Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preconfiguration API behavior for missing package policies #113921

Closed
simitt opened this issue Oct 5, 2021 · 12 comments · Fixed by #119488
Closed

Preconfiguration API behavior for missing package policies #113921

simitt opened this issue Oct 5, 2021 · 12 comments · Fixed by #119488
Assignees
Labels
bug Fixes for quality problems that affect the customer experience discuss Feature:Fleet Fleet team's agent central management project QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v7.17.1 v8.0.0 v8.1.0

Comments

@simitt
Copy link
Contributor

simitt commented Oct 5, 2021

Using the preconfiguration API to set up agent policies results in an empty agent policy, if the package registry cannot be reached. If it eventually is reachable, the agent policy is not updated, also not on Kibana startup.

This potentially leaves agent policies in a half-baked state, that is not automatically recoverable.

There's probably a bunch of questions to answer if policies from the preconfig API need to update existing agent policies, but can we discuss comparing preconfigured agent policies with installed ones and add missing package policies (not updating existing ones if they differ)?

cc @jen-huang @joshdover @ruflin

@simitt simitt added discuss Feature:Fleet Fleet team's agent central management project labels Oct 5, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Feature:Fleet)

@botelastic botelastic bot added the needs-team Issues missing a team label label Oct 5, 2021
@ruflin ruflin added the Team:Fleet Team label for Observability Data Collection Fleet team label Oct 11, 2021
@botelastic botelastic bot removed the needs-team Issues missing a team label label Oct 11, 2021
@joshdover joshdover added the bug Fixes for quality problems that affect the customer experience label Oct 18, 2021
@joshdover
Copy link
Contributor

@nchaulet this seems like something we should be able to handle. We shouldn't have any of the fleet-preconfiguration-deletion-record objects around for these, so I think this is something we can achieve.

The challenge is deciding when to execute this retry, which will be something that should be tackled as part of #111859

@nchaulet
Copy link
Member

nchaulet commented Oct 18, 2021

@nchaulet this seems like something we should be able to handle. We shouldn't have any of the fleet-preconfiguration-deletion-record objects around for these, so I think this is something we can achieve.

Yes this something we can achieve, I have been digging a little more in our code and there probably a few things that can lead to error in the way the preconfigure service:

  • If a policy is managed, we update the agent policy fields (not the associated package policy we should probably change that)
  • We have a bug: we do not use the is_managed flag from the config file for the previous feature only the one from the saved object, this cause the previous point to doesn't work correctly as we first create the policy saved object without the is_managed flag

@joshdover
Copy link
Contributor

Seems like we need to audit the existing behavior of preconfiguration, including how it handles failures and kibana.yml config changes, so we can determine a consistent model that should meet all of these types of requirements. I suspect this might be better than patching one thing at a time?

@simitt
Copy link
Contributor Author

simitt commented Nov 22, 2021

As discussed with @nchaulet async, this is important for the cloud setup for 8.0 for the apm-server. The cloud setup is in managed mode, so adding the missing packages would only be required for the managed policy. Is this doable for 8.0?

@nchaulet nchaulet self-assigned this Nov 22, 2021
@nchaulet
Copy link
Member

@simitt I have a draft PR that does that. I will work on having on solid test here but we should be able to land this for 8.0 #119488

@joshdover
Copy link
Contributor

joshdover commented Feb 24, 2022

Test instructions:

  1. Start an on-prem 8.1 Kibana and ES cluster, add this to the kibana.yml config before starting Kibana:
xpack.fleet.packages:
  - name: fleet-server
    version: latest
  - name: apm
    version: latest
xpack.fleet.agentPolicies:
  # Cloud Agent policy
  - name: Elastic Cloud agent policy
    id: policy-elastic-cloud
    description: Default agent policy for agents hosted on Elastic Cloud
    is_default: false
    is_managed: true
    is_default_fleet_server: true
    namespace: default
    monitoring_enabled: []
    package_policies:
      - name: Fleet Server
        package:
          name: fleet_server
        inputs:
          - type: fleet-server
            keep_enabled: true
            vars:
              - name: host
                value: 0.0.0.0
              - name: port
                value: 8220
  1. Go to Fleet app and wait for UI to load
  2. Stop Kibana
  3. Change the kibana.yml to use these settings:
xpack.fleet.packages:
  - name: fleet-server
    version: latest
  - name: apm
    version: latest
xpack.fleet.agentPolicies:
  # Cloud Agent policy
  - name: Elastic Cloud agent policy
    id: policy-elastic-cloud
    description: Default agent policy for agents hosted on Elastic Cloud
    is_default: false
    is_managed: true
    is_default_fleet_server: true
    namespace: default
    monitoring_enabled: []
    package_policies:
      - name: apm-cloud-123
        package:
          name: apm
      - name: Fleet Server
        package:
          name: fleet_server
        inputs:
          - type: fleet-server
            keep_enabled: true
            vars:
              - name: host
                value: 0.0.0.0
              - name: port
                value: 8220
  1. Start Kibana and go to Fleet app
  2. Go to Elastic Cloud policy and verify that an APM integration policy was added

@amolnater-qasource
Copy link

Hi @joshdover
We have revalidated this on self-managed 8.1 BC-4 and had below observations:

  • We added the required configuration to kibana.yml, however we get below error on logging in to kibana.
    5

Could you please confirm if we need to create any policy(manually) for this?

Build details:
VERSION: 8.1.0
BUILD: 50428
COMMIT: 015578b

Please let us know if we are missing anything here.
Thanks

@joshdover
Copy link
Contributor

@amolnater-qasource Apologies, I've updated the test instructions above to include the id

@amolnater-qasource
Copy link

amolnater-qasource commented Mar 4, 2022

Hi @joshdover
Thanks for updating the configuration, we revalidated this with the updated configuration on 8.1 BC-6 self-managed environment.

Build details:
BUILD: 50485
COMMIT: 4aaeda2
Artifact Link: https://staging.elastic.co/8.1.0-6b89df21/summary-8.1.0.html

We observed that we are still getting some errors, due to which we are unable to access Fleet tab.
There should be something that we are still missing in configuration.

Screenshots:
6
7

We tried to troubleshoot, however didn't get any success.
Could please look into the same?

Thanks!

@joshdover
Copy link
Contributor

@amolnater-qasource Good catch, these need to be updated for the removal of default packages. I've updated the configs above again to include the fleet_server package.

@amolnater-qasource
Copy link

amolnater-qasource commented Mar 7, 2022

Hi @joshdover
Thank you for updating the configuration. We have revalidated this on 8.1 BC-6 self-managed environment and found it working fine.

Steps followed:

  1. Started self-managed kibana with below config in kibana.yml.
xpack.fleet.packages:
  - name: fleet_server
    version: latest
  - name: apm
    version: latest
xpack.fleet.agentPolicies:
  # Cloud Agent policy
  - name: Elastic Cloud agent policy
    id: policy-elastic-cloud
    description: Default agent policy for agents hosted on Elastic Cloud
    is_default: false
    is_managed: true
    is_default_fleet_server: true
    namespace: default
    monitoring_enabled: []
    package_policies:
      - name: Fleet Server
        package:
          name: fleet_server
        inputs:
          - type: fleet-server
            keep_enabled: true
            vars:
              - name: host
                value: 0.0.0.0
              - name: port
                value: 8220
  1. On Fleet>Agent Policies> Elastic Cloud agent policy is available with only Fleet Server integration.
  2. Stopped kibana and added updated configuration:
xpack.fleet.packages:
  - name: fleet_server
    version: latest
  - name: apm
    version: latest
xpack.fleet.agentPolicies:
  # Cloud Agent policy
  - name: Elastic Cloud agent policy
    id: policy-elastic-cloud
    description: Default agent policy for agents hosted on Elastic Cloud
    is_default: false
    is_managed: true
    is_default_fleet_server: true
    namespace: default
    monitoring_enabled: []
    package_policies:
      - name: apm-cloud-123
        package:
          name: apm
      - name: Fleet Server
        package:
          name: fleet_server
        inputs:
          - type: fleet-server
            keep_enabled: true
            vars:
              - name: host
                value: 0.0.0.0
              - name: port
                value: 8220
  1. Rerun kibana.bat and Elastic Cloud agent policy is now available with Fleet Server and Elastic APM integrations.

Build details:

BUILD: 50485
COMMIT: 4aaeda23aea9c3bf29698878c70a0107ea3c1659
Artifact Link: https://staging.elastic.co/8.1.0-6b89df21/summary-8.1.0.html

Screenshots:
17
18

Hence marking this as QA: Validated.
Thanks!

@amolnater-qasource amolnater-qasource added QA:Validated Issue has been validated by QA and removed QA:Ready for Testing Code is merged and ready for QA to validate labels Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience discuss Feature:Fleet Fleet team's agent central management project QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v7.17.1 v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants