Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Create multi-signal service inventory : Tech Preview 'off' by default #183012

Closed
roshan-elastic opened this issue May 9, 2024 · 9 comments · Fixed by #183605
Closed

[APM] Create multi-signal service inventory : Tech Preview 'off' by default #183012

roshan-elastic opened this issue May 9, 2024 · 9 comments · Fixed by #183605
Assignees
Labels
enhancement New value added to drive a business result Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team

Comments

@roshan-elastic
Copy link

roshan-elastic commented May 9, 2024

📖 Description

Create initial multi-signal service inventory under tech preview (off by default)

🎨 Designs

See Figma

See design issue

✔️ Acceptance criteria

1. Must Have

Must be delivered in this issue in order for the release to be valuable

Name Description
New service inventory must be marked as technical preview -
Log rate and log error rate must be available as columns These will only be populated for logs-only services. Even if you add APM to a logs-only service, we will stop populating these columns.

This is because when you click through into the APM view, there will be no capability to debug these logs (and also, log error rate is not simple for APM logs as there is no log.level). We will figure this out in a future iteration.
Logs-only and APM-enabled services and services with both APM and log data must show in the same list -
Logs-only services must support service.environment Just like in APM, this metadata is optional and will be handled just as it is with APM-instrumented services

2. Should Have

Should be delivered in this issue in order for the release to be valuable

Name Description
- -

3. Could Have

Could be delivered in this issue in order for the release to be valuable

Name Description
Alert counts could show but only for services with traces -
Health could show but only for services with traces -
Comparison could be available but only for trace-related metrics -
Alert creation workflows could be available but with messaging to explain it only work for those using APM/tracing -
Anomaly Detection workflows could be available but with messaging to explain it only work for those using APM/tracing -
Service Groups could be available but with messaging to explain it only work for those using APM/tracing -

4. Will Not Have (for now)

Explicitly will not be looked at within this issue

Name Description
We won't let users toggle the feature flag via the service inventory directly For now, this will only be controlled via feature flag which is accessed via APM 'settings' and 'stack management'
Service Map will not show logs-only services -
Services cannot be 'created' Users won't be able to create this
Fast filter will not be available -
Log Rate and Log Error % values will not show for any service instrumented with APM We will figure this out in a future iteration

📈 Telemetry Process

  • Telemetry requirements must be part of the acceptance criteria (above) (defined by the Epic creator, e.g. the Product Manager) during refinement.
  • See Telemetry Process for full details/process/implementation conventions.
@botelastic botelastic bot added the needs-team Issues missing a team label label May 9, 2024
@roshan-elastic roshan-elastic changed the title [APM] Create multi-signal service views (logs-only) [APM] Create multi-signal service inventory May 9, 2024
@roshan-elastic roshan-elastic added the Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team label May 9, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

@botelastic botelastic bot removed the needs-team Issues missing a team label label May 9, 2024
@roshan-elastic roshan-elastic changed the title [APM] Create multi-signal service inventory [APM] Create multi-signal service inventory (tech preview - default off) May 9, 2024
@roshan-elastic roshan-elastic changed the title [APM] Create multi-signal service inventory (tech preview - default off) [APM] Create multi-signal service inventory : Tech Preview 'off' by default May 9, 2024
@kpatticha
Copy link
Contributor

I have created this ticket #182675 based on the initial AC. Happy to close it as duplicate but can we update this one to have only the requirements for this iteration? ex, Must Have and & Will Not Have (for now)

It would be really useful if the implementation tickets are as concrete as possible . it does help both the engineer and the reviewer.

@roshan-elastic wdyt?

@roshan-elastic
Copy link
Author

Ah sorry @kpatticha - I completely missed that and thanks for flagging! I was working with @cauemarcondes on this.

The 'must have' requirements are the only things required for this first iteration. Is that OK or is there anything that could make this clearer?

@kpatticha
Copy link
Contributor

@roshan-elastic the must have requirements are clear.

However, I would appreciate it if the ticket includes only the things necessary for this first iteration.

@roshan-elastic
Copy link
Author

No worries @kpatticha. Could I ask about this to understand a bit more? I just want to make sure I structure these issues in the right way moving forwards.

Should these implementation issues only have the minimum criteria for success or should they have stretch goals in there too (e.g. could have)?

I'm OK either way - I'm just trying to figure out where (or if) we should put things which aren't critical in this kind of ticket.

Thanks!

@kpatticha kpatticha self-assigned this May 15, 2024
@smith smith added the enhancement New value added to drive a business result label May 17, 2024
@smith
Copy link
Contributor

smith commented May 17, 2024

log error rate is not simple for APM logs as there is no log.level

APM errors use error.log.level (not ECS) and we have an open issue on semantic conventions to add log.level to otel logs: open-telemetry/semantic-conventions#134

@kpatticha
Copy link
Contributor

pause this work for a bit to work on #183818 in order to match the output of the entities.

@kpatticha
Copy link
Contributor

resume the work on the UI.

@roshan-elastic
Copy link
Author

@chrisdistasio / @tommyers-elastic / @kpatticha

Getting Log Rate and Log Error % to work in Service Inventory for services with only logs, only APM or both

Kate and I were discussing how the 'log rate' and 'log error rate' columns would work in the new Service Inventory and also ensuring users can dig into these logs by going to the logs tab in the service view.

Where we'd like to end up is that this works for services with only logs, only APM or both.

Looking at the data, I'm wondering whether we can might be able to achieve this with some small configuration?

My Test on edge

Here is the formulae I tried for displaying log rate and log error % in the service inventory:

Log Rate

Index pattern : logs-*
Formula : Log rate for records where service.name : "NAME OF SERVICE"

Log Error Rate

Index pattern : logs-*
Formula : Count of records where service.name : "NAME OF SERVICE" AND (log.level : "error" OR 'error.log.level : "error")

Note : Both of the above should also match custom logs (log-*) and also APM logs (logs-apm-*).

Sample Lens Visualisation on edge

Note : I've simplified the logic for 'log rate' to just 'log count' in Lens to save some time to illustrate my proposal

image

It seems to work but I might be missing some detail here.

If this works, I think the 'logs' tab in the service view should also work

I have a feeling that the Logs tab in the Service View should automatically show any logs where service.name matches the current service from any logs set in the "error indices" set in the APM settings.

Therefore, I'm wondering if we could see both logs from APM and custom logs if this index pattern was just set to logs-*?

image

WDYT?

bhapas pushed a commit to bhapas/kibana that referenced this issue Jun 24, 2024
## Summary

fixes elastic#183012 
- Rename `assets` to `entities`
- Update entities index: `.entities-observability.latest-*` The index
where the data transform writes the summaries
- Show a search field to filter by service name. This will allow the
user to filter the table without knowing the entities index and the
fields.
- Use the same template and path for the service inventory`/services` . 
- `throughput` remains the initial sorting field
- Merge the entities with the the same service name and calculates the
averages in the front end







https://github.com/elastic/kibana/assets/3369346/2dbc07e9-3086-4d32-a98e-5dc364f59554



### How to test
1. Add the config to your kibana.yml
```
xpack.assetManager:
  alphaEnabled: true
```
2. Enable `observability:apmEnableMultiSignal` in advansted settings
 
<details>


<summary>3. Run the entities definition in the dev tools</summary>


```
POST kbn:/internal/api/entities/definition
{
  "id": "apm-services-with-metadata",
  "name": "Services from logs and metrics",
  "displayNameTemplate": "test",
  "history": {
    "timestampField": "@timestamp",
    "interval": "5m"
  },
  "type": "service",
  "indexPatterns": [
    "logs-*",
    "metrics-*"
  ],
  "timestampField": "@timestamp",
  "lookback": "5m",
  "identityFields": [
    {
      "field": "service.name",
      "optional": false
    },
    {
      "field": "service.environment",
      "optional": true
    }
  ],
  "identityTemplate": "{{service.name}}:{{service.environment}}",
  "metadata": [
    "tags",
    "host.name",
    "data_stream.type",
    "service.name", 
    "service.instance.id",
    "service.namespace",
    "service.environment",
    "service.version",
    "service.runtime.name",
    "service.runtime.version",
    "service.node.name",
    "service.language.name",
    "agent.name",
    "cloud.provider",
    "cloud.instance.id",
    "cloud.availability_zone",
    "cloud.instance.name",
    "cloud.machine.type",
    "container.id"
  ],
  "metrics": [
    {
      "name": "latency",
      "equation": "A",
      "metrics": [
        {
          "name": "A",
          "aggregation": "avg",
          "field": "transaction.duration.histogram"
           
          
        }
      ]
    },
    {
      "name": "throughput",
      "equation": "A / 5",
      "metrics": [
        {
          "name": "A",
          "aggregation": "doc_count",
          "filter": "transaction.duration.histogram:*"
        }
      ]
    },
    {
      "name": "failedTransactionRate",
      "equation": "A / B",
      "metrics": [
        {
          "name": "A",
          "aggregation": "doc_count",
          "filter": "event.outcome: \"failure\""
        },
        {
          "name": "B",
          "aggregation": "doc_count",
          "filter": "event.outcome: *"
        }
      ]
    },
    {
      "name": "logErrorRate",
      "equation": "A / B",
      "metrics": [
        {
          "name": "A",
          "aggregation": "doc_count",
          "filter": "log.level: \"error\""
        },
        {
          "name": "B",
          "aggregation": "doc_count",
          "filter": "log.level: *"
        }
      ]
    },
     {
      "name": "logRatePerMinute",
      "equation": "A / 5",
      "metrics": [
        {
          "name": "A",
          "aggregation": "doc_count",
          "filter": "log.level: \"error\""
        }
      ]
    }
  ]
}
```

</details>

4. Generate data with synthrace

    1. logs only: `node scripts/synthtrace simple_logs.ts`
    2. APM only: `node scripts/synthtrace simple_trace.ts` 


### Checklist
- [ ] There is a issue with the `SearchBar` that causing the gap between
the search field and the timerange. I need to check it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants