Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing log events for invalid requests #14232

Open
up2neck opened this issue Oct 2, 2024 · 4 comments
Open

Confusing log events for invalid requests #14232

up2neck opened this issue Oct 2, 2024 · 4 comments
Labels

Comments

@up2neck
Copy link
Contributor

up2neck commented Oct 2, 2024

APM Server log events and correlating trace (with self-instrumentation enabled) contains 2 different HTTP errors:

Whether log event contains 503 error with "request timed out" error:

Log entry
{
  "message": "request timed out",
  "url.original": "/intake/v2/events",
  "http.request.body.bytes": 0,
  "log.level": "error",
  "transaction.id": "d91bb495a6ab25bb",
  "http.request.method": "POST",
  "ecs.version": "1.6.0",
  "http.response.status_code": 503,
  "error.message": "request timed out",
  "log.logger": "request",
  "@timestamp": "2024-10-02T07:22:04.439Z",
  "service.name": "apm-server",
  "user_agent.original": "apm-agent-java/1.45.0 (autocode-app 2.0.0)",
  "log.origin": {
    "function": "github.com/elastic/apm-server/internal/beater/api.apmMiddleware.LogMiddleware.func1.1",
    "file.name": "middleware/log_middleware.go",
    "file.line": 59
  },
  "event.duration": 3499790781,
  "http.request.id": "d91bb495a6ab25bb",
  "source.address": "000.000.000.000",
  "trace.id": "d91bb495a6ab25bbdede8859c7cf8c8a"
}

APM interface shown actual error was:

Image

Trace body
{
  "_index": ".ds-traces-apm-dummy_sub2-2024.09.20-000009",
  "_id": "urkbTJIBO4jYgZRmiL6k",
  "_version": 1,
  "_score": 0,
  "_source": {
    "agent": {
      "name": "go",
      "version": "2.6.0"
    },
    "process": {
      "args": [
        "apm-server",
        "run",
        "-e",
        "-c",
        "config/config-secret/apm-server.yml"
      ],
      "pid": 1,
      "title": "apm-server"
    },
    "source": {
      "nat": {
        "ip": "00.000.00.000"
      },
      "ip": "000.000.000.000"
    },
    "processor": {
      "event": "transaction"
    },
    "url": {
      "path": "/intake/v2/events",
      "scheme": "https",
      "domain": "mydomain",
      "full": "https://mydomain/intake/v2/events"
    },
    "labels": {
      "project": "dummy-sub2"
    },
    "cloud": {
      "availability_zone": "europe-west3-a",
      "instance": {
        "name": "gke-dummy-sub-elastic-europe-w-generic-c4e5a328-8ipl",
        "id": "6772592666386103052"
      },
      "provider": "gcp",
      "project": {
        "id": "or2-ms-dummy-sub-elastic-t1iylu"
      },
      "region": "europe-west3"
    },
    "edp": {
      "domain": "mydomain"
    },
    "observer": {
      "hostname": "apm-server-v2-apm-server-6f6db466bd-22ccb",
      "type": "apm-server",
      "version": "8.14.3"
    },
    "trace": {
      "id": "d91bb495a6ab25bbdede8859c7cf8c8a"
    },
    "@timestamp": "2024-10-02T07:22:00.940Z",
    "data_stream": {
      "namespace": "dummy_sub2",
      "type": "traces",
      "dataset": "apm"
    },
    "service": {
      "node": {
        "name": "apm-server-v2-apm-server-6f6db466bd-22ccb"
      },
      "environment": "sandbox",
      "name": "apm-server",
      "runtime": {
        "name": "gc",
        "version": "go1.22.7"
      },
      "language": {
        "name": "go",
        "version": "go1.22.7"
      },
      "version": "8.14.3"
    },
    "host": {
      "hostname": "apm-server-v2-apm-server-6f6db466bd-22ccb",
      "os": {
        "platform": "linux"
      },
      "name": "apm-server-v2-apm-server-6f6db466bd-22ccb",
      "architecture": "amd64"
    },
    "client": {
      "geo": {
        "region_iso_code": "BE-BRU",
        "continent_name": "Europe",
        "city_name": "Brussels",
        "country_iso_code": "BE",
        "country_name": "Belgium",
        "location": {
          "lon": 4.347,
          "lat": 50.8534
        },
        "region_name": "Brussels Capital"
      },
      "ip": "000.000.000.000"
    },
    "http": {
      "request": {
        "headers": {
          "X-Cloud-Trace-Context": [
            "61ce7649b7546eb12e48bd1b05dc5acf/7453092485412124311"
          ],
          "Cache-Control": [
            "no-cache"
          ],
          "Accept": [
            "*/*"
          ],
          "Content-Encoding": [
            "deflate"
          ],
          "X-Forwarded-Proto": [
            "https"
          ],
          "User-Agent": [
            "apm-agent-java/1.45.0 (autocode-app 2.0.0)"
          ],
          "X-Forwarded-For": [
            "000.000.000.000,00.00.00.000"
          ],
          "Pragma": [
            "no-cache"
          ],
          "Via": [
            "1.1 google"
          ],
          "Content-Type": [
            "application/x-ndjson"
          ]
        },
        "method": "POST",
        "cookies_list": [],
        "cookies": {}
      },
      "response": {
        "headers": {
          "X-Content-Type-Options": [
            "nosniff"
          ],
          "Connection": [
            "Close"
          ],
          "Content-Type": [
            "application/json"
          ]
        },
        "status_code": 400
      },
      "version": "1.1"
    },
    "event": {
      "agent_id_status": "missing",
      "ingested": "2024-10-02T07:22:05Z",
      "success_count": 1,
      "outcome": "success"
    },
    "transaction": {
      "result": "HTTP 4xx",
      "duration": {
        "us": 3499844
      },
      "representative_count": 1,
      "name": "POST /intake/v2/events",
      "id": "d91bb495a6ab25bb",
      "span_count": {
        "dropped": 0,
        "started": 0
      },
      "type": "request",
      "sampled": true
    },
    "user_agent": {
      "original": "apm-agent-java/1.45.0 (autocode-app 2.0.0)",
      "name": "Other",
      "device": {
        "name": "Other"
      }
    },
    "span": {
      "id": "d91bb495a6ab25bb"
    },
    "timestamp": {
      "us": 1727853720940194
    }
  },
  "fields": {
    "transaction.name.text": [
      "POST /intake/v2/events"
    ],
    "http.request.headers.Accept": [
      "*/*"
    ],
    "http.response.headers.Connection": [
      "Close"
    ],
    "transaction.representative_count": [
      1
    ],
    "user_agent.original.text": [
      "apm-agent-java/1.45.0 (autocode-app 2.0.0)"
    ],
    "client.geo.country_iso_code": [
      "BE"
    ],
    "host.hostname": [
      "apm-server-v2-apm-server-6f6db466bd-22ccb"
    ],
    "process.pid": [
      1
    ],
    "service.language.name": [
      "go"
    ],
    "transaction.result": [
      "HTTP 4xx"
    ],
    "cloud.availability_zone": [
      "europe-west3-a"
    ],
    "process.title.text": [
      "apm-server"
    ],
    "transaction.id": [
      "d91bb495a6ab25bb"
    ],
    "http.request.headers.Via": [
      "1.1 google"
    ],
    "http.request.method": [
      "POST"
    ],
    "processor.event": [
      "transaction"
    ],
    "labels.project": [
      "dummy-sub2"
    ],
    "source.ip": [
      "000.000.000.000"
    ],
    "agent.name": [
      "go"
    ],
    "host.name": [
      "apm-server-v2-apm-server-6f6db466bd-22ccb"
    ],
    "client.geo.country_name": [
      "Belgium"
    ],
    "event.agent_id_status": [
      "missing"
    ],
    "http.response.status_code": [
      400
    ],
    "http.version": [
      "1.1"
    ],
    "event.outcome": [
      "success"
    ],
    "user_agent.original": [
      "apm-agent-java/1.45.0 (autocode-app 2.0.0)"
    ],
    "cloud.region": [
      "europe-west3"
    ],
    "transaction.duration.us": [
      3499844
    ],
    "service.runtime.version": [
      "go1.22.7"
    ],
    "span.id": [
      "d91bb495a6ab25bb"
    ],
    "edp.domain": [
      "mydomain"
    ],
    "client.ip": [
      "000.000.000.000"
    ],
    "user_agent.name": [
      "Other"
    ],
    "data_stream.type": [
      "traces"
    ],
    "host.architecture": [
      "amd64"
    ],
    "http.request.headers.X-Forwarded-For": [
      "000.000.000.000,00.00.00.000"
    ],
    "cloud.provider": [
      "gcp"
    ],
    "timestamp.us": [
      1727853720940194
    ],
    "url.path": [
      "/intake/v2/events"
    ],
    "observer.type": [
      "apm-server"
    ],
    "observer.version": [
      "8.14.3"
    ],
    "agent.version": [
      "2.6.0"
    ],
    "transaction.name": [
      "POST /intake/v2/events"
    ],
    "process.title": [
      "apm-server"
    ],
    "client.geo.continent_name": [
      "Europe"
    ],
    "http.request.headers.Cache-Control": [
      "no-cache"
    ],
    "http.request.headers.Content-Encoding": [
      "deflate"
    ],
    "service.node.name": [
      "apm-server-v2-apm-server-6f6db466bd-22ccb"
    ],
    "http.request.headers.X-Forwarded-Proto": [
      "https"
    ],
    "url.scheme": [
      "https"
    ],
    "source.nat.ip": [
      "00.000.00.000"
    ],
    "transaction.sampled": [
      true
    ],
    "cloud.instance.id": [
      "6772592666386103052"
    ],
    "trace.id": [
      "d91bb495a6ab25bbdede8859c7cf8c8a"
    ],
    "client.geo.region_name": [
      "Brussels Capital"
    ],
    "client.geo.location": [
      {
        "coordinates": [
          4.347,
          50.8534
        ],
        "type": "Point"
      }
    ],
    "event.success_count": [
      1
    ],
    "transaction.span_count.dropped": [
      0
    ],
    "http.request.headers.Content-Type": [
      "application/x-ndjson"
    ],
    "url.full": [
      "https://mydomain/intake/v2/events"
    ],
    "http.request.headers.User-Agent": [
      "apm-agent-java/1.45.0 (autocode-app 2.0.0)"
    ],
    "service.environment": [
      "sandbox"
    ],
    "client.geo.city_name": [
      "Brussels"
    ],
    "client.geo.region_iso_code": [
      "BE-BRU"
    ],
    "service.name": [
      "apm-server"
    ],
    "data_stream.namespace": [
      "dummy_sub2"
    ],
    "service.runtime.name": [
      "gc"
    ],
    "process.args": [
      "apm-server",
      "run",
      "-e",
      "-c",
      "config/config-secret/apm-server.yml"
    ],
    "observer.hostname": [
      "apm-server-v2-apm-server-6f6db466bd-22ccb"
    ],
    "http.request.headers.X-Cloud-Trace-Context": [
      "61ce7649b7546eb12e48bd1b05dc5acf/7453092485412124311"
    ],
    "url.full.text": [
      "https://mydomain/intake/v2/events"
    ],
    "transaction.type": [
      "request"
    ],
    "transaction.span_count.started": [
      0
    ],
    "event.ingested": [
      "2024-10-02T07:22:05.000Z"
    ],
    "http.request.headers.Pragma": [
      "no-cache"
    ],
    "@timestamp": [
      "2024-10-02T07:22:00.940Z"
    ],
    "service.version": [
      "8.14.3"
    ],
    "host.os.platform": [
      "linux"
    ],
    "data_stream.dataset": [
      "apm"
    ],
    "http.response.headers.Content-Type": [
      "application/json"
    ],
    "http.response.headers.X-Content-Type-Options": [
      "nosniff"
    ],
    "service.language.version": [
      "go1.22.7"
    ],
    "url.domain": [
      "mydomain"
    ],
    "user_agent.device.name": [
      "Other"
    ],
    "cloud.instance.name": [
      "gke-dummy-sub-elastic-europe-w-generic-c4e5a328-8ipl"
    ],
    "cloud.project.id": [
      "or2-ms-dummy-sub-elastic-t1iylu"
    ]
  }
}
@up2neck up2neck added the bug label Oct 2, 2024
@inge4pres
Copy link
Contributor

hello @up2neck can you please give a bit more informations on the environment?
I see from the trace you're running APM Server v 8.14.3, is Elasticsearch deployed at the same version?
The trace body and log body you posted are from documents stored in Elasticsearch?

@up2neck
Copy link
Contributor Author

up2neck commented Oct 2, 2024

@inge4pres

Log entry was taken from APM Server logs (stdout, collected by GCP Loggins if it matters)
Trace was taken from Elasticsearch itself.

Our team, has 2 environments (both facing the issue):
GCP Load balancer -> APM Server (GKE, ECK) -> Elasticsearch cluster (GKE, ECK)
GCP Load balancer -> APM Server (GKE, ECK) -> Elasticsearch cluster (Elastic Cloud)

My thoughts are some middleware, probably, rewrites the exact error, which leads to incorrect log entry.

@inge4pres
Copy link
Contributor

Thanks for the additional info 🙏🏼

The reason I was asking is that you may have transforms or ingest pipelines in Elasticsearch that remaps fields, in this case the http.response.status_code field.
Do you know if there is special handling of this field?

My thoughts are some middleware, probably, rewrites the exact error, which leads to incorrect log entry.

I am not aware of anything inside apm-server binary modifying the value of this field, hence my confusion as to why it is different in logs than what's stored in ES as trace.
I'll verify this is bug and report back.

@up2neck
Copy link
Contributor Author

up2neck commented Oct 3, 2024

@inge4pres

There are few additional steps on the top of default "APM" integration-provided ingest pipelines, we have, but all of them don't interact with http.response.status_code field, keeps it original.

@inge4pres inge4pres self-assigned this Oct 3, 2024
@inge4pres inge4pres removed their assignment Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants