-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/awscloudwatch] Missing log stream events #32231
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
It is also not iterating over all log groups I have in my aws account. It iterates over and over again in log groups that dont even have log streams anymore, and it also never tries other log groups than these ones shown below.
|
@djaglowski @schmikei could you guys help me to get my dev env so that I can fix this? |
is this #32057 the fix? |
I think it might be related, if you'd be willing to try out the PR feel free to see if it resolves your issue, we'd very much appreciate it! |
@schmikei how can I do it? Im using the otel-collector binary |
You can checkout the fork and do a |
@schmikei can I build linux dis on a mac? |
never mind @schmikei |
@schmikei do you have this error while building? I checkout your branch, and then I run the |
@schmikei I spent the day building this stuff but it did not work :/ I used the same Below is the output when using 0.97.0
Below you can see the result I got from the new binary
|
@djaglowski @schmikei can you help here? |
Aah, i see this: |
Ok, so i did some tests with AWS API Gateway. Of those 1000 requests 1000 are processed by AWS API Gateway. I add here the timestamps of the Cloudwatch logs and the loki logs. When you compare the two file you see about every minute (my poll interval) a gap in the logs. |
Increasing or decreasing the poll interval does not help. |
I'm starting to think Cloudwatch may take a second to serve the log entries on the API (just a suspicion), created a Spike PR in #33809 that will set the next end time... I will try to dedicate some time to see if I can replicate the specific behavior, but if you'd like to test out the PR to see if it helps, it would help as I try and replicate your test case. |
@schmikei once I have time I will test it again |
I did an other test this time with only API Gateway log groups. That is 4 log groups out of the 38 that i have in total. The gaps in the logs that i see are between 14-18 seconds with a 1min pull request. @schmikei I don't have the mad skills to do this. But if you can provide a docker, i can test. |
@schmikei Here is a graph of the logs i get. This is done with a poll of 5 min and you see a big gap every 5 min. When you look closely you also see small gaps every couple of seconds. It turns out that when Cloudwatch created a new log stream the first logs (API Gateway call) is not picked up. A possible solution to fix this would be to work with a time offset equal to the poll rate, so if |
@AllanOricil @schmikei |
I did not have time to test the latest changes. If you did and the issue is still there, make a video and post it here. |
Heya @AllanOricil |
@schmikei |
@Jessimon I cant invest time on testing it again. Once I reach a point in my project where It makes sense to add probes to every single running service, I will try again. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Component(s)
receiver/awscloudwatch
What happened?
Description
1 - Past log streams aren't exported
2 - Log streams are incomplete
Steps to Reproduce
1 - create a lambda function that logs more than 15 log lines in Cloud Watch
2 - run this function several times and ensure a single log stream has more than 15 log lines
3 - in an EC2 machine, run otel-collector with this receiver
WARNING: don't forget to change
NAME_OF_YOUR_LAMBDA_FUNCTION_LOG_GROUP
by the name of the log group of your lambda functionDon't forget to register the receiver in the logs pipeline
4 - verify that logs are processed with no error. You should see something as shown below
Expected Result
Actual Result
Collector version
0.97.0
Environment information
Environment
NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2023"
HOME_URL="https://aws.amazon.com/linux/"
BUG_REPORT_URL="https://github.com/amazonlinux/amazon-linux-2023"
SUPPORT_END="2028-03-15"
OpenTelemetry Collector configuration
Log output
Additional context
Im using Signoz to see my logs
Compare the images to verify that there are missing log events for my log stream named
2024/04/09/[$LATEST]9a79cb34998a4037bf1e3ff5df35fea5
In Signoz, I filtered logs by the log stream name
The text was updated successfully, but these errors were encountered: