Restore the test assertion that agent monitoring logs should not contain errors #3562

cmacknz · 2023-10-06T21:01:14Z

We have a test to ensure that agent monitoring logs are shipped to Fleet:

elastic-agent/testing/integration/monitoring_logs_test.go

Line 31 in dcf2263

func TestMonitoringLogsShipped(t *testing.T) {

This test used to contain an assertion that the monitoring logs did not contain any error level logs. This was added because there are several problems that can hide in the logs due to the fact that failing processes will automatically be restarted by the agent. Panics in inputs and processors are recent example.

This test was removed because it was causing the test to be flaky, since there are several error level log messages that can occur as part of normal operation (retries connecting to Fleet or Elasticsearch for example). Rather than eliminating this check completely, we should add the ability to whitelist these expected errors or convert them to the warning level if they are truly expected.

Having a test looking for unexpected errors in our logs is extremely valuable and would have caught bugs that were released in the past. We should eventually expand this to all of our tests but we can start by introducing it in a single test.

The function to check for errors in the monitoring logs in Elasticsearch still exists:

elastic-agent/pkg/testing/tools/estools/elasticsearch.go

Lines 156 to 160 in 99b14c8

    
           // CheckForErrorsInLogs checks to see if any error-level lines exist 
        
           // excludeStrings can be used to remove any particular error strings from logs 
        
           func CheckForErrorsInLogs(client elastictransport.Interface, namespace string, excludeStrings []string) (Documents, error) { 
        
           	return CheckForErrorsInLogsWithContext(context.Background(), client, namespace, excludeStrings) 
        
           }

It was removed in 908c912 which shows how it was called:

t.Log("Making sure there are no error logs")
	docs = findESDocs(t, func() (tools.Documents, error) {
		return tools.CheckForErrorsInLogs(info.ESClient, info.Namespace, []string{})
	})
	t.Logf("errors: Got %d documents", len(docs.Hits.Hits))
	for _, doc := range docs.Hits.Hits {
		t.Logf("%#v", doc.Source)
	}

The text was updated successfully, but these errors were encountered:

elasticmachine · 2023-10-06T21:01:16Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz added the Team:Elastic-Agent Label for the Agent team label Oct 6, 2023

pierrehilbert assigned michalpristas Oct 10, 2023

michalpristas mentioned this issue Oct 17, 2023

Enable log errors check test and filter for acceptable errors #3616

Merged

michalpristas closed this as completed in #3616 Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore the test assertion that agent monitoring logs should not contain errors #3562

Restore the test assertion that agent monitoring logs should not contain errors #3562

cmacknz commented Oct 6, 2023

elasticmachine commented Oct 6, 2023

Restore the test assertion that agent monitoring logs should not contain errors #3562

Restore the test assertion that agent monitoring logs should not contain errors #3562

Comments

cmacknz commented Oct 6, 2023

elasticmachine commented Oct 6, 2023