Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Error in experimental file data visualizer #29578

Closed
wolframhaussig opened this issue Jan 30, 2019 · 7 comments
Closed

[ML] Error in experimental file data visualizer #29578

wolframhaussig opened this issue Jan 30, 2019 · 7 comments
Labels

Comments

@wolframhaussig
Copy link
Contributor

Kibana version: 6.5.4

Elasticsearch version: 6.5.4

Server OS version: RedHat 7

Browser version: Chrome 71

Browser OS version: Windows 10

Original install method (e.g. download page, yum, from source, etc.): download page

Description of the problem including expected versus actual behavior:
I tried the experimental File Data Visualizer with a logback generated logfile of 390KB. Instead of showing the data an error occurs:

File could not be read
[timeout_exception] Aborting structure analysis during [timestamp format determination] as it has taken longer than the timeout of [25s]

We have a 3 node elasticsearch cluster - all nodes were available, management console shows green status.

Errors in browser console (if relevant):

Failed to load resource: the server responded with a status of 400 (Bad Request)
error: "Bad Request"
message: "[timeout_exception] Aborting structure analysis during [timestamp format determination] as it has taken longer than the timeout of [25s]"
statusCode: 400

Provide logs and/or server output (if relevant):
Elasticsearch:

[2019-01-30T09:55:10,280][WARN ][r.suppressed ] [node-1] path: /_xpack/ml/find_file_structure, params: {}
org.elasticsearch.ElasticsearchTimeoutException: Aborting structure analysis during [timestamp format determination] as it has taken longer than the timeout of [25s]
at org.elasticsearch.xpack.ml.filestructurefinder.TimeoutChecker.check(TimeoutChecker.java:82) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TimeoutChecker.grokCaptures(TimeoutChecker.java:103) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TimestampFormatFinder.findFirstMatch(TimestampFormatFinder.java:236) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TimestampFormatFinder.findFirstMatch(TimestampFormatFinder.java:191) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TextLogFileStructureFinder.mostLikelyTimestamp(TextLogFileStructureFinder.java:158) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TextLogFileStructureFinder.makeTextLogFileStructureFinder(TextLogFileStructureFinder.java:35) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.TextLogFileStructureFinderFactory.createFromSample(TextLogFileStructureFinderFactory.java:45) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.FileStructureFinderManager.makeBestStructureFinder(FileStructureFinderManager.java:278) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.FileStructureFinderManager.findFileStructure(FileStructureFinderManager.java:150) ~[?:?]
at org.elasticsearch.xpack.ml.filestructurefinder.FileStructureFinderManager.findFileStructure(FileStructureFinderManager.java:121) ~[?:?]
at org.elasticsearch.xpack.ml.action.TransportFindFileStructureAction.buildFileStructureResponse(TransportFindFileStructureAction.java:50) ~[?:?]
at org.elasticsearch.xpack.ml.action.TransportFindFileStructureAction.lambda$doExecute$0(TransportFindFileStructureAction.java:39) ~[?:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]

Kibana:

{"type":"response","@timestamp":"2019-01-30T08:54:45Z","tags":[],"pid":21927,"method":"post","statusCode":400,"req":{"url":"/api/ml/file_data_visualizer/analyze_file","method":"post","headers":{"host":"host:5601","connection":"keep-alive","content-length":"402517","kbn-system-api":"true","origin":"https://host:5601","kbn-version":"6.5.4","user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36","content-type":"application/json","accept":"/","referer":"https://host:5601/app/ml","accept-encoding":"gzip, deflate, br","accept-language":"de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7"},"remoteAddress":"XXX.XXX.XXX.XXX","userAgent":"XXX.XXX.XXX.XXX","referer":"https://host:5601/app/ml"},"res":{"statusCode":400,"responseTime":25146,"contentLength":9},"message":"POST /api/ml/file_data_visualizer/analyze_file 400 25146ms - 9.0B"}

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui

@peteharverson peteharverson changed the title Error in experimental file data visualizer [ML] Error in experimental file data visualizer Jan 30, 2019
@droberts195
Copy link
Contributor

There are certain log structures that are slow to analyse, including those that have multiple timestamps on the same line that nearly match our supported timestamps but not quite - see elastic/elasticsearch#35137

It’s going to be hard to debug this problem without the file that causes it. @wolframhaussig would you be able to supply us with an anonymised version of the file, with everything confidential or identifying individuals removed?

If this is not possible something else you could do to help narrow down the problem is to call the backend API directly, gradually increasing lines_to_sample from a very low number, say 10, up to the 1000 that the UI uses. The point at which the time taken increases dramatically will give a clue about the characteristic of the line that makes it slow. Then maybe you can describe those characteristics to us without giving away anything that is confidential or infringes anybody’s privacy. The documentation for the backend endpoint is in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-find-file-structure.html

@wolframhaussig
Copy link
Contributor Author

Is there a minimum number of lines which can be visualized? I reduced the logfile to 9 rows and get the error message:

File could not be read
[illegal_argument_exception] Number of messages analyzed must be positive.

I added an anonymized version of the file with which the error is reproducable:
Reduced1.log

I will now try to reproduce the timeout error.

@droberts195
Copy link
Contributor

@wolframhaussig thanks for uploading that sample.

I can see what the problem is now. The file structure finder endpoint only recognises a small set of timestamp formats, and yours is not one of them. To see the formats it recognises search for "the formats it knows" in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-find-file-structure.html

When analysing a semi-structured log file the structure finder groups the lines into messages by assuming the timestamp is on the first line of each message. But for your file it didn't recognise the timestamp at all so it thinks the whole file is leftovers from a previous incomplete message before the first complete message. This is why you get the "Number of messages analyzed must be positive" error - it thinks there are 0 complete messages after grouping the lines into messages.

Since this is a backend problem I will open an issue in the elasticsearch repo to improve the timestamp recognition.

@droberts195
Copy link
Contributor

I will now try to reproduce the timeout error.

Given what the problem is, please don't spend any more time doing this.

@droberts195
Copy link
Contributor

But for your file it didn't recognise the timestamp at all

Actually that statement I made was wrong, and it's why you don't get a more useful "this file does not contain a timestamp" error.

Mon Jan 21 11:04:19 CET 2019 is a recognised timestamp, but it only appears once in the file. So the structure finder ends up thinking the file consists of some preamble before the first message and then one message consisting of every line from the one with Mon Jan 21 11:04:19 CET 2019 to the end. It doesn't analyse the last message in case it is truncated, so you still end up with 0 messages to analyse. But then the error is very confusing. I'll add that to the list of things to fix.

@droberts195
Copy link
Contributor

I opened elastic/elasticsearch#38086. Anyone who encounters the problem reported in this issue should subscribe to elastic/elasticsearch#38086 to follow progress towards improving things. Since this issue is in the UI repo I will close it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants