-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyze dataset edge case failure #45
Comments
Hi there, The file I followed the tutorial (with very minor changes) and did not get any error messages. These are the commands I ran
Mind the
What seems odd to me is that '2019-04-02' doesn't match with the input frequency of the data. See, all entries correspond to the results of a quarter, so the first entry of a year corresponds to 03-31, the second to 06-30, third to 09-30 and last one to 12-31; having a prediction with timestamp 2019-04-02 sort of doesn't make any sense. Interestingly enough
What's even more interesting is that the time difference between consecutive predictions is 92 days, which is certainly not "a quarter" as 92 x 4 = 368 which in turn happens to be 365 + 3! Another interesting "feature" is that the SQL commands listed above yield predictions for
but for 2020
In summary
I suggest to open a PR related to this issue to fix this problem by properly handling unevenly sampled data, either by using resampling or time-stamp matching (i.e. neglecting the offsets produced by the uneven sampling). |
What a great analysis @pedrofluxa 🚀! Apologies for not including additional information here, could have saved you a bunch of time. Even though what you diagnosed is correct, I think it is appropriate to move it to a different issue, as OP is more concerned about the key error and the subsequent failure in dataprep itself, rather than the generated predictions. |
No problem: detective work is always fun (and I learned a lot) :D Got it. I'll investigate how to fix the particular problem described by OP. Would you please send the exact dataset and commands/code that OP used to trigger this error? |
Exact commands are not available anymore, but:
|
Scratch that, managed to find it 😁:
|
Nice! I am not able to reproduce the error using a clean installation of MindsDB. I actually get these results back
I am not entirely sure if the missing values for ma_timestamp_{1,2,3} are a normal feature or point to something not working correctly. |
Ok, closing then 🤝 |
Triggered by the attached dataset:
analyze_dataset_failure.csv
The text was updated successfully, but these errors were encountered: