Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a method to add the quantiles in a timseseries graph #124

Open
lordofthejars opened this issue Apr 11, 2024 · 13 comments
Open

Add a method to add the quantiles in a timseseries graph #124

lordofthejars opened this issue Apr 11, 2024 · 13 comments
Assignees

Comments

@lordofthejars
Copy link

lordofthejars commented Apr 11, 2024

As we talked about in DevNexus, I'd love to plot some time series where I set the mean and the quantiles. What I've got is an array of floats for each part, here I pasted you a real example.

==== target =====
[112.0, 118.0, 132.0, 129.0, 121.0, 135.0, 148.0, 148.0, 136.0, 119.0, 104.0, 118.0, 115.0, 126.0, 141.0, 135.0, 125.0, 149.0, 170.0, 170.0, 158.0, 133.0, 114.0, 140.0, 145.0, 150.0, 178.0, 163.0, 172.0, 178.0, 199.0, 199.0, 184.0, 162.0, 146.0, 166.0, 171.0, 180.0, 193.0, 181.0, 183.0, 218.0, 230.0, 242.0, 209.0, 191.0, 172.0, 194.0, 196.0, 196.0, 236.0, 235.0, 229.0, 243.0, 264.0, 272.0, 237.0, 211.0, 180.0, 201.0, 204.0, 188.0, 235.0, 227.0, 234.0, 264.0, 302.0, 293.0, 259.0, 229.0, 203.0, 229.0, 242.0, 233.0, 267.0, 269.0, 270.0, 315.0, 364.0, 347.0, 312.0, 274.0, 237.0, 278.0, 284.0, 277.0, 317.0, 313.0, 318.0, 374.0, 413.0, 405.0, 355.0, 306.0, 271.0, 306.0, 315.0, 301.0, 356.0, 348.0, 355.0, 422.0, 465.0, 467.0, 404.0, 347.0, 305.0, 336.0, 340.0, 318.0, 362.0, 348.0, 363.0, 435.0, 491.0, 505.0, 404.0, 359.0, 310.0, 337.0, 360.0, 342.0, 406.0, 396.0, 420.0, 472.0, 548.0, 559.0, 463.0, 407.0, 362.0, 405.0, 417.0, 391.0, 419.0, 461.0, 472.0, 535.0, 622.0, 606.0, 508.0, 461.0, 390.0, 432.0]
==================

==== prediction =====
[442.40707, 442.26746, 473.85284, 503.2582, 544.3242, 631.8106, 703.4895, 682.01056, 584.2869, 496.65305, 458.26706, 486.83524]
================================

==== Quantile 50% =====
[441.90524, 438.94196, 475.58868, 504.77777, 546.2015, 631.03503, 703.1645, 684.6702, 585.4043, 497.68643, 456.81332, 487.01727]
================================

==== Quantile 90% =====
[470.60236, 472.7702, 506.5765, 537.26117, 573.006, 667.6558, 739.82074, 709.30133, 611.8065, 520.3827, 486.74255, 515.27356]
================================

The target is the real values (so it is the past) and should be plotted from the initial of the graph until one point after that point, when the prediction starts. The prediction is the array of the mean prediction, which is the main line, and then two quantiles.

Python does this, and the output graph looks like this:

https://d2kv9n23y3w0pn.cloudfront.net/static/README/forecasts.png

It's not necessary to be the same. Also notice that in that graph there is 3 predictions, in my case with only one prediction is enough

:)

@HanSolo HanSolo self-assigned this Apr 11, 2024
@HanSolo
Copy link
Owner

HanSolo commented Apr 11, 2024

Short question, where is the time in the data?

@lordofthejars
Copy link
Author

lordofthejars commented Apr 11, 2024

Oh, it is a YearMonth object, of the first series, and then there is a String that represents the frequency, so for example YearMonth 01-1964 and freq M means that from value to value there is one month. Of course the freq can be seconds, minutes, hours, days, months, ...

So if you want to do it generic enough, I would say not using YearMonth object but using the Temporal object and calling plus method setting the TemporalUnit depending on the string freq

@HanSolo
Copy link
Owner

HanSolo commented Apr 11, 2024

Hmm...ok...was playing around with it and here is a first test. Question is if I do it right. The area defined by Quantile 50 and Quantile 90 is at the moment calculated by going from the prediction y value +/- quantile50 y value and the same for the Quantile 90 data. This might be wrong...
Xnip2024-04-11_13-46-56

@lordofthejars
Copy link
Author

I think it is right, I compare this graph with the one I passed which is using the same dataset, and the green part is similar to this one, when I call the quantile function it returns what I showed you, not relative to the mean but the absolute number.

Look the method I call for returning the array I pasted before is coming from forecast.quantile(0.90f)

@HanSolo
Copy link
Owner

HanSolo commented Apr 11, 2024

There is one more question which is, how do I figure out the time of the prediction? It is just a set of numbers. At the moment I just moved the data manually to overlay the prediction.

@lordofthejars
Copy link
Author

lordofthejars commented Apr 11, 2024

It is the same elapsed of time as the original set. So if from target data1 to data2 is 1 month, then in the prediction from pred1 to pred2 is also 1 month. So in this case if there are 12 numbers in the prediction is 12 months

@HanSolo
Copy link
Owner

HanSolo commented Apr 12, 2024

Could you provide the data for all 3 predictions with their start timestamp and the time step between the datapoints. And also the start timestamp for the target data? I would like to recreate the complete chart. 😁

@lordofthejars
Copy link
Author

lordofthejars commented Apr 12, 2024

Initial date for target: 1949-01

Target values: 112.0, 118.0, 132.0, 129.0, 121.0, 135.0, 148.0, 148.0, 136.0, 119.0, 104.0, 118.0, 115.0, 126.0, 141.0, 135.0, 125.0, 149.0, 170.0, 170.0, 158.0, 133.0, 114.0, 140.0, 145.0, 150.0, 178.0, 163.0, 172.0, 178.0, 199.0, 199.0, 184.0, 162.0, 146.0, 166.0, 171.0, 180.0, 193.0, 181.0, 183.0, 218.0, 230.0, 242.0, 209.0, 191.0, 172.0, 194.0, 196.0, 196.0, 236.0, 235.0, 229.0, 243.0, 264.0, 272.0, 237.0, 211.0, 180.0, 201.0, 204.0, 188.0, 235.0, 227.0, 234.0, 264.0, 302.0, 293.0, 259.0, 229.0, 203.0, 229.0, 242.0, 233.0, 267.0, 269.0, 270.0, 315.0, 364.0, 347.0, 312.0, 274.0, 237.0, 278.0, 284.0, 277.0, 317.0, 313.0, 318.0, 374.0, 413.0, 405.0, 355.0, 306.0, 271.0, 306.0, 315.0, 301.0, 356.0, 348.0, 355.0, 422.0, 465.0, 467.0, 404.0, 347.0, 305.0, 336.0, 340.0, 318.0, 362.0, 348.0, 363.0, 435.0, 491.0, 505.0, 404.0, 359.0, 310.0, 337.0, 360.0, 342.0, 406.0, 396.0, 420.0, 472.0, 548.0, 559.0, 463.0, 407.0, 362.0, 405.0, 417.0, 391.0, 419.0, 461.0, 472.0, 535.0, 622.0, 606.0, 508.0, 461.0, 390.0, 432.0

Initial date of prediction: 1961-01

Quantiles:

==== Quantile 50% =====
[441.90524, 438.94196, 475.58868, 504.77777, 546.2015, 631.03503, 703.1645, 684.6702, 585.4043, 497.68643, 456.81332, 487.01727]

==== Quantile 90% =====
[470.60236, 472.7702, 506.5765, 537.26117, 573.006, 667.6558, 739.82074, 709.30133, 611.8065, 520.3827, 486.74255, 515.27356]

Freq: M

@HanSolo
Copy link
Owner

HanSolo commented Apr 12, 2024

I will implement the prediction as overlays to a line chart, meaning to say you will be able to add multiple predictions to an existing line chart.

@lordofthejars
Copy link
Author

lordofthejars commented Apr 12, 2024 via email

@lordofthejars
Copy link
Author

lordofthejars commented Apr 12, 2024 via email

@HanSolo
Copy link
Owner

HanSolo commented Apr 13, 2024

Made some progress, could you also provide the data for the other 2 predictions (the green and blue one) in the original chart? This is how it looks like at the moment.
Xnip2024-04-13_10-55-03
You will find the demo in the jdk21 branch in the tests in TimeSeriesPredictionTest.java and you can run it by calling the TimeSeriesPredictionTestLauncher.java

@lordofthejars
Copy link
Author

lordofthejars commented Apr 13, 2024

Oh looks great. I don't have them as my prediction was for only 12 months. And the model I am using is trained only for 12 months. If you want to try with more, I'd suggest copying and pasting the previous ones. I know it does not make much sense from the point of view of AI, but for display, I think it is ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants