-
-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeseries models derived from generative graph #642
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
I think this one is ready for a first review round. I have not been able to make the scan reference work with {func} |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2024-03-11T18:53:02Z
ricardoV94 commented on 2024-03-13T09:58:34Z Agree with Alex. The main motivation is this framework allows you to define many arbitrary timeseries, not just things that are pre-packaged in PyMC. For the AR example, one could for instance add different Noise (StudentT) or covariates that change over time... ricardoV94 commented on 2024-03-13T10:07:00Z Maybe add a second more complex example, either MA2 https://gist.github.com/ricardoV94/a49b2cc1cf0f32a5f6dc31d6856ccb63#file-pymc_timeseries_ma-ipynb or one of those Jesse wrote here https://gist.github.com/jessegrabowski/ccda08b8a758f882f5794b8b89ace07a ? jessegrabowski commented on 2024-03-13T10:28:06Z I actually disagree, I think an AR(2) is a fine choice. I was going to put suggestions for other models here (ARIMA-GARCH or ETS), but I actually think it's better to keep this notebook really simple and focus on the machinery, which is quite complex.
ricardoV94 commented on 2024-03-13T10:51:14Z Showing a non-recursive time varying parameter could be useful though? Can split into two separate notebooks? jessegrabowski commented on 2024-03-13T10:54:36Z I think that's a good 2nd example, because it also serves as a tutorial on the difference between
Even if it's not a time-varying parameter, maybe an example that shows how to combine an exogenous regression with an AR model, so you're just scanning in some covariate data and doing a linear model with AR distributed errors. juanitorduz commented on 2024-05-06T12:17:35Z Maybe add a second more complex example, either MA2?
I suggest we keep this notebook simple and work out other more complex examples in a different notebook (I can also work on it). In my experience, the first time an user sees these models can be overwhelming, so let's keep it simple for this one :D |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2024-03-11T18:53:03Z
ricardoV94 commented on 2024-03-13T10:01:31Z Re: collect_default_updates, it tells PyMC that the RV in the generative graph should be updated in every iteration of the loop. Agree with adding more context on how Scan is defined and linking to the PyTensor docs for a deeper dive: https://pytensor.readthedocs.io/en/latest/library/scan.html juanitorduz commented on 2024-05-06T13:34:35Z Addd more info! |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2024-03-11T18:53:04Z Explain why you set all the observed data nodes to 0 jessegrabowski commented on 2024-03-13T10:51:13Z Why is there an observed value for the initial condition? We never observe this by definition. ricardoV94 commented on 2024-03-13T12:50:51Z I don't see why you can't observe it? jessegrabowski commented on 2024-03-13T13:41:37Z Because the first observation in the data is
The way this model is written, it assumes that the first observations of the data are generated by some arbitrary normal distribution, which then go on to spontaneously kick-off an autoregressive process that describes the rest of the data. This isn't logical. The correct definition of the model should consider all observed data as part of the autoregressive process juanitorduz commented on 2024-05-06T13:35:35Z @jeseegrabowski I used the code you shareed on discourse and that is why I added you as a co-author :) |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2024-03-11T18:53:05Z Can't we get rid of the ricardoV94 commented on 2024-03-13T10:03:25Z Agree. Let me know if something breaks jessegrabowski commented on 2024-03-13T13:46:49Z I blatantly plagiarized this notebook and used juanitorduz commented on 2024-05-06T13:35:51Z Fixed in 67ec83d |
View / edit / reply to this conversation on ReviewNB AlexAndorra commented on 2024-03-11T18:53:06Z "we see the model is capturing the global dynamics of the time series. In order to have a jessegrabowski commented on 2024-03-13T10:24:47Z I think a discussion of conditional and unconditional posteriors is needed here. Many users will be surprised by this posterior because they are used to seeing conditional one-step forecasts,
At the risk of scope-creep, I think it's also important to show users how to use a predictive model to get the conditional posterior. It would also be the first place in pymc-examples that shows how to use a predictive model -- up until now we only have the labs blog. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it @juanitorduz ! Added some comments in ReviewNB for the next round
Agree with Alex. The main motivation is you can this framework allows you to define many arbitrary time-series, not just things that are pre-packagend in PyMC. For the AR example, one could for instance add different Noise (StudentT) or covariates that change over time... View entire conversation on ReviewNB |
Re: collect_default_updates, it tells PyMC that the RV in the generative graph should be updated in every iteration of the loop. Agree with adding more context on how Scan is defined and linking to the PyTensor docs for a deeper dive: https://pytensor.readthedocs.io/en/latest/library/scan.html View entire conversation on ReviewNB |
May be a good excuse to use View entire conversation on ReviewNB |
Agree. Let me know if something breaks View entire conversation on ReviewNB |
Maybe add a second more complex example, either MA2 https://gist.github.com/ricardoV94/a49b2cc1cf0f32a5f6dc31d6856ccb63#file-pymc_timeseries_ma-ipynb or one of those Jesse wrote here https://gist.github.com/jessegrabowski/ccda08b8a758f882f5794b8b89ace07a ? View entire conversation on ReviewNB |
I think a discussion of conditional and unconditional posteriors is needed here. Many users will be surprised by this posterior because they are used to seeing conditional one-step forecasts,
At the risk of scope-creep, I think it's also important to show users how to use a predictive model to get the conditional posterior. It would also be the first place in pymc-examples that shows how to use a predictive model -- up until now we only have the labs blog. View entire conversation on ReviewNB |
I actually disagree, I think an AR(2) is a fine choice. I was going to put suggestions for other models here (ARIMA-GARCH or ETS), but I actually think it's better to keep this notebook really simple and focus on the machinery, which is quite complex.
View entire conversation on ReviewNB |
Why is there an observed value for the initial condition? We never observe this by definition. View entire conversation on ReviewNB |
Showing a non-recursive time varying parameter could be useful though? Can split into two separate notebooks? View entire conversation on ReviewNB |
I think that's a good 2nd example, because it also serves as a tutorial on the difference between View entire conversation on ReviewNB |
I don't see why you can't observe it? View entire conversation on ReviewNB |
Because the first observation in the data is View entire conversation on ReviewNB |
I blatently plagerized this notebook and used
View entire conversation on ReviewNB |
Thank you all for the feedback! Now that the HSGP example was merged I can come back to this one (there are no dependencies but I was busy with other stuff 🫠). Apologies for the delay 🙈😄. |
Awesome! |
Thanks! Done View entire conversation on ReviewNB |
I added some explanation. View entire conversation on ReviewNB |
Done! View entire conversation on ReviewNB |
Done! Thanbks for the reference! View entire conversation on ReviewNB |
Added! View entire conversation on ReviewNB |
yes! View entire conversation on ReviewNB |
yes! View entire conversation on ReviewNB |
I think as the conditional step has View entire conversation on ReviewNB |
Any suggested values ? Changing the seed does not change that much and the rho values go very close to zero many times :D View entire conversation on ReviewNB |
@OriolAbril, I am able to add links to the classes, but the functions and methods won't work for some reason ... do you have any tip :) Concretely, |
I don't know if this can be recovered, but perhaps worth a shot? https://colab.research.google.com/drive/1yLrxTBRPa08B8EIEh6NGWG_aLFxIbanh?usp=sharing View entire conversation on ReviewNB |
Does that matter? View entire conversation on ReviewNB |
You could pick parameters that are 1) strongly persistent, and 2) give imaginary eigenvalues and generate oscillating trajectories. For example, rho_1 = 0.99, rho_2 = -0.99/4 View entire conversation on ReviewNB |
@juanitorduz These seem particular problems. pytensor.scan is explicitly not indexed within the pytensor docs as a function: https://github.com/pymc-devs/pytensor/blob/main/doc/library/scan.rst?plain=1#L679. My guess from a quick look at the blame is theano probably had It looks like the main alternative right now is using For Here you should leave the reference correctly added and then it needs to be fixed on pymc end (once that is done, regenerating the examples will fix the issue without any extra work). Regarding fixing on the pymc docs, do we really expect users to use If not, use the actual import path in the reference, and when fixing it document it where the users are expected to import it from (in this case If yes, then the notebook would need to be updated to use that. |
Thank you very much @OriolAbril ! (and apologies for the late reply, these last two weeks have been hectic 🫠) |
@juanitorduz let us know if you need anything else from us at the moment |
Apologies @ricardoV94 🙈. I just need some time... Thinks are busy at this stage 🫠. Thanks for checking in! This PR is still in my todo list. I think the main content is there: I just need to address the two open points:
|
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2024-07-28T05:49:25Z Line #2. trials = 100 # Time series length Poke to push this over the finish line, but also a real point:
I really dislike that the time series length is a global variable hidden away here. It seems to me like it should be inferred by the either the shape of the observed data, or the requested size/shape of the
def ar_dist(ar_init, rho, sigma, size): ar_innov, _ = pytensor.scan( fn=ar_step, outputs_info=[{"initial": ar_init, "taps": range(-lags, 0)}], non_sequences=[rho, sigma], n_steps=size[0], strict=True, ) return ar_innov
But this no longer works. I made this comment because I came back to this code for a project, and I was getting an error about
If you insist on using a global for |
Next week I'll have the time to work on this one 🙏 |
@ricardoV94 @jessegrabowski Sorry for the late reply, but I am particularly busy (not just work, but kids and moving and...). Anyway, I cleaned up and updated the PyMC version (5.16). Regarding the open points:
@ricardoV94 Making the series smoother: I could use @jessegrabowski 's suggestion of setting the values for rho directly. But then I would need to generate the series again instead of using the prior samples, which I believe was the idea of the initial gist. This, again, would require more work for the first iteration. All in all, I think this notebook is in a very good state for the first iteration. I suggest we merge this one, and then we can take over additional enhancements in future PRs. This is very cool material and needs to be exposed to the users ;) WDYT? (Again, apologies for the time constraints to address all the comments in this iteration 🙏 ) |
@juanitorduz sounds good with me |
Let's make sure we advertise this one. Do you want to post it @juanitorduz? |
I will do so! |
As discussed with @ricardoV94 I will port the gist https://gist.github.com/ricardoV94/a49b2cc1cf0f32a5f6dc31d6856ccb63#file-pymc_timeseries_ma-ipynb into the PyMC Example Gallery. I will add text and explanation to the existing working code :)
📚 Documentation preview 📚: https://pymc-examples--642.org.readthedocs.build/en/642/