Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated data_container.ipynb to reflect changes to pm.Data containers in pymc v5.16 #676

Merged
merged 5 commits into from
Aug 5, 2024

Conversation

Dekermanjian
Copy link
Contributor

@Dekermanjian Dekermanjian commented Jul 28, 2024

related to #659

The data container notebook is out of date. It uses the deprecated pm.ConstantData and pm.MutableData methods.
The changes I made were:

  • Update the introduction to reflect the change to using pm.Data but keeping a section that refers to the deprecated pm.MutableData and pm.ConstantData in case a new user sees that in example code elsewhere.
  • Replace all instances of pm.ConstantData and pm.MutableData with pm.Data in the code and in the markdown text
  • Replace the instance of mutable coords with coords to reflect that all coordinates are now mutable and explain that fact in the preceding markdown text
  • I removed some text that no longer applied like checking the type of data container twice previously for each type of data container
  • I made a couple of typo corrections

📚 Documentation preview 📚: https://pymc-examples--676.org.readthedocs.build/en/676/

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -58,19 +58,17 @@ This notebook will illustrate each of these benefits in turn, and show you the b

+++

## Types of Data Containers
## Current and past states of Data Containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole section can be deleted and maybe replaced with an info box of your last comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I will make the change.

@@ -99,20 +97,20 @@ Furthermore, inside `idata`, PyMC has automatically saved the observed (endogeno
idata.observed_data
```

In this next model, we create a `pm.ConstantData` containers to hold the observations, and pass this container to the `observed`. We also make a `pm.ConstantData` container to hold the `x` data:
In this next model, we create a `pm.Data` container to hold the observations, and pass this container to the `observed`. We also make a `pm.Data` container to hold the `x` data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this next model, we create a `pm.Data` container to hold the observations, and pass this container to the `observed`. We also make a `pm.Data` container to hold the `x` data:
In this next model, we create a {class}`pm.Data` container to hold the observations, and pass this container to the `observed`. We also make a {class}`pm.Data` container to hold the `x` data:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change everywhere we use pm.Data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying that for me. I was not sure if the {class} was only supposed to appear the first time we introduce the class. I will make the changes to fix it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be one of {class}`pymc.Data` (will render pymc.Data as link) or {class}`~pymc.Data` (will render only Data as link). Sphinx doesn't know about pm. I prefer the 1st one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twiecki @OriolAbril Would it be preferable to change all pm.Data to {class}pymc.Data or is it preferred that I go back to keeping it as pm.Data without {class}? At the start of the notebook, there is a reference to {class}pymc.Data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer pymc.Data as a link, but I don't really care much either way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I went ahead and changed it so that it is {class} pymc.Data everywhere in the markdown text.

beta = pm.Normal("beta")
mu = pm.Deterministic("mu", beta * x_data)
sigma = pm.Exponential("sigma", 1)
obs = pm.Normal("obs", mu=mu, sigma=sigma, observed=y_data)
idata = pm.sample(random_seed=RANDOM_SEED)
```

Because we used a `pm.ConstantData` container, the data now appears on our probabilistic graph. It is downstream from `obs` (since the `obs` variable "causes" the data), shaded in gray (because it is observed), and has a special rounded square shape to emphasize that it is data. We also see that `x_data` has been added to the graph.
Because we used a `pm.Data` container, the data now appears on our probabilistic graph. It is downstream from `obs` (since the `obs` variable "causes" the data), shaded in gray (because it is observed), and has a special rounded square shape to emphasize that it is data. We also see that `x_data` has been added to the graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Because we used a `pm.Data` container, the data now appears on our probabilistic graph. It is downstream from `obs` (since the `obs` variable "causes" the data), shaded in gray (because it is observed), and has a special rounded square shape to emphasize that it is data. We also see that `x_data` has been added to the graph.
Because we used a `pm.Data` container, the data now appears in our probabilistic graph. It is downstream from `obs` (since the `obs` variable "causes" the data), shaded in gray (because it is observed), and has a special rounded square shape to emphasize that it is data. We also see that `x_data` has been added to the graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I missed that. Thank you, for catching it. I will fix it.

@OriolAbril OriolAbril linked an issue Aug 1, 2024 that may be closed by this pull request
@OriolAbril OriolAbril mentioned this pull request Aug 1, 2024
3 tasks

In past versions of PyMC, the only data container was `pm.Data`. This container is still available for backwards compatability, but the current best practice is to use either `pm.MutableData` or `pm.ConstantData`.
:::{important}
In past versions of PyMC, there were two types of data containers {func}`pymc.MutableData` and {func}`pymc.ConstantData` these are now deprecated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In past versions of PyMC, there were two types of data containers {func}`pymc.MutableData` and {func}`pymc.ConstantData` these are now deprecated.
In past versions of PyMC, there were two types of data containers {func}`pymc.MutableData` and {func}`pymc.ConstantData`. These have been deprecated as all data containers are mutable now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, for the review @twiecki. I have made the change that you have suggested.

@twiecki twiecki merged commit 0a2ea64 into pymc-devs:main Aug 5, 2024
2 checks passed
@twiecki
Copy link
Member

twiecki commented Aug 5, 2024

Thanks @Dekermanjian!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update data container notebook to reflect changes to pm.Data
3 participants