Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Pyobis example notebook #115

Merged
merged 25 commits into from
Mar 24, 2023
Merged

Adding Pyobis example notebook #115

merged 25 commits into from
Mar 24, 2023

Conversation

MathewBiddle
Copy link
Contributor

closes #106

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@MathewBiddle MathewBiddle marked this pull request as ready for review March 15, 2023 17:52
@ocefpaf
Copy link
Member

ocefpaf commented Mar 15, 2023

pre-commit.ci autofix

Copy link
Member

@ocefpaf ocefpaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments:

  • In cell [5] you have a comment for the future. Maybe, to keep the notebook "clean," that could be an issue instead of a commented out cell.
  • The tables tables in the notebook are quite big. If they are only for a quick data check/demo, maybe consider using just a few rows like df_meta.head(5) instead of all 45 rows. (Same in cell 11).
  • You don't need plt.show() in a notebook but it doesn't hurt to leave it there.
  • There seems to be some commented out debug print statement, consider removing them to improve the notebook readability.
  • A comment that describes what is being done decreases readability instead of improving it I would remove the one at the top of cell 17 b/c the next cell is almost the same sentence in s highly legible code.
  • Everything after cell 18 is commented out. If we cannot run maybe consider removing it.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 15, 2023

pre-commit.ci autofix

@MathewBiddle you'll have to do a git pull b/c I asked pre-commit to perform a fix automated fixes. If that doesn't work don't worry, just force push what you have and we can re-run pre-commit later.

@MathewBiddle
Copy link
Contributor Author

@ocefpaf I was hoping to only use pyobis for this notebook, but it wasn't quite ready yet. I made some work arounds and added some context. Let me know what you think.

At the end of the notebook, what I tried to do is repeat this figure [1] from [2]. But, the latitudes on the map don't match the latitudes on the histogram (so I did some fiddling). We've tried figuring it out before, but never quite finished it up (see last cell in [3].

[1] - image

[2] - https://github.com/iobis/pyobis/blob/main/notebooks/biodiversity_mapping.ipynb

[3] - https://github.com/MathewBiddle/bio_ice/blob/main/create_map_from_OBIS_API.ipynb

@MathewBiddle
Copy link
Contributor Author

@ocefpaf I think I addressed your comments.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2023

Some minor comments:

  • the re, sys, and os modules are imported but unused.
  • in cell [5] you have df_meta.head(5) but the table is showing 45 rows. Maybe you should restart and run all to ensure the notebook is run from top to bottom in order.
  • import matplotlib.pyplot as plt and from datetime import datetime are repeated in the notebook. You can drop all that are after the first one. (Or move to the first cell for consistency).
  • cell [25] has some commented out code.
  • last cell is empry
  • suggest: change the line plt.title( "Total occurrence records contributed to OBIS from the US MBON network as of %s: %i" % (today, gdf.records.sum() ) ) to ax.set_title( f"Total occurrence records contributed to OBIS from the US MBON network as of {today}: {gdf.records.sum() }" (modern mpl and modern f-strings). Same in cell [11]

The notebook is great and is good to go IMO. When you are done on your side let me know so I can run pre-commit and conda-lock to update the packages in the CIs.

PS: In notebooks we break the rule of imports at the top and usually import them at first cell that uses them to help show its use and track what is being used and what was abandoned along the way.

@MathewBiddle
Copy link
Contributor Author

Thanks for taking a look!

cell [25] - The commented code is a search that would replace the for loop above, if the API responded as expected. Unfortunately it caps responses at 10k records and we can't figure out why (or a workaround). I wanted to leave it in so there was a reference as to how the package should work. In the end, I don't think it's all that difficult to recreate that line, if the API response is fixed. I'll remove.

This is a similar issue with cell[10] and cell[11]. Cell[10] is less code and should work, but the API limits responses to 10k. Cell[11] iterates dataset by dataset and concatenates them all together to give the true representation of points. So, my question to you is, do we leave both options in there? Or should we only show the one that gives the appropriate response?

I do like having the imports where they are actually being used. I'll work through making those changes. Although, it will be odd with the various pyobis calls throughout the notebook. I'll see how it looks.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2023

So, my question to you is, do we leave both options in there? Or should we only show the one that gives the appropriate response?

I'd leave just the one that works, open an issue, and re-visit the notebook when/if we can the better option that doesn't work now.

Although, it will be odd with the various pyobis calls throughout the notebook.

B/c of the narrative in the notebook you "introduce" pyobis at the first use/import, so readers kind of expect it to show up again. At least that is how I read notebooks.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2023

This line

f"Total aggregated occurrence records contributed to OBIS from the US MBON network as of {today}: {gdf_ghsh.records.sum() }"

in in cell [18] is failing. I belive it was a shape there instead of a records sum. I'll fix that in my copy here and push.

Also, the file US_MBON_bounding_boxes_20230315.geojson is not used and creating it is not too slow. We can demonstrate how to save it but no need to track it on GH IMO.

@MathewBiddle
Copy link
Contributor Author

MathewBiddle commented Mar 24, 2023

Also, the file US_MBON_bounding_boxes_20230315.geojson is not used and creating it is not too slow. We can demonstrate how to save it but no need to track it on GH IMO.

Agreed. my mistake on that one. I was using that geojson in the map presented at https://marinebon.github.io/map-of-activities/. See
https://github.com/marinebon/map-of-activities/blob/e97577145a7a2e54d7b166c756cf536e4d349193/lib/functions.R#L21

But I could easily port code to do that in the marinebon map-of-activities repo.

@MathewBiddle
Copy link
Contributor Author

I'm thinking we should remove cell 9 and the italicized text in the markdown above. That cell is only returning the first 10k entries and exemplifies the failing API response, not something we want to promote at the moment.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2023

I'm thinking we should remove cell 9 and the italicized text in the markdown above. That cell is only returning the first 10k entries and exemplifies the failing API response, not something we want to promote at the moment.

Done!

@MathewBiddle
Copy link
Contributor Author

I'm okay with merging this! Thanks for all your help on this one!

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2023

I'm okay with merging this! Thanks for all your help on this one!

Just waiting the CIs to merge this one and #131.

@ocefpaf ocefpaf merged commit 883f50b into ioos:main Mar 24, 2023
@ocefpaf ocefpaf mentioned this pull request Mar 31, 2023
3 tasks
@MathewBiddle MathewBiddle deleted the pyobis branch August 31, 2023 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pyobis example notebook using known dataset ids
2 participants