Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example Notebook Clarifications/Suggestions #57

Closed
rwegener2 opened this issue Nov 23, 2024 · 2 comments
Closed

Example Notebook Clarifications/Suggestions #57

rwegener2 opened this issue Nov 23, 2024 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@rwegener2
Copy link
Contributor

The example notebook is really helpful for learning about the tool! A few suggestions to make it even clearer:

  1. When using the api.radial_search() method I had to get an error message to figure out what the options are for the units parameter. A sentence stating these three options and what they correspond to would be great.
  2. In the "Obtain measurements over a given time range" section there is the following sentence: "The returned pd.DataFrame will match the format explored earlier, with one key difference. The station_id is now included as a column in addition to the other features.". It looks like ‘station’ is a mulit-index, not a column. If that's true this sentence can be updated.
  3. I like the paragraph in "Check the API's supported data formats" that gives some context about modes, since modes seems to be a level of organization unique to NDBC. Some pieces of info that I would add to this paragraph (assuming I've understood correctly) are 1) the data variables that are available at a station depends on which modes the station is operating in 2) a station can be operating in multiple simultaneous modes 3) you can't search by data variable, you can only search by mode. If there is a way for people to figure out what data variables correspond to each node that may also be helpful. Maybe consider a table displaying the information from the top of section 8 in the NDBC data guide which gives a plain language description of the mode codes (ex. stdmet -> Standard Meteorological data)
  4. Your example notebook has a really nice flow, but it feels like it gets interrupted in the middle with the api.available_realtime() and api.available_historic() methods. As far as I can tell those methods weren't important for me getting data into pandas, they were sort of just bonus methods to generate access links to the raw data in case I wanted to download the files. I would suggest moving these methods to the end of the notebook to keep the workflow that I would guess is most likely to be used together(find a station -> access station metadata -> access station data).
  5. In the notebook as it is I don't actually see how a user would figure out which modes are available at a station. The api.available_historical() method gives the plain language description of the mode, but not the code. I realized later that the api.get_modes() method accepts a station key. Do you think showing users how to use .get_modes() would be a more streamlined way of showing users how to get the station code?

Feel free to disagree with me here, these are just some thoughts after reading through the example notebook with fresh eyes. Please don't hesitate to ask follow ups if anything is unclear!

openjournals/joss-reviews#7406

@CDJellen CDJellen self-assigned this Nov 23, 2024
@CDJellen CDJellen added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 23, 2024
@CDJellen
Copy link
Owner

Thank you again for your review and feedback @rwegener2; I've updated the overview notebook and some of the method docstrings based on these suggestions.

The inclusion of the available_historical and available_realtime methods in the overview notebook is intended to answer question 5 above. Without speculatively making an API call, the easiest way to determine what modes are available at a given station is to use one of these two methods for the station of interest based on the time period required. The responses do include the mode indicators, but only through URLs.

I believe that adding either a get_modes(station_id=...) or refactoring the response of these two methods will help make this more intuitive and fit-for-purpose.

One of these two will be included as a feature in the next release, along with updates to the flags and keywords across some of the data retrieval methods and possibly the inclusion of station metadata in opendap data response.

All other changes were made in #58 . If you would prefer to see these changes expediated, please feel free to re-open this issue. Thank you again and have an excellent rest of your day.

@rwegener2
Copy link
Contributor Author

All sounds good. I see you added a sentence explaining how that the indicator is in the url, ("In the response above, the modes which are available for station "tplm2" are reported in plain text. Their indicators (stdmet, cwind) are also specified as the suffix of the URLS for the data files.") , which is super helpful. I think one of those two more in-depth options you listed will really help your users in the next release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants