Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices: adding Dataset Publishing guidelines and Practice Recommendations for all files #406

Merged
merged 3 commits into from
Nov 16, 2023

Conversation

Sergiodero
Copy link
Collaborator

Problem

MobilityData has heard a number of pains from the community about the GTFS best practices and the spec’s SHOULD statements living in two different places:

  • Producers do not always refer to the best practices, and so moving these into the official spec would give the best practices greater visibility and improve data quality for everyone
  • Merging the best practices in the spec would make it easier for regulators to point producers to one place to get the information they need to create their GTFS feeds

Proposed solution

This proposal focuses on adding the Dataset Publishing & General Practice guidelines, and Practice Recommendations for all files into the GTFS specification's reference file. This represents the second phase of the merging of Best Practices into the GTFS specification as outlined in issue #396.

The incorporation of Dataset Publishing & General Practice guidelines would include an update of the text linking to the Google transitfeed tool merge function, so it references a list of merge tools instead. This content is proposed to be inserted in the reference document as a separate section before the current Field Definitions section to provide greater visibility to this information. Regarding the Practice Recommendations for all files, this content would be incorporated under the File Requirements section matching its bullet point format.

For both sections, minor editorial changes have been made to conform to RFC 2119 without affecting the severity of any statements (as recommendations, they would all remain as SHOULD statements).

As this change focuses only on moving the Best Practices content in its current status into spec, It is suggested that any potential revisions and improvements for these best practices should be discussed in a separate conversation if there's interest in doing so, ideally after being merged into the specification.

Incorporating the following Best Practices into the GTFS Reference document:

- Dataset Publishing & General Practice guidelines
- Practice Recommendations for all files
Moving Dataset Publishing & General Practice guidelines before Field Definitions section.

Merging File Recommendations with File Requirements section.
@google-cla
Copy link

google-cla bot commented Oct 24, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@e-lo
Copy link

e-lo commented Oct 24, 2023

This looks great, @Sergiodero - thank you for operationalizing this need. 🎉

@skinkie
Copy link
Contributor

skinkie commented Oct 24, 2023

What I am honestly missing is more information on feed_info.txt specifically regarding to feed_version and feed_start_date and feed_end_date.

@Sergiodero
Copy link
Collaborator Author

Thanks @skinkie, as mentioned in the initial post these changes are only focusing on moving the existing best practices into the spec as they currently are (see issue #396 for further reference).

I would recommend to discuss any potential changes to the content that is being moved as a second step, once they're incorporated in the reference document. Additionally, to propose any new best practices I would suggest opening a new Issue to discuss them separately.

Please note that we're in the process of consolidating outstanding issues and PRs from MobilityData's best practices repos for Schedule and Realtime and moving them to the Google/transit repo as the Best Practices content is expected to become part of the spec soon.

@emmambd emmambd added the Change: Best Practice Changes focusing on recommendations for optimal use of the specification. label Nov 1, 2023
@Sergiodero
Copy link
Collaborator Author

Since there are no additional comments and considering the engagement and interest in the phasing plan outlined in issue #396 and the implementation of the previous phase (PR #386), I'm opening the vote on this proposal.

This PR adds the Dataset Publishing & General Practice guidelines and Practice Recommendations for all files to the spec. These guidelines are taken from GTFS Best Practices.

Please vote with a +1 (for) or -1 (against) in the comments. Voting ends on 2023-11-15 at 23:59:59 UTC.

@doconnoronca
Copy link

TransSee +1
Looking forward to suggesting improvements once this is incorporated.

@skinkie
Copy link
Contributor

skinkie commented Nov 1, 2023

+1 (OpenGeo)

@drewda
Copy link

drewda commented Nov 1, 2023

+1 from @interline-io

@e-lo
Copy link

e-lo commented Nov 1, 2023

➕ Woohoo! +1 from me (UrbanLabs)

@leonardehrenfried
Copy link
Contributor

+1 OpenTripPlanner

@westontrillium
Copy link
Contributor

+1 from Trillium

@bdferris-v2
Copy link
Collaborator

+1 from Google

@Transnnovation-GTFSMgr
Copy link

+1 with the following consideration for those who have tested the merge feed function - from this link (from ~ 2014) https://github.com/google/transitfeed/wiki/Merge I had success in the ~2008-2012 timeframe with merge.exe (drag & drop in a Windows XP utility, I can't fully recall). I have once attempted with some frustration and gave up more recently - if merge is working well for all , green light from me; otherwise perhaps don't refrence it as "best practice". GTFS service_id can work model current and future route modifications within a dataset. Also I'd highly recommend that all URL in the feed do not fail, as a best practice. I had suggested that the MD feed-validator test all URL (perhaps I could request at least the agency_url will not fail and is tested during validation) and since there can be 00s of redirects with exponential # of broken links we can't use the validator (it times out) while it may be a best practice to reconfirm URLs must be fully tested and do not fail, as often as possible in the spec for ALL url (not just agency_url).

@Sergiodero
Copy link
Collaborator Author

Thanks @Transnnovation-GTFSMgr! Regarding the use of Google’s Merge tool, with this change we’re moving away from referring users to only this specific tool, and referencing instead to a set of different merging tools listed in GTFS.org. This list can be updated in the GTFS.org repo as new tools are developed and added.

Similarly, other new best practices can be added after this PR, including additional best practices for providing URLs. For now, the purpose of this change is to migrate existing content from the BP document into the specification.

Please note that we're in the process of consolidating outstanding issues and PRs from MobilityData's best practices repos for Schedule and Realtime and moving them to the Google/transit repo, as the Best Practices content is expected to become part of the spec soon.

@evansiroky
Copy link
Contributor

+1 Caltrans. We will create a new issue for a new suggestion regarding URL best practices.

@philip-cline
Copy link

+1 from Arcadis IBI Group

@Sergiodero
Copy link
Collaborator Author

The vote passed on 2023-11-15 at 23:59:59 UTC with 10 votes in favor and no votes against.

The votes came from:

Thanks to everyone who participated! After merging these changes, a new PR will soon be opened in the Best Practices repo to remove the content that has been integrated into the reference document.

You can follow the progress of the best practices migration into the specification here.

Updated revision date to Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Change: Best Practice Changes focusing on recommendations for optimal use of the specification.
Projects
None yet