-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed Changes to Standardization Process #181
Comments
@SteVwonder Again, a very well thought out and written proposal 😄 I have no objection to any of the above - in fact, FWIW, I strongly endorse it.
I'd suggest being rigorous here - it isn't a significant burden and avoids getting into hair-splitting arguments.
We have a template RFC - it is a markdown file in the RFC repo. Easy to convert that to an issue/PR template. It seemed to work well for us - I'd suggest integrating that with your list, at least as a starting point.
What we have been doing is asking that it be a PR against the reference implementation (so it remains current with the standard). It gets merged when the standard's RFC is committed. In a perfect world, we would always have the merge done by someone other than the author. However, we also didn't want the standards repo to be open for commits by a wide audience, and so we restricted it to a couple of people. Since one of those was me, and I have been the most prolific RFC author, it made the "merge by someone else" somewhat impractical. We substituted a requirement that at least one other person "approve" using the GitHub "review" buttons. With a larger set of participants, I suspect we can make the "someone else" rule work now. We might also want to "loosen" the requirement about the PR being against the reference implementation - seems somewhat unfair to make people whose interest lies in a different implementation to have to contribute to the reference one. Perhaps we could instead ask that the PR be against some open source implementation (so we can see the code to help understand what is being proposed) combined with some output that helps understand the intended functionality? Dunno - probably worth you folks discussing it. |
Thanks for the quick feedback @rhc54!
Thanks 😄. Most of the credit goes to those present on the concall last week. I mainly just copied-and-pasted the notes from the discussion into an issue.
Excellent suggestions! I think the title and action fields are covered under the existing metadata that GitHub provides on PRs. I also assume a Copyright disclaimer is not necessary for a PR (please correct me if I'm wrong there). I've integrated the other fields into the PR template proposal in the original comment of this issue; as well as the open-source implementation suggestion.
👍 Those both seem reasonable to me. |
We require them only for code, not for text going into the standard. However, we have required a "Signed-off-by" line on the PR itself just because it is going into a repo. No strong feelings on that point - just seemed a way to be consistent and help train good habits for people that might be contributing to code. |
BTW: if you'd like to add that template to the pmix-standards repo, please let me know and I'll add write permissions for you. Frankly, your proposal looks good enough I see no reason not to adopt the mechanics now. Even if you decide to modify it in the future, no harm done and (now that you encouraged me to go back and look at the RFCs) I think that RFC format really helped capture the theory of what was being proposed. The text for the standard doesn't necessarily provide a place for it as that is more focused on pedantics - i.e., this is the API and associated attributes, not the theory of operation behind its design. Guess I'm kinda thinking the "issue" filed on pmix-standard is where the RFC template goes, the pmix-standard PR is the corresponding text mod to the standard doc itself, and the PR against the code repository is the associated implementation (all appropriately "linked"). Does that make sense? Or is my fossil brain lost in the fluffy clouds? |
This captures very welll what we discussed over the phone. Like it. |
Yeah. That is a good point. I guess it depends on the situation. In this case, for example, there was already a proposal in mind when the issue was originally posted. So filling out an RFC-based issue template would be relatively straightforward. In another scenario, someone may have a use-case that they lay out in an issue but not have a solution to propose right away. Over the course of the discussion, a proposed change (or multiple competing changes) to the standard may emerge(s). How should we handle those types of issues, where the proposed change isn't know at submit time? Or the types of issues where there are multiple competing changes/proposals that emerge.
Are the "signed-of-by" line and the GitHub PR review approval interchangeable or do they serve distinct purposes?
Ok. I will submit a PR soon. The templates look to be version controlled files that can be placed in the repo by a PR, so I'll include those in the PR along with changes to the README.md and add a "Contributing" document. |
I would suggest we create a "use-case" label for the issue to indicate this is just a description of a desired behavior or problem
I would suggest creating an "RFC" label to be applied to each such proposed change, and add a comment "Ref #" to link it to the related use-case
They are distinct. The "signed-off-by" line has a very specific meaning derived from the Linux community's practice - it means that the author is signifying that this is their original work and they have the authority (from their company or whatever) to contribute it. The GitHub review is someone other than the author indicating that they approve of the proposed change. |
I think this is looking good. So we are talking about having:
Then on the Wiki we can accumulate links to the Use Case GH issues. This is instead of having a wiki page dedicated to the use case. I like the idea of having the use case in an Issue since then it is open for discussion. Whereas a wiki page does not have that option. We may want some language in the GH Issue template for Use Case authors letting them know that they need not fill out all of the RFC fields if they don't apply. Once the PR(s) associated with an RFC have been accepted then the RFC issue can be closed. The Use Case issue can be reopened if additional future work is needed (e.g., extensions to the model), but it might be more appropriate to open a new Use Case that extension. |
Yeah, I think that's all true. One minor suggestion: I'd have a "use-case" template that is separate from the "RFC" template. GH allows you to have multiple templates (IIRC) that the user can select from, so we can tailor the use-case one for that purpose. |
On the teleconf today we discussed quorum and voting. Since it is part of the proposal for the standardization process it is best to capture it here for discussion. Note that all of the below is still under discussion. Please provide constructive feedback so we make sure this works well for the community. A few additional notes on the PR process:
Voting on a Pull Request
Voting Eligibility:
|
On the topic of quorum/voting, I'm wondering if we are making the process unnecessarily formal. I worry that we will be spending a lot of time on process tracking when we might not need to. In the prior RFC process, the PMIx community used a notion of "silence is lack of dissent" which might be a more straightforward way to move forward. The idea is that folks can discuss issues on the PR and mailing list. Any dissent must be addressed by the author. The teleconfs serve as synchronization points for the discussion. After two teleconfs (first 'reading' and second 'vote to accept') the proposal is accepted if the dissenting opinions are addressed. Any strong objection can hold a PR, but more often it is in the best interest of everyone if we can resolve those objections before accepting a PR. The presentation of PRs is well publicized so folks can chime in on the GitHub PR if they cannot attend the teleconf. |
Thanks @jjhursey for the summary of the discussion today. However we end up implementing this, I think we should strive for a system where the "critical functions" (e.g., voting, objecting to/discussing changes, proposing changes) can be performed "asynchronously" (i.e., you are not required to join a conference call or a face-to-face meeting). The more we can push the "critical functions" onto mediums like GitHub, the better for those that reside outside of the Americas. That is not to say we shouldn't leverage phone calls or face-to-face meetings; I think we should. They are high bandwidth and generally very productive, but their discussions and outcomes should make their way back onto GitHub/the mailing list.
This is a good point. I certainly don't want to be the one sending out emails to coordinate and collect votes, ensure people have met the criteria for being active and eligible to vote, etc. That process is probably necessary at some point down the road, but may be not now.
This is probably controversial, so take the idea with a grain of salt, but IMO, we should try and keep as much of the discussion on one platform as possible, rather than fragmenting it across platforms. The mailing list is archived and searchable, which is great, but (for example) years from now, it would be arduous to reconstruction a discussion if it is split across GitHub and the mailing list. Maybe we leverage the mailing list for "broadcasts"/announcements, and GitHub issues/PRs for discussions/back-and-forths on particular topics? Just a thought.
Sounds reasonable to me. My only suggestion would be that in the case that a PR's original author cannot attend the concall, they can nominate someone else to "read" the PR on the concall and represent that change. Then the original author can address feedback/objections asynchronously (after they have been added to the GitHub PR by the representative or nominated recorder). |
I agree about the concern over making this too formal - I don't think we want to create a "PMIx Forum", at least at this stage.
Not controversial, IMO - I think we should confine discussion to the GitHub issues and PRs. I would suggest that the mailing list be used as you suggest.
Agreed - we should also allow that someone wishing to dissent in person but who cannot make the "accept" call can request that the PR be delayed until they can, subject to some reasonable time. For example, if someone is going on sabbatical for a year or taking a 3-month vacation, then they need to get someone else to represent them. Otherwise, dissents posted on the PR/issue can be addressed asynchronously there. |
In thinking about this some more over the weekend I am also leaning more towards the model of "silence is lack of dissent" vs a formal "PMIx Forum" model. We have decided to use the mailing list and GitHub to discuss topics so that the thought process is well archived. This also allows those that cannot attend the teleconferences to still actively participate in the process. Teleconferences are useful synchronization points that provide a high bandwidth discussion medium. We decided that the teleconf meeting notes will be archived on the wiki for those that cannot attend to review and discuss. A nice fallout of this PR is that decisions are made through the GitHub issues (or mailing list if necessary) so that the everyone gets an opportunity to voice support/dissent and the discussion is archived. I think that this model allows for the broadest participation with the least amount of logistical overhead and maintains (maybe the most important aspect of) traceability of the discussion. I think it is fine to pick the GitHub service as the primary medium for communication is good, with the mailing list being for administrative announcements (e.g., meeting notices), or general project queries that don't fit inside a GitHub issue. With that in mind, let me see if I can outline this notion of participation as a proposal.
|
There was an idea brought up during the concall that I'd like to be part of the above: the PR must be in use by n>1 end users to qualify for a reading (obviously this applies to changes in substance rather than presentation). As a related project, I'd like to get the source code from several different projects that use pmix and see what parts of the standard are being used by the wider community. That's probably worth starting a separate thread, though. |
@rountree I'm not sure I quite understand - how can a PR be in use by an end user?? It won't be in master and won't be in a release branch, so how/why would an end user get ahold of it and implement to it? |
The end users would be running a patched version of the reference implementation (or another impementation). If a feature is only useful to a single user, that feature probably shouldn't be in the standard and that user can keep running patched code. Once multiple users have deployed the feature, then we have high confidence that a) it works as advertised and b) it's broadly useful. At that point both the standard and the official reference implementation can be updated. |
I honestly don't believe any user of an implementation would agree to such a thing, especially with the possibility of it changing on them once the PR starts working thru this process or perpetually being in a state of limbo. How would you even monitor compliance? Would you require that they disclose their code to "prove" they were using it? Proprietary users would never agree to that requirement. If we look at other standards out there, we find many things that are used by a very small fraction of the community (MPI being a classic example). PMIx has always maintained that anyone has the right to not implement something - i.e., everyone has the right to say "not supported". Seems like that ought to be sufficient and we shouldn't be judging the usefulness of someone's feature based on how many other people want to use it. Perhaps it would be better if you expressed the concern that you are trying to address with this proposal. Are you worried that the standard incorporates features that are not widely used? If so, your implementation doesn't have to implement them unless your users want to utilize them, so how is it negatively affecting things? I'm not rejecting the idea, just trying to understand the motivation and why/how this would be necessary. |
@rhc54 The underlying concern is software specification versus software standardization. It's perfectly appropriate to specify new features for a particular implementation in hopes that they might eventually be useful, but a standard needs to reflect what's being used. As to compliance --- these are publicly visible changes to public interfaces. Proprietary implementations don't need to be shared. If IBM is shipping a flavor of pmix with a new interface, I'll take Josh's word for whether or not their users are finding that useful. If we start shipping a version of Flux with the same interface, I hope he'll take my word for it that our users like it, too. I certainly agree that there's a risk that changes won't make it into the standard, but having 2+ implementations using the change lowers that risk. The higher-risk approach is to propose a change that appears to benefit only a single set of users, or no users at all. Creating and maintaining patches that benefit only local users can still be a good investment, but that's not a sufficient reason for all conformant implementation to have to implement it as well. Which speaks to your next point: "the right not to implement" is fine for a specification, but it doesn't work for standards. When LIvermore puts out a request for proposals, we want to be able to say "the successful bidder will provide software conformant with the PMIx standard." The more bits that are optional, the less certain we are about what we're getting, and the more time we have to spend creating our own mini-standard for that contract. Our vendors hate that, and it's not a good use of our time, either. One way to resolve this might be to have separate development and reference implementations. The current specification document would describe the development implementation. There would be a low barrier to making changes to both and it could be used as a way of advertising new features that might eventually make it into the standard. The reference implementation would reflect only what was in the standard, which in turn would only reflect was was being used in the community. The standard would be mostly composed of "shall"s and "must"s, and it would be straightforward to determine if a particular version of a particular implementation was conformant to a particular version of the standard. Thoughts? |
On the call, I thought that we were discussing that once a PR was accepted that it be merged in and labeled as The interface classes allow us to communicate stability and wide use to the end user while not having to express the need for "PMIx Standard version X + PR 123 + PR 125" but instead "All stable interfaces in PMIx Standard version X + experimental interfaces in section 2.3 + 'widely used' interfaces in section 3.5 and 4.6". Or possibly, depending on the direction of the grouping/slicing discussion, "All interfaces from PMIx Standard version X in the use case chapter Bootstrapping and Fault Tolerance". I like the idea of accepting new items and using the interface stability classes to express stability a bit better than tracking patches. There are some tricky edges to stability classes that I'd like to explore in that discussion, but nothing that I don't think that we can resolve. Certainly, implementations can provide functionality beyond the standard but should caution users about them. I think that all PRs would tie back to a use case or problem statement. Maybe that should be called out in the PR or Issue template - "Description of and links to use case scenarios" or something like that. |
@jjhursey, thanks, that's indeed what I had (mostly [somewhat?]) remembered. I hadn't caught where the notes had landed; I'll go back and review those now. |
@rountree
I'm not sure I understand - this is common practice for every standard. Consider MPI as an example. Many MPI implementations don't support the dynamics APIs, and several don't even include those APIs in their headers. The historical way for dealing with this has been the method @jjhursey described - i.e., the Labs stipulate that "the software must support v3.1 of the PMIx Standard, minus the tools chapter", just like they do today for MPI. I'm afraid I don't see the issue here nor why PMIx should behave differently. I think we also need to be careful here that we recall PMIx is composed of very generic APIs plus attributes that can be rather specific. I'm unaware of any API (past, present, or envisioned) that would be specific to a given environment. However, there are a number of attributes which fall into that category. This is why we provided a method for querying the attributes supported by any given API and return both machine parsable and human readable output. Acceptance by RMs and others has been predicated on this philosophy that allows them to "not support" various APIs and attributes. What we have seen so far is that they tend to provide support for the launch-related APIs/attributes early, and then gradually expand their coverage over time based on the demands from their target market. Thus, as long as we organize the standard doc in a way that facilitates specifying desired support (both in terms of APIs and attributes), I think we should be okay with the current proposal.
I think we may be mixing terms here. Are you proposing that the PR must have two implementations adopt it, or two users? I can readily see scenarios where only one implementation supports an API or attribute as the decision to support is often based on market segment. For example, a PMIx implementation built by an RM targeting the small cluster market has no need for the APIs associated with "instant on", and thus would have no reason to implement them. If we see a world where there are dozens of implementations, then requiring at least two to support an API (or at least indicate their intent to support it) might be a reasonable requirement. However, we currently only know of perhaps 2-3 planned implementations, which makes any numerical requirement synonymous with unanimity - a bar that seems too high to me. If we look at users instead of implementations, then that provides a larger pool but creates its own problems. I have found it difficult to get users to work off of "proposed" APIs. Remember, someone has to implement and distribute the code, and then users have to modify their library/application to use the proposed API/attribute. A lot of risk being taken there. People prefer to review the PR and see it accepted first, then get the updated implementation and begin to integrate it into their app. This is why I support the "experimental" vs "stable" identification. At least you know the API is in the Standard and therefore can reasonably expect it not to change. Ditto for attributes, though we'll need to figure out more details on how to classify them. HTH |
Notes from Teleconf April 26, 2019:
|
I opened a Draft PR to try and capture the process described in this Issue: Please take a look and make sure I captured everything from this Issue. A couple things that I did not include in this draft:
|
Notes from Teleconf May 10, 2019 Regarding discussion here We circled back to the scenario:
The general sentiment was that the community may choose to side with the PR author and move the PR forward. The question becomes how do we define that the community is in agreement, or at least a majority of the community, to move it forward? Suggestion was:
|
I think it addresses the point I raised. We may want to clarify what simple majority of participants means, which I think, is also what you mean with your question mark. It could be the standard 1/2+1 or 2/3 (2/3 would ensure it is not too controversial). |
Notes from May 17, 2019 teleconf:
|
PR #193 is another proposal towards this issue. |
FWIW, I ran across a straw poll "in the wild" while looking at json-schema. I thought their template was quite nice. Here is an example poll: json-schema-org/json-schema-spec#15 (comment) If others like it, maybe we leverage it/tweak it for PMIx's straw polls. |
Per pmix#181 (comment) Signed-off-by: Ralph Castain <[email protected]>
@SteVwonder Added some definition based on this to the proposed process - see what you think. |
Question: Will PR #193 close this issue or is there more to do? |
Per teleconf July 26, 2019 and Aug. 2, 2019 we think that this can be closed now that PR #193 has been merged. If there are outstanding issues to resolve this Issue can be reopened or (preferably) a new issue can be filed for discussion. |
Problem Statement
During the Runtime Standards meeting in Chattanooga, there was some discontent expressed with regards to the current standardization process. I believe the main sources of discontent were over the mechanisms for achieving consensus (two weeks of no objections during the conference call) and in regards to stability/backwards-compatibility (i.e., there is no formal support for an "experimental" interface or attribute). Note: this issue only focuses on the former, the latter is discussed in #179.
Speaking personally, I have found great value in going back and reading the PMIx RFCs. For the changes/interfaces/attributes that have RFCs, I find the less formal language and documentation of the thought processes motivating a change very helpful. While rigorous formal processes may seem like an unnecessary burden, I think it is beneficial to have processes in place to ensure that the discussions and thought processes behind changes are documented and preserved.
Background
During last weeks (April 5, 2019) concall, we discussed the current standardization process and the above problems. Notes from the call can be found here. We decided it would be appropriate to follow the proposed process when proposing the changes (how meta of us).
The main proposal is to generally follow the COSS model, but use issues & PRs in the pmix-standard repo rather than RFCs. As @rhc54 mentioned in #179, this is essentially what the PMIx community is already doing. The main differences (from what I can tell) between this proposal and what the community has currently documented are formally requiring certain documentation in a PR before it can be merged (dates of concalls where consensus was achieved, a link to the spawning GH issue, etc) and "nitty gritty details" (don't squash commits, use draft PRs, etc.)
Proposed Process
A change to PMIx starts with a new issue on GitHub. The issue should highlight the problem with the current standard but does not need to contain proposed changes, standards language, or working code (although any/all of those are nice additions). The main motivation of opening an issue is to document problems/suggestion that people have, start discussions around potential solutions/changes, and keep everyone in sync with current efforts in the community.
Once the discussion in the GitHub issue has reached a point where concrete changes to the standards documents are ready to be proposed, then a Pull Request (PR) should be opened with the proposed changes. Initially, the PR only needs to contain a reference to the aforementioned GitHub issue and the proposed changes to the wording of the PMIx document. If the PR does not contain a reference to a working implementation, we recommend that a GitHub Draft PR be used. Discussion around the specific proposed changes can then take place within GitHub's PR comment thread. Based on the discussion of the PR, the PR may need to be adjusted. The PR author should push a new commit without squashing so that the history of the PR is preserved. If there is a competing change or the wording of the current PR has diverged significantly from the original wording, a new PR can be created.
The next step for the proposed change is implementation. Once the change has been implemented, the PR shall be updated with a link to the implementation. If the PR was opened as a "Draft PR", at this stage, it should be converted into a regular PR. Discussion on the proposed change should continue online until consensus is reached.
Once consensus is reached online, the PR must then be discussed at least twice on the weekly PMIx conference call. If there are no changes or objections two weeks in a row, then the PR must be updated with the dates of the two concalls as well as links to the notes from those concalls so that it is easy to see the discussion that happened and how consensus was reached. At this point, the PR is eligible to be merged.
Things to discuss/decide
What level of implementation is sufficient for merging a PR? Is a branch of or PR against the reference implementation sufficient? Should we require the code actual be merged into the mainline first? Is it a judgement call by those reviewing the standard document PR?
Who can "press the button" and merge a PR? The Collective Code Construction Contract has a rule that a maintainer cannot accept their own "patch" (PR in this case). With all of these "checklist" items, it seems like a useful convention to have a second pair of eyes go over the PR and ensure all the steps have been followed before a PR is merged.
Does every little change to the standard need to go through this process? Does every change require both an issue and a PR (can spelling mistakes and format fixes go straight to a PR)?
There was a discussion of leveraging GitHub's issue and PR templates. I think that is a good idea. What should they include?
Strawman proposal for PR template:
The text was updated successfully, but these errors were encountered: