diff --git a/docs/RFCS/00000000_template.md b/docs/RFCS/00000000_template.md index c106a9786314..788ce6724881 100644 --- a/docs/RFCS/00000000_template.md +++ b/docs/RFCS/00000000_template.md @@ -8,11 +8,14 @@ **Remember, you can submit a PR with your RFC before the text is complete. Refer to the [README](README.md#rfc-process) for details.** -# Summary +**Remember, you can either fill in this template from scratch, for +example if you prefer working from a blank slate, or you can follow +the writing prompts in the [GUIDE](GUIDE.md). In any case, please ensure +at the end that you have all relevant topics from the guide covered in +your prose.** -One paragraph explanation of the proposed change. +# Summary -Suggested contents: - What is being proposed - Why (short reason) - How (short plan) @@ -20,85 +23,24 @@ Suggested contents: # Motivation -Why are we doing this? What use cases does it support? What is the expected outcome? - -# Guide-level explanation - -How do we teach this? - -Explain the proposal as if it was already included in the project and -you were teaching it to another CockroachDB programmer. That generally means: - -- Introducing new named concepts. -- Explaining the feature largely in terms of examples. Take into account that a product manager (PM) will want to connect back the work introduced by the RFC with user stories. Whenever practical, do ask PMs if they already have user stories that relate to the proposed work, and do approach PMs to attract user buy-in and mindshare if applicable. -- Explaining how CockroachDB contributors and users should think about - the feature, and how it should impact the way they use - CockroachDB. It should explain the impact as concretely as possible. -- If applicable, provide sample error messages, deprecation warnings, or migration guidance. -- If applicable, describe the differences between teaching this to - existing roachers and new roachers. - -For implementation-oriented RFCs (e.g. for core internals), this -section should focus on how contributors should think about -the change, and give examples of its concrete impact. For policy RFCs, -this section should provide an example-driven introduction to the -policy, and explain its impact in concrete terms. - -# Reference-level explanation - -This is the technical portion of the RFC. Explain the design in sufficient detail that: +Audience: PMs, end-users, CockroachDB team members. -(You may replace the section title if the intent stays clear.) +# Technical design -- Its interaction with other features is clear. -- It covers where this feature may be surfaced in other areas of the product - - If the change influences a user-facing interface, make sure to preserve consistent user experience (UX). Prefer to avoid UX changes altogether unless the RFC also argues for a clear UX benefit to users. If UX has to change, then prefer changes that match the UX for related features, to give a clear impression to users of homogeneous CLI / GUI elements. Avoid UX surprises at all costs. If in doubt, ask for input from other engineers with past UX design experience and from your design department. -- It considers how to monitor the success and quality of the feature. - - Your RFC must consider and propose a set of metrics to be collected, if applicable, and suggest which metrics would be useful to users and which need to be exposed in a public interface. - - Your RFC should outline how you propose to investigate when users run into related issues in production. If you propose new data structures, suggest how they should be checked for consistency. If you propose new asynchronous subsystems, suggest how a user can observe their state via tracing. In general, think about how your coworkers and users will gain access to the internals of the change after it has happened to either gain understanding during execution or troubleshoot problems. -- It is reasonably clear how the feature would be implemented. -- Corner cases are dissected by example. - -The section should return to the examples given in the previous -section, and explain more fully how the detailed proposal makes those -examples work. - -## Detailed design - -What / how. - -Outline both "how it works" and "what needs to be changed and in which order to get there." - -Describe the overview of the design, and then explain each part of the -implementation in enough detail that reviewers will be able to -identify any missing pieces. Make sure to call out interactions with -other active RFCs. +Audience: CockroachDB team members, expert users. ## Drawbacks -Why should we *not* do this? - -If applicable, list mitigating factors that may make each drawback acceptable. - -Investigate the consequences of the proposed change onto other areas of CockroachDB. If other features are impacted, especially UX, list this impact as a reason not to do the change. If possible, also investigate and suggest mitigating actions that would reduce the impact. You can for example consider additional validation testing, additional documentation or doc changes, new user research, etc. - -Also investigate the consequences of the proposed change on performance. Pay especially attention to the risk that introducing a possible performance improvement in one area can slow down another area in an unexpected way. Examine all the current "consumers" of the code path you are proposing to change and consider whether the performance of any of them may be negatively impacted by the proposed change. List all these consequences as possible drawbacks. +... ## Rationale and Alternatives -This section is extremely important. See the -[README](README.md#rfc-process) file for details. +... + +# Explain it to folk outside of your team -- Why is this design the best in the space of possible designs? -- What other designs have been considered and what is the rationale for not choosing them? -- What is the impact of not doing this? +Audience: PMs, doc writers, end-users, CockroachDB team members in other areas of the project. -## Unresolved questions +# Unresolved questions -- What parts of the design do you expect to resolve through the RFC - process before this gets merged? -- What parts of the design do you expect to resolve through the - implementation of this feature before stabilization? -- What related issues do you consider out of scope for this RFC that - could be addressed in the future independently of the solution that - comes out of this RFC? +Audience: all participants to the RFC review. diff --git a/docs/RFCS/GUIDE.md b/docs/RFCS/GUIDE.md new file mode 100644 index 000000000000..cf08ed4c0cff --- /dev/null +++ b/docs/RFCS/GUIDE.md @@ -0,0 +1,218 @@ + +# Summary + +One paragraph explanation of the proposed change. + +Suggested contents: +- What is being proposed +- Why (short reason) +- How (short plan) +- Impact + +# Motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +Is there a PM in this product area already? Does the PM know of user +stories that relate to the proposed work? Can we list these user +stories here? (Specific customer names need not be included, for +confidentiality, but it is still useful to describe their use cases.) + +# Technical design + +This is the technical portion of the RFC. Explain the design in sufficient detail. + +Important writing prompts follow. You do not need to answer them in +this particular order, but we wish to find answers to them throughout +your prose. + +Some of these prompts may not be relevant to your RFC; in which case +you can spell out “this change does not affect ...” or answer “N/A” +(not applicable) next to the question. + +- Questions about the change: + + - What components in CockroachDB need to change? How do they change? + + This section outlines the implementation strategy: for each + component affected, outline how it is changed. + + - Are there new abstractions introduced by the change? New concepts? + If yes, provide definitions and examples. + + - How does this work in a multi-tenant deployment? + + - How does the change behave in mixed-version deployments? During a + version upgrade? Which migrations are needed? + + - Is the result/usage of this change different for CC end-users than + for on-prem deployments? How? + + - What are the possible interactions with other features or + sub-systems inside CockroachDB? How does the behavior of other code + change implicitly as a result of the changes outlined in the RFC? + + (Provide examples if relevant.) + + - Is there other ongoing or recent RFC work that is related? + (Cross-reference the relevant RFCs.) + + - What are the edge cases? What are example uses or inputs that we + think are uncommon but are still possible and thus need to be + handled? How are these edge cases handled? Provide examples. + + - What are the effect of possible mistakes by other CockroachDB team + members trying to use the feature in their own code? How does the + change impact how they will troubleshoot things? + +- Questions about performance: + + - Does the change impact performance? How? + + - If new algorithms are + introduced whose execution time depend on per-deployment parameters + (e.g. number of users, number of ranges, etc), what is their + high-level worst case algorithmic complexity? + + - How is resource usage affected for “large” loads? For example, + what do we expect to happen when there are 100000 ranges? 100000 + tables? 10000 databases? 10000 tenants? 10000 SQL users? 1000000 + concurrent SQL queries? + +- Stability questions: + + - Can this new functionality affect the stability of a node or the + entire cluster? How does the behavior of a node or a cluster degrade + if there is an error in the implementation? + + - Can the new functionality be disabled? Can a user opt out? How? + + - Can the new functionality affect clusters which are not explicitly + using it? + + - What testing and safe guards are being put in place to + protect against unexpected problems? + +- Security questions: + + - Does the change concern authentication or authorization logic? If + so, mention this explicitly tag the relevant security-minded + reviewer as reviewer to the RFC. + + - Does the change create a new way to communicate data over the + network? What rules are in place to ensure that this cannot be + used by a malicious user to extract confidential data? + + - Is there telemetry or crash reporting? What mechanisms are used to + ensure no sensitive data is accidentally exposed? + +- Observability and usage questions: + + - Is the change affecting asynchronous / background subsystems? + + - If so, how can users and our team observe the run-time state via tracing? + + - Which other inspection APIs exist? + + (In general, think about how your coworkers and users will gain + access to the internals of the change after it has happened to + either gain understanding during execution or troubleshoot + problems.) + + - Are there new APIs, or API changes (either internal or external)? + + - How would you document the new APIs? Include example usage. + + - What are the other components or teams that need to know about the + new APIs and changes? + + - Which principles did you apply to ensure the APIs are consistent + with other related features / APIs? (Cross-reference other APIs + that are similar or related, for comparison.) + + - Is the change visible to users of CockroachDB or operators who run CockroachDB clusters? + + - Are there any user experience (UX) changes needed as a result of this RFC? + + - Are the UX changes necessary or clearly beneficial? (Cross-reference the motivation section.) + + - Which principles did you apply to ensure the user experience + (UX) is consistent with other related features? + (Cross-reference other CLI / GUI / SQL elements or features + that have related UX, for comparison.) + + - Which other engineers or teams have you polled for input on the + proposed UX changes? Which engineers or team may have relevant + experience to provide feedback on UX? + + - Is usage of the new feature observable in telemetry? If so, + mention where in the code telemetry counters or metrics would be + added. + +The section should return to the user stories in the motivations +ection, and explain more fully how the detailed proposal makes those +stories work. + +## Drawbacks + +Why should we *not* do this? + +If applicable, list mitigating factors that may make each drawback acceptable. + +Investigate the consequences of the proposed change onto other areas +of CockroachDB. If other features are impacted, especially UX, list +this impact as a reason not to do the change. If possible, also +investigate and suggest mitigating actions that would reduce the +impact. You can for example consider additional validation testing, +additional documentation or doc changes, new user research, etc. + +Also investigate the consequences of the proposed change on +performance. Pay especially attention to the risk that introducing a +possible performance improvement in one area can slow down another +area in an unexpected way. Examine all the current "consumers" of the +code path you are proposing to change and consider whether the +performance of any of them may be negatively impacted by the proposed +change. List all these consequences as possible drawbacks. + +## Rationale and Alternatives + +This section is extremely important. See the +[README](README.md#rfc-process) file for details. + +- Why is this design the best in the space of possible designs? +- What other designs have been considered and what is the rationale for not choosing them? +- What is the impact of not doing this? + +# Explain it to someone else + +How do we teach this? + +Explain the proposal as if it was already included in the project and +you were teaching it to an end-user, or a CockroachDB team member in a different project area. + +Consider the following writing prompts: + +- Which new concepts have been introduced to end-users? Can you + provide examples for each? + +- How would end-users change their apps or thinking to use the change? + +- Are there new error messages introduced? Can you provide examples? + If there are SQL errors, what are their 5-character SQLSTATE codes? + +- Are there new deprecation warnings? Can you provide examples? + +- How are clusters affected that were created before this change? Are + there migrations to consider? + +# Unresolved questions + +- What parts of the design do you expect to resolve through the RFC + process before this gets merged? + +- What parts of the design do you expect to resolve through the + implementation of this feature before stabilization? + +- What related issues do you consider out of scope for this RFC that + could be addressed in the future independently of the solution that + comes out of this RFC? diff --git a/docs/RFCS/README.md b/docs/RFCS/README.md index 2e5fb69348e2..e97da35839d4 100644 --- a/docs/RFCS/README.md +++ b/docs/RFCS/README.md @@ -56,9 +56,20 @@ guidance and to help shepherd your RFC through the process. 2. Copy `00000000_template.md` to a new file and fill in the details. Commit this version in your own fork of the repository or - a branch. Your commit message (and corresponding pull request) + a branch. Your commit message (and corresponding pull request) should include the prefix `rfc`. Eg: `rfc: edit RFC template` + If you are a creative person, you may prefer to start with this blank + slate, write your prose and then later check that all necessary topics + have been covered. + + If you feel intimidated by a blank template, you can instead peruse + the list of requested topics and use the questions in there as + writing prompt. + + The list of topics and questions that can serve as writing guide + can be found in the separate file [GUIDE.md](GUIDE.md). + 3. Submit a pull request (PR) to add your new file to the main repository. Each RFC should get its own pull request; do not combine RFCs with other files.