Skip to content

GSOD 2022 Case Study

bettinagruen edited this page Dec 2, 2022 · 16 revisions

Expand and Reorganize the R Development Guide

Organization or Project: The R Project (Season of Docs 2022 wiki)

Organization Description: The R Project maintains and develops R, an integrated suite of software facilities for data manipulation, calculation, and graphical display. There are thousands of add-on packages available through the Comprehensive R Archive Network (CRAN), extending the core functionality of R to cover a wide range of tools for modern data science.

Authors: Heather Turner (@hturner)

Date: 30 November 2022

Problem Statement

What problem were you trying to solve with new or improved documentation?

In the past, novice contributors to the R project had to pick up information on the development process from disparate sources, some of which are difficult to find and most of which assume a lot of experience and background knowledge. A first draft of the R Development Guide (or R Dev Guide for short) was written in early 2021 to address this issue. As people began to refer to the guide, they started suggesting new topics that would be useful to cover and other ways to make the guide more useful to the community, such as making it easier to find and improving the organization of topics. By responding to this feedback, we hope to make the guide the go-to resource for new contributors and a useful reference to support outreach activities aimed at broadening the community of contributors.

Proposal Abstract

A brief summary of your original organization proposal. Link to the proposal page on your project site, if possible.

The proposal for Season of Docs was closely linked to issues that had been raised by the community. The main objectives were to add new chapters or sections covering missing topics and make it more helpful to newcomers by adding further examples and improving the organization of material. At the same time, we planned to work on miscellaneous issues, such as creating the guide in alternative formats and improving the infrastructure for contributors to the guide itself.

Project Description

Creating the proposal

How did you come up with your Season of Docs proposal? What process did your organization use to decide on an idea? How did you solicit and incorporate feedback?

During the Season of Docs Exploration phase, one of the administrators opened a number of issues on the rstats-gsod22 GitHub repository with potential project ideas arising from meetings of the R Contribution Working Group, the R Forwards taskforce for underrepesented groups, the R Consortium Working Group on Repositories, and other forums of the R community. Further ideas were proposed by others. The project ideas were openly discussed on the GitHub issues as well as on the R Contributors Slack and in R Contribution Working Group meetings. We prioritised the R Dev Guide proposal as it was in direct support of R itself (vs other parts of the ecosystem such as The R Journal or The Comprehensive R Archive Network), it was aligned with the current activities of the R Contribution Working Group (who oversee the R Project's involvement in Season of Docs) and it was unlikely to be addressed through other means (e.g., Google Summer of Code, or the work of other working groups or individuals). We also had community members interested to be technical writers for this project.

Once the project idea was settled, we drafted the proposal in the open on our rstats-gsod22 repository. We solicited feedback from the interested technical writers and people we had invited to be on a Steering Committee. Many people edited the proposal directly or gave feedback to the Season of Docs admins to incorporate.

Budget

Include a short section on your budget. How did you estimate the work? Were there any unexpected expenses? Did you end up spending less than the grant award? Did you allocate funds properly or were some items you budgeted for more/less/unnecessary? Did you have other funds outside of Season of Docs that you were able to use?

We designed our proposal to require similar effort and personnel to our Season of Docs 2021 project, so that we could use the same budget (we did not consider it necessary to increase the rates this time around). In particular, we budgeted for two technical writers from lower-middle-income countries to work approximately 30 hours/month over 6 months and one project manager from a high-income country to work 4-6 hours/month over 6 months. Members of the steering committee were given the option of a $500 stipend - some asked in advance for a donation to be made in lieu of a stipend, so we could include this in the budget. We did not have any additional funding, so we also needed to budget for payment processing fees.

At the end of Season of Docs, we have spent less than the grant award, as various factors hampered initial progress and the technical writers have not worked as many hours as expected. This means we have not fully achieved the project objectives, so we plan to use the funds to continue work beyond the official timeline.

Participants

Who worked on this project (use usernames if requested by participants)? How did you find and hire your technical writer? How did you find other volunteers or paid participants? What roles did they have? Did anyone drop out? What did you learn about recruiting, communication, and project management?

The people who worked on this project were:

The following people contributed as volunteers on the Steering Committee: Michael Lawrence (@lawremi), Thomas Lumley (@tslumley), Carol Willing (@willingc), Bettina Grün (@bettinagruen), Toby Dylan Hocking (@tdhock), Mine Çetinkaya-Rundel (@mine-cetinkaya-rundel), Ben Ubah (@benubah), and Heather Turner (@hturner) (also Season of Docs admin).

Initially, we recruited another person as Technical Writer 2 who had approached us early in the exploration phase and shown enthusiasm for the project idea. We discussed the proposal and he identified areas he would be most interested to work on, which matched well with our needs. He had been a technical writer for Season of Docs the previous year for another open source software project, so we felt confident selecting him for the role. Unfortunately, it did not work out. After our kick-off meeting, we lost communication and it turned out he had been ill. Since it was the start of the project, we were relaxed about a delay. However, we went through a frustrating cycle of finding it difficult to reach him, having the odd catch-up where he returned with renewed enthusiasm and promises of delivering work, but no such work materialising. Eventually we decided to find a replacement, which took some time to sort out.

In the end, both technical writers were recruited from the R Contribution Working Group. Saranjeet authored the first version of the R Dev Guide and Lluís contributed as a volunteer reviewer on that version, so both were very familiar with the project. The project manager was recommended by one of the Season of Docs admins, Matthew Bannert. The Steering Committee were selected on the basis of their roles in the R/open source communities; many had also contributed as volunteer reviewers on the first version of the guide.

We followed as similar process to recruitment and project management as for Season of Docs 2021. In that project we also had to replace a technical writer part-way through. At the time, it felt like a one-off issue, due to family bereavement. However, it seems we must be more prepared to deal with such situations. In both years, the whole project was derailed for some time, even though the technical writers were assigned different tasks, because the project manager and Season of Docs admins spent the small amount of time they had on the project trying to sort the matter out.

Communicating with a Steering Committee spread around the globe also presented challenges. We often held meetings in two parts to catch people in different time zones and would still miss people. Although we intended to record meetings this often got forgotten, so it was difficult to keep everyone in the loop. Keeping meeting minutes on the project wiki and creating a channel on the R Contributors Slack helped mitigate this a little, but it was still important to update people by email.

Timeline

Give a short overview of the timeline of your project (indicate estimated end date or intermediate milestones if project is ongoing). Did the original timeline need adjustment?

At the beginning of the project, we agreed how to divide the planned tasks between the two technical writers. We originally planned for the tasks to be completed in two phases and to have meetings with the Steering Committee at the start, middle and end of each phase.

We had to adjust our plans as we went along due to the issues discussed in the Participants section. The timeline below reflects the main tasks completed by each technical writer (or their replacement) in each phase. We have added two additional phases beyond the official deadline to show our plans for ongoing work.

Stage Technical Writer 1 Technical Writer 2 Completed By
Kick-off meeting 17 May
Phase 1a Add contributors table; start on translations chapter 2 July
Mid-phase 1 meeting 7 July
Phase 1b Merge previous work with contributed translations chapter 30 July
Delivery Phase 1/Kick-off Phase 2 meeting 11 August
Phase 2a Improve book/website infrastructure; update documentation chapter Add instructions on installing from source on Linux 29 September
Mid-phase 2 meeting 29 September
Phase 2b Write new introduction chapter; draft chapter on tests for R Add instructions on installing from source on Windows; draft section on GitHub workflow for testing patches 28 November
Delivery Phase 2 meeting 28 November
Prepare and submit case study 30 November
Phase 3 Complete chapters on tests and translations; add examples Complete section on GitHub workflow 31 January
Phase 4 Add instructions on installing from source on macOS February/March

Saranjeet can continue to work in December and January, so she will be able to work on meeting other key deliverables originally allocated to her. Lluís is not available to work after the end of November, so we may redirect some funding to another writer to document installing R from source on macOS. Since we do not yet have a suitable writer in place, it is unclear when this part will be completed.

Results

What was created, updated, or otherwise changed? Include links to published documentation if available. Were there any deliverables in the proposal that did not get created? List those as well. Did this project result in any new or updated processes or procedures in your organization?

Key deliverables for this project are shown in bold.

Deliverables created

  1. New chapter on Message Translations. This explains the message translation infrastructure of .mo, .po and .pot files.
  2. New sections on installing R from source in the chapter R Patched and Development Versions, covering Linux and Windows.
  3. New Introduction chapter, to give a starting point for new contributors and an overview of the guide.
  4. Cover page meeting the requirements to get the guide listed on the bookdown.org home page. This does not appear as a cover page in the HTML version, but shows in social media cards as demonstrated on the OpenGraph.xyz previewer.
  5. Revised landing page, with updated acknowledgements and a new CC BY 4.0 license.
  6. Improved contributor infrastructure, including contributors table generated by the All Contributors bot and GitHub issue templates.
  7. Updated GitHub infrastructure, including changing branch name from master to main and updating GitHub action to deploy PDF and EPUB versions in addition to HTML.
  8. Minor editing, including adding a favicon, and fixing broken links and typos.

Deliverables in progress

  1. New chapter on tests for base R. PR #73 contains a first draft that has been reviewed.
  2. Describing a git workflow for testing proposed patches. PR #110 contains a first draft that has been reviewed.
  3. Reviewing the R Dev Guide to improve its readability and structure.

Deliverables without progress so far

  1. A case study chapter giving an end-to-end example.
  2. Expanding the existing chapters with simple & beginner-friendly examples.
  3. Minting a DOI for the guide &/or repo. Publishing it on Zenodo.

Metrics

What metrics did you choose to measure the success of the project? Were you able to collect those metrics? Did the metrics correlate well or poorly with the behaviors or outcomes you wanted for the project? Did your metrics change since your proposal? Did you add or remove any metrics? How often do you intend to collect metrics going forward?

The metrics we proposed and the current status is outlined below:

  1. The key deliverables are met.

    The key deliverables have been partially met as detailed below

    • New chapter on translations. The new chapter only partially addresses the original issue #34 since it does not yet document how to contribute translations.
    • New chapter on tests for base R. The draft chapter requires substantial work to address the motivating issue #28 before it can be published.
    • Describing a git workflow for testing proposed patches. This section is still in draft so the motivating issue #23 can not be closed.
    • Adding novice-friendly instructions on installing R from source. The new sections are sufficient to close the motivating issue #33 and a Windows-specific issue #29, however a macOS specific issue #62 is outstanding.
    • Including more examples in the existing chapters to serve as a reference for new/future contributors. Examples have been included in new sections, but we still wish to add simple & beginner-friendly examples to the Bug Tracking and Reviewing Bugs chapters.
    • Reviewing the R Dev Guide to improve its readability and structure. As the new chapters and sections were added late in the project, work is still required to improve its overall structure.
  2. At least 75% of the open issues on the R Dev Guide GitHub repository are closed or at least become work-in-progress.

    Of the 17 issues open at the start of GSoD 2022, 8 issues have been closed and 6 issues are in progress (82% overall)

    Status Issues Count
    Closed #1, #20, #29, #30, #32, #33, #34, #67 8
    In progress #22, #23, #9, #28, #35, #64 6
    Open no progress #2, #25, #62 3
  1. Monthly visitor counts for existing chapters increase by at least 10%.

    Visitor counts for the most popular chapters have increased by at least 10%

    The following table shows the visitor counts for the month prior to Season of Docs (April 14 - May 15) and the final month of Season of Docs (October 14 - November 15):

    Chapter Topic Apr-May Oct-Nov % increase
    R Installation 38 56 47
    Bug Tracking 23 39 70
    Finding the Source Code 15 31 107
    Developer Tools 11 10 -9
    R Core Developers 9 5 -44
    Reviewing Bugs 8 10 25
    Lifecycle of a Patch 7 7 0
    Where to Get Help 7 5 -29
    Testing Pre-release Versions 7 4 -43
    Documenting R 7 22 214
    News and Announcements 3 4 33

    Visitor counts for some chapters are quite low (<15 per month); for these chapters it is not meaningful to consider percentage changes corresponding to a difference of less than 5 people. Only one chapter with previously low counts, on documenting R, showed a meaningful increase of 15 people (over 200% increase). For the more commonly visited chapters, visitor counts increased by at least 47%, comfortably meeting our target.

  2. Monthly visitor counts for new chapters are commensurate with existing chapters at a similar technical level.

    Visitor counts for new chapters added at least a month before the end of Season of Docs are commensurate with existing chapters at a similar technical level

    By the final month of Season of Docs (October 14 - November 15), two new chapters had been added, a new introduction chapter and a chapter on message translations. Their visitor counts are shown below:

    Chapter Topic Oct-Nov
    Introduction 37
    Translating R Messages 4

    Visitor counts for the introduction are similar to those for the bug tracking chapter, both of which are suitable starting points for new contributors. Visitor counts for the translation chapter are similar to the counts for chapters suited to more experienced contributors such as Testing Pre-release Versions or Lifecycle of a Patch. This is appropriate for the current content, which focuses on how message translations work, rather than how to contribute message translations.

  3. The upgraded version of the R Dev Guide is listed on the home page of bookdown.org.

    The R Dev Guide is not yet listed on the home page of bookdown.org

    We have added the basic infrastructure to enable the R Dev Guide to be listed on the bookdown.org homepage and restarted discussion with the maintainers of bookdown.org in issue 165 of their GitHub repo to make this to happen.

  4. At least two community members that are not directly involved in the project contribute corrections/updates to the documentation, by submitting issues, making a pull request or making a commit to the git repository.

    In addition, the wider community contributed to the new content on installing R from source on Linux. Henrik Bengtsson suggesting adding specific instructions for Fedora-based distributions in the discussion on PR 65, while Martin Maechler and Iñaki Ucar helped work these instructions out in the discussion on PR 105.

The metrics adequately capture our progress on improving the R Dev Guide, the involvement of the community indicating the sustainability of the project moving forward and the level of interest from the wider community. We will continue to monitor visitor numbers to evaluate the visibility and usefulness of the guide.

Analysis

What went well? What was unexpected? What hurdles or setbacks did you face? Do you consider your project successful? Why or why not? (If it's too early to tell, explain when you expect to be able to judge the success of your project.)

Although we have not reached all our metrics, we have made substantial improvements to the R Dev Guide, addressing many of the motivating issues that the community had raised. It was good to have involvement from the wider community during the Season of Docs project, with both large and small contributions to the text, as well as helpful discussions as the content was being drafted. The Steering Committee also made direct contributions and provided useful input through reviewing the drafts on GitHub or through the discussions at the five catch-up meetings we arranged throughout Season of Docs.

A development that we did not forsee was a member of the wider community setting up an experimental weblate server for people to contribute message translations, as an alternative to the existing workflow of directly editing .po files. The weblate server was set up in parallel to the new chapter on message translations being drafted and the server has been tweaked and trialed since. New contributors are finding the weblate server easier than the existing workflow, so we held back on documenting that as the way to contribute message translations, but were not in a position to document the new workflow while it was still being established.

As discussed in the Participants section, the main setback we faced was the availability of our technical writers in the early months of Season of Docs. For the work planned for Technical Writer 2, no progress was made until Lluís was recruited at the end of August. Lluís did a good job of catching up with the tasks, getting close to the key deliverables for that side of the project. The main outstanding task is to add instructions for installing R on macOS - Lluís found it difficult to make progress here as he did not have direct access to a macOS machine. For the work planned for Technical Writer 1, progress was slow in the first half, with Saranjeet mainly addressing minor issues, though her work was boosted by an outside contributor writing the first draft of the new translations chapter. Progress picked up in the second half as the project got back on track and the writers were in more frequent communication with the project manager and the Season of Docs admin. So Saranjeet also made good progress on her key deliverables by the end of Season of Docs.

Overall, we are pleased with the progress made given the hours worked by each writer. However, as outlined in the Timeline section, we are behind schedule and plan to continue work. For recently added content or work in progress we need more time to get feedback and to assess whether people are viewing the new material. We should be in a better position to judge the success of the project in February/March.

Summary

In 2-4 paragraphs, summarize your project experience. Highlight what you learned, and what you would choose to do differently in the future. What advice would you give to other projects trying to solve a similar problem with documentation?

We have appreciated the opportunity to give the R Dev Guide a major upgrade, making it more relevant and useful for novice contributors. By working with technical writers that are interested themselves in contributing to the R project, it has enabled them to deepen their own understanding of the R development process. This is part of our aim in involving community members with this documentation project: not only to distribute the work, but to foster a wider community of people with the knowledge and skills to contribute to the R project.

As with our first experience of Season of Docs in 2021, we found that the team came together and worked more effectively towards the end of the project. While it will inevitably take time to gel as a team and for writers to get going on a new project, we need to adjust our approach to make more progress earlier on. We expected the technical writers to plan their own work given a set of allocated tasks, but this led to late delivery of material to review, meaning the Steering Committee had little opportunity to influence the work.

In future, we will ask candidates to make a small contribution as part of the recruitment process, so we can evaluate how effectively they work with us, as well as their ability to grasp a topic and document it well. We will also need to set clearer expectations around the regularity of commits and reporting hours worked, so that if our expectations are not met or personal circumstances mean that someone needs to withdraw, we can end our agreement with a fair pro-rata payment.

Our advice to other projects would be to engage as much as possible with the target audience. The fact that we made our plans based on suggestions made by community members via GitHub issues, or as a result of interacting with potential readers in meetings and events, means we can be confident that people will be interested to read the resulting documentation. This is evidenced by our increasing visitor numbers and we hope that the virtuous circle will continue to maintain this resource in the future.

Appendix

If you have other materials you'd like to link to (for example, if you created a contract for working with your technical writer that you'd like to share, or templates for your documentation project, or other open documentation resources, you can list and link them here). The Appendix is also a good place to list links to any documentation tools or resources you used, or a place to add thanks or acknowledgments that might not fit into the sections above.

The R Dev Guide is written in bookdown. We would like to acknowledge Jonathan Godfrey for his advice on the accessibility of different bookdown formats and alternatives to bookdown, that led to us sticking with bookdown and the GitBook format for the present time.

Project links