Rough draft of contributor help script #4293

sarayourfriend · 2024-05-09T07:36:17Z

Fixes

Description

Adds a new script get-started.sh for contributors to run. The script checks for dependencies on their system and gives suggestions for how to install them.

I've gone with a basic approach here, and we can expand or change references however we see fit. One thing I've noticed is that the installation suggestions naturally overlap with the documentation we have in the getting started page. Is there a reasonable way to merge these? Should we go "all in" on the script? Or should the script refer to the documentation pages instead of putting the suggestions in-line? The latter is probably the clearest and easiest way to remove the duplication. What do y'all think?

Testing Instructions

I've added a temporary Dockerfile and just recipe to make testing the different scenarios as easy as possible.

Run just test-get-started <target> with each of the targets defined in Dockerfile.get-started-test:

no-deps
with-python
with-just
with-docker
with-docker-compose
with-node
with-pnpm-direct
with-everything

I'll remove the Dockerfile and the just recipe before merging.

Output when there's nothing missing

Welcome to...                                                                                                                                      

   ____
  / __ \
 | |  | |_ __   ___ _ ____   _____ _ __ ___  ___
 | |  | | '_ \ / _ \ '_ \ \ / / _ \ '__/ __|/ _ \
 | |__| | |_) |  __/ | | \ V /  __/ |  \__ \  __/
  \____/| .__/ \___|_| |_|\_/ \___|_|  |___/\___|
        | |
        |_|

This script will check your local development environment for the tools required to work on Openverse.

If anything is missing, it will let you know, and provide a suggestion for where to find it.


Enabling corepack in the repository for pnpm!

Congrats! Your system appears to be all set up for Openverse development!

Try running 'just install' followed by 'just up' and then 'just init'.

Further setup instructions can be found in the quick start guide.
The guide also includes instructions for setting up individual parts of the Openverse stack.

https://docs.openverse.org/general/quickstart.html

Output when everything is missing

Welcome to...

   ____
  / __ \
 | |  | |_ __   ___ _ ____   _____ _ __ ___  ___
 | |  | | '_ \ / _ \ '_ \ \ / / _ \ '__/ __|/ _ \
 | |__| | |_) |  __/ | | \ V /  __/ |  \__ \  __/
  \____/| .__/ \___|_| |_|\_/ \___|_|  |___/\___|
        | |
        |_|

This script will check your local development environment for the tools required to work on Openverse.

If anything is missing, it will let you know, and provide a suggestion for where to find it.



====== Python language ======

Python 3.11 or later could not be found on your system.
Please update or install Python according to the instructions from the Python Foundation:

https://docs.python.org/3/using/unix.html#getting-and-installing-the-latest-version-of-python


====== 'just' command runner ======
The 'just' command runner could not be found on your system.
Try installing 'just' using your OS's package manager: https://github.com/casey/just?tab=readme-ov-file#packages

For debian or debian-derived systems (like Ubuntu) that do not have makedeb configured,
'just' also provides pre-built binaries. However, you'll need to manually keep them updated:

https://github.com/casey/just?tab=readme-ov-file#pre-built-binaries

Alternatively, you may prefer using the 'just-install' NPM package, endorsed by the 'just' project:

https://github.com/brombal/just-install#readme


====== Docker container runtime ======

Docker is missing from your system. Install it and Docker compose using Docker's instructions.

Docker engine: https://docs.docker.com/engine/install/
Docker compose: https://docs.docker.com/compose/install/

Podman is not currently supported for Openverse development.


====== pnpm Node.js package manager ======

pnpm is missing from your system, and corepack was unavailable to automatically install it using standard Node.js tooling.

For ease of use, Corepack is highly recommended and is the most flexible approach to Node.js package manager installation.

Refer to Corepack's documentation for installation instructions: https://github.com/nodejs/corepack?tab=readme-ov-file#how-to-install

Alternatively, to install pnpm directly, refer to pnpm's installation instructions: https://pnpm.io/installation#on-posix-systems


I detected the following missing dependencies:
- pnpm package manager
- Docker container runtime
- Just command runner
- Python 3.11 or greater

Checklist

My pull request has a descriptive title (not a vague title like Update index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
[N/A] I added or updated tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

github-actions · 2024-05-15T01:13:24Z

Full-stack documentation: https://docs.openverse.org/_preview/4293

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

Changed files 🔄:

https://docs.openverse.org/_preview/4293/general/general_setup.html

dhruvkb · 2024-05-15T05:26:32Z

Should we go "all in" on the script? Or should the script refer to the documentation pages instead of putting the suggestions in-line?

I'd love to review this PR in depth soon, but I do want to quickly chime in to voice my support for keeping the docs in the docs, and for the script to print links to the docs site. For these reasons:

It'll introduce contributors to the docs site and encourage them to read and search through it for common issues. It's also good to cultivate an expectation that the docs site should have answers for commonly asked questions.
It's easier to add rich content like formatting, images, step-by-step guides. on the docs site compared to plain text content in a script. Additionally a web page is much more capable and interactive with hyperlinks and search.

sarayourfriend · 2024-05-15T05:34:35Z

I agree with you, Dhruv.

Do you think we should have the instructions say something like:

Install git if you don't have it
Clone the repository
Run the get-started script (or whatever we end up calling it) and it will tell you what dependencies you're missing (you can read about those below)

That would change the docs from "install these dependencies" to "run this script and it will tell you what to do next". It feels like there's a bit of a structural change warranted by the introduction of this script, mainly because I'm not entirely confident when it should be brought up in the documentation.

dhruvkb · 2024-05-15T07:00:55Z

We can suggest the script at the very top of the general setup page, in a "tip"-style admonition. This lets novice contributors, who will primarily be interested in a script like this to find it immediately instead of having to read through any docs first.

I'm not a fan of the curl | bash pattern but it let's us skip the Git clone step as well.

curl https://raw.githubusercontent.com/WordPress/openverse/main/get_started.sh | bash

Then the entire rest of the page can be left as-is as reading material for slightly advanced contributors who would prefer to set things up manually with more control over their systems. Any missing deps can be listed by the script as links to the very specific sections in the file for those deps. It would love it if we could go one step further and just install the missing deps for the users too (but that can make the script extremely complicated and, since we'll need sudo, potentially dangerous too).

Oh and I think the script name get_started.sh is quite good and on point, but with _ instead of the -, similar to the load_sample_data.sh script.

Additionally I feel like the script should ask the contributor what part of the stack they would like to work on because some setup can be avoided, and some deps may differ, if they decide to only work on a smaller part of the stack like the docs or the frontend (which do not need Docker) or the API or ingestion server which do not need PDM to be present locally.

Also about linting, I think we can keep the entire thing as optional and let the CI check it for them. Linting makes a lot of dependencies like pnpm and Docker become non-optional which may be a lot to ask of a new contributor.

sarayourfriend · 2024-05-15T07:27:18Z

Sounds good on changing the script to link to the docs, it simplifies things quite a bit!

I'm not a fan of the curl | bash pattern but it let's us skip the Git clone step as well.

I wondered about this approach as well, but also am not a fan of it 🤔

Git is typically easy to install anywhere we support development 🤔. It's available by default on many Linux distros and macOS (as far as I understand it).

It would love it if we could go one step further and just install the missing deps for the users too (but that can make the script extremely complicated and, since we'll need sudo, potentially dangerous too).

FWIW, just to clarify, I'm firmly against this for a lot of reasons, even if we could work around the need for sudo. I don't think it actually creates less friction for contributors and the potential for wreaking havoc in someone's environment is too significant.

If anything, we could simplify things by ditching just, but that would be a huge change in our workflow. It's for some reason a recurring issue that people have trouble installing it on Ubuntu/Debian generally.

Installing things ourselves presents a huge list of problems, which I wrote about in the issue. I don't think it's something we should get in the mindset of trying to manage in the project unless we managed to move the entire development environment into Docker and made Docker the only dependency. That would be awesome, but would require making sure docker socket pass through was intuitive and easy for everyone to get working regardless of context.

Doing that would avoid all of this, Docker is trivially easy to install these days pretty much everywhere.

Additionally I feel like the script should ask the contributor what part of the stack they would like to work on because some setup can be avoided, and some deps may differ, if they decide to only work on a smaller part of the stack like the docs or the frontend (which do not need Docker) or the API or ingestion server which do not need PDM to be present locally.

That would be great, but maybe a fast follow? I agree it's a good feature, but more significant, and there are already some big questions that need answering in this basic version.

What do you think about this compared to making a development environment in a Docker image and Docker being the only requirement? We'd still need to write some kind of initialisation script to configure git inside the container, I think? Or could require git and docker, and everything else happens inside an openverse-dev-env Docker container?

I played around a bit with nix because it's another option for creating a zero-effort development environment... but only if nix is already available, and that's a question in itself 😅

Also about linting, I think we can keep the entire thing as optional and let the CI check it for them

The issue here is that then contributors are not running linters, and we repeatedly have to ask them to do so in PRs, or run it ourselves. This is a massive increase in the time and friction it takes for us to review PRs. I've talked about this in the discussion on this PR, #3889, but I think running the linters is a reasonable baseline expectation that you must be able to do to contribute. Same with writing unit tests when a change would require them.

We need to have some baseline expectation of being able to run development tools and follow the quick start guide. There are essentially no contributions that are less complex than that, and while it's nice to be able to accept easy one-off contributions, I question the value of them in and of themselves if we have to spend more time requesting changes or running linters on a PR ourselves than it would have taken to just implement the issue. As discussed in that PR, I don't think Openverse has the context, time, or other resources to be a significant aspect of someone's learning the basic skills of software development in Python or JavaScript. It would be great if we did, but we're very small team working on an ambitious project, and already are bogged down for time and energy to tackle our most urgent and pressing needs.

dhruvkb · 2024-05-15T16:10:49Z

I would be all in on the devcontainers workflow where 100% of the development happens in a container with everything pre-provisioned but that has not worked out so reliably. I remember this issue with executables github.com/docker/for-mac/issues/5029.

For now, I think a simple script that only checks for all our dependencies (PDM, pnpm, Docker... everything) and reports a list of missing ones with docs links will be good enough. If it can be run with curl | bash that's good because it's one less step, but if not, that's good too because any contributor will undoubtedly need to clone the repo anyways.

What I mean to say this is a good start and I'd be totally open to merging this PR as-is albeit after a proper review with further enhancements as needed based on contributor feedback.

zackkrida · 2024-05-15T16:30:50Z

I think the approach @dhruvkb just outlined (which is basically what's here so far) is ideal.

I'll plug my related issue here, which should make actually cloning the repo much more accessible if it remains a requirement for running this script: #4329

sarayourfriend · 2024-05-16T07:04:57Z

I'm going to close this for now, because I think #4343 is a better approach that fixes all the problems we wanted to solve with this script, without introducing a bunch of complex caveats about how to install certain dependencies on systems where it's a pain to do so for one reason or another.

If we can just entirely obviate the need to think about any of this for a normal contributor, and I believe we can by using just git, bash, and docker, then I'd prefer that than trying to sort out the complexity of instructing contributors to use this script (or not) and the documentation to support it.

#4343 would also be something that all of us could use all the time when working on Openverse, and so we'd be confident that it would continue to work and remain true to the needs of the project. A get-started script like the one in this PR would essentially never be used by us directly, and probably go out of sync with the project's needs.

Rough draft of contributor help script

097795b

github-actions bot added 🏷 status: label work required Needs proper labelling before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 9, 2024

openverse-bot added 🟩 priority: low Low priority and doesn't need to be rushed 🌟 goal: addition Addition of new feature 🤖 aspect: dx Concerns developers' experience with the codebase labels May 9, 2024

obulat added 🧱 stack: documentation Related to Sphinx documentation and removed 🏷 status: label work required Needs proper labelling before it can be worked on 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels May 9, 2024

sarayourfriend added 3 commits May 14, 2024 18:41

Fix message indentation and add temporary testing dockerfile

9d09ec1

Clean up script output and expand testing dockerfile

d64d9bb

Add new script to CODEOWNERS

10775e0

sarayourfriend marked this pull request as ready for review May 15, 2024 00:55

sarayourfriend requested review from a team as code owners May 15, 2024 00:55

sarayourfriend requested review from zackkrida and krysal May 15, 2024 00:55

Add note in documentation about get-started script

19d6d72

sarayourfriend mentioned this pull request May 16, 2024

Dockerfy the Openverse development environment #4343

Merged

7 tasks

sarayourfriend closed this May 16, 2024

sarayourfriend deleted the add/openverse-setup-script branch May 16, 2024 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rough draft of contributor help script #4293

Rough draft of contributor help script #4293

sarayourfriend commented May 9, 2024 •

edited

Loading

github-actions bot commented May 15, 2024

dhruvkb commented May 15, 2024

sarayourfriend commented May 15, 2024

dhruvkb commented May 15, 2024

sarayourfriend commented May 15, 2024 •

edited

Loading

dhruvkb commented May 15, 2024 •

edited

Loading

zackkrida commented May 15, 2024 •

edited

Loading

sarayourfriend commented May 16, 2024

Rough draft of contributor help script #4293

Rough draft of contributor help script #4293

Conversation

sarayourfriend commented May 9, 2024 • edited Loading

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

github-actions bot commented May 15, 2024

dhruvkb commented May 15, 2024

sarayourfriend commented May 15, 2024

dhruvkb commented May 15, 2024

sarayourfriend commented May 15, 2024 • edited Loading

dhruvkb commented May 15, 2024 • edited Loading

zackkrida commented May 15, 2024 • edited Loading

sarayourfriend commented May 16, 2024

sarayourfriend commented May 9, 2024 •

edited

Loading

sarayourfriend commented May 15, 2024 •

edited

Loading

dhruvkb commented May 15, 2024 •

edited

Loading

zackkrida commented May 15, 2024 •

edited

Loading