Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dec 12 merge] fix: add cross links to the packaging guide #431

Merged
merged 11 commits into from
Dec 12, 2024
30 changes: 23 additions & 7 deletions package-structure-code/intro.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,29 @@
# Python package structure information
# Python Package Structure

This section provides guidance on your Python package's structure, code formats
and style. It also reviews the various packaging tools that you can use to
support building and publishing your package.
This section provides guidance on your Python package's structure, code format,
and style. It also reviews the various [packaging tools](python-package-build-tools) you can use to
[build](python-package-distribution-files-sdist-wheel) and [publish](publish-python-package-pypi-conda) your Python package.

If you want end-to-end tutorials, check out our tutorial series that starts by introducing [what a Python package is](what-is-a-package).

If you are confused by Python packaging, you are not alone! The good news is
there are some great modern packaging tools that ensure that you're following
best practices. Here, we review tool features and suggest tools that might be
best fitted for your workflow.
that some great modern packaging tools ensure you follow
best practices. Here, we review tool features and suggest tools you can use
for your Python packaging workflow.

:::{button-link} /tutorials/intro
:color: success
:class: sd-rounded-pill float-left

Checkout our beginning-to-end create a Python package tutorials

:::


:::{admonition} How this content is developed
All of the content in this guide has been vetted by community members, including maintainers and developers of the core packaging tools.
:::


:::::{grid} 1 1 2 2
:class-container: text-center
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ Also note that we are not discussing conda build workflows in this section.
```

(python-source-distribution)=
## Source Distribution (sdist)
## What is a source distribution (sdist)

**Source files** are the unbuilt files needed to build your
package. These are the "raw / as-is" files that you store on GitHub or whatever
Expand Down Expand Up @@ -226,7 +226,7 @@ the sdist may also contain a file that stores the version.
```

(python-wheel)=
## Wheel (.whl files):
## What is a Python wheel (whl):

A wheel file is a ZIP-format archive whose filename follows a specific format
(below) and has the extension `.whl`. The `.whl` archive contains a specific
Expand Down
116 changes: 68 additions & 48 deletions package-structure-code/python-package-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,41 @@

There are two different layouts that you will commonly see
within the Python packaging ecosystem:
[src and flat layouts.](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/)
src and flat layouts.
Both layouts have advantages for different groups of maintainers.

We strongly suggest, but do not require, that you use the **src/** layout (discussed below)
for creating your Python package. This layout is also recommended in the
[PyPA packaging guide](https://packaging.python.org/en/latest/tutorials/packaging-projects/).
[PyPA packaging guide tutorial](https://packaging.python.org/en/latest/tutorials/packaging-projects/).

```{admonition} pyOpenSci will never require a specific package structure for peer review
:class: important

We understand that it would be tremendous effort for existing
We understand that it would take significant effort for existing
maintainers to move to a new layout.

The overview on this page presents recommendations that we think are best for
someone getting started with Python packaging or someone who's package
has a simple build and might be open to moving to a more fail-proof approach.

Other resources you can check out:
* [PyPA's overview of src vs flat layouts](https://packaging.python.org/en/latest/discussions/src-layout-vs-flat-layout/)
```

An example of the **src/package** layout structure can be seen below.
You can use tools like Hatch to quickly create a modern Python package structure. Check out our quickstart tutorial:


:::{button-link} /tutorials/installable-code.html#step-1-set-up-the-package-directory-structure
:color: success
:class: sd-rounded-pill float-left

Want to learn how to create the structure to build your package? Click here.

:::

## What is the Python package source layout?

An example of the **src/package** layout structure is below.

```
myPackageRepoName
Expand All @@ -44,8 +60,8 @@ myPackageRepoName

Note the location of the following directories in the example above:

- **docs/:** discussed in our docs chapter, this directory contains your user-facing documentation website. In a **src/** layout docs/ are normally included at the same directory level of the **src/** folder.
- **tests/** this directory contains the tests for your project code. In a **src/** layout tests are normally included at the same directory level of the **src/** folder.
- **docs/:** Discussed in our docs chapter, this directory contains your user-facing documentation website. In a **src/** layout docs/ are normally included at the same directory level as the **src/** folder.
- **tests/** This directory contains the tests for your project code. In a **src/** layout, tests are normally included at the same directory level as the **src/** folder.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You capitalized this bullet, but no others. Not sure what your preference is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this happens because i never know when to capitalize and when to not capitalize! let's capitalize?

- **src/package/**: this is the directory that contains the code for your Python project. "Package" is normally your project's name.

Also in the above example, notice that all of the core documentation files that
Expand All @@ -61,14 +77,12 @@ include:
<!-- TODO: CHANGELOG is not mentioned in either documentation nor peer review -->

```{button-link} https://www.pyopensci.org/python-package-guide/documentation
:color: primary
:color: info
:class: sd-rounded-pill

Click here to read about our packaging documentation requirements.
```

While we recommend the **src/** layout we also review the **flat** layout here. Both are used in the Python ecosystem.

```{admonition} Example scientific packages that use **src/package** layout

* [Sourmash](https://github.com/sourmash-bio/sourmash)
Expand All @@ -81,14 +95,13 @@ While we recommend the **src/** layout we also review the **flat** layout here.

## The src/ layout and testing

The benefit of using the **src/package** layout, particularly if you
are creating a new package, is that it ensures tests are run against the
The benefit of using the **src/package** layout is that it ensures tests are run against the
installed version of your package rather than the files in your package
working directory. If you run your tests on your files rather than the
installed version, you may be missing issues that users encounter when
installed version of your package, you may be missing issues that users encounter when
your package is installed.

If `tests/` are outside of the **src/package** directory, they aren't included in the package wheel. This makes your package size slightly smaller which then places places a smaller storage burden on PyPI which has over 400,000 packages to support.
If `tests/` are outside the **src/package** directory, they aren't included in the package's [wheel](python-wheel). This makes your package size slightly smaller, which places a smaller storage burden on PyPI, and makes them faster to fetch.

- [Read more about reasons to use the **src/package** layout](https://hynek.me/articles/testing-packaging/)

Expand All @@ -98,14 +111,14 @@ By default, Python adds a module in your current working directory to the front

This means that if you run your tests in your package's working directory, using a flat layout, `/package/module.py`, Python will discover `package/module.py` file before it discovers the installed package.

However, if your package lives in a src/ directory structure **src/package** then it won't be, by default, added to the Python path. This means that when you import your package, Python will be forced to search the active environment (which has your package installed).
However, if your package lives in a src/ directory structure **src/package**, then it won't be added to the Python path by default. This means that when you import your package, Python will be forced to search the active environment (which has your package installed).

Note: Python versions 3.11 and above have a path setting that can be adjusted to ensure the priority is to use installed packages first (e.g. `PYTHONSAFEPATH`).
Note: Python versions 3.11 and above have a path setting that can be adjusted to ensure the priority is to use installed packages first (e.g., `PYTHONSAFEPATH`).
```

### Sometimes tests are needed in a distribution
### Don't include tests in your package wheel

We do not recommend including tests as part of your package wheel by default. However, not including tests in your package distribution will make it harder for people other than yourself to test whether your package is functioning correctly on their system. If you have a small test suite (Python files + data), and think your users may want to run tests locally on their systems, you can include tests by moving the `tests/` directory into the **src/package** directory (see example below).
Writing [tests](tests-intro) for your package is important; however, we do not recommend including tests as part of your [package wheel](python-wheel) by default. However, not including tests in your package distribution will make it harder for people other than yourself to test whether your package runs properly on their system. If you have a small test suite (Python files + data), and think your users may want to run tests locally on their systems, you can include tests by moving the `tests/` directory into the **src/package** directory (see example below).

```bash
src/
Expand All @@ -114,69 +127,76 @@ src/
docs/
```

Including the **tests/** directory in your **src/package** directory ensures that tests will be included in your package's wheel.
Including the **tests/** directory in your **src/package** directory ensures that tests will be included in your package's [wheel](python-wheel).

Be sure to read the [pytest documentation for more about including tests in your package distribution](https://docs.pytest.org/en/7.2.x/explanation/goodpractices.html#choosing-a-test-layout-import-rules).

```{admonition} Challenges with including tests and data in a package wheel
:class: tip

Tests, especially when accompanied by test data can create a few small challenges including:
Tests, especially when accompanied by test data, can create a few small challenges, including:

- Take up space in your distribution which will build up over time as storage space on PyPI
- Large file sizes can also slow down package install.
- Take up space in your distribution, which will build up over time as storage space on PyPI
- Large file sizes can also slow down package installation.

However, in some cases, particularly in the scientific Python ecosystems you may need to include tests.
However, in some cases, particularly in the scientific Python ecosystem, you may need to include tests.
```

### **Don't include test suite datasets in your package**

If you do include your tests in your package distribution, we strongly
If you include your tests in your package distribution, we strongly
discourage you from including data in your test suite directory. Rather,
host your test data in a repository such as Figshare or Zenodo. Use a
tool such as [Pooch](https://www.fatiando.org/pooch/latest/) to access
the data when you (or a user) runs tests.

Check out the testing section of our guide for more information about tests.
For more information about Python package tests, see the [tests section of our guide](tests-intro).

- The **src/package** layout is semantically more clear. Code is always found in the
**src/package** directory, `tests/` and `docs/`are in the root directory.

```{important}
If your package tests require data, we suggest that you do NOT include that
data within your package structure. We will discuss this in more detail in a
tutorial. Include data in your package structure increases the size of your
If your package tests require data, do NOT include that
data within your package structure. Including data in your package structure increases the size of your
distribution files. This places a maintenance toll on repositories like PyPI and
Anaconda.org that have to deal with thousands of package uploads.
```

## About the flat Python package layout

Currently most scientific packages use the **flat-layout** given:
:::{button-link} /tutorials/installable-code.html#step-1-set-up-the-package-directory-structure
:color: success
:class: sd-rounded-pill float-left

- It's the most commonly found layout with the scientific Python ecosystem and
people tend to look to other packages / maintainers that they respect for examples
of how to build Python packages.
- Many Python tools depend upon tools in other language and / or complex builds
with compilation steps. Many developers thus appreciate / are used to features
of the flat layout.

While we present this layout here in our guide, we suggest that those just
getting started with python packaging start with the src/package layout
discussed above. Numerous packages in the ecosystem [have had to move to a
src/ layout](https://github.com/scikit-build/cmake-python-distributions/pull/145)
Click here for a quickstart tutorial on creating your Python package.

:::

(flat-layout)=
## What is the flat Python package layout?

Many scientific packages use the **flat-layout** given:

- This layout is used by many core scientific Python packages such as NumPy, SciPy, and Matplotlib.
- Many Python tools depend upon tools in other languages and/or complex builds
with compilation steps. Many maintainers prefer features
of the flat layout for more complex builds.

While we suggest that you use the **src/package** layout discussed above, it's important to also
understand the flat layout, especially if you plan to contribute to a package that uses this layout.


```{admonition} Why most scientific Python packages do not use source
:class: tip

In most cases the advantages of using the **src/package** layout for
larger scientific packages that already use flat approach are not worth it.
Moving from a flat layout to a **src/package** layout would come at a significant cost to
maintainers.
In most cases, moving to the **src/package** layout for
larger scientific packages that already use a flat layout would consume significant time.

However, the advantages of using the **src/package** layout for a beginner are significant.
As such, we recommend that if you are getting started with creating a package,
that you consider using a **src/package** layout.
As such, we recommend that you use the **src/package** layout if you are creating a new package.

Numerous packages in the ecosystem [have had to move to a
**src/package** layout](https://github.com/scikit-build/cmake-python-distributions/pull/145).
```

## What does the flat layout structure look like?
Expand Down Expand Up @@ -234,7 +254,7 @@ There are some benefits to the scientific community in using the flat layout.

It would be a significant maintenance cost and burden to move all of these
packages to a different layout. The potential benefits of the source layout
for these tools is not worth the maintenance investment.
for these tools are not worth the maintenance investment.
```

<!--
Expand All @@ -243,12 +263,12 @@ Not sure where to put this now ... most new users won't have multiple packages.
```{admonition} Multiple packages in a src/ folder
:class: tip

In some more advanced cases you may have more than one package in your src/ directory. See [black's GitHub repo](https://github.com/psf/black/tree/main/src) for an example of this. However, for most beginners you will likely only have one sub-directory in your **src/** folder.
In some more advanced cases, you may have more than one package in your **src/** directory. See [Black's GitHub repo](https://github.com/psf/black/tree/main/src) for an example of this. However, for most beginners you will likely only have one sub-directory in your **src/** folder.
``` -->

<!--
```{admonition} A few notes about the src/ layout
:class: tip

It is important to note here that sometimes when using the src/package structure the directory name (e.g. package name) is different from the actual project or package name. What is important to take away here is that you should store your code within a sub directory within **src/**.
It is important to note here that sometimes, when using the **src/package** structure, the directory name (e.g., package name) is different from the actual project or package name. What is important to take away here is that you should store your code within a subdirectory within **src/**.
``` -->
5 changes: 3 additions & 2 deletions tests/index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
(tests-intro)=
# Tests and data for your Python package

Tests are an important part of your Python package because they
provide a set of checks that ensure that your package is
functioning how you expect it to.

In this section you will learn more about the importance of writing
tests for your Python package and how you can setup infrastructure
In this section, you will learn more about the importance of writing
tests for your Python package and how you can set up infrastructure
to run your tests both locally and on GitHub.


Expand Down
2 changes: 2 additions & 0 deletions tutorials/intro.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
(packaging-101)=
# Python packaging 101

_A start to finish beginner-friendly tutorial_
Expand Down Expand Up @@ -75,6 +76,7 @@ After reading this lesson you will:

:::

(what-is-a-package)=
## What is a Python package?

At a high level, you can think about a Python package as a toolbox
Expand Down
Loading