Skip to content

Latest commit

 

History

History
190 lines (137 loc) · 6.68 KB

CONTRIBUTING.rst

File metadata and controls

190 lines (137 loc) · 6.68 KB

Contributing to skrub

First off, thanks for taking the time to contribute!

The following is a set of guidelines for contributing to skrub.



We use GitHub Discussions for general chat and Q&As. Check it out!

If you want to understand in more depth the incentives behind skrub, read our vision statement! If scientific literature doesn’t scare you, we greatly encourage you to read the two following papers:

  • `Similarity encoding for learning

with dirty categorical variables <https://hal.inria.fr/hal-01806175>`__ - Encoding high-cardinality string categorical variables.

Even if we unit-test our code, using the library is the best way to discover new bugs and limitations.

If you find one, please check if a similar or identical issue already exists.

  • If yes:
    • The issue is still open: leave the emote 👍 on the original message, which will let us know there are several users affected by this issue.
    • The issue has been closed:
      • It has been closed by a merged pull request (1) update your skrub version, or (2) the fix has not been released in a version yet.
      • Otherwise, there might be a wontfix label, and / or a reason at the bottom of the conversation.
  • If not, file a new issue.

To solve your issue, first explain the problem and include additional details to help maintainers easily reproduce the problem:

  • Use a clear and descriptive title which identifies the problem.
  • Describe the result you expected.
  • Add additional details to your description problem such as situations where the bug should have appeared but didn’t.
  • Include a snippet of code that reproduces the error, as it allows maintainers to reproduce it in a matter of seconds!
  • Specify versions of Python, skrub, and other dependencies which might be linked to the issue (e.g., scikit-learn, numpy, pandas, etc.).

Some of these bullet points might not apply depending on the kind of error you’re submitting.

This section will guide you through submitting a new enhancement for skrub, whether it is a small fix or a new feature.

First, you should check if the feature has not already been proposed or implemented.

If not, before writing any code, submit a new issue proposing the change.

  • Use a clear and descriptive title.
  • Provide a quick explanation of the goal of this enhancement.
  • Provide a step-by-step description of the suggested enhancement with as many details as possible.
  • If it exists elsewhere, link resources.

Of course, some of these bullet points might not apply depending on the kind of enhancement you’re submitting.

Let maintainers know whether :

There are specific incentives behind skrub. While most enhancement ideas are good, they don’t always fit in the context of the library.

If you’d like to implement your idea regardless, we’d be very glad if you create a new package that builds on top of skrub! In some cases, we might even feature it on the official repository!

If not already done, first create an issue, and discuss the changes with the project’s maintainers.

See in the sections above for the right way to do this.

For this process, please refer to the installation process, section Advanced usage, for contributors.

There are a few specific project goals to keep in mind:

  • Pure Python code - no binary extensions, Cython, etc.
  • Make production-friendly code.
    • Try to target the broadest range of versions (Python and dependencies).
    • Use the least amount of dependencies.
    • Make code as backward compatible as possible.
  • Prefer performance to readability.
  • Use explicit, borderline verbose variables / function names
  • Public functions / methods / variables / class signatures should be documented and type-hinted.
    • The public API describes the components users of the library will import and use. It’s everything that can be imported and does not start with an underscore.

First, you’ll want to fork skrub on Github.

That will enable you to push your commits to a branch on your fork. It is advised to create a new branch every time you work on a new issue, to avoid confusion. Use the following command to create a branch:

git checkout -b branch_name

Next, you can use the Github “Compare & pull request” button to submit your branch code as a PR.

Community consensus is key in the integration process. Expect a minimum of 1 to 3 reviews depending on the size of the change before we consider merging the PR.

Once again, remember that maintainers are volunteers and therefore cannot guarantee how much time it will take to review the changes.