First, please see contribution-guide.org for the steps we expect from contributors before submitting an issue or bug report. Be as concrete as possible, include relevant logs, package versions etc.
Also, please check the Gensim FAQ page before posting.
The proper place for open-ended questions is the Gensim mailing list. Github is not the right place for research discussions or feature requests.
- Fork the Gensim repository
- Clone your fork:
git clone https://github.com/<YOUR_GITHUB_USERNAME>/gensim.git
- Create a new branch based on
develop
:git checkout -b my-feature develop
- Setup your Python enviroment
- Create a new virtual environment:
pip install virtualenv; virtualenv gensim_env
and activate it:- For linux:
source gensim_env/bin/activate
- For windows:
gensim_env\Scripts\activate
- For linux:
- Install Gensim and its test dependencies in editable mode:
- For linux:
pip install -e .[test]
- For windows:
pip install -e .[test-win]
- For linux:
- Create a new virtual environment:
- Implement your changes
- Check that everything's OK in your branch:
- Check it for PEP8:
flake8 --ignore E12,W503 --max-line-length 120 --show-source gensim
- Build its documentation (works only for MacOS/Linux):
make -C docs/src html
(documentation stored indocs/src/_build
) - Run unit tests:
pytest -v gensim/test
- Check it for PEP8:
- Add files, commit and push:
git add ... ; git commit -m "my commit message"; git push origin my-feature
- Create a PR on Github. Write a clear description for your PR, including all the context and relevant information, such as:
- The issue that you fixed, e.g.
Fixes #123
- Motivation: why did you create this PR? What functionality did you set out to improve? What was the problem + an overview of how you fixed it? Whom does it affect and how should people use it?
- Any other useful information: links to other related Github or mailing list issues and discussions, benchmark graphs, academic papers…
- The issue that you fixed, e.g.
P.S. for developers: see our Developer Page for details on the Gensim code style, CI, testing and similar.
Thanks and let's improve the open source world together!