Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating CONTRIBUTING.md to for jazzband #501

Merged
merged 2 commits into from
Mar 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 4 additions & 115 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,116 +1,5 @@
Any and all contributions are welcome and appreciated. To make it easy
to keep things organized, this project uses the
[general guidelines](https://guides.github.com/introduction/flow/)
for the fork-branch-pull request model for github. Briefly, this means:
[![Jazzband](https://jazzband.co/static/img/jazzband.svg)](https://jazzband.co/)

1. Make sure your fork's `master` branch is up to date:

git remote add deanmalmgren https://github.com/deanmalmgren/textract.git
git checkout master
git pull deanmalmgren/master

2. Start a feature branch with a descriptive name about what you're
trying to accomplish:

git checkout -b csv-support

3. Make commits to this feature branch (`csv-support`, in this case)
in a way that other people can understand with good commit messages
to explain the changes you've made:

emacs textract/parsers/csv_parser.py
git add textract/parsers/csv_parser.py
git commit -m 'added csv_parser'

4. If an issue already exists for the code you're contributing, use
[issue2pr](http://issue2pr.herokuapp.com/) to attach your code to
that issue:

git push origin csv-support
chrome http://issue2pr.herokuapp.com
# enter the issue URL, HEAD=yourusername:csv-support, Base=master

If the issue doesn't already exist, just send a pull
request in the usual way:

git push origin csv-support
chrome http://github.com/deanmalmgren/textract/compare


Common contributions: support for new file type
-----------------------------------------------

This project has really taken off, much more so than I would have
thought (thanks everybody!). To help new contributors, I thought I'd
jot down some notes for one of the more common contributions---how to
add support for hitherto unsupported file type `.abc123`:

* write a `Parser` class in `textract/parsers/abc123_parser.py` that
inherits from `textract.parsers.utils.BaseParser` or
`textract.parsers.utils.ShellParser` and implements the
`extract(self, filename, **kwargs)` method.

* add a test file in `tests/abc123/raw_text.abc123`, run textract on
it like this:

```shell
textract tests/abc123/raw_text.abc123 > tests/abc123/raw_text.txt
```

and add the basic test suite by creating
a file called `tests/test_abc123.py` with content that looks
something like this:

```python
# tests/test_abc123.py
import unittest

import base


class Abc123TestCase(unittest.TestCase, base.BaseParserTestCase):
extension = 'abc123'
```

now you should be able to run tests on your parser with `nosetests
tests/test_abc123.py` or the tests for every parser with `nosetests`.

* if your package relies on any external sources, be sure to add them
in either `requirements/python` (for python packages) or
`requirements/debian` (for debian packages) and update the
installation documentation accordingly in `docs/installation.rst`.

* add documentation about the awesome new file format this is being
supported in `docs/index.rst` and be sure to give yourself a pat on
the back by updating the changelog in `docs/changelog.rst`

* finally, make sure the entire test suite passes by running
`./tests/run.py` and fix any lingering problems (usually PEP-8
nonsense).


Style guidelines
----------------

As a general rule of thumb, the goal of this package is to be as
readable as possible to make it easy for novices and experts alike to
contribute to the source code in meaningful ways. Pull requests that
favor cleverness or optimization over readability are less likely to be
incorporated.

To make this notion of "readability" more concrete, here are a few
stylistic guidelines that are inspired by other projects and we
generally recommend:

- write functions and methods that can `fit on a screen or two of a
standard
terminal <https://www.kernel.org/doc/Documentation/CodingStyle>`_
--- no more than approximately 40 lines.

- unless it makes code less readable, adhere to `PEP
8 <http://legacy.python.org/dev/peps/pep-0008/>`_ style
recommendations --- use an appropriate amount of whitespace.

- `code comments should be about *what* is being done, not *how* it is
being done <https://www.kernel.org/doc/Documentation/CodingStyle>`_
--- that should be self-evident from the code itself.
This is a [Jazzband](https://jazzband.co/) project. By contributing you agree to
abide by the [Contributor Code of Conduct](https://jazzband.co/about/conduct)
and follow the [guidelines](https://jazzband.co/about/guidelines).
7 changes: 7 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,16 @@ Extract text from any document. No muss. No fuss.

`Full documentation <http://textract.readthedocs.org>`__.

Originally written by @deanmalmgren. Maintained by the good people at
@jazzband |Jazz Band|

|Build Status| |Version| |Downloads| |Test Coverage| |Documentation Status|
|Updates| |Stars| |Forks|

.. |Jazz Band| image:: https://jazzband.co/static/img/badge.svg
:target: https://jazzband.co/
:alt: Jazzband

.. |Build Status| image:: https://travis-ci.org/deanmalmgren/textract.svg?branch=master
:target: https://travis-ci.org/deanmalmgren/textract

Expand Down