Skip to content

Download, parse, and convert raw Wikimedia data into standard formats.

License

Notifications You must be signed in to change notification settings

chartbeat-labs/kwnlp-preprocessor

 
 

Repository files navigation

Kensho Wikimedia for Natural Language Processing - Preprocessor

kwnlp_preprocessor is a Python package to help you convert raw Wikimedia data to standard formats.

Quick Install (Requires Python >= 3.6)

# Install the pre-commit setup (linters in our case)
pip install pre-commit
pre-commit install

pip install . # This package is not on pypi yet
# or "pip install -e ." to install in editable mode

Status

This code is not battle tested production code. It is mostly used by the R&D team to prototype new ideas using Wikimedia data.

About

Download, parse, and convert raw Wikimedia data into standard formats.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%