This repository contains OpenNMT-related utilities used in the RXN universe.
For general utilities not related to OpenNMT, see our other repository rxn-utilities
.
Links:
This package is supported on all operating systems. It has been tested on the following systems:
- macOS: Big Sur (11.1)
- Linux: Ubuntu 18.04.4
A Python version of 3.6, 3.7, or 3.8 is recommended. Python versions 3.9 and above are not expected to work due to compatibility with the selected version of OpenNMT.
The package can be installed from Pypi:
pip install rxn-onmt-utils
For local development, the package can be installed with:
pip install -e ".[dev]"
By importing the following,
from rxn.onmt_utils import Translator, translate
you can do OpenNMT translations directly in Python.
The translate
function acts on input/output files, while the Translator
class gives you flexibility for getting translation results on strings directly.
The script rxn-strip-opennmt-model
installed the package allows you to strip ~2/3 of the size of model checkpoints by removing the state of the optimizer.
You can do this safely if you don't need to continue training on these checkpoints.
If you finetune a model on a dataset with additional tokens compared to the base model, you will need to extend the model weights.
The script rxn-extend-model-with-vocab
does that for you.