Skip to content

rxn4chemistry/rxn-onmt-utils

Repository files navigation

RXN package for OpenNMT-related utilities

Actions tests

This repository contains OpenNMT-related utilities used in the RXN universe. For general utilities not related to OpenNMT, see our other repository rxn-utilities.

Links:

System Requirements

This package is supported on all operating systems. It has been tested on the following systems:

  • macOS: Big Sur (11.1)
  • Linux: Ubuntu 18.04.4

A Python version of 3.6, 3.7, or 3.8 is recommended. Python versions 3.9 and above are not expected to work due to compatibility with the selected version of OpenNMT.

Installation guide

The package can be installed from Pypi:

pip install rxn-onmt-utils

For local development, the package can be installed with:

pip install -e ".[dev]"

Package highlights

Translate with OpenNMT from Python code

By importing the following,

from rxn.onmt_utils import Translator, translate

you can do OpenNMT translations directly in Python. The translate function acts on input/output files, while the Translator class gives you flexibility for getting translation results on strings directly.

Stripping models

The script rxn-strip-opennmt-model installed the package allows you to strip ~2/3 of the size of model checkpoints by removing the state of the optimizer. You can do this safely if you don't need to continue training on these checkpoints.

Extending the model vocabulary

If you finetune a model on a dataset with additional tokens compared to the base model, you will need to extend the model weights. The script rxn-extend-model-with-vocab does that for you.