You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using v1.2.1, the following command successfully downloads the resources of MultiCCAligned. After the download, however, the conversion to Moses-format fails without any error message due to a lack of memory (RAM).
opus_read --directory MultiCCAligned -r v1 --source en --target de --write en-de.en en-de.de --write_mode moses
opus_read seems to read the dataset into memory. The memory increases above 60GB before the process dies.
A similar operation to download the WMT dataset works:
opus_read --directory WMT-News -r v2019 --source en --target de --write en-de.en en-de.de --write_mode moses
Thanks for this library. A tool to collect and filter the ever-increasing datasets is of great use.
The text was updated successfully, but these errors were encountered:
Using
v1.2.1
, the following command successfully downloads the resources ofMultiCCAligned
. After the download, however, the conversion to Moses-format fails without any error message due to a lack of memory (RAM).opus_read
seems to read the dataset into memory. The memory increases above 60GB before the process dies.A similar operation to download the WMT dataset works:
Thanks for this library. A tool to collect and filter the ever-increasing datasets is of great use.
The text was updated successfully, but these errors were encountered: