You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running the following step, tried twice. Both end up with process "killed" (the 2nd attempt already got downloaded files, so no download was skipped). Any suspected reason? RAM 32 GB, not enough memory?
type: opus_read
parameters:
corpus_name: CCMatrix
source_language: de
target_language: en
preprocessing: raw
src_output: sents.de.gz
tgt_output: sents.en.gz
The text was updated successfully, but these errors were encountered:
This is possibly the same issue as opus_read fails to extract CCMatrix OpusTools#32. I tried to run the step, and indeed it's taking a lot of memory. (I killed the process at 15G before it started swapping.)
Cannot replicate this, downloading ParaCrawl v9 works fine for me both with OpusFilter and OpusTools.
Hi,
Thanks!
common:
output_directory: CCMatrix_de-en
steps:
parameters:
corpus_name: CCMatrix
source_language: de
target_language: en
preprocessing: raw
src_output: sents.de.gz
tgt_output: sents.en.gz
The text was updated successfully, but these errors were encountered: