Step6_zero_phrase_filtering problem #18

Tamali6 · 2021-01-29T13:01:23Z

While training monoses I got an error in Step 7 which is------

Traceback (most recent call last):
File "/home/xyz/monoses/training/tuning/tune.py", line 335, in
main()
File "/home/xyz/monoses/training/tuning/tune.py", line 322, in main
extract_zmert_params(tmp + '/dcfg.txt.ZMERT.final'))
File "/home/xyz/monoses/training/tuning/tune.py", line 73, in extract_zmert_params
with open(path, encoding='utf-8', errors='surrogateescape') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpv1m8y_i1/dcfg.txt.ZMERT.final'
clean-corpus.perl: processing /home/xyz/models/monoses/src-tgt/tmpzbtcque6/train.bt & .trg to /home/xyz/models/monoses/src-tgt/tmpzbtcque6/train-supervised/clean, cutoff 3-80, ratio 9

From the log file and intermediate results, I find out that

It successfully generated phrase tables.
However, in step 6 it filtered 0% which I suspect.

P(f|e) filter limit: 100
Filtering using P(e|f) only. n=100

..................................................[n:500000]
..................................................[n:1000000]
..................................................[n:1500000]
..................................................[n:2000000]
..................................................[n:2500000]
..................................................[n:3000000]
..................................................[n:3500000]
..................................................[n:4000000]
..................................................[n:4500000]
..................................................[n:5000000]
..................................................[n:5500000]
..................................................[n:6000000]
..................................................[n:6500000]
..................................................[n:7000000]
..................................................[n:7500000]
..................................................[n:8000000]
..................................................[n:8500000]
..................................................[n:9000000]
..................................................[n:9500000]
..................................................[n:10000000]

unfiltered phrases pairs: 10000000
 P(f|e) filter [first]: 0   (0%)
   significance filter: 0   (0%)
        TOTAL FILTERED: 0   (0%)

FILTERED phrase pairs: 10000000   (100%)

Then, in Step 7 while running decoder, it printed -

Call to decoder returned 1; was expecting 0.
Z-MERT exiting prematurely (MertCore returned 30)...

The text was updated successfully, but these errors were encountered:

kellymarchisio · 2021-09-14T01:58:50Z

I confirm that I have experienced this issue many times. The temporary directory is deleted before extract_zmert_params is called. The failure is intermittent

kellymarchisio · 2021-09-25T19:08:54Z

Ok, I finally figured out my related issue.

I got Z-MERT exiting prematurely (MertCore returned 1)...

This was due to moses2 segfaulting under the hood -> it segfaulted because one of the lines in the dev file I was passing into it was too long. I truncated each line in the dev set to 200 chars, and the segfault resolved. If you're doing unsupervised tuning, I recommend truncating the dev file you pass to moses2

Note: This also happened when I accidentally passed in two files for --supervised-tuning that were of different lengths.
Note2: Failure to use the moses tokenizer or escape-special-chars.perl script can also cause moses2 segfaults within zmert (https://github.com/moses-smt/mosesdecoder/tree/master/scripts/tokenizer)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step6_zero_phrase_filtering problem #18

Step6_zero_phrase_filtering problem #18

Tamali6 commented Jan 29, 2021

kellymarchisio commented Sep 14, 2021 •

edited

Loading

kellymarchisio commented Sep 25, 2021 •

edited

Loading

Step6_zero_phrase_filtering problem #18

Step6_zero_phrase_filtering problem #18

Comments

Tamali6 commented Jan 29, 2021

kellymarchisio commented Sep 14, 2021 • edited Loading

kellymarchisio commented Sep 25, 2021 • edited Loading

kellymarchisio commented Sep 14, 2021 •

edited

Loading

kellymarchisio commented Sep 25, 2021 •

edited

Loading