run error due to dsataset #44

angeluau · 2019-06-12T05:34:15Z

Traceback (most recent call last):
File "train.py", line 340, in
train(args, device_id)
File "train.py", line 272, in train
trainer.train(train_iter_fct, args.train_steps)
File "/home/wsy/xry/BertSum-master/src/models/trainer.py", line 142, in train
for i, batch in enumerate(train_iter):
File "/home/wsy/xry/BertSum-master/src/models/data_loader.py", line 131, in iter
for batch in self.cur_iter:
File "/home/wsy/xry/BertSum-master/src/models/data_loader.py", line 235, in iter
batch = Batch(minibatch, self.device, self.is_test)
File "/home/wsy/xry/BertSum-master/src/models/data_loader.py", line 27, in init
src = torch.tensor(self._pad(pre_src, 0))
File "/home/wsy/xry/BertSum-master/src/models/data_loader.py", line 14, in _pad
width = max(len(d) for d in data)
ValueError: max() arg is an empty sequence

nlpyang · 2019-06-12T07:52:41Z

this error means your batchsize is too small

angeluau · 2019-06-12T07:58:57Z

thanks for your guidance

w5688414 · 2019-12-24T05:47:06Z

python train.py -mode train -encoder classifier -dropout 0.1 -bert_data_path ../bert_data/cnndm -model_path ../models/bert_classifier -lr 2e-3 -visible_gpus 0 -gpu_ranks 0 -world_size 1 -report_every 50 -save_checkpoint_steps 1000 -batch_size 800 -decay_method noam -train_steps 50000 -accum_count 2 -log_file ../logs/bert_classifier -use_interval true -warmup_steps 10000
I use this configurations and it works

gpu_rank 0
[2019-12-24 13:37:56,642 INFO] * number of parameters: 109483009
[2019-12-24 13:37:56,643 INFO] Start training...
[2019-12-24 13:37:56,801 INFO] Loading train dataset from ../bert_data/cnndm.train.123.bert.pt, [2019-12-24 13:38:20,795 INFO] Step 50/50000; xent: 7.57; lr: 0.0000001; 8 docs/s; [2019-12-24 13:38:45,591 INFO] Step 100/50000; xent: 6.42; lr: 0.0000002; 8 docs/s; [2019-12-24 13:39:10,421 INFO] Step 150/50000; xent: 5.29; lr: 0.0000003; 8 docs/s; [2019-12-24 13:39:34,860 INFO] Step 200/50000; xent: 4.07; lr: 0.0000004; 8 docs/s; [2019-12-24 13:40:00,014 INFO] Step 250/50000; xent: 3.43; lr: 0.0000005; 8 docs/s; [2019-12-24 13:40:25,245 INFO] Step 300/50000; xent: 3.33; lr: 0.0000006; 8 docs/s; [2019-12-24 13:40:49,972 INFO] Step 350/50000; xent: 3.52; lr: 0.0000007; 8 docs/s; [2019-12-24 13:41:14,567 INFO] Step 400/50000; xent: 3.31; lr: 0.0000008; 8 docs/s; [2019-12-24 13:41:38,936 INFO] Step 450/50000; xent: 3.27; lr: 0.0000009; 9 docs/s; [2019-12-24 13:42:02,810 INFO] Step 500/50000; xent: 3.38; lr: 0.0000010; 9 docs/s; [2019-12-24 13:42:26,544 INFO] Step 550/50000; xent: 3.25; lr: 0.0000011; 9 docs/s; [2019-12-24 13:42:50,622 INFO] Step 600/50000; xent: 3.35; lr: 0.0000012; 9 docs/s; [2019-12-24 13:43:14,930 INFO] Step 650/50000; xent: 3.29; lr: 0.0000013; 8 docs/s; [2019-12-24 13:43:39,289 INFO] Step 700/50000; xent: 3.20; lr: 0.0000014; 8 docs/s; [2019-12-24 13:44:03,805 INFO] Step 750/50000; xent: 3.41; lr: 0.0000015; 8 docs/s; [2019-12-24 13:44:28,189 INFO] Step 800/50000; xent: 3.36; lr: 0.0000016; 8 docs/s; [2019-12-24 13:44:52,998 INFO] Step 850/50000; xent: 3.29; lr: 0.0000017; 8 docs/s; [2019-12-24 13:45:19,066 INFO] Step 900/50000; xent: 3.30; lr: 0.0000018; 8 docs/s; [2019-12-24 13:45:44,456 INFO] Step 950/50000; xent: 3.15; lr: 0.0000019; 8 docs/s; [2019-12-24 13:45:55,944 INFO] Loading train dataset from ../bert_data/cnndm.train.91.bert.pt, [2019-12-24 13:46:09,317 INFO] Step 1000/50000; xent: 3.39; lr: 0.0000020; 8 docs/s; [2019-12-24 13:46:09,320 INFO] Saving checkpoint ../models/bert_classifier/model_step [2019-12-24 13:46:35,466 INFO] Step 1050/50000; xent: 3.21; lr: 0.0000021; 8 docs/s; number of examples: 2001
24 sec
49 sec
74 sec
98 sec
123 sec
148 sec
173 sec
198 sec
222 sec
246 sec
270 sec
294 sec
318 sec
342 sec
367 sec
391 sec
416 sec
442 sec
468 sec
number of examples: 1998
493 sec
_1000.pt
519 sec

nimahassanpour · 2020-02-06T22:18:12Z

But when I run the same command it gives this error:
train.py: error: argument -encoder: invalid choice: 'classifier' (choose from 'bert', 'baseline')

after changing -encoder from classifier to bert or baseline, it gives this error:
train.py: error: unrecognized arguments: -dropout 0.1 -world_size 1 -decay_method noam

I deleted these args "-dropout 0.1 -world_size 1 -decay_method noam" from your command line and got this error:

gpu_rank 0
[2020-02-06 22:17:01,325 INFO] * number of parameters: 35456513
[2020-02-06 22:17:01,326 INFO] Start training...
[2020-02-06 22:17:01,438 INFO] Loading train dataset from ../bert_data/cnndm.train.123.bert.pt, number of examples: 2001
Traceback (most recent call last):
File "train.py", line 146, in
train_ext(args, device_id)
File "/data/examples/nhassanp/PreSumm-master/src/train_extractive.py", line 203, in train_ext
train_single_ext(args, device_id)
File "/data/examples/nhassanp/PreSumm-master/src/train_extractive.py", line 245, in train_single_ext
trainer.train(train_iter_fct, args.train_steps)
File "/data/examples/nhassanp/PreSumm-master/src/models/trainer_ext.py", line 137, in train
for i, batch in enumerate(train_iter):
File "/data/examples/nhassanp/PreSumm-master/src/models/data_loader.py", line 142, in iter
for batch in self.cur_iter:
File "/data/examples/nhassanp/PreSumm-master/src/models/data_loader.py", line 278, in iter
for idx, minibatch in enumerate(self.batches):
File "/data/examples/nhassanp/PreSumm-master/src/models/data_loader.py", line 256, in create_batches
for buffer in self.batch_buffer(data, self.batch_size * 300):
File "/data/examples/nhassanp/PreSumm-master/src/models/data_loader.py", line 224, in batch_buffer
ex = self.preprocess(ex, self.is_test)
File "/data/examples/nhassanp/PreSumm-master/src/models/data_loader.py", line 195, in preprocess
tgt = ex['tgt'][:self.args.max_tgt_len][:-1]+[2]
KeyError: 'tgt'

can you please help me to find the solution?

RafaelWO · 2020-02-07T15:43:18Z

It seems that a certain target cannot be found. Did you run all suggested preprocessing steps?

nimahassanpour · 2020-02-07T16:08:04Z

It seems that the issue was from the downloaded data. I download it again , and that issue was solve. Now the code gives this error:
RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or logical_not() operator instead.

I am using this command:
python train.py -mode train -encoder bert -bert_data_path ../bert_data/cnndm -model_path ../models/bert_classifier -lr 2e-3 -visible_gpus 0 -gpu_ranks 0 -report_every 50 -save_checkpoint_steps 1000 -batch_size 800 -train_steps 50000 -accum_count 2 -log_file ../logs/bert_classifier -use_interval true -warmup_steps 10000 -ext_dropout 0.1

RafaelWO · 2020-02-07T16:50:10Z

Downgrade to PyTorch 1.1.0, see also #73

nimahassanpour · 2020-02-11T17:28:57Z

I want to test new datasets with BertSum. A dataset like list of paper abstracts in a CSV files. Can you please let me know that how I can pre-process such a data set and create a .pt file which BertSum accepts it as an input?
Thank you!

RafaelWO · 2020-02-12T14:18:26Z

Just create .story files which are structured the same way as the original ones from the dataset.
Then you can use the preprocessing in the same way as with the original data.

nimahassanpour · 2020-02-12T15:50:17Z

Thank you! Do you know any tool for making .story file?

RafaelWO · 2020-02-12T16:29:40Z

E.g. in python:

text = "abc"
with open("text.story","w") as f:
    f.write(text)

nimahassanpour · 2020-02-12T17:44:28Z

Thank you!
Just a quick question, do I need to have @highlight part for may samples? Because my samples do not have summarised part.

tcqiuyu · 2020-02-18T01:51:50Z

this error means your batchsize is too small

I came across this error as well. I wondered why batch size would be "too" small, since I didn't see a constraint like this. Is there anywhere mentioning this constraint? And how to determine whether a batch size is too small or not?

tcqiuyu · 2020-02-18T01:55:38Z

this error means your batchsize is too small

#33
I've seen your explanation here. Thanks. I will dig it out.

RafaelWO · 2020-02-18T11:51:17Z

Thank you!
Just a quick question, do I need to have @highlight part for may samples? Because my samples do not have summarised part.

@nimahassanpour It depends what you want to do with your samples. If you want to train the model you need a references summary (section in @highlight). If you want only to predict then it should be fine without them.

nimahassanpour · 2020-02-19T16:30:22Z

@RafaelWO I am sorry for asking many questions and thank you for your replies! I am having another strange problem. When I follow the Option2 for pre-processing data, I can tokenize .story files successfully and generate .story.json file. But When I run the step-5 I always get three empty square bracket:

[nhassanp@uc1f-bioinfocloud-assembly-base src]$ /data/conda_envs/20200204/miniconda3/bin/python preprocess.py -mode format_to_bert -raw_path ../merged_stories_tokenized/ -save_path ../bert_cnn/ -oracle_mode greedy -log_file ../logs/preprocess.log
[]
[]
[]

(Since I don't have url for my data, I skip step-4)

angeluau closed this as completed Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run error due to dsataset #44

run error due to dsataset #44

angeluau commented Jun 12, 2019

nlpyang commented Jun 12, 2019

angeluau commented Jun 12, 2019

w5688414 commented Dec 24, 2019

nimahassanpour commented Feb 6, 2020

RafaelWO commented Feb 7, 2020

nimahassanpour commented Feb 7, 2020

RafaelWO commented Feb 7, 2020

nimahassanpour commented Feb 11, 2020

RafaelWO commented Feb 12, 2020

nimahassanpour commented Feb 12, 2020

RafaelWO commented Feb 12, 2020

nimahassanpour commented Feb 12, 2020 •

edited

Loading

tcqiuyu commented Feb 18, 2020

tcqiuyu commented Feb 18, 2020

RafaelWO commented Feb 18, 2020

nimahassanpour commented Feb 19, 2020 •

edited

Loading

run error due to dsataset #44

run error due to dsataset #44

Comments

angeluau commented Jun 12, 2019

nlpyang commented Jun 12, 2019

angeluau commented Jun 12, 2019

w5688414 commented Dec 24, 2019

nimahassanpour commented Feb 6, 2020

RafaelWO commented Feb 7, 2020

nimahassanpour commented Feb 7, 2020

RafaelWO commented Feb 7, 2020

nimahassanpour commented Feb 11, 2020

RafaelWO commented Feb 12, 2020

nimahassanpour commented Feb 12, 2020

RafaelWO commented Feb 12, 2020

nimahassanpour commented Feb 12, 2020 • edited Loading

tcqiuyu commented Feb 18, 2020

tcqiuyu commented Feb 18, 2020

RafaelWO commented Feb 18, 2020

nimahassanpour commented Feb 19, 2020 • edited Loading

nimahassanpour commented Feb 12, 2020 •

edited

Loading

nimahassanpour commented Feb 19, 2020 •

edited

Loading