How much GPU memory is required to run the model fully #4

cmd0714 · 2021-12-14T12:02:00Z

Hi, dear author, I wanted to test the performance of the model on Colab using arxiv summarization dataset, which is a long text dataset.
Although the input is truncated to no more than 4096, it is still OOM.
What is the minimum amount of memory I need to run with 4096 input lenth.
this dataset is available at https://github.com/armancohan/long-summarization

abhilash1910 · 2021-12-17T05:14:25Z

Hi @cmd0714 ,
Could you please share a reproducible sample in colab/notebook ?
Thanks for the links, I will be testing it on my end.

cmd0714 · 2021-12-17T07:42:42Z

Hi @abhilash1910 ,
The code is this:
import json
from LongPegasus.LongPegasus import LongPegasus
from transformers import PegasusTokenizer, TFPegasusForConditionalGeneration
l=LongPegasus()
model_name='human-centered-summarization/financial-summarization-pegasus'
model,tokenizer=l.create_long_model(save_model="Pegasus\", attention_window=4096, max_pos=4096,model_name=model_name)
model = TFPegasusForConditionalGeneration.from_pretrained('Pegasus\')
tokenizer = PegasusTokenizer.from_pretrained('Pegasus\')
with open('arxiv-dataset\test.txt','r',encoding='utf-8') as f: #The test.txt file is from arxiv dataset
lines = f.readlines()
for line in lines:
data=json.loads(line)
summary = ' '.join(data['abstract_text'])
article = ' '.join(data['article_text'])
ARTICLE_TO_SUMMARIZE = article
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=4096, truncation=True, return_tensors='tf')
summary_ids = model.generate(inputs['input_ids'])
summary_generate = ' '.join([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
print(summary_generate)
https://colab.research.google.com/drive/1xyX82-FE8XR7RIFICoswCjC46fHd0IB_?usp=sharing

abhilash1910 · 2021-12-24T19:16:21Z

Hi @cmd0714 ,
I had a few questions:

Are you using the model and tokenizer for training on the corpus or only for inference ?
If you are using for training /fine tuning , I would suggest a tpu workflow (I am currently working on adding it in module)
For inference, does the same issue arise when reducing the tokens to 1024?
Since there is not a proper way to determine tf gpu usage (apart from Profiler), maybe wrapping the model inside a tf.session() and then specifying TF_FORCE_GPU_ALLOW_GROWTH=true in configurations might help ? (have to try this)

Will be running a few tests on this,

cmd0714 · 2021-12-25T07:03:17Z

Hi @abhilash1910 ,
I tested the model only for inference.
The issue no longer occurs when the tokens reduced to 1024.

abhilash1910 self-assigned this Dec 24, 2021

abhilash1910 added enhancement New feature or request question Further information is requested labels Dec 24, 2021

cmd0714 closed this as completed Dec 25, 2021

abhilash1910 reopened this Dec 26, 2021

abhilash1910 removed the question Further information is requested label Dec 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How much GPU memory is required to run the model fully #4

How much GPU memory is required to run the model fully #4

cmd0714 commented Dec 14, 2021

abhilash1910 commented Dec 17, 2021

cmd0714 commented Dec 17, 2021 •

edited

Loading

abhilash1910 commented Dec 24, 2021

cmd0714 commented Dec 25, 2021

How much GPU memory is required to run the model fully #4

How much GPU memory is required to run the model fully #4

Comments

cmd0714 commented Dec 14, 2021

abhilash1910 commented Dec 17, 2021

cmd0714 commented Dec 17, 2021 • edited Loading

abhilash1910 commented Dec 24, 2021

cmd0714 commented Dec 25, 2021

cmd0714 commented Dec 17, 2021 •

edited

Loading