Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much GPU memory is required to run the model fully #4

Open
cmd0714 opened this issue Dec 14, 2021 · 4 comments
Open

How much GPU memory is required to run the model fully #4

cmd0714 opened this issue Dec 14, 2021 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@cmd0714
Copy link

cmd0714 commented Dec 14, 2021

Hi, dear author, I wanted to test the performance of the model on Colab using arxiv summarization dataset, which is a long text dataset.
Although the input is truncated to no more than 4096, it is still OOM.
What is the minimum amount of memory I need to run with 4096 input lenth.
this dataset is available at https://github.com/armancohan/long-summarization

@abhilash1910
Copy link
Owner

Hi @cmd0714 ,
Could you please share a reproducible sample in colab/notebook ?
Thanks for the links, I will be testing it on my end.

@cmd0714
Copy link
Author

cmd0714 commented Dec 17, 2021

Hi @abhilash1910 ,
The code is this:
import json
from LongPegasus.LongPegasus import LongPegasus
from transformers import PegasusTokenizer, TFPegasusForConditionalGeneration
l=LongPegasus()
model_name='human-centered-summarization/financial-summarization-pegasus'
model,tokenizer=l.create_long_model(save_model="Pegasus\", attention_window=4096, max_pos=4096,model_name=model_name)
model = TFPegasusForConditionalGeneration.from_pretrained('Pegasus\')
tokenizer = PegasusTokenizer.from_pretrained('Pegasus\')
with open('arxiv-dataset\test.txt','r',encoding='utf-8') as f: #The test.txt file is from arxiv dataset
lines = f.readlines()
for line in lines:
data=json.loads(line)
summary = ' '.join(data['abstract_text'])
article = ' '.join(data['article_text'])
ARTICLE_TO_SUMMARIZE = article
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=4096, truncation=True, return_tensors='tf')
summary_ids = model.generate(inputs['input_ids'])
summary_generate = ' '.join([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
print(summary_generate)
https://colab.research.google.com/drive/1xyX82-FE8XR7RIFICoswCjC46fHd0IB_?usp=sharing

@abhilash1910
Copy link
Owner

Hi @cmd0714 ,
I had a few questions:

  • Are you using the model and tokenizer for training on the corpus or only for inference ?
  • If you are using for training /fine tuning , I would suggest a tpu workflow (I am currently working on adding it in module)
  • For inference, does the same issue arise when reducing the tokens to 1024?
  • Since there is not a proper way to determine tf gpu usage (apart from Profiler), maybe wrapping the model inside a tf.session() and then specifying TF_FORCE_GPU_ALLOW_GROWTH=true in configurations might help ? (have to try this)

Will be running a few tests on this,

@abhilash1910 abhilash1910 self-assigned this Dec 24, 2021
@abhilash1910 abhilash1910 added enhancement New feature or request question Further information is requested labels Dec 24, 2021
@cmd0714
Copy link
Author

cmd0714 commented Dec 25, 2021

Hi @abhilash1910 ,
I tested the model only for inference.
The issue no longer occurs when the tokens reduced to 1024.

@cmd0714 cmd0714 closed this as completed Dec 25, 2021
@abhilash1910 abhilash1910 reopened this Dec 26, 2021
@abhilash1910 abhilash1910 removed the question Further information is requested label Dec 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants