Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the document frequencies? #5

Open
Cppowboy opened this issue Mar 8, 2019 · 1 comment
Open

How to get the document frequencies? #5

Cppowboy opened this issue Mar 8, 2019 · 1 comment

Comments

@Cppowboy
Copy link

Cppowboy commented Mar 8, 2019

If using some other corpus, get the document frequencies into a similar format as "coco-val-df", and put them in the data/ folder as a pickle file. Then set mode to the name of the document frequency file (without the '.p' extension).

How to get the document frequencies?

@awkrail
Copy link

awkrail commented May 15, 2019

I found how to calculate ngram and save a .p file here.
https://github.com/ruotianluo/self-critical.pytorch/blob/master/scripts/prepro_ngrams.py

The target dataset of this script is MSCOCO, but it may be helpful to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants