Arxiv Scraper

This is a small program which will grab all the papers posted in the last day, convert them to pure text, and, in each of them, search for the regular expressions listed in the configuration file.

Installation

First install the required dependencies:

$ pip install arxiv dateutil slate3k

Then clone this repository

$ git clone https://github.com/bfichera/arxivscraper

Finally, make a file called conf.py in the base directory with your configuration details. See example_conf.py.

Usage

Just run

$ python arxivscraper.py --config-file /path/to/my/config/file

Configuration

See example_conf.py. The search terms which are chemical formulas should go into the chem_terms field, with spaces between elements; they will be converted to a regular expression which I've found matches well most of the time (different authors like to format TaS₂ like TaS2, TaS$_2$, TaS$_{2}$, etc. and the chem_terms tries to catch all of them). Otherwise, use the terms field.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitignore		.gitignore
README.md		README.md
arxivscraper.py		arxivscraper.py
example_conf.py		example_conf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Arxiv Scraper

Installation

Usage

Configuration

About

Releases

Packages

Languages

bfichera/arxivscraper

Folders and files

Latest commit

History

Repository files navigation

Arxiv Scraper

Installation

Usage

Configuration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages