Hello! Happy to have you following along. The session deck will be available after the presentation, so please feel free to poke around here while I talk.
First, clone this repository
$ git clone [email protected]:mattdennewitz/sloan-2019-scraping-code.git
Enter the repository
$ cd sloan-2019-scraping-code
Create a virtual environment for this project, then activate
$ python3 -m venv .
$ source bin/activate
Finally, install its requirements
$ pip install -r requirements.txt
This will install requests-html
and requests
.
With your environment still activated, you may run any of the Python scripts to pull data. These scripts will overwrite the CSV files included in the repo.
Running the BP scraper:
$ python p_bpro.py
Once you've run a scraper, view its relevant CSV output file to see the results.