Web-Crawler

Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).

Work done during the first day of internship as a data analyst in Dec. 2019 @AssetPro.

`WebCrawler.py`

This script gives a basic example of how to utilize webdriver to crawls fund IDs from a funding company's website: www.ifund.com.hk, you could crawl any data you want from any webpage following the similar pattern of the usage of webdriver.

Author: Changyuan Qiu

Contact: peterqiu@umich.edu

Latest Update: Nov. 12, 2020

Build:

Make sure that the latest version of selenium and openpyxl is installed on your computer.

Apart from selenium and openpyxl, you also need to download chrome driver from

https://sites.google.com/a/chromium.org/chromedriver/downloads

and add it to the PATH for executing this script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Web-Crawler

`WebCrawler.py`

Author: Changyuan Qiu

Contact: [email protected]

Latest Update: Nov. 12, 2020

Build:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Web-Crawler

WebCrawler.py

Author: Changyuan Qiu

Contact: [email protected]

Latest Update: Nov. 12, 2020

Build:

`WebCrawler.py`