Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 909 Bytes

README.md

File metadata and controls

21 lines (13 loc) · 909 Bytes

Web-Crawler

Basic web crawling and data processing based on Python(using selenium and openpyxl) and R(using rvest, xml2 and xlsx).

Work done during the first day of internship as a data analyst in Dec. 2019 @AssetPro.

WebCrawler.py

This script gives a basic example of how to utilize webdriver to crawls fund IDs from a funding company's website: www.ifund.com.hk, you could crawl any data you want from any webpage following the similar pattern of the usage of webdriver.

Author: Changyuan Qiu
Latest Update: Nov. 12, 2020

Build:

Make sure that the latest version of selenium and openpyxl is installed on your computer.

Apart from selenium and openpyxl, you also need to download chrome driver from

https://sites.google.com/a/chromium.org/chromedriver/downloads

and add it to the PATH for executing this script.