Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technologies used for web-scraping #75

Open
4 tasks done
jasmineyang opened this issue Aug 21, 2018 · 0 comments
Open
4 tasks done

Technologies used for web-scraping #75

jasmineyang opened this issue Aug 21, 2018 · 0 comments
Assignees

Comments

@jasmineyang
Copy link
Contributor

jasmineyang commented Aug 21, 2018

We want a simple beginner's tutorial to have the installation steps for basic technologies used for web-scraping. The goal is that students can become acquainted with these tools (e.g. beautifulsoup, Scrapy) by just following the exact steps, and then later on we may use examples (#70) to demonstrate the technologies.

List of topics in order:

  • Introduction to web-scraping: What it is and why is it useful
  • BeautifulSoup: Installation, Expressions & Examples (eg. extracting needed information from HTMP pages)
  • Manually scape data using browser extensions
  • Scrapy: Installation, Rules & Examples (eg. writing a simple scraper, telling Scrapy to follow URLs and scrape contents)
    and more.

Link to notes: https://github.com/ubcecon/computing_and_datascience/blob/master/python_sandbox/Web-Scraping.md

@jasmineyang jasmineyang self-assigned this Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant