GitHub - josh-works/ruby_web_scraping: A humble side project about Hacker News, Sinatra, and webscraping.

I wanted a little learning project, so I decided on:

Scrape all links from Ask HN: What is your blog and why should I read it?

Get it published online, so you can visit you get a random blog, or random blog post.

So far, I've got a web scraper together that scrapes the top-level comments of the above blog post and saves them to links.txt.

To visualize, this tool shows links JUST from top-level comments on the above thread:

Next, I'll get some basic routing in place with Sinatra, and put it on Heroku.

Should be a cool little thing.

Misc project notes

These are notes I've taken, ordered by time thought occurred to Josh, that I'll use to guide myself in building additional resources/drills

Do this kind of scraping three times total, save outputs to a text file or database.

Tutorials and Guide's I Have Created as a result of this project

1. Nokogiri obstacle course

Nokogiri was such a big part of this, but I had such little knowledge, so I ended up creating this, which will be one of eventually many pieces of intermediate_ruby

I used my new Nokogiri knowledge to get this list of links:

link other effort about this:

https://www.dannysalzman.com/2020/04/08/analyzing-hn-readers-personal-blogs

Potential Extensions

dealing with feature flags in the real world? https://boringrails.com/articles/feature-flags-simplest-thing-that-could-work/
Redis obstacle course? (ties into feature flags)

Sinatra Usage

live reloading

Boot app with rerun 'ruby app.rb'

“No Procfile detected” in Sinatra app heroku push

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github		.github
images		images
lib		lib
practice_documents		practice_documents
test		test
views		views
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
Procfile		Procfile
Rakefile		Rakefile
app.rb		app.rb
config.ru		config.ru
links.txt		links.txt
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Misc project notes

Tutorials and Guide's I Have Created as a result of this project

1. Nokogiri obstacle course

Potential Extensions

Sinatra Usage

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

josh-works/ruby_web_scraping

Folders and files

Latest commit

History

Repository files navigation

Misc project notes

Tutorials and Guide's I Have Created as a result of this project

1. Nokogiri obstacle course

Potential Extensions

Sinatra Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 2

Languages

Packages