Skip to content

szabinah90/Website-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Website Parser

This code is written for parsing the search results page of a website, https://index.hu/ (a Hungarian news portal). It can separate URLs related to a search keyword, and can extract relevant information, such as author name, title, content and publishing date.

An ElasticSearch connector is also included, using high-level client.

Kibana example figures:

screenshot from 2018-05-08 21-15-30

screenshot from 2018-05-08 21-16-19

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages