Skip to content

alextgu/hackathon-webscrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HACKATHON DATASCRAPE PROJECT

This program automates the process of gathering hackathon information from specified websites, focusing on extracting the "name" and "date" of each event. It leverages the Selenium automation tool to handle JavaScript-driven content that traditional web scrapers often struggle with. After ensuring all dynamic content is loaded, the program uses BeautifulSoup for data extraction.

Websites this program scans:

https://devpost.com/hackathons

https://mlh.io/seasons/2025/events

Characteristics:

Name

Date

HOW TO RUN CODE:

  1. Installations: Terminal

    • pip install selenium
    • pip install beautifulsoup4
    • pip install pandas
    • pip install TIME-python
    • pip install openpyxl
    • pip install webdriver manager (this handles the webdriver stuff)

    if you want to manually install webdriver manager:

  2. (Optional): Add more links and specifications, and make sure all of the code is adjusted towards the changes.

  3. Run

Sample Output

Screenshot 2024-08-26 at 7 56 15 PM

This program is straightforward but versatile. Feel free to use and modify the code as needed. Feedback is always appreciated (even the mean ones).

This project is licensed under the MIT License. See the LICENSE file for details.