In this project, Python is used to gather data from a housing research website, and analysis is made on the data gathered.
- bs4 is the BeautifulSoup library used for parsing HTML and XML documents.
- pandas is a library used for data manipulation and analysis.
- numpy is a library used for mathematical operations on arrays and matrices.
- requests is a library used for making HTTP requests to websites.
- re is a regular expression library used for pattern matching and string manipulation.
- Seaborn is a library for data visualization built on top of matplotlib
- Matplotlib is a plotting library for creating static, animated, and interactive visualizations in Python
- Exploratory data analysis
- Price distribution across houses
- Average Increase in House Price for each city
- Factors that affect house prices
- Average housing prices by cities
- Effects of bedroom count on house prices
- Average bath per price
- County-wise Average price per Square Feet
- OLS - Multiple Linear Regression
- Predictive analysis
- Residual plots