Welcome to my Data Science Projects Repository! This repository contains a collection of my data science projects I have worked on during the years.
-
Fine-Tuninig CodeT5 on Ruby Summarization
- Description: Fine-tune CodeT5 (LLM) on Ruby code summarization task.
- Technologies: Pytorch, Pytorch-lightning, Huggingface, Docker
-
CNN-news summarization and topic modeling with LLMs
- Description: Performing the text summarization and topic modeling tasks with LLMs, respectively T5 (encoder-decoder) and BERT(encoder).
- Technologies: Pytorch, Huggingface
-
Birds species image and audio classification
- Description: Image classification of 525 bird species through IncpetionV3 transfer learning + fine-tuning. And audio classification of 25 bird species with a custom made convolutional neural network architecture.
- Technologies: Tensorflow, Keras, Kaggle datasets
-
f‑AI‑nance: Automated Stock Market Screener
- Description: deployment of stock market screening models enhanced by deep learning techniques (LSTM).
- Technologies: Tensorflow, Keras, Pandas, Numpy
- BTC‑Sentinel: Unraveling Bitcoin Market Sentiment with Machine Learning
- Description: implementing a daily pipeline to gather and merge data from diverse sources (Reddit, Google Trends, web articles, etc.) using varied techniques (API, web scraping) to construct a Bitcoin market sentiment index.
- Technologies: SQL, Selenium (web scraping), Reddit API, RDBMS, Sentiment Analysis NLP, Hugging Face.
-
EarlyCervix: Machine Learning for Cervical Cancer Detection
- Description: a supervised classification analysis through several machine learning models, to predict the outcome of a biopsy test.
- Technologies: Sklearn, Knime
-
Development of a Light Boosting Model for Diabetes Prediction
- Description: The aim of this study was to develop a Light Boosting model for diabetes prediction that will contribute as a valuable tool for healthcare providers to identify patients at risk for diabetes and enable earlier intervention and prevention strategies.
- Technologies: Sklearn, Knime
-
E‑Commerce Sales Forecasting and Interactive Monitoring Visualization
- Description: Forecasting future sales for an ecommerce store using various techniques and identifying the best‑performing predictive model. It was in addittion designed an interactive dashboard to present the forecasting results in a clear and accessible manner.
- Technologies: Tensorflow, Keras, pmdarima: SARIMAX, Sklearn, Tableau, R.
- The Impact of Short-Term-Rentals on European Cities
- Description: This study examines the impact of short-term rentals (STRs), specifically focusing on Airbnb, on urban housing markets, demographic shifts, and regulatory responses in Barcelona. A comparative analysis with Milan, where regulations remain lenient, highlights robust regulatory frameworks' critical role in curbing the STR market's negative externalities while balancing economic benefits and housing affordability.
- Technologies: Python, Numpy, Pandas, Matplotlib