Skip to content

nmahmudova/DataMining

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SNCB Anomaly Detection Project

Overview

A Data Mining project at Université Libre de Bruxelles (ULB). The aim is to detect anomalies in SNCB train data using a mix of preprocessing, domain knowledge, and advanced algorithms.

ULB Logo

Team Members:

  • Simon Coessens
  • Md Kamrul Islam (Konok)
  • Narmina Mahmudova
  • José Carlos Lozano (Pepe)

Objectives

Identify anomalies through:

  1. Data Preprocessing: Handling data quality issues.
  2. Domain Knowledge Analysis: Addressing specific research questions.
  3. Advanced Algorithms: Implementing clustering and outlier detection.

Key Steps

  1. Data Handling: Migrated from CSV to PostgreSQL for better performance.
  2. Exploratory Data Analysis: Identified anomalies in temperature, RPM, etc.
  3. Data Enrichment: Added weather data (temperature, humidity, rain).
  4. Research Questions: Investigated temperature anomalies, sensor errors, and speed irregularities.
  5. Anomaly Detection: Utilized clustering and classification techniques.
  6. Dashboard Development: Created visualizations for anomaly insights.
  7. Real-Time Detection: Set up streaming algorithms to flag live anomalies.

Meetings & Tasks

  • 2 Nov: Initial anomaly techniques, data cleaning, visualizations.
  • 9 Nov: MobilityDB setup, local Jupyter, anomaly refinement.
  • 19 Nov: Feature engineering and database updates.
  • 23 Nov: Data Mining lab preparation.

Next Steps

  • Refine streaming algorithms.
  • Improve clustering and classification for data cleaning.

About

Data Mining project at ULB

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 93.8%
  • HTML 6.1%
  • Python 0.1%