CS598 Project: HiCu-ICD

This repository is a modified version of the original HiCu-ICD repository for the purposes of our course project. The original repository contains code for the paper HiCu: Leveraging Hierarchy for Curriculum Learning in Automated ICD Coding.

Our modified code adds support for the latest packages and libraries, as well as reduces the number of models to be trained for the purposes of our project.

Clone the repository to a local directory

Once you have gained access to the MIMIC-III v1.4 dataset (see link for requirements), download the required files and move them into a /data folder within the root of the repository. Use the following directory structure:

 HiCu-ICD/ (root)
 |
 ... (other files in the repository)
 └── data/
 |   |   D_ICD_DIAGNOSES.csv
 |   |   D_ICD_PROCEDURES.csv
 |   └───mimic3/
 |   |   |   NOTEEVENTS.csv
 |   |   |   DIAGNOSES_ICD.csv
 |   |   |   PROCEDURES_ICD.csv
 |   |   |   train_full_hadm_ids.csv
 |   |   |   train_50_hadm_ids.csv
 |   |   |   dev_full_hadm_ids.csv
 |   |   |   dev_50_hadm_ids.csv
 |   |   |   test_full_hadm_ids.csv
 |   |   |   test_50_hadm_ids.csv

The *_hadm_ids.csv files can be found here.

Install the required packages using the following command:

# Python v3.11.7 is recommended
pip install -r requirements.txt

Run the pre-processing script to generate the necessary sampled data files. The --ratio flag can be used to specify the ratio of the dataset to be used. For example, to use 5% of the dataset, run the following command:
```
python preprocess_mimic3.py --ratio 0.05
```
Use the training scripts in the ./runs directory to train each of the models. For example, to train the base MultiResCNN model, run the following command:
```
./runs/run_multirescnn.sh
```
The evaluation metrics will be saved in the ./models directory in a timestamped folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CS598 Project: HiCu-ICD

Files

README.md

Latest commit

History

README.md

File metadata and controls

CS598 Project: HiCu-ICD