This track focuses on answering biological questions using a synthetic data set of diabetes patient data. It is up to you to find interesting biological patterns in the data - but here's few hints to get you started:
- Can you explore which treatments are effective, and which ones are not?
- What are the most important factors for hospital readmission?
- Is the best cure insulin or a combination of drugs?
- What's the best treatment for a newly diagnosed patient based on patient's features and the response to certain drugs?
Aim: To generate new biological/medical knowledge from the synthetic data set. If you use machine learning, it is important that you can explain why the model predicts the way it does (the biological pattern in the data) and hence deep learning might not be suited for this challenge.
Note: To access the data set, Download or Fork this project (on the left under Source Files). Due to the file size (19MB), it may take a few minutes to download.
You are provided a data set: 1 set of synthetic patient data (synthetic_data.csv
) which consists of 78441 rows and 42 columns.
The file can be found in the data
dictionary in the source files.
The feature descriptions can be found in data/feature_descriptions.csv
:
race
- Values: Caucasian, Asian, African American, Hispanic, and othergender
- Values: male and femaleage
- Grouped in 10-year intervals: [0, 10), [10, 20), . . ., [90, 100)time_in_hospital
- Integer number of days between admission and dischargenum_lab_procedures
- Number of lab tests performed during the encounternum_procedures
- Number of procedures (other than lab tests) performed during the encounternum_medications
- Number of distinct generic names administered during the encounternumber_outpatient
- Number of outpatient visits of the patient in the year preceding the encounternumber_emergency
- Number of emergency visits of the patient in the year preceding the encounternumber_inpatient
- Number of inpatient visits of the patient in the year preceding the encounternumber_diagnoses
- Number of diagnoses entered to the systemmax_glu_serum
- Indicates the range of the result or if the test was not taken. Values: >200, >300, normal, and none if not measuredA1Cresult
- Indicates the range of the result or if the test was not taken. Values: >8 if the result was greater than 8%, >7 if the result was greater than 7% but less than 8%, normal if the result was less than 7%, and none if not measuredmetformin
- Medication - decreases blood glucose levels by decreasing hepatic glucose production (gluconeogenesis)repaglinide
- Medication - lowers blood glucose levels by blocking ATP-dependent potassium channels in pancreatic beta cells, which in turn, stimulates insulin secretionnateglinide
- Medication - lowers blood glucose levels by stimulating insulin secretion from the pancreas. (Same as above)chlorpropamide
- Medication - acts by stimulating beta cells of the pancreas to release insulin, bind to ATP-sensitive potassium channels on the pancreatic cell surface (sulfonylurea)glimepiride
- Medication - stimulating the secretion of insulin granules from pancreatic islet beta cells by blocking ATP-sensitive potassium channels (KATP channels) and causing depolarization of the beta cells. (sulfonylurea)acetohexamide
- Medication - same as above (sulfonylurea)glipizide
- Medication - same as above ( sulfonylurea)glyburide
- Medication - same as above (sulfonylurea)tolbutamide
- Medication - same as above ( sulfonylurea)pioglitazone
- Medication - is a selective agonist at peroxisome proliferator-activated receptor-gamma (PPAR??) in target tissues for insulin action such as adipose tissuerosiglitazone
- Medication - same as aboveacarbose
- Medication - acts as a competitive, reversible inhibitor of pancreatic alpha-amylase and membrane-bound intestinal alpha-glucoside hydrolasemiglitol
- Medication - reversible inhibition of membrane-bound intestinal a-glucoside hydrolase enzymestroglitazone
- Medication - same as other glitazonestolazamide
- Medication - sulfonylureaexamide
- Medication - diureticcitoglipton
- Medication - is an oral dipeptidyl peptidase-4 (DPP-4) inhibitor used in conjunction with diet and exercise to improve glycemic control in patients with type 2 DMinsulin
- Medicationglyburide-metformin
- Medication - glyburide belongs to sulfonylureasglipizide-metformin
- Medication - glipizide: sulfonylureaglimepiride-pioglitazone
- Medication - glimepiride: sulfonylureametformin-rosiglitazone
- Medication -metformin-pioglitazone
- Medicationchange
- Indicates if there was a change in diabetic medications (either dosage or generic name). Values: change and no changediabetesMed
- Indicates if there was any diabetic medication prescribed. Values: yes and noreadmitted
- 30 days, >30 if the patient was readmitted in more than 30 days, and No for no record of readmission_diag_1
- Generic primary diagnosis extracted from ICD9 codes_diag_2
- Generic secondary diagnosis extracted from ICD9 codes_diag_3
- Additional generic secondary diagnosis extracted from ICD9 codes
Please join the #diabetes-bioinformatics room on Slack.