6.S897/HST.956: Machine Learning for Healthcare

Instructors: David Sontag, Peter Szolovits
Teaching Assistants: Willie Boag, Irene Chen (Office Hours: Monday 1pm, 32-G 9th floor lounge)
Graduate level; Units 3-0-9 (counts as an AAGS subject)
Time: Tuesdays & Thursdays, 2:30-4pm
Location: 4-270
Prerequisite: 6.036/6.862 or 6.867 or 9.520/6.860 or 6.806/6.864 or 6.438 or 6.034
Recitations (optional): Fridays at 2pm (4-153)
Contact: Piazza
Stellar page: https://stellar.mit.edu/S/course/HST/sp19/HST.956/

Course Description | Schedule | Prerequisite quiz | Grading | Problem sets | Lecture scribes | MLHC Community Consulting | Final projects | Collaboration Policy | Problem Set Late Policy

Course description

Introduces students to machine learning in healthcare, including the nature of clinical data and the use of machine learning for risk stratification, disease progression modeling, precision medicine, diagnosis, subtype discovery, and improving clinical workflows. Topics include causality, interpretability, algorithmic fairness, time-series analysis, graphical models, deep learning and transfer learning. Guest lectures by clinicians from the Boston area and course projects with real clinical data emphasize subtleties of working with clinical data and translating machine learning into clinical practice.

Note that because of high demand, we do not have space for listeners.


Schedule is subject to change.

Class Date Lecture Materials Assignments
1 Tues Feb 05
Introduction: What makes healthcare unique?
Prerequisite quiz due, Pset0 out
2 Thurs Feb 07
Overview of clinical care
3 Tues Feb 12
Deep dive into clinical data
Reflection questions, Pset0 due, Pset1 out
4 Thurs Feb 14 Risk stratification using EHRs and insurance claims Reflection questions
Tues Feb 19 - President's Day, Monday schedule
5 Thurs Feb 21 Survival modeling Reflection questions, Pset1 due
6 Tues Feb 26 Physiological time-series Reflection questions, Pset2 out (on stellar)
7 Thurs Feb 28 Clinical text part 1 Reflection questions
8 Tues Mar 05 Clinical text part 2 Pset2 due, Pset3 out
9 Thurs Mar 07 Translating technology into the clinic
10 Tues Mar 12 Machine learning for cardiology Reflection questions,
Pset3 due, Pset4 out
11 Thurs Mar 14 Machine learning for differential diagnosis Reflection questions
12 Tues Mar 19 Machine learning for pathology Reflection questions, Pset4 due
13 Thurs Mar 21 Machine learning for mammography Reflection questions, project proposals due
Tues Mar 26 & Thurs Mar 28 - Spring vacation
14 Tues Apr 02 Causal inference part 1 Pset5 out
15 Thurs Apr 04 Causal inference part 2
16 Tues Apr 09 Reinforcement learning part 1 Pset5 due
17 Thurs Apr 11 Reinforcement learning part 2
Tues Apr 16 - Patriots Day holiday
18 Thurs Apr 18 Disease progression modeling Pset6 out
19 Tues Apr 23 Disease subtyping
20 Thurs Apr 25 Precision medicine Pset6 due
21 Tues Apr 30 Automating clinical workflows
22 Thurs May 02 Regulation of ML/AI in the US
23 Tues May 07 Fairness, transparency, and accountability
24 Thurs May 09 Robustness to dataset shift
25 Tues May 14 Interpretability part 1 Project poster presentations (evening)
26 Thurs May 16 Interpretability part 2 Projects due

Prerequisite quiz

This quiz will not count toward your grade, but will be used by the course staff to check prerequisites (6.036/6.862 or 6.867 or 9.520/6.860 or 6.806/6.864 or 6.438 or 6.034) and to assess students' preparation for this class.

The prerequisite quiz is now closed, but you can view the questions here.


Problem sets

We expect there will be seven problem sets this year.

Lecture scribes

Each student is expected to either “scribe” for one lecture or "consult" for one MLHC community evening session (see below). A given lecture will have 1-2 scribes who are responsible for summarizing what was discussed in class. The first draft of the notes should be submitted to the TAs by 11:59pm of the day after class (i.e. 30 hours after lecture ends). We will send you suggestions to revise, and once the notes are finalized, we will then post it on the course website. The goal will be to get the notes out by one week after the corresponding class.

We expect writing up lecture notes to take no more than 3 hours. If there are two scribes for one lecture, the two scribes should collaborate and submit one writeup. The notes you write should cover all the material covered during the relevant lecture, plus real references to the papers containing the covered material. Your notes should be understandable to someone who has not been to the lecture. You should write in full sentences where appropriate; point form is often too terse to follow without a sound track (though occasionally it is appropriate). Use numbered sections, subsections, etc. to organize the material hierarchically and with meaningful titles. Try to preserve the motivation, difficulties, solution ideas, failed attempts, and partial results obtained along the way in the actual lecture.

Write your notes using LaTeX. Please use our template -- either through downloading the template or using Overleaf (Menu -> Copy project).

Sign up for lecture scribing here.

MLHC Community Consulting

Each student is expected to either “scribe” for one lecture (see above) or "consult" for one Machine Learning for Healthcare (MLHC) community evening session. Throughout the semester, we will organize four evening sessions to engage with the larger MLHC community. Clinicians and other Boston area people interested in machine learning for healthcare will come to talk through their problems and ideas.

MLHC Community Consulting for this semester will occur:

Clinicians are welcome to
sign-up here for more information, or see our poster.

Students who sign up for community consulting will be expected to attend the entire session and submit a write-up of their experiences shortly after the session. We expected one write-up per clinician, so students should coordinate if they talked to the same clinician. Write-ups are due one week after the consulting session.

Sign up for MLHC community consulting here.


Projects will include a proposal, poster presentation, and final report. We will add more information here shortly.

Collaboration Policy

Students must write up their problem sets individually. Students should not share their code or solutions (i.e., the write up) with anyone inside or outside of the class, nor should it be posted publicly to GitHub or any other website. You are asked on problem sets to identify your collaborators. If you did not discuss the problem set with anyone, you should write "Collaborators: none." If in writing up your solution you make use of any external reference (e.g. a paper, Wikipedia, a website), both acknowledge your source and write up the solution in your own words. It is a violation of this policy to submit a problem solution that you cannot orally explain to a member of the course staff.

Plagiarism and other dishonest behavior cannot be tolerated in any academic environment that prides itself on individual accomplishment. If you have any questions about the collaboration policy, or if you feel that you may have violated the policy, please talk to one of the course staff.

Problem Set Late Policy

(starting for pset2 onwards)