Task: given a card and its review history, predict the probability of recall at given timing.
Good metrics: Log Loss (cross entropy), RMSE (root mean square deviation)
Bad metrics: MAE (mean absolute error), AUC (area under the ROC curve), AP (average precision)