Houjun Liu

evaulating model fitness

We want to compare features of the model to features of the data:

Visual diagnostics

  1. PDF plot
  2. CDF of data vs. CDF of model
  3. Quantile-Quantile plot
  4. Calibration Plot

Summative Metrics

  1. KL Divergence
  2. Expected Calibration Error
  3. Maximum Calibration Error

Marginalization Ignores Covariances

Notice on the figure on the right captures distribution much better, yet the marginal distributions don’t show this. This is because marginalizing over the datasets ignores the covariances. Hence, remember to keep dimensions and any projections hould capture covariances, etc.

Conditional Distributions

Bin the conditions into groups and perform evals on each.

Turing Test

If expert knowledge is available, you can show an expect roll outs from data and model, and see if they can tell.