for X: - the usual guidelines: predictor of interest, confounders, effect modifiers, precision variables - think carefully about interpretation and question for M: - don't throw everything in there! many M's likely to be highly correlated - definitely include indicators of an external database if combining reference genomes