Capacitance between coils is done at a low enough frequency where the self inductance doesn't matter.
Self capacitance is done by measuring impedance over frequency, and modeling the equivalent inductance, resistance (core loss) and capacitance. (Typically the equivalent circuit also has a number of RLC elements in parallel, representing skin effect, inter-layer capacitance and other complexities, depending on how complex the transformer is, and how accurate the desired model is.)
It can also be predicted using transmission line theory, for which it helps that the transformer was designed from TL theory as well. A conventional windup in violation of good TL design, can still be estimated in part, in this way, but there will be more uncertainty regarding how much worse its bandwidth will be.
Tim