The EM for FA, as a useful exercise if you want to become proficient at the math, is presented in detail as follows. As for the mixture of FAs, you can refer to The EM Algorithm for Mixtures of Factor Analyzers, Zoubin Gharamani, Geoffrey E.Hinton, 1996.
We begin with: (centralize
to cancel
w.l.o.g):
where we have centralized
to simplify the deduction. Now we apply (4.124) and (4.125) to the two equations before, this ends up with:
The log-likelihood for the complete data set
is:
We are now ready to formulate the auxiliary function, let
:
where the conditional first and second moments are:
Therefore:
Finally, let us take partial gradient of the auxiliary function w.r.t.
and
. We start with
, note that
and
only dependents on
, hence the first term is a constant for
:
Setting it to zero yields:
Meanwhile, let
:
Hence:
This completes the proof.