# Importance Sampling and Estimating Marginal Likelihood in Variational Autoencoders

It took a while but I'm back! This post is kind of a digression (which seems to happen a lot) along my journey of learning more about probabilistic generative models. There's so much in ML that you can't help learning a lot of random things along the way. That's why it's interesting, right?

Today's topic is *importance sampling*. It's a really old idea that you may
have learned in a statistics class (I didn't) but somehow is useful in deep learning,
what's old is new right? How this is relevant to the discussion is that when
we have a large latent variable model (e.g. a variational
autoencoder), we want to be able to efficiently estimate the marginal likelihood
given data. The marginal likelihood is kind of taken for granted in the
experiments of some VAE papers when comparing different models. I was curious
how it was actually computed and it took me down this rabbit hole. Turns out
it's actually pretty interesting! As usual, I'll have a mix of background
material, examples, math and code to build some intuition around this topic.
Enjoy!