It's been a long time coming but I'm finally getting this post out! I read this paper a couple of years ago and wanted to really understand it because it was state of the art at the time (still pretty close even now). As usual though, once I started down the variational autoencoder line of posts, there was always yet another VAE paper to look into so I never got around to looking at this one.

This post is all about a proper probabilistic generative model called Pixel Convolutional Neural Networks or PixelCNN. It was originally proposed as a side contribution of Pixel Recurrent Neural Networks in [1] and later expanded upon in [2,3] (and I'm sure many other papers). The real cool thing about it is that it's (a) probabilistic, and (b) autoregressive. It's still counter-intuitive to me that you can generate images one pixel at at time, but I'm jumping ahead of myself here. We'll go over some background material, the method, and my painstaking attempts at an implementation (and what I learned from it). Let's get started!

Read more…

Importance Sampling and Estimating Marginal Likelihood in Variational Autoencoders

It took a while but I'm back! This post is kind of a digression (which seems to happen a lot) along my journey of learning more about probabilistic generative models. There's so much in ML that you can't help learning a lot of random things along the way. That's why it's interesting, right?

Today's topic is importance sampling. It's a really old idea that you may have learned in a statistics class (I didn't) but somehow is useful in deep learning, what's old is new right? How this is relevant to the discussion is that when we have a large latent variable model (e.g. a variational autoencoder), we want to be able to efficiently estimate the marginal likelihood given data. The marginal likelihood is kind of taken for granted in the experiments of some VAE papers when comparing different models. I was curious how it was actually computed and it took me down this rabbit hole. Turns out it's actually pretty interesting! As usual, I'll have a mix of background material, examples, math and code to build some intuition around this topic. Enjoy!

Read more…

Universal ResNet: The One-Neuron Approximator

"In theory, theory and practice are the same. In practice, they are not."

I read a very interesting paper titled ResNet with one-neuron hidden layers is a Universal Approximator by Lin and Jegelka [1]. The paper describes a simplified Residual Network as a universal approximator, giving some theoretical backing to the wildly successful ResNet architecture. In this post, I'm going to talk about this paper and a few of the related universal approximation theorems for neural networks. Instead of going through all the theoretical stuff, I'm simply going introduce some theorems and play around with some toy datasets to see if we can get close to the theoretical limits.

(You might also want to checkout my previous post where I played around with ResNets: Residual Networks)

Read more…

Hyperbolic Geometry and Poincaré Embeddings

This post is finally going to get back to some ML related topics. In fact, the original reason I took that whole math-y detour in the previous posts was to more deeply understand this topic. It turns out trying to under tensor calculus and differential geometry (even to a basic level) takes a while! Who knew? In any case, we're getting back to our regularly scheduled program.

In this post, I'm going to explain one of the applications of an abstract area of mathematics called hyperbolic geometry. The reason why this area is of interest is because there has been a surge of research showing its application in various fields, chief among them is a paper by Facebook researchers [1] in which they discuss how to utilize a model of hyperbolic geometry to represent hierarchical relationships. I'll cover some of the math weighting more towards intuition, show some of their results, and also show some sample code from Gensim. Don't worry, this time I'll try much harder not going to go down the rabbit hole of trying to explain all the math (no promises though).

(Note: If you're unfamiliar with tensors or manifolds, I suggest getting a quick overview with my previous two posts: Tensors, Tensors, Tensors and Manifolds: A Gentle Introduction)

Read more…

Hi, I'm Brian Keng. This is the place where I write about all things technical.

Twitter: @bjlkeng

Signup for Email Blog Posts