Universal ResNet: The One-Neuron Approximator

Brian Keng

2018-08-03 08:03

"In theory, theory and practice are the same. In practice, they are not."

I read a very interesting paper titled ResNet with one-neuron hidden layers is a Universal Approximator by Lin and Jegelka [1]. The paper describes a simplified Residual Network as a universal approximator, giving some theoretical backing to the wildly successful ResNet architecture. In this post, I'm going to talk about this paper and a few of the related universal approximation theorems for neural networks. Instead of going through all the theoretical stuff, I'm simply going introduce some theorems and play around with some toy datasets to see if we can get close to the theoretical limits.

(You might also want to checkout my previous post where I played around with ResNets: Residual Networks)

Hyperbolic Geometry and Poincaré Embeddings

Brian Keng

2018-06-17 08:20

This post is finally going to get back to some ML related topics. In fact, the original reason I took that whole math-y detour in the previous posts was to more deeply understand this topic. It turns out trying to under tensor calculus and differential geometry (even to a basic level) takes a while! Who knew? In any case, we're getting back to our regularly scheduled program.

In this post, I'm going to explain one of the applications of an abstract area of mathematics called hyperbolic geometry. The reason why this area is of interest is because there has been a surge of research showing its application in various fields, chief among them is a paper by Facebook researchers [1] in which they discuss how to utilize a model of hyperbolic geometry to represent hierarchical relationships. I'll cover some of the math weighting more towards intuition, show some of their results, and also show some sample code from Gensim. Don't worry, this time I'll try much harder not going to go down the rabbit hole of trying to explain all the math (no promises though).

(Note: If you're unfamiliar with tensors or manifolds, I suggest getting a quick overview with my previous two posts: Tensors, Tensors, Tensors and Manifolds: A Gentle Introduction)

Manifolds: A Gentle Introduction

Brian Keng

2018-04-17 06:24

Following up on the math-y stuff from my last post, I'm going to be taking a look at another concept that pops up in ML: manifolds. It is most well-known in ML for its use in the manifold hypothesis. Manifolds belong to the branches of mathematics of topology and differential geometry. I'll be focusing more on the study of manifolds from the latter category, which fortunately is a bit less abstract, more well behaved, and more intuitive than the former. As usual, I'll go through some intuition, definitions, and examples to help clarify the ideas without going into too much depth or formalities. I hope you mani-like it!

Tensors, Tensors, Tensors

Brian Keng

2018-03-13 08:24

This post is going to take a step back from some of the machine learning topics that I've been writing about recently and go back to some basics: math! In particular, tensors. This is a topic that is casually mentioned in machine learning papers but for those of us who weren't physics or math majors (*cough* computer engineers), it's a bit murky trying to understand what's going on. So on my most recent vacation, I started reading a variety of sources on the interweb trying to piece together a picture of what tensors were all about. As usual, I'll skip the heavy formalities (partly because I probably couldn't do them justice) and instead try to explain the intuition using my usual approach of examples and more basic maths. I'll sprinkle in a bunch of examples and also try to relate it back to ML where possible. Hope you like it!

Residual Networks

Brian Keng

2018-02-18 13:55

Taking a small break from some of the heavier math, I thought I'd write a post (aka learn more about) a very popular neural network architecture called Residual Networks aka ResNet. This architecture is being very widely used because it's so simple yet so powerful at the same time. The architecture's performance is due its ability to add hundreds of layers (talk about deep learning!) without degrading performance or adding difficulty to training. I really like these types of robust advances where it doesn't require fiddling with all sorts of hyper-parameters to make it work. Anyways, I'll introduce the idea and show an implementation of ResNet on a few runs of a variational autoencoder that I put together on the CIFAR10 dataset.