Bounded Rationality (Posts about residual networks)

Label Refinery: A Softer Approach

Brian Keng — Tue, 04 Sep 2018 11:26:02 GMT

This post is going to be about a really simple idea that is surprisingly effective from a paper by Bagherinezhad et al. called Label Refinery: Improving ImageNet Classification through Label Progression. The title pretty much says it all but I'll also discuss some intuition and show some experiments on the CIFAR10 and SVHN datasets. The idea is both simple and surprising, my favourite kind of idea! Let's take a look.

Universal ResNet: The One-Neuron Approximator

Brian Keng — Fri, 03 Aug 2018 12:03:28 GMT

"In theory, theory and practice are the same. In practice, they are not."

I read a very interesting paper titled ResNet with one-neuron hidden layers is a Universal Approximator by Lin and Jegelka [1]. The paper describes a simplified Residual Network as a universal approximator, giving some theoretical backing to the wildly successful ResNet architecture. In this post, I'm going to talk about this paper and a few of the related universal approximation theorems for neural networks. Instead of going through all the theoretical stuff, I'm simply going introduce some theorems and play around with some toy datasets to see if we can get close to the theoretical limits.

(You might also want to checkout my previous post where I played around with ResNets: Residual Networks)

Residual Networks

Brian Keng — Sun, 18 Feb 2018 18:55:13 GMT

Taking a small break from some of the heavier math, I thought I'd write a post (aka learn more about) a very popular neural network architecture called Residual Networks aka ResNet. This architecture is being very widely used because it's so simple yet so powerful at the same time. The architecture's performance is due its ability to add hundreds of layers (talk about deep learning!) without degrading performance or adding difficulty to training. I really like these types of robust advances where it doesn't require fiddling with all sorts of hyper-parameters to make it work. Anyways, I'll introduce the idea and show an implementation of ResNet on a few runs of a variational autoencoder that I put together on the CIFAR10 dataset.