Bounded Rationality (Posts about entropy)http://bjlkeng.github.io/enTue, 04 Jun 2024 00:49:16 GMTNikola (getnikola.com)http://blogs.law.harvard.edu/tech/rssLossless Compression with Asymmetric Numeral Systemshttp://bjlkeng.github.io/posts/lossless-compression-with-asymmetric-numeral-systems/Brian Keng<div><p>During my undergraduate days, one of the most interesting courses I took was on
coding and compression. Here was a course that combined algorithms,
probability and secret messages, what's not to like? <a class="footnote-reference brackets" href="http://bjlkeng.github.io/posts/lossless-compression-with-asymmetric-numeral-systems/#id2" id="id1">1</a> I ended up not going
down that career path, at least partially because communications systems had
its heyday around the 2000s with companies like Nortel and Blackberry and its
predecessors (some like to joke that all the major theoretical breakthroughs
were done by Shannon and his discovery of information theory around 1950). Fortunately, I
eventually wound up studying industrial applications of classical AI techniques
and then machine learning, which has really grown like crazy in the last 10
years or so. Which is exactly why I was so surprised that a <em>new</em> and <em>better</em>
method of lossless compression was developed in 2009 <em>after</em> I finished my
undergraduate degree when I was well into my PhD. It's a bit mind boggling that
something as well-studied as entropy-based lossless compression still had
(have?) totally new methods to discover, but I digress.</p>
<p>In this post, I'm going to write about a relatively new entropy based encoding
method called Asymmetrical Numeral Systems (ANS) developed by Jaroslaw (Jarek)
Duda [2]. If you've ever heard of Arithmetic Coding (probably best known for
its use in JPEG compression), ANS runs in a very similar vein. It can
generate codes that are close to the theoretical compression limit
(similar to Arithmetic coding) but is <em>much</em> more efficient. It's been used in
modern compression algorithms since 2014 including compressors developed
by Facebook, Apple and Google [3]. As usual, I'm going to go over some
background, some math, some examples to help with intuition, and finally some
experiments with a toy ANS implementation I wrote. I hope you're as
excited as I am, let's begin!</p>
<p><a href="http://bjlkeng.github.io/posts/lossless-compression-with-asymmetric-numeral-systems/">Read more…</a> (32 min remaining to read)</p></div>Arithmetic Codingasymmetric numeral systemscompressionentropyHuffman codingmathjaxhttp://bjlkeng.github.io/posts/lossless-compression-with-asymmetric-numeral-systems/Sat, 26 Sep 2020 14:37:43 GMTThe Calculus of Variationshttp://bjlkeng.github.io/posts/the-calculus-of-variations/Brian Keng<div><p>This post is going to describe a specialized type of calculus called
variational calculus.
Analogous to the usual methods of calculus that we learn in university,
this one deals with functions <em>of functions</em> and how to
minimize or maximize them. It's used extensively in physics problems such as
finding the minimum energy path a particle takes under certain conditions. As
you can also imagine, it's also used in machine learning/statistics where you
want to find a density that optimizes an objective <a class="footnote-reference brackets" href="http://bjlkeng.github.io/posts/the-calculus-of-variations/#id4" id="id1">1</a>. The explanation I'm
going to use (at least for the first part) is heavily based upon Svetitsky's
<a class="reference external" href="http://julian.tau.ac.il/bqs/functionals/functionals.html">Notes on Functionals</a>, which so far is
the most intuitive explanation I've read. I'll try to follow Svetitsky's
notes to give some intuition on how we arrive at variational calculus from
regular calculus with a bunch of examples along the way. Eventually we'll
get to an application that relates back to probability. I think with the right
intuition and explanation, it's actually not too difficult, enjoy!</p>
<p><a href="http://bjlkeng.github.io/posts/the-calculus-of-variations/">Read more…</a> (16 min remaining to read)</p></div>differentialsentropylagrange multipliersmathjaxprobabilityvariational calculushttp://bjlkeng.github.io/posts/the-calculus-of-variations/Sun, 26 Feb 2017 15:08:38 GMTMaximum Entropy Distributionshttp://bjlkeng.github.io/posts/maximum-entropy-distributions/Brian Keng<div><p>This post will talk about a method to find the probability distribution that best
fits your given state of knowledge. Using the principle of maximum
entropy and some testable information (e.g. the mean), you can find the
distribution that makes the fewest assumptions about your data (the one with maximal
information entropy). As you may have guessed, this is used often in Bayesian
inference to determine prior distributions and also (at least implicitly) in
natural language processing applications with maximum entropy (MaxEnt)
classifiers (i.e. a multinomial logistic regression). As usual, I'll go through
some intuition, some math, and some examples. Hope you find this topic as
interesting as I do!</p>
<p><a href="http://bjlkeng.github.io/posts/maximum-entropy-distributions/">Read more…</a> (11 min remaining to read)</p></div>entropymathjaxprobabilityhttp://bjlkeng.github.io/posts/maximum-entropy-distributions/Fri, 27 Jan 2017 14:05:00 GMT