Bounded Rationality (Posts about Rubikloud)

The Hard Thing about Machine Learning

Brian Keng — Tue, 22 Aug 2017 12:32:55 GMT

I wrote a post on the hard parts about machine learning over at Rubikloud:

The Hard Thing about Machine Learning

Here's a blurb:

Much of the buzz around machine learning lately has been around novel applications of deep learning models. They have captured our imagination by anthropomorphizing them, allowing them to dream, play games at superhuman levels, and read x-rays better than physicians. While these deep learning models are incredibly powerful with incredible ingenuity built into them, they are not humans, nor are they much more than “sufficiently large parametric models trained with gradient descent on sufficiently many examples.” In my experience, this is not the hard part about machine learning.

Beyond the flashy headlines, the high-level math, and the computation-heavy calculations, the whole point of machine learning — as has been with computing and software before it — has been its application to real-world outcomes. Invariably, this means dealing with the realities of messy data, generating robust predictions, and automating decisions.

...

Just as much of the impact of machine learning is beneath the surface, the hard parts of machine learning are not usually sexy. I would argue that the hard parts about machine learning fall into two areas: generating robust predictions and building machine learning systems.

Enjoy!

Building A Table Tennis Ranking Model

Brian Keng — Wed, 19 Jul 2017 12:51:41 GMT

I wrote a post about building a table tennis ranking model over at Rubikloud:

Building A Table Tennis Ranking Model

It uses Bradley-Terry probability model to predict the outcome of pair-wise comparisons (e.g. games or matches). I describe an easy algorithm for fitting the model (via MM-algorithms) as well as adding a simple Bayesian prior to handle ill-defined cases. I even have some code on Github so you can build your own ranking system using Google sheets.

Here's a blurb:

Many of our Rubikrew are big fans of table tennis, in fact, we’ve held an annual table tennis tournament for all the employees for three years running (and I’m the reigning champion). It’s an incredibly fun event where everyone in the company gets involved from the tournament participants to the spectators who provide lively play-by-play commentary.

Unfortunately, not everyone gets to participate either due to travel and scheduling issues, or by the fact that they miss the actual tournament period in the case of our interns and co-op students. Another downside is that the event is a single-elimination tournament, so while it has a clear winner the ranking of the participants is not clear.

Being a data scientist, I identified this as a thorny issue for our Rubikrew table tennis players. So, I did what any data scientist would do and I built a model.

Enjoy!

Beyond Collaborative Filtering

Brian Keng — Sat, 11 Jun 2016 22:00:34 GMT

I wrote a couple of posts about some of the work on recommendation systems and collaborative filtering that we're doing at my job as a Data Scientist at Rubikloud:

Here's a blurb:

Here at Rubikloud, a big focus of our data science team is empowering retailers in delivering personalized one-to-one communications with their customers. A big aspect of personalization is recommending products and services that are tailored to a customer’s wants and needs. Naturally, recommendation systems are an active research area in machine learning with practical large scale deployments from companies such as Netflix and Spotify. In Part 1 of this series, I’ll describe the unique challenges that we have faced in building a retail specific product recommendation system and outline one of the main components of our recommendation system: a collaborative filtering algorithm. In Part 2, I’ll follow up with several useful applications of collaborative filtering and end by highlighting some of its limitations.

Hope you like it!