Hotel Ranking

Ranking is my single favorite problem in data science.

In terms of theory, the core ranking problem is very hard: given a set of `n` items, return the best possible permutation of those items which minimizes some penalty function. Since there are `n!` permutations for a set of `n` items, this "core" problem is too intractable, and everyone has to make simplifying assumptions. In my opinion, learning different ways to simplify the ranking problem is instructive to learning modeling in general. And the ranking research space moves fast - every year there is something new!

In terms of engineering, ranking is an extremely high-throughput problem with extremely tight latency requirements. Most models predict on one input, but ranking requires predictions on a set of inputs. Depending on your setup, the dataset to predict on can be very large; for each item you could be pulling a lot of relevant information to inform your ranking decision. And no matter what the computational requirements are, you have to do it fast - the user is waiting for you to present a recommendation!

I've spent a lot of time working on sub-problems in ranking at Rocket and at NYU, and I'm happy to share some presentable work here.