Sunday, June 2, 2013

Notable presentations at Technion TCE conference 2013: RevMiner & Boom

I found two interesting talks at the TCE conference. 

The first by Prof. Oren Etzioni from University of Washington on the RevMiner system. RevMiner is used for mining Seattle text restaurant reviews obtained from Yelp!. The results where published in UIST paper in 2012: http://turing.cs.washington.edu/papers/uist12-huang.pdf
In a nutshell, they look for tokens and their relations in the text, and you can ask questions like "good sushi @ seattle" and find a recommendation based on user reviews.  They are trying to compete in Yelp! business prediction contest I wrote about here. I wonder how they system will perform. 

They further have some nice UI which is a color bar which summarizes the level of different ratings, and for each recommendation you can understand how it was created by browsing the original reviews. 

Another related paper of text mining twitter data: http://turing.cs.washington.edu/papers/kdd12-ritter.pdf

The second interesting talk, by Yoram Singer from Google, is about a system called Boom. The context is classification performed on ads, to decide which ad the user will click. The binary features of the ad are from a very high dimensional space, but are very sparse. 

Boom uses a very simple parallel coordinate descent for optimizing a cost function which is a sum of convex functions. The main trick is speeding up Nesterov method of accelerate gradients by using the fact the data is highly skewed.

An update: just got a note from my reader Patrick Durusau, with a link to Boom video lecture. Thanks Patrick!

No comments:

Post a Comment