Thursday, December 15, 2016

Neural networks for graphs

I met at Thomas Kipf from University of Amsterdam at NIPS 2016 and he pointed out some interesting blog post he wrote regarding neural networks for graph analytics.

Friday, June 24, 2016

Thursday, June 23, 2016

4th Large Scale Recommender Systems workshop - deadline extended

We have extended the deadline of our Large Scale Recommender Systems workshop to June 30. This is the 4th year we are organizing this workshop as part of ACM Recsys 2016. Anyone with novel work in the area of applied recommender systems is welcome to submit a talk proposal.

Sunday, June 19, 2016

GraphLab Create healthcare use case

A nice blog post from Mark Pinches, our Manchester evangelist who is working with John Snow  Labs. It shows how to use GraphLab Create for slicing, dicing and aggregations of healthcare data.

Saturday, June 18, 2016

Novomatic - comparing spark, pandas and Dato

Interesting slides from Bogdan Pirvu head data science @ Novomatic (Austrian Gaming Industries). I met Bogdan at RecSys Vienna last fall and he got interested in GraphLab Create. In his talk Bogdan compares pandas, Spark and Graphlab Create. Guess who is the winner?

Thursday, June 16, 2016

Prof. Alex Smola moves to Amazon

Here is the note he wrote on his blog. Alex will be heading Amazon Cloud ML effort. Definitely a great win for Amazon!

If you like to hear Alex speaking about his recent research you should attend our Data Science Summit July 12-13 in SF. You are welcome to use discount code DSSfriend.

Tuesday, June 14, 2016

RSA fraud in social media report

An interesting report from RSA is found here. A lot of cool visualizations of social communities of fraudsters.

Tuesday, June 7, 2016

Data Science Summit Europe - A great success!!

Thanks for everyone who helped us create this 1000 attendees data science event in Jerusalem. A special thanks for Assaf Araki from Intel & Avner Algom CEO IGTCloud and all the guests that arrived especially from abroad.

Here are some pictures from the event.

Dato training day at Azrieli college:

Ben Lorica's keynote

 JVP Dinner
 My keynote
 Full house!
 Meeting with Dr. Kim Larsen, SVP Deutsche Telekom
 Dato booth
 My parallel session talk:

Friday, June 3, 2016

MMDS 2016 is coming up

My friend Michael Mahoney is setting up again MMDS (Workshop on Algorithms of Massive DataSets). Only 9 days left for registration:

MMDS 2016.
Workshops on Algorithms for Modern Massive Data Sets 

Register for MMDS:
Confirmed speakers List:
UC Berkeley, CA

Tuesday, June 21 through Friday, June 24


MMDS Registration Deadline - ONLY 9-DAYS LEFT!

This is a quick reminder that MMDS 2016 Registration is available until June 12th, 2016. Please don't forget to register early to secure your spot!  

MMDS 2016 includes four-day series of academic workshops address algorithmic and statistical challenges in modern large-scale data analysis. The goals of this series of workshops are to explore novel techniques for modeling and analyzing massive, high-dimensional, and non-linearly structured scientific and internet data sets; and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote the cross-fertilization of ideas.
MMDS Organizers: Michael Mahoney (Chair), Alex Shkolnik, and Petros Drineas

Monday, May 16, 2016


I got this from my colleague Rajat Arya: Amazon newly release a new deep learning framework called Amazon DSSTNE. I have connected with Scott Le Grand, one of the main developers, and he kindly agreed to talk about DSSTNE at our Data Science Summit US, July 12-13 in SF.  In addition, Jeff Dean, Google's TensorFlow creator will speak as well.

Sunday, May 15, 2016

Ligra wins ACM PhD Dissertation Award

From my friend Jonthan Rosenblatt who is a Prof. at BGU. My other friend Michael Mahoney just published a blog post about Ligra work by Julia Shun which won the ACM Dissertation Award. Ligra is a shared memory graph processing system which improves our older PowerGraph work.

Thursday, April 7, 2016

SalesForce Acquires MetaMind

Perhaps not surprisingly, the purchase streak continues: it was recently announced that SalesForce acquires MetaMind. I mentioned Richard Socher in my blog 3 years ago, and wrote about MetaMind in 2014. Maybe VCs should read my blog?

Anyway MetaMind joins others company recently sold to SalesForce like and Sense.

Friday, April 1, 2016

My PyData Amsterdam Talk

I never published any of my video talks in this blog before, however I was last month in PyData Amsterdam and had a great time. I was a talk which was well received - the audience laughed a lot so did I. Here is the video:

A much better talk from Rodrigo Agundez from Qualogy - a tutorial for face detection in python with Ipython notebook and opencv. Very recommended!


An interesting paper I got from my friend Asher Cohen: Mihalcea, Rada, and Paul Tarau. "TextRank: Bringing order into texts." Association for Computational Linguistics, 2004. pdf
A very simple construction to mine text entities into graph and then compute graph ranking on the resulted graph to find the most important keywords.

Interesting comparison of XGBoost and

I got this by my colleague Brian Kent.

Domino Data Labs compared XGboost boosted decision trees to implementation claiming significant performance gains (x10) for XGboost.

When I tried to access this blog post I find out the blog post was removed. But here is the chart that was appended to the blog post

Monday, February 22, 2016

MXNET vs. TensorFlow

An interesting comparison by Prof. Alex Smola from CMU (thanks for Matt Grover from Walmart for sharing!)

SalesForce Acquires

Relatively fresh news from my colleague Brian:

I have invited Simon Chan to present at our GraphLab Conference in july 2014. Good luck Simon!!

Tuesday, January 26, 2016

Google is pumping tensorflow

Some misc news regarding deep learning. Got them from my colleagues Guy Rapoport Charlie Maalouf.

1) Google released an interesting new course about TensorFlow. I viewed the first lecture, looks like a lot of effort is invested. Interestingly, you need to install sklearn for performing the first exercise, which actually means that TensorFlow does not solve everything as some people think.

2) DataBricks claims to support TensorFlow: although when reading the content it is not clear where Spark is used. Seems likes an opportunity to hop on something cool. It seems they propose to run many different experiments in parallel (what we call embarrassingly parallel) but for that you can write a bash script so not clear why Spark is needed or required here.

3) Microsoft woke up and released their own deep learning toolkit. Not clear what is new there (in terms of functionality). But they claim much superior performance to TensorFlow:

Friday, January 15, 2016

Data Science Summit - registration is open

Join us July 12-13, 2016 in SF for the 5th annual Data Science Summit. The Summit brings together 1400 developers and data scientists for talks and tutorials by the top minds in industry and research. Learn to leverage the best new advances in data science and applied machine learning, and discover how industry greats are building the next generation of data products and intelligent applications.

This non-profit event is organized jointly by Intel, Dato and O’Reilly. The conference agenda has been co-created with Dr. Ben Lorica, Chief Scientist of O’Reilly Media who serves as the content manager of the O’Reilly Strata Conferences.

Preliminary speakers, more coming soon…

  • Prof. Alex Smola - Carnegie Mellon University & Marianas Labs
  • Xavier Amatriain - Quora
  • Prof. Carlos Guestrin - Dato & University of Washington
  • Wes McKinney - Cloudera
  • Prof. Magdalena Balazinska - University of Washington
  • Anthony Goldbloom - CEO Kaggle
  • Prof. Mike Jordan - UC Berkeley
  • Kevin Novak - Uber
  • Prof. Chris Re - Stanford University

We invite all related companies and research labs to submit talks about novel case studies, algorithms and applications.

There are flexible sponsorship options for companies who want to get involved.

Expected audience: 1400 data scientists, developers, researchers, CTOs, VPs of Engineering and Product, VPs of data analytics, professors and students.

Date & Location: July 12-13, 2016 at the Fairmont San Francisco Hotel

Registration is open - please use this 20% off discount code when registering:


Thursday, January 14, 2016

3D modeling with GraphLab Create

About a year ago I gave a demo of GraphLab Create to Prof. Amit Wolf from the Southern California Institute of Architecture (SCI-Arc). I was pleasantly surprised to learn  this week that Amit started to use GraphLab Create for 3D modeling. Together with his assistant Jordan Squires they created a workshop for training architecture students. The task is to bridge the gap between 3D scans of 3D printed objects and their 3D plan mesh. I got some amazing photos which present this work:

Anyone who wants to learn more about this work is welcome to attend our Data Science Summit US where Amit will give a talk about this work.

Deep learning

A nice treatment to the buzz round deep learning (got this from Yishay Carmiel @ Spoken)

Sunday, January 10, 2016

1-page acquires Marianas Labs

Just found today that Alex Smola's startup was purchased 12 days ago:
Here is a talk by Alex about his startup in our annual data science summit.