I am currently a data scientist at Foursquare applying machine learning algorithms to large spatiotemporal datasets.  I recently completed my PhD with advisor Tony Jebara at Columbia University.

Research Interests: visualization, dimensionality reduction, spatiotemporal data, networks, spectral optimizations, semidefinite programming, data mining, graph algorithms, large datasets

Machine Learning


Extracting Diurnal Patterns of Real World Activity from Social Media

Nir Grinberg, Mor Naaman, Blake Shaw, Gilad Lotan

International AAAI Conference on Weblogs and Social Media ICWSM 2013.

Download: Paper (PDF) | BibTex

Learning to Rank for Spatiotemporal Search

Blake Shaw, Jon Shea, Siddhartha Sinha, Andrew Hogue

Web Search and Data Mining Proceedings WSDM 2013.

Download: Paper (PDF) | BibTex

In this article we consider the problem of mapping a noisy estimate of a user's current location to a semantically meaningful point of interest, such as a home, restaurant, or store. We propose a novel spatial search algorithm that infers a user's location by combining aggregate signals mined from billions of foursquare check-ins with real-time contextual information.

Learning a Distance Metric from a Network

Blake Shaw, Bert Huang, Tony Jebara

Neural Information Processing Systems, NIPS, December 2011.

Download: Paper (PDF) | Supplemental (PDF) | Poster (PDF) | BibTex | Code

Many real-world networks are described by both connectivity information and features for every node. To better model and understand these networks, we present structure preserving metric learning (SPML), an algorithm for learning a Mahalanobis distance metric from a network such that the learned distances are tied to the inherent connectivity structure of the network.

Structure Preserving Embedding

Blake Shaw, Tony Jebara

International Conference on Machine Learning, ICML, June 2009.

Best Paper Award Winner

Download: Paper (PDF) | Poster (PDF) | Slides (PDF) | BibTex

View Talk: videolectures.net | MP4

Structure Preserving Embedding (SPE) is an algorithm for embedding graphs in Euclidean space such that the embedding is low-dimensional and preserves the global topological properties of the input graph.  Topology is preserved if a connectivity algorithm, such as k-nearest neighbors, can easily recover the edges of the input graph from only the coordinates of the nodes after embedding.

Minimum Volume Embedding

Blake Shaw, Tony Jebara

Artificial Intelligence and Statistics, AISTATS, March 2007.

Download: Paper (PDF)  | Poster | BibTex | Code

Minimum Volume Embedding (MVE) is an algorithm for non-linear dimensionality reduction that uses semidefinite programming (SDP) and matrix factorization to find a low-dimensional embedding that preserves local distances between points while representing the dataset in many fewer dimensions.

Workshop Papers

Recommending Interesting Events in Real-time with Foursquare Check-ins

Max Sklar, Blake Shaw, Andrew Hogue

ACM Conference on Recommender Systems RecSys 2012.

Download: Paper (PDF)

Learning a Degree-Augmented Distance Metric from a Network

Bert Huang, Blake Shaw, Tony Jebara

Beyond Mahalanobis: Supervised Large-Scale Learning of Similarity. NIPS 2011 workshop.

Download: Paper (PDF)

Visualizing Social Networks with Structure Preserving Embedding

Blake Shaw, Tony Jebara

Interdisciplinary Workshop on Information and Decision in Social Networks 2011

Download: Paper (PDF) | Poster (PDF)

Network Prediction with Degree Distributional Metric Learning

Bert Huang, Blake Shaw, Tony Jebara

Interdisciplinary Workshop on Information and Decision in Social Networks 2011

Download: Paper (PDF) | Poster (PDF)

Dimensionality Reduction, Clustering, and PlaceRank Applied to Spatiotemporal Flow Data

Blake Shaw, Tony Jebara

New York Academy of Science - Machine Learning Symposium 2009.

Download: Paper (PDF) | Poster (PDF)

Visualizing Graphs with Structure Preserving Embedding

Blake Shaw, Tony Jebara

Analyzing Graphs: Theory and Applications. NIPS Workshop. December 2008.

Download: Paper (PDF)

Graph Embedding with Global Structure Preserving Constraints

Blake Shaw, Tony Jebara

New York Academy of Science - Machine Learning Symposium, October 2008.

Download: Paper (PDF) | Poster (PDF)

Minimum Volume Embedding (NYAS)

New York Academy of Science - Machine Learning Symposium 2007.

Download: Paper (PDF)

Optimizing Eigengaps and Spectral Functions using Iterated SDP

Tony Jebara, Blake Shaw, Andrew Howard -- Learning Workshop 2007.

Download: Paper (PDF)

B-matching for Embedding

Tony Jebara, Blake Shaw, Vlad Schogolev

Snowbird Machine Learning Conference, April 2006.

Download: Paper (PDF)

Blog Posts and Talks at Foursquare

Big Data and the Big Apple: Understanding New York City using Millions of Check-ins

DataGotham -- September 2012

Blog post | Video of Talk

Machine Learning with Large Networks of People and Places

Blog post | Slides | Video of Talk

Foursquare is now aware of over 1.5 billion check-ins from 15 million people at 30 million different places all over the world. Each check-in can be thought of as an edge in a vast network connecting people to each other and to the places that they care about most. Graph-based machine learning algorithms are critical not only for making sense of these networks that emerge out of patterns of human mobility but also for creating useful data-driven products that help people better navigate the real world. In this talk, we will examine two networks that we have observed at foursquare, the Social Graph and the Place Graph, and then discuss various machine learning and big data techniques for better understanding these networks as well as using them to build a novel recommendation engine we call Explore.

A Hackday Project: What neighborhood is the ‘East Village’ of San Francisco?

Blog Post

Have you ever wondered what’s the equivalent of your neighborhood in another city? How you’d find the Times Square of Tokyo? The Beverly Hills of Dallas? Or the East Village of San Francisco? For a hackday project this January, we mapped our 1,500,000,000 check-ins to 140,000 neighborhoods all over the world to better understand and compare the different places we live, work, and play. Here is a brief account of our hack.

Projects at Sense Networks

CabSense - The Smartest Way to Hail a Cab in NY

MacroSense - Relevant Recommendation, Personalization and Discovery from Mobile Location Data

CitySense - Live San Francisco Nightlife Activity


Programming Languages (Matlab)

w3101 section 1 - Spring 2008

Course Website


12/134,634 (pending) - System and Method of Performing Location Analytics

12/241,266 (pending) - Event Identification in Sensor Analytics

2/241,227 (pending) - Comparing Spatial-Temporal Trails in Location Analytics

Earlier Projects in Computer Science at Columbia