Uber’s open source data visualization for advanced geospatial exploration

Uber’s open source data visualization for advanced geospatial exploration

Pic 1: Using data from data.gov.uk, this 3D heat map shows the number of personal injuries in the UK from 1979 to the present day in a hexagonal layer.
Pic 2: Dragging the cursor over a given radius area reveals distributions of Uber dropoffs in real-time.

I am sure the above 2 analytical presentations must be quite visually appealing to you as well. And why not, if the idea is to deliver intelligence through crafting visual exploratory data analysis tools for Uber’s datasets!

It is not just about pretty pictures though. Many of you must not be aware that Uber has invested heavily in developing an open source data visualization tool, and this makes for great production value. Uber uses its software to help users develop new insights from data, and the results can be notable.

Open Source Data Visualization

In November 2016, Uber made its data visualization software deck.gl open source. deck.gl is a WebGL-powered framework specifically designed for exploring and visualizing data sets at scale. Last month, the company introduced an updated version deck.gl 4.0. It lets the users extract both historical and real-time insights from large, complex datasets, allowing then to think in 3D.

“Personally, I’m a very big advocate for open source, so I think that open source can help the business in many other ways, and those ways are immediately more valuable than someone paying for this software,” Nicolas Garcia Belmonte, Head of Data Visualization, Uber. “I would say this is more about trying to reach out to developers and seeing what they can come up with in creative ways.” Uber’s goal for 4.0 is threefold:

Advanced Geospatial Visualization

deck.gl 4.0 is particularly strong at visualizing geospatial data. It brings a new set of powerful layers to the catalog, capable of covering more use cases from across the mapping world.

For example Grid and Hexagon layers — these layers encode the aggregation results of location data by square or hexagon in the color or height of their cells, providing a 2D heat map or 3D height map visualization of the data. In Pic 1 (above), data from data.gov.uk to depict the number of personal injuries based on location in the UK since 1979 has been used. The model demonstrates variable granularity for grouping data, the variable radius for hexagons, and a simple way to select percentiles from the overall distribution.

Beyond Maps

But it’s not all about geospatial data. Data science teams at Uber are also using the software to visualize machine learning models to get a more intuitive sense of how they are working. The team has also worked on decoupling the Mercator projection viewport from deck.gl so that a user can take in any type of viewport, including standard 3D perspective and orthographic viewports.

For instance: 3D surface layer can be used for rendering things like Partial Dependence Plots for machine learning models, highlighting the correlations between two variables and their impact on a prediction.

Making deck.gl Easier to Use

In addition to adding new layers and functionalities, deck.gl 4.0 makes the framework more accessible to new and experienced users alike. Documentation has been extensively improved and reorganized in response to user feedback. In particular, every layer now has an interactive layer browser allowing the user to play with all the properties of the layer while reading the docs. This makes it much easier to understand how the provided deck.gl layers work, as well as determine whether or not they work for your use cases.

“The main idea behind this library is that it’s a WebGL-powered framework that is designed for exploring and visualizing data assets at scale,” explains Belmonte. “There’s a lot of geospatial stuff that we do here, as well, as you can probably imagine from the core business, so we visualize a lot of data on maps.”