Predicting the Geospatial Availability of Mobility Services like Bird and Lime

Using Geo Viz to visualize predictions on a map

Part 1 : Data Acquisition

From the AutoTel website I extracted the location of the parked cars, every two minutes for several months. The raw data was saved to Google Storage in CSV format, and later loaded to a BigQuery Table. This short clip shows a visualization of the recorded data using Uber’s kepler.gl tool

Parked shared-car locations on a map with timeline visualization using Kepler.gl

Part 2 : Geographical Joins and Grouping

Each row in the raw dataset represents a location of a parked car at a given time. In order to aggregate the data to neighborhood level I needed to use a spatial join statement.

SELECT car_locs.*, ta_dis.neighbourhood_name FROM car_locs 
JOIN ta_dis
ON ST_WITHIN(ST_GEOGPOINT(car_locs.longitude, car_locs.latitude),
ST_GeogFromText(ta_dis.area_polygon))

Part 3 : Model Training

As of December 2018 BQML supports two types of models — linear regression for regression and logistic regression for classification tasks. We train the models with the following SQL command:

CREATE OR REPLACE MODEL 
`autotel_demo.free_cars_model` --model save path
OPTIONS
(model_type='linear_reg', ls_init_learn_rate=.015, l1_reg=0.1,
l2_reg=0.1, data_split_method='seq', data_split_col='split_col',
min_rel_progress=0.001, max_iterations=30),
SELECT
free_cars label, -- declaring target variable
timestamp split_col
-- independent variables:
,age5to14
...
FROM
`autotel_demo.autotel_dataset` as dataset
WHERE dataset.timestamp < TIMESTAMP '2018-10-11'

Part 4 : Model Evaluation

After training we can use “Model.Evaluate” function to provide metrics on the model performance. While these metrics are useful in many cases, this time we are facing a geospatial prediction task, and we would like to view the model predictions on a map.

Visualizing the predictions using Geo Viz

Part 5 — Deploying to Production

Let’s examine the overall architecture that was created

Pipeline architecture

Summary

As part of the effort to simplify the integration of machine learning models to production systems, BigQuery ML serves a managed tool that can serve as an end-to-end mechanism for the core part of the pipeline. If a simple linear/logistic regression model is sufficient, you will find that BQML can support:

  1. Data hosting
  2. Dataset creation
  3. Model training + hosting
  4. Inference and serving

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gad Benram

Gad Benram

Machine Learning Architect and Google Developer Expert.