Like tens of millions of commuters around the world, 35-year-old east London resident Jemma Taylor monitors her cycle ride to and from work.
Using the fitness tracker app Strava, Ms Taylor logs the route, speed and other data on her 7.5-mile journey to the capital’s financial centre. She uses this to set goals on her commute and for training at her local cycling club.
Her information is also being used by London’s planners. In 2016 Strava signed a contract with Transport for London, the city’s transport agency, to share its data. In October TfL renewed the relationship, signing up to a four-year contract worth $272,080.
“It’s a unique data set,” said TfL data planner Louise Hall.
Data generated by Strava’s 47m global users is now being aggregated and anonymised on its Metro platform, a crowdsourced resource the company describes as the “largest active transport data set on the planet”.
The platform showcases the potential, as well as the risks, of using privately generated big data for public development.
Information about travel routes is amassed from a wealth of sources, including self-tracking apps; bike hire schemes such as Lime — which shares its data with London local authorities; and mapping apps like CityMapper, which uses its own data and that from TfL to recommend better routes about the city.
Strava created of Metro in 2012 as a way to share its data with researchers and local authorities on how people move around their cities. It is now being used by some 300 organisations and has just been relaunched so it presents data in easy-to-use maps and graphs.
When the UK capital built a “cycle superhighway” in 2016, Strava indicated where people had changed their route and showed that the number of cyclists increased by 60 per cent when a bike-only lane was built along the Victoria Embankment on the Thames. Planners can observe changes, such as many cyclists avoiding a direct route, to see where roads may be dangerous.
Granular data from Strava also show where cyclists have to stop and wait, information Ms Hall used to review traffic light patterns so more cyclists could get a clear run on their commute.
While recognising its potential, however, researchers warned that Strava and other crowdsourced data sets should be treated with caution. Giulio Ferrini, from cycling charity Sustrans, said the average Strava user was probably “not representative” of the average cyclist.
Strava says it has 5.5m users in the UK and one in seven cyclists in the country use the app. But researchers fear they are a self-selecting group, filtered by an affinity for exercise apps that may make them more competitive than others. According to Ms Hall at TfL, they “tend to be more gung-ho”.
Relying on crowdsourced data, Mr Ferrini said, could lead to cities being designed for “white men in Lycra” who usually travel speedily from A to B and neglecting groups such as parents who cycle with their children to school.
Tom Knights, Europe manager for Strava, acknowledged the tool was not “trying to do everything”. But he pointed to several academic studies that found similar travel patterns on Strava data and other sources.
David McArthur, a researcher at Glasgow university’s Urban Big Data Centre, compared journeys logged by Strava against roadside counts and found the numbers correlated relatively well.
He has found, based on data from the app, that when segregated cycle-only paths were built cycling volumes increased by 12-18 per cent; when a lane was simply painted on the road the number dropped.
The app “opens up new modelling possibilities”, Mr McArthur said, but noted important discrepancies in the data. There are more Strava users, for example, in wealthier west Glasgow than the east of the city, so using the data alone could risk replicating deeper inequalities.
The app does also not help plan cities to encourage more people to cycle.
“In many cities, just 2 per cent of trips are currently made by cycling,” said Robin Lovelace, a researcher at Leeds University. “If you are building on the basis of journeys made yesterday, you’re only build for a tiny number of people compared with the huge potential for cycling to grow.”
Mr Lovelace is on the team using open-source and government data to create the Propensity to Cycle Tool, which is used by the government as part of work to improve cycling and walking infrastructure. It is both publicly available and measures the difficult-to-quantify potential rides.
Strava is pushing to be recognised as a “trusted source” among other inputs in that official planning too.
“When you start to look at Strava alongside different pieces of data, it starts to look really powerful,” Mr Knights said. “It’s an example of tech companies doing something authentic with their data, not just for financial gain.”