Crucial to many mobility-related studies is good origin-destination data, but most data is only available at the level of statistical sectors, municipalities, regions, and so forth. Such data, despite being useful for generic analysis providing overviews, is rather holistic when it comes to analysis on a microscopic scale (e.g., the number of travelers on each road segment).

In this regard, a common practice is to use each area’s centroid as the base-point. However, these areas are usually irregular polygons making the centroid rather a poor representative of the data sample. Therefore, it would be best to use the information that comes with the entire polygon.

In some origin-destination datasets, we know that one side of the data describes, for example, residential points of origin. It’s also usually this part that is privacy-sensitive data meaning it’s aggregated to larger areas. Using the fact that these points are residential, combined with landuse, we can enhance the original origin-destination dataset by randomly distributing the samples in data on to the areas with the corresponding landuse.

To do this, we use a random point cloud distributed based on landuse data and knowledge about cyclists' behavior.

Residential areas are one of (if not the) most relevant areas. After all, residential areas usually are where trips are generated from or end to. Thanks to OpenStreetMap (OSM), we know where these areas are.

Residential areas around schools in Bordet in Brussels.
Residential areas around schools in Bordet in Brussels.

The next challenge is to attribute the origin-destination data to the corresponding landuse areas. For instance, in the case of a school accessibility analysis by bicycle, we use the school’s surrounding residential areas. But just randomly distributing the data over all residential areas around the schools cannot be done without answering a crucial question: how far around the school?

This question is partially answered in some origin-destination datasets. Besides, existing research suggests a clear link between the mode choice, in this case bicycle, and the distance traveled.

Students by distance between place of living and college location; avg 2011-13.
Students by distance between place of living and college location; avg 2011-13.

With this in mind, distance intervals can serve as bins and subsequently proportion of cycling trips from each distance interval to the school may be considered as the expected number of seeds (i.e., students) in each bin.

Distance-based classification of Residential areas around schools in Bordet in Brussels.
Distance-based classification of Residential areas around schools in Bordet in Brussels.

Now it’s time to distribute points randomly and create point cloud(s) over the corresponding (distance interval classified) polygons. Each point represents a possible student, and of course, the profile of students in a given school may change every school year. In other words, new students may as well mean new addresses. Therefore, the higher the iteration in the distribution of random points, the more reliable the accessibility analysis and subsequent assessments and conclusions.

Random Points Distribution; 14 point Clouds.
Random Points Distribution; 14 point Clouds.

Eventually, we used the point clouds as origins and schools as destinations with an appropriate routing profile and our QGIS plugin to plan routes between all origin-destination pairs and import them in QGIS for further analysis.

Cycling trips between potential origins and a set of destinations.
Cycling trips between potential origins and a set of destinations.

Next, we can also do a Network Frequency Analysis, or in other words, in how many scenarios (school routing), a link has been used. This can be interpreted as (body of) the most pertinent cycling network serving these schools.

A network frequency analysis, the most frequently used links over all scenarios.
A network frequency analysis, the most frequently used links over all scenarios.

In conclusion, even with sparse origin-destination data, we can analyze cycling use around schools. This is made possible by combining the origin-destination data with landuse data and knowledge about cyclists' behavior.

Posted by Hamed Eftekhar,