Mapping metropolitan areas with machine learning

Client: Tauranga City Council

Tauranga City Council (TCC) maintain an ongoing programme to monitor change across their city to assist with a range of planning and infrastructure needs.

In late 2019 Lynker Analytics was awarded a contract to use machine learning to extract land surface classes from the most recent (2019) high-resolution aerial photography encompassing most of the TCC area (approximately 210 square km). This work was focused on the need to produce detailed land cover maps to inform hydrological imperviousness estimates and hydraulic roughness for overland flow for the city’s 2D stormwater models.

We used computer vision to extract high resolution multiple land surface classes from the aerial photography which had a spatial resolution of 0.10m. Eight Land cover classes were identified and mapped including:

  • Buildings

  • Road & footpath

  • Vegetation (trees and shrubs)

  • Scrub

  • Grass (open space without trees and shrubs)

  • Water

  • Sand/Gravel/Bare Earth

  • Other impervious surfaces e.g. driveways, car parks

A supply of existing Roads and Buildings was provided by TCC with machine learning used for change detection in these classes.

The example below shows the imagery alongside the classified land cover map in an area that includes residential, industrial, parkland and coastal environment.

tcc1.png

Python, Tensorflow and Keras were used to build neural network architectures to annotate, classify, segment and achieve object localization from the imagery. Our approach used Active Learning or ‘human in the loop’ computing – a form of supervised machine learning that requires only the most informative samples for training. This technique provides a pathway to rapidly create a highly specific dataset for training machine learning models whilst simultaneously training a model with this data.

For this project we used a specially designed software interface that uses iterative semi-supervised machine learning to optimize and minimize the amount of data needed to train industrial artificial neural network models.

Multiple inference cycles were run. Vectorisation and comprehensive post processing was then carried out. This included clipping, sliver detection, geometry size checks, vertex counts, criteria-based dissolve and eliminate tasks before full data re-assembly. Finally, a quality check and validation exercise was run to ensure all features were representative and topologically correct.

The models were then calibrated using a separate hold out set of ground truth data carefully selected and not used for training. A number of refinements were made to improve the definition in this quality control stage. Model accuracy was calculated using Absolute Accuracy, Precision, Recall and F1 scores.

Overall, the model results exceeded 90% for all eight land cover categories including up to 97% for some classes.

In residential areas the model is at its optimum as the land cover is generally a mix of buildings, footpaths, roads, grass, vegetation, and other impervious surfaces e.g. concrete, driveways, patios etc. This finer detail is well described in this high resolution ortho-photography and our models.

tcc2.png

In coastal, rural, farmland and horticultural areas there tends to be more heterogeneity in the land cover with the consequence that smaller polygons are generated as the model attempts to describe everything on the surface. There is also a mixed surface response at this resolution across rural land such as in this example below where predominantly pasture (light green) is interspersed with vegetation (dark green). Earthworks for new sub-divisions, roadworks as well as the sandy beach are shown in yellow.

tcc3.png

This work extended previous modelling we have done using urban ortho-photography. We were able to adapt the neural networks we have already built to this image data and multi-class output requirements using an optimum volume of hand captured features for training and ground truth.

Deliverables

  • Multi-class land cover GIS data

  • Data accuracy and quality report