Training Machine Learning models with ArcGIS Online

Rātā Chapman Olsen, Senior Geospatial Consultant

Our work with location aware machine learning covers a broad range of industries and use cases from land cover mapping and habitat monitoring e.g. beaver ponds, riparian margins, wetland extent, through to fish detection and asset inspection e.g. roof, road or pipe material or condition. With this diversity, it is impractical for Lynker Analytics to employ staff with the expert knowledge in every sub-field to supply our machine learning (ML) models with high quality examples to learn from.

To overcome this, we often partner with our clients to bolster and optimise the training effort with tuned and location specific examples. In some cases we have also worked with domain experts such as PDP or CSU to supply expert training.

In many of our projects we employ a training approach called Active Learning which is a ‘human in the loop’ computing technique that significantly out-performs typical or passive supervised ML methods. Active Learning involves the ML model guiding the trainer where to capture the most informative training samples using a statistical measure of uncertainty (entropy).

In typical projects, multiple training/inference cycles are run using Python and Tensorflow/Keras before the ML model achieves the accuracy required. In each cycle, high entropy situations, features or locations are presented to the trainer for review and annotations are then passed back to the ML model to be used in the next inference. The cycles of entropy-based training necessitate a flow of data to and from the ML model and this dataflow can be laborious to orchestrate, even more so when the trainer is from a different organisation or in a different time zone.

We originally trained the majority our location aware ML models using Desktop GIS applications like ArcGIS Pro where geospatial result files would be shared back and forth. However, this can get difficult because of data size and file versioning, and simply keeping track on numerous datasets moving between people and organisations.

To overcome this problem and provide scalability we have adopted a Web GIS architecture and specifically ArcGIS Online from Esri. Lynker Analytics has a strong relationship with Esri, a partner since our inception, and the suite of web hosted maps, apps, and dashboards from Esri make the sharing of data, model versions and labelling tools with a client simple and effective.

Web GIS viewers are a great means of presenting interim results and getting feedback

Using off-the-shelf features in ArcGIS Online, Lynker Analytics can quickly setup an app containing the target geography to be trained and have an expert familiar with the area and subject start capturing data. Our interfaces can be as complex or simple as the task requires with the web GIS interface supporting Object Detection (OD) or semantic segmentation ML models.

Example App used to create entropy-based training data, targets in left hand panel

Using this process the data we use to train models is fully contained with ArcGIS Online including:

  1. Entropy locations which guide the human trainer to the next training site. These locations are generated from the ML model using entropy.

  2. Training annotation dataset which the human trainer captures the data into and is used to train the model.

This method is especially useful for earth observation studies where multiple frames can be used within the editing application to display false colour ratios like water or vegetation indexes. It also enables multiple users to participate in model training which not only accelerates training but serves to reduce training bias while providing useful visual reassurances as the models steadily improve to completion.