GeoAI at the American Association of Geographers’ 2024 Annual Meeting

Modern research applications of machine learning and deep learning in remote sensing.
Author

Liam Smith

Published

April 29, 2024

Introduction

Last week, I travelled to Honolulu, Hawaii, where I presented my summer research at the American Association of Geographers’ Annual Meeting. This was my first academic conference, and it was exciting to connect with others in the field and learn from their work. I am particularly interested in the applications of machine learning towards remote sensing, so I made an effort to attend presentations on the topic.

With the emergence of large language models and the hype surrounding AI, it appears that a new buzzword – GeoAI – has taken root amongst geographers. There were a whole slew of sessions surrounding different topics in GeoAI, and on April 17, I attended the session on GeoAI for Feature Detection and Recognition. In this session, graduate students and researchers presented on how they are using machine learning and deep learning methods to identify objects and perform classification on remote sensing imagery. In the following sections, I provide a summary of each presentation, as well as my personal reflections on the topic.

Context-Enhanced Small Object Detection through High-Resolution Remote Sensing Imagery

The first presentation was by Tao Jiang, who received her PhD in Biomathematics, Bioinformatics, and Computational Biology from the Chinese University of Hong Kong in 2023. Her research goal was to detect individual cars using high-resolution satellite imagery. To do so, with the assumption that all cars are found on roads, she first performed semantic segmentation on her satellite imagery to distinguish roads from non-roads. With a binary image of roads vs non-roads in hand, she then fit deep learning algorithms to search for cars exclusively within roads. As a part of her study, she developed a new deep learning methodology specifically for car identification in satellite imagery and compared the performance of her algorithm and preexisting algorithms. Her model achieved a minor improvement in the overall accuracy rate of car identification, but she intends to continue improving her model. She also wants to expand upon her current work to track changes in traffic over time. Unfortunately, I have been unable to find a publication on her work for further reading.

Personally, I think it’s somewhat incredible that we can now detect individual vehicles using instruments that are not even on our planet. While there is a certain wow-factor inherent to Jiang’s research, I couldn’t help but wonder about (1) the practical applications of detecting cars on roads and (2) whether these applications would actually benefit society.

There were a few minutes for questions at the end of the presentation, and one person actually asked about the applications. Jiang’s response was that in near-real-time, her model could in theory be used to identify traffic conditions. In response to that, another member of the audience chimed in, asking how that would be any improvement over the preexisting methods for assessing traffic conditions. For example, Google Maps and other services already provide relatively high quality traffic information using location data from smartphones. Jiang did not have a satisfactory response to that point. Personally, I do not see how an imagery-based traffic report would be an improvement, as smartphone GPS data is updated constantly, while satellites need to be physically above the region of interest on relatively cloudless days to be of any use.

While her algorithm may not provide much improvement in identifying traffic conditions, I would imagine that her area of research would be of interest to militaries and intelligence agencies. With transfer learning, it is not difficult to imagine extending her research towards military interests like identifying specific military vehicles or various military developments. For me, this raises concerns regarding the morality of this research and the various military actors that could exploit object identification models. And more generally, the capability to identify objects as small as a car raises a whole host of privacy concerns. Gone are the days where the best available satellite imagery had 30x30 meter pixels. With imagery of high enough resolution to identify individual cars, anybody with access to high resolution imagery could snoop in your backyard from outer space without you ever knowing.

Classification of USGS 3DEP LiDAR Point Clouds Using Deep Learning Models

In the second presentation, Jung-Kuan Liu described his efforts to create a deep learning model to classify 3DEP LiDAR point cloud data. In order to discuss his work, I should first provide some necessary context regarding 3DEP and LiDAR.

First of all, 3DEP is the USGS’s 3D Elevation Program, which seeks to provide open, consistent, and high-quality LiDAR elevation data across the entire United States. LiDAR stands for light detection and ranging, and LiDAR data is collected by repeatedly bouncing a laser beam from an aircraft to the ground and back, determining elevation using the amount of time it takes for the reflected beam to return to the aircraft. Combined with other information such as the scan angle, calibration, and the aircraft’s coordinates, this process is used to develop a LiDAR point cloud, which is a set of X, Y, and Z-coordinates corresponding to the locations that each laser beam struck a surface. X and Y correspond to longitude and latitude coordinates, respectively, while Z corresponds to elevation. These point clouds can be processed to develop models of the Earth’s surface such as Digital Elevation Models (DEM), which are images with a single band where each pixel’s value represents the elevation at that location. A more detailed discussion of elevation data is beyond the scope of this blog post, but for further information, a good place to start is NOAA’s article on LiDAR.

Liu works as a Physical Research Scientist at the USGS’s Center of Excellence for Geospatial Information Science, and his presentation revolved around his efforts to improve the landcover classification of USGS’s point cloud data. In particular, the USGS wants to automatically detect whether points represent vegetation, water, buildings, bridges, or other landcover. Current 3DEP standards classify points as ground, non-ground, and unclassified. This is sufficient for extracting “bare earth” DEMs, which are images where each pixel represents the elevation of the ground at that location, and Digital Surface Models (DSM), which are images where each pixel represents the elevation of the very first object that a laser would hit when beamed down from above (for example, trees and buildings). However, these current standards are insufficient for more detailed analyses surrounding the type of landcover each point represents. Liu is seeking to improve upon current standards by training deep learning models on point cloud data, and he has achieved substantial improvement in the variety of landcover classes identified via point cloud classification models. A current weak point of his model is that it is not very successful in its identification of water and bridge features. Liu indicated that water tends to absorb the wavelengths typically used in aerial LiDAR data collection, and that its low reflectance has been a particular challenge to his work.

After improving upon these current weaknesses, Liu hopes to make his model available for public use. In fact, his end-goal is to develop an open-source tool that others can use to perform their own point cloud classification of 3DEP’s point cloud data without actually writing any code. The audience was interested in this product and I think it will be a welcome addition to the current 3DEP program.

Detecting Particular Types of Agriculture With Satellite Imagery

In this section, I will actually discuss two presentations due to the similarity in their content and methods. These presentations were delivered by two PhD students at the University of Florida, Torit Chakraborty and Mashoukur Rahaman, who conduct research for the same lab.

The first of these presentations was titled From Grapes to Pixels: Advanced Deep Learning Techniques in Stellenbosch Vineyard Management, and Chakraborty’s goal in the project was to detect certain types of vineyards in Stellenbosch, South Africa. Chakraborty fit several different deep learning models, including Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and Convo-LSTM, and assessed their performance in order to determine which model was best for the task. In addition to accurately locating vineyards, Chakraborty analyzed changes to the vineyards over time in order to assess the sustainability of Stellenbosch’s wine-tourism economy. While the level of accuracy he achieved was impressive, it was unclear to me what benefit remote sensing had over traditional methods. To my knowledge, Chakraborty has not yet published a paper on the topic, making it difficult to learn more about his methods and rationale.

The second presentation was titled Examining Machine and Deep Learning Techniques to Predict Almond Crop Locations in Central Valley California Using Remote Sensing Technologies. Completed in 2023, this study was Rahaman’s Master’s Thesis, and access to his paper appears to be restricted to University of Florida faculty and students at the moment. Rahaman’s goal was to identify locations of almond agriculture in Central Valley, California. To do so, he fit a handful of traditional machine learning models including random forest, K-nearest neighbor, and logistic regression, as well as several deep learning models including Unet, MAnet, and DeepLabv3+. He then compared the performance of his models to determine which model was best for the job. He found that all of the algorithms performed with at least 90% accuracy, with the deep learning models outperforming the machine learning models by one or two percentage points. With simpler models like logistic regression and random forest performing almost as well as complex and black box deep learning architectures, Rahaman spoke to the continued value of simpler models. When a model that is explainable and interpretable has almost the same predictive power as the most sophisticated model available, perhaps it is more valuable to use the simpler model.

In both of these presentations, I did not feel satisfied with the discussion of the applications of their work. Yes, it is cool that you can identify a very particular subclass of agriculture with 95% accuracy, but why does that matter? Is it just easier to perform a large-scale remote sensing analysis than to ask the farmers what they are growing? Or is the practical impact actually increased surveillance, like identifying illicit drug plantations? What benefit will this research have on humanity? There may be encouraging answers to these questions, but the presenters failed to point them out. To me, this highlights the importance of communicating not only the technical components of one’s research, but also the tangible impact one’s work could have on the world.

Conclusion

In this session on GeoAI for Feature Detection and Recognition, I gained exposure to three different yet related areas of research within remote sensing. In the first presentation, I learned that semantic segmentation and deep learning is being used to automate detection of individual cars from satellite imagery. In the second presentation, I learned how deep learning is improving the data quality of 3DEP’s elevation data. And in the third and fourth presentations, I learned about the power of machine learning and deep learning for detecting particular subclasses of agriculture from satellite imagery. Personally, I was impressed with the level of accuracy of these models and the potential of deep learning for automating object identification.

On the other hand, I was somewhat disappointed by the extent to which this research appears to be by trial and error. I’m the type of person who wants to understand exactly why something works, from the most basic assumptions to the overarching framework, as well as why a particular algorithm outperforms its alternatives. When you throw a dozen models at a problem and simply select the one with the highest accuracy, research feels less like scientific inquiry and more like an art of algorithm-matching, especially when there is no discussion of the theory behind why one model might perform better than others.

Furthermore, as I mentioned earlier, I felt that some of the researchers failed to communicate why their work was relevant or important. I understand the fascination surrounding the power of deep learning, but I’m more interested in how these tools can be used to solve tangible and meaningful problems. One of my main takeaways from my first conference experience is how important it is to inform the audience not only about one’s methods, but also about the potential impacts of one’s research.