Satellite imagery analysis

Satellite imagery analysis by machine learning algorithms can monitor land use and the management of natural resources. Novel applications in the financial sector include predicting crop yields in order to price crop futures.

The figure below illustrates a land use application. We train a deep neural network to detect the presence of buildings in satellite imagery over Bungoma, Kenya. Red squares denote areas where the machine detected “no buildings,” while green squares are areas with buildings detected. Interact with the map through dragging/zooming.

 

The map also illustrates a simple GIS (Geographical Information Systems) application. Of the 10 randomly selected parcels of land (yellow dots), which is most suitable for a centralized utility like a borehole? One approach is to calculate the distances between all the parcels, then select the parcel with the smallest cumulative distance from all the others. This is precisely the analysis performed to arrive at the optimal location, shaded green. Note that by “distance”, we are not referring to geometric distance, but actual walking distance along a road.

Each square has sides of length 125 meters; the total area covered by the squares is about 35 square kilometers. One major advantage machine learning offers over traditional analytics is the ability to scale to immense areas. Once trained, the machine can analyze a million square kilometers almost as easily (in machine terms) as it would 35 square kilometers. In contrast, a human being analyzing satellite imagery would reach a limit very quickly.

Therefore, machine learning enables large-scale satellite imagery analysis, with numerous applications like:


TECHNICAL DETAILS:

SOFTWARE: Linux, Python, TensorFlow, Keras, Docker, VirtualBox, git, Mapbox/GraphHopper.
TRAINING DETAILS: The entire image (not shown) is 12 GB, and produces 60,000 ‘chips’ (the squares in the image above). Training was iterative, with ~1000 chips in each class. We used a deep convolutional neural network with 12 layers. Each training run took several hours on a machine with 4 Intel Xeon 2.3 Ghz processors, 60 GB memory, and an NVIDIA K80 GPU with 12 GB memory.