I recently had the task of serving 200+ terabytes of map tiles for users through a web map. The two map products were 250 meter croptype layers for 2000 to 2014. A little background on the project, this is for a USGS NASA funded project with the mission of mapping the world's croplands at 30 meter resolution over two decades.
The process is quite simple and nothing unusual, except for the ability to make these maps dynamic! It's a far simpler task to create the tiles once and serve them forever as they were originally created. Granted, my work is hardly dynamic at this point(switching between years in an image stack and using different masks), but I can easily extend the concept. All of this is accomplished by using Google's Earth Engine, not to be confused with Google Earth.
About the Solution
The primary pieces for the solution are AWS Cloudfront, Redis, Heroku, Python Flask and Google Earth Engine. Each of these tools were chosen for a specific purpose. First, Google Earth Engine provides the power for the creation of the tiles.
Google Earth Engine brings together the world's satellite imagery — trillions of scientific measurements dating back over 40 years — and makes it available online with tools for scientists, independent researchers, and nations to mine this massive warehouse of data to detect changes, map trends and quantify differences on the Earth's surface. Applications include: detecting deforestation, classifying land cover, estimating forest biomass and carbon, and mapping the world’s roadless areas.
It's an amazing tool and I always find myself running into the mathematics and statistics wall when playing around with it. That's how powerful of a tool it is for the more experienced remote sensing scientist.
- A xyz slippy tile map request comes into my server.
- A flask view parses out the necessary information to build the map and retreives a map id and token from Google Earth Engine.
- The map id and token are stored in Redis for 12 hours for future requests.
- The app then builds a Google Earth Engine url for a tile on this map with the same xyz from the first step.
- The python app then gets this tile and returns it to the original request.
For the most part this works quite smoothly. However, with more complex maps, tiles can take a significant amount of time to be generated by Google Earth Engine.
The solution for this is to through a cache layer in front of these tiles. Amazon's Cloudfront is the perfect solution to this and brings incredible ease of use and low cost charging pennies for gigabytes. (Although my next step is to require signed urls for accessing these tiles, but in the mean time I configured some alarms!)
How many tiles?
The neat part of this solution is the number of tiles that are suddenly available to the user. Each global layer with all of the zoom levels from 1 to 18 amounts to 91,625,968,980 tiles, EACH! Now consider that there are numerous layers each with several years... 2,748,779,069,400 tiles. With an average size of a 2-3 kb, this is a possible 8 petabytes of data.
Maybe I need to reconsider that signed url on AWS Cloudfront after all.
Here is a view of a leaflet map with one of these layers and several tiles from somewhere in Africa. Please ignore the control aesthetics as it is still a work in progress, but the play buttons work and allow the user to loop through the years!
Next time I will talk about the Leaflet maps part of this and how I incorporate Angular into the mix.