How Web Maps Work
This post aims to summarize the basic elements of web maps at a deep level but without any implementation-specific details: it should roughly describe the paradigm adopted by Leaflet, Modest Maps, and OpenLayers, and the slippy map tilename standard.
It does generally describe tiled vector maps, with the obvious caveat that, instead of simply displaying raster data, the browser rasterizes a vector source before adding it to the map.
Goal: Tile Layout
The end result of a map client is a tile layout: given a
viewport of, say, 640x480px, the client finds all tiles within a certain zoom level and centerpoint which intersect with this viewport. It then arranges them so that tiles are perfectly adjacent.
Tiles are chunks of raster or vector data. Most commonly, tiles are images of 256x256px, because they’re broadly supported, and fast to consume, but they can also be 512x512px.
Tiles can also be JSON, GeoJSON, Protocol buffers, or another kind of vectorized data, so that rasterization can be pushed to the client. This has its advantages and disadvantages, but that’s another discussion: to the map client, vector tiles are equivalent to raster tiles except with an additional rendering step in-browser.
Tile coordinates are tuples with three elements:
tile: [zoom, column, row]
Unlike locations and pixel coordinates, tile coordinates uniquely identify maps because they include a zoom level. They also differ in that coordinates can be interpreted as having area, so the coordinate
[2, 0, 0] occupies the square of space between it and
[2, 1, 1].
Since tiles are quad-tree indexed, they’re transformed between coordinates like such:
tile: [zoom level, column, row] = [zoom level + 1, column * 2, row * 2] = [zoom level + n, column * 2^(n - zoom level), column * 2^(n - zoom level)]
So the location of the tile
[2, 1, 1] (zoom 2, column & row 1) becomes
[3, 2, 2] at zoom level 3, and so on.
Locations are geographical locations: they can be represented as
[latitude, longitude] pairs. A location maps to a different, unique coordinate at each zoom level.
The latitude longitude values are assumed to be WGS84 unless otherwise specified.
Pixel locations are the most ephemeral and immediate units in web maps: the pixel location of a point is a
[x, y] pair of its offset from the top-left of the map element. Every time that the map moves, all pixel locations become invalid, so positioning elements on the map with pixels requires you to reposition those elements every call to
Relationships Between Data Types
You can ‘convert’ between these different representations, but since each type represents a different concept, it’s not 1:1.
- Coordinate → Location: Each coordinate maps to one geographical location
- Location → Coordinate: Since coordinates have zoom as well as location, locations map to a different coordinate at each zoom level
- Coordinate / Location → Point: Coordinates and locations at a particular zoom levels map to points, but this changes whenever the map centerpoint moves.
- Coordinate → Tile: Coordinates yield tiles: the coordinate
[0, 0, 0]could yield a tile with the URL
http://c.tile.openstreetmap.org/0/0/0.png. Tiles represent coordinates.
Given the goal of tile layout and the definition of a tile, let’s start thinking about the map client. Like any application, it is defined by its state.
state of the map is its current zoom level and centerpoint: this changes every time that the map moves, zooms, pans, etc. Other attributes, like layer selections, styling, and such, will be described as configuration.
There are multiple ways to represent state:
- A ‘geographical location’ and a zoom level
- A coordinate
The Modest Maps tradition uses a coordinate, so we’ll describe those first, but Leaflet and OpenLayers use geographical locations, which are equivalent.
Changing Map State
Changing the map’s state - in this case, represented by a Coordinate, is an appropriate place for getters and setters. A minimal API for changing map state could be
And, indeed, this is what Modest Maps does inside of its higher abstractions. But a typical set of abstractions would be:
Internally each of these functions will convert the
location into a
coordinate, assign that as the current map state, and call
draw(), a function that renders the map onto the page - which is what we’ll address in the next section.
Maps are rendered: typically as
<img> elements in
The structure of an images-in-the-dom library is like so:
So the main task of
positionTile in the common case - in which it’s dealing with HTML elements - is positioning.
The simplest way to do this is with absolute positioning inside of relative positioning - the map’s parent is relative, the children are absolutely positioned.
An optimization here is to use CSS transforms, which has a few advantages: they can cause fewer reflows, use hardware acceleration in their 3D versions, and, unlike CSS’s
height properties, the scaling transformation is inherited.
Interaction is typically in the domain of ‘handlers’ which split up different ways of moving and changing the map. A typical set of interactions might be
- Double-click → zoom in around a point
- Drag → pan
- MouseWheel → zoom
- Single-finger swipe → pan
- Double-finger pinch → zoom
Tile Management and Removal
At each zoom level, the number of potential tiles increases exponentially - at zoom level 5, 1,024 tiles are possible. The interface therefore needs to intelligently load tiles that are visible and prune those that aren’t.
To find what tiles are on screen, you find the Coordinates of the top-left and bottom-right corners of the screen, and loop through them:
The simplest way to implement tile removal is with an LRU cache - tiles that haven’t been displayed recently are the first ones to be flushed from the cache. On the web, this isn’t necessarily part of the map client itself, since web browser implement their own caching.
- maptiler.org: Tiles à la Google Maps: Coordinates, Tile Bounds and Projection
- macwright.org: Understanding Map Projections
- modest maps: How Modest Maps Works
- axis maps: Your Map And The Internet