Skip to content
Rob Emanuele edited this page Jun 10, 2013 · 21 revisions

The ARG (Azavea Raster Graphics) format is a simple way to encode raster data.

Overview

A raster encoded in ARG comprises two files: foo.json and foo.arg. The JSON file contains all metadata about the raster, including name, data type, resolution, and the geographic extent. The ARG file contains the actual raster data (a two-dimension grid of numbers). It's important to know that both files are required.

Metadata

Here is a sample JSON metadata file. Every key seen here is required. Other keys are allowed but will be ignored.

{
  "layer": "philly rainfall",
  "type": "arg",
  "datatype": "int8",
  "xmin": -8507736.525864778,
  "ymin": 4847928.144104313,
  "xmax": -8457736.525864778,
  "ymax": 4897928.144104313,
  "cellwidth": 10.0,
  "cellheight": 10.0,
  "rows": 5000,
  "cols": 5000
}

The layer key provides the name of the raster as a string. The type must be set to arg.

The geographic area covered by the raster is given by the points (xmin, ymin) and (xmax, ymax). xmin is the western edge of the raster and ymin is the southern edge. Together these values form the "lower left point" of the rectangle. xmax is the eastern edge of the raster and ymax is the northern edge. Together these values for the "upper right point" of the rectangle.

The coordinate system for these points is unspecified. While the web mercator projection is often used, it is not required.

The resolution of the raster is provided by cellwidth and cellheight, which describe how much geographic area is covered by a single pixel's width and height, respectively. In general, these values should be the same when the output will be rendered with square pixels, although this is not enforced.

Geographic values like xmin and cellwidth are interpreted as 64-bit floating point values. Thus, some slight rounding error may accumulate when doing calculations with these values. For this reason, the dimensions of the raster are also provided: the width of the raster is given in columns (cols) and the height of the raster is given in rows. Both values must be positive integers.

The datatype parameter communicates the width (in bytes) of each cell, as well as how to interpret the value. It must be one of the following:

datatype bytes per cell signed? min max no data Supported in GeoTrellis?
int8 1 yes -127 127 -128 yes
int16 2 yes -32,767 32,767 -32,767 yes
int32 4 yes -231+1 231-1 -231 yes
int64 8 yes -263+1 263-1 -263 no
uint8 1 no 0 254 255 no
uint16 2 no 0 65,534 65,535 no
uint32 4 no 0 232-1 -231 no
uint64 8 no 0 264-1 -263 no
float32 4 yes -1038.5 1038.5 NaN yes
float64 8 yes -10308.2 10308.2 NaN yes

There are three specific keys which are not directly used by GeoTrellis but which could be used by other clients: epsg, xskew and yskew.

The epsg attribute gives the coordinate system of the data. When absent, the coordinate system is assumed to be Web Mercator (i.e. "3785").

The xskew and yskew attributes support rotated rasters. When unspecified their values are assumed to be "0.0". GeoTrellis does not currently support rasters whose skew attributes are not zero (although they are allowed by the ARG format).

Data

The ARG file contains every cell value, starting with the upper-left cell (Northwest corner) and ending with the lower-right cell (Southeast corner). Notice that this is different than how the geographic extent is represented.

There is no header information in an ARG file, no checksum, and no form of compression. Each cell uses the same amount of space, so the total size of an ARG file is always equal to the number of cells times the size of each cell. For example, a 40x40 raster will have 1600 cells, so at 4 bytes per cell will use 1600 x 4 = 6400 bytes of space, or 6.4K.

Sometimes this kind of file is called a "raw file".

All data is stored in network byte order (big-endian). That is, a 32-bit integer with the value 1 would be represented as 0x00000001.

Notice that without the accompanying JSON file there is no way to know what geographic area a raster pertains to, or even the correct dimensions of the raster.

Examples

Attached is a simple Python program that knows how to encode raster data into an ARG. For each of the 10 data types, it will print out a sample raster's encoding, as well as some information about the encoding of the no data value.

#!/usr/bin/env python

import struct, sys

# given fmt and nodata, encodes a value as bytes
def pack(fmt, nodata, value):
    if value is None: value = nodata
    return struct.pack(fmt, value)

# packs the given values together as bytes
def encode(fmt, nodata, values):
    chunks = [pack(fmt, nodata, v) for v in values]
    return ''.join(chunks)

# translates the bytes "\x12\x13" into "0x1213"
def show(s):
    chunks = ["%02x" % ord(c) for c in s]
    return '0x' + ''.join(chunks)

# None means "no data"
tests = [
    {'formats': [('int8', '>b', -(1<<7)),
                 ('int16', '>h', -(1<<15)),
                 ('int32', '>i', -(1<<31)),
                 ('int64', '>q', -(1<<63))],
     'data': [None, 2, -3, -4]},
    {'formats': [('uint8', '>B', (1<<8)-1),
                 ('uint16', '>H', (1<<16)-1),
                 ('uint32', '>I', (1<<32)-1),
                 ('uint64', '>Q', (1<<64)-1)],
     'data': [None, 2, 3, 4]},
    {'formats': [('float32', '>f', float('nan')),
                 ('float64', '>d', float('nan'))],
     'data': [None, 1.1, -20.02, 300.003]},
]

print "2x2 raster values:"
for d in tests:
    print "  data: %r" % d['data']
    for (name, fmt, nodata) in d['formats']:
        bytes = encode(fmt, nodata, d['data'])
        print "  %-7s %s" % (name, show(bytes))
    print ""

print "nodata values:"
for d in tests:
    for (name, fmt, nodata) in d['formats']:
        nd = pack(fmt, nodata, nodata)
        print "  %-7s %s (%s)" % (name, show(nd), nodata)
    print ""