to EOS Electronic Supplementto AGU Home



New Shoreline Map-Drawing Data Available


by Rainer Feistel, Baltic Sea Research Institute, Seastrasse 15, D-18119 Warnemünde, Germany; E-mail: rainer.feistel@io-warnemuende.de.


A new shoreline database for making geophysical maps is now available to software developers and should speed up the map-drawing process considerably. The new database, known as Regionally Accessible Nested Global Shorelines (RANGS), attempts to combine the best of the two sets of data most often used in recent years. It follows its forerunners, World Vector Shoreline (WVS) and global self-consistent hierarchical high-resolution shorelines (GSHHS), in providing coastline information for computing geophysical maps.

RANGS data are based on the GSHHS data set computed by Wessel and Smith [1996]. GSHHS data were derived by Wessel and Smith from the widely used World Vector Shoreline (WVS) data set and additional Central Intelligence Agency data.

WVS was originally provided by the U.S. Defense Mapping Agency (DMA) [1988], now the National Imagery and Mapping Agency (see, e.g., Soluri and Woodson [1990]). WVS data are organized in 1° x 1° cells covering the entire globe surface and depicting either only water or only land or coastlines. The lines consist of fractions of different length, in other words, sequences of latitude-longitude coordinate pairs. They resolve structures smaller than 100 m. Vertex coordinates are given on a 0.1" raster (or 3 m); their absolute accuracy is 500 m. Although not officially stated for WVS, we will assume its resolution in the following as 100 m. The 64,800 cells of Earth's surface are grouped in ten different ocean basin area files.

Wessel and Smith merged the separate WVS files and concatenated the various line fractions to closed polygon sequences, assigning each its own ID number. They assigned hierarchy level indices to these polygon tracks in order to distinguish ocean shore from lakes on land, islands in lakes, and so forth. Using algorithms developed by Douglas and Peucker [1973], they derived lower resolution versions (full resolution=0.1, high=0.2, intermediate=1.0 , low=5.0, and crude=25 km) of the polygon sets. They called their corresponding files gshhs_f_, gshhs_h_, gshhs_i_, gshhs_l_, and gshhs_c_. Their coordinates are expressed in integer microdegree numbers (thus resolving 0.1 m, theoretically).

Compared to WVS, these GSHHS data are a significant advantage in software applications for graphic visualization using masking or rendering methods. However, they still have two major shortcomings. First, their mutual topological relations are not specified; that is, it is not indicated whether two given polygons are disjunct or one is inside the other. Second, no local access exists to polygon parts as with WVS cell bins. As an example, the Warnemünde Baltic Sea Research Institute is located at a shoreline polygon that starts at the Baltic, continues around the Iberian Peninsula, the Mediterranean, Africa, Arabia, India, China, Siberia, and Scandinavia, and eventually returns to the Baltic, all that with about 100 m spatial resolution. Moreover, to draw, say, an island in the Baltic, the program has to process the full GSHHS file because there is no indication whether any of the polygons intersects the given cell(s). Numerical processing or even only drawing of such huge polygons is very time and memory consuming, especially if software applications in interactive graphical user surfaces are considered.

RANGS has been developed to overcome these disadvantages while preserving the benefits of GSHHS. Like WVS, RANGS is composed of 1° x 1° cells covering the globe. For each cell, the conjunction polygon between the GSHHS polygon and the cell square is computed. All these local RANGS polygons keep a reference to their global GSHHS parent polygons. RANGS polygons then were nested by determining which polygons surround a given one, and which are inside a given one. RANGS polygons are stored as additionally generated vertices in conjunction with pointers into GSHHS files. GSHHS itself is only slightly modified, compared to the original data given by Wessel and Smith.

Processing GSHHS Files

The construction of RANGS polygons required some preparation and some modification of the original GSHHS files. For one, the byte sequence of the binary numbers needed to be reversed. (GSHHS comes in Unix/Mac number format of a Motorola processor and we were using a Windows PC with an Intel processor.) Wessel and Smith provided a byte reversion program, but we could not get it working, so we used our own program written for this purpose.

Rim rotation also was performed. For all GSHHS polygons not confined to a single cell, the polygon rim points were moved forward within the file and the resulting excess points were appended at the end until the first and last points of the polygon became located in different cells.

Cleaning also was necessary. Longitude coordinates had to be restricted to values between x ³ 0 and x < 360. All duplicate vertices were removed.

Additionally, the level index had to be corrected for four polygons. They were ID 3087 in gshhs_f_ and ID 2418 in gshhs_h_ (both corrected to level 2 from level 1); and ID 47992 in gshhs_f_ and ID 490 in gshhs_i_ (both corrected to level 2 from level 3). The processed GSHHS files gshhs_f_, gshhs_h_, gshhs_i_, gshhs_l_, and gshhs_c_ were then renamed to gshhs(0).rim, gshhs(1).rim, gshhs(2).rim, gshhs(3).rim, and gshhs(4).rim.

Structure of Files

The file structure of the modified gshhs(?).rim files is identical with that described by Wessel and Smith [1996] for Version 1.1, April 30, 1996. It is this version we have processed here, downloaded in January 1997 from their Web site (ftp://kiawe.soest.hawaii.edu/pub/wessel/gshhs/). The files contain several successive logical blocks of gshhs header and gshhs points. Each gshhs header consists of the following variables:

int id; (/* Unique polygon id number, starting at 0 */)
int n; (/* Number of points in this polygon */)
int level; (/* 1 land, 2 lake, 3 island_in_lake, 4 pond_in_island_in_lake */)
int west, east, south, north; (/* min/max extent in micro-degrees */)
int area; (/* Area of polygon in 1/10 km^2 */)
short int greenwich; (/* Greenwich is 1 if Greenwich is crossed */)
and short int source; (/* 0=CIA WDBII, 1=WVS */)

Here, int is 4-byte integers and short means 2-byte integers. The gshhs points are stored as n successive records of the form int x (/* longitude of a point in microdegrees */) and int y (/* latitude of a point in microdegrees */).

Processing RANGS Files

Assigning polygon parts to 1· cells, reconnecting these fractions to polygons, and nesting the small polygons involved a number of processing steps. First, for each 1° x 1° cell, the rim segments of all polygons inside the cell were determined. (A given polygon can lie entirely inside a cell, or it can enter and exit the same cell once or several times; it can touch the cell border with one or more vertices without really crossing it; and it can cross the cell without having a single vertex within the cell). For polygons passing more than one cell, all entry and exit vertices on the cell border were computed and stored. Address and number of points in gshhs.rim sequences inside the cell were stored (number can be zero). For polygons not passing the cell's border, the first polygon point was used as entry and exit vertex, its address and the number of its points were noted, and a "closed-loop" flag was set, indicating that entry/exit need not be on the cell border. For all such segments described this way by entry point, rim segment address, rim segment length, and exit point, the clockwise or counterclockwise rotation of the original entire polygon is computed and flagged. Together with the parity of the polygon level index, this flag tells whether we will find land or water on the right or on the left hand side along the sequence.

Second, cell segments had to be concatenated to form little polygons ("cell polygons"), the conjunctions of the global parent polygons with the cell. For this purpose, fractions of the cell border (with or without cell corner points) had to be inserted between the isolated polygon rim segments.

Third, for each cell polygon of a given cell it has to be determined which is inside which other cell polygon. This polygon nesting procedure involves checking all rim points of one polygon to see if they are inside another polygon.

Fourth, for each cell it must be determined whether it is at least partially in the world ocean or entirely inside one global polygon. This is cell nesting. For all cell borderlines not crossed by shorelines, a surrounding polygon must be found. This can be done, knowing that the North Pole is ocean and the South Pole is land, by handing over the information of a given cell to its adjacent cells.

Structure of RANGS Files

Two different kinds of RANGS files exist, rangs(?).cat and rangs(?).cel, where the question mark is placeholder for 0,1,2,3,4, denoting the different resolution levels. The Cell Address Table file, rangs(?).cat, contains one long (4-byte) integer for each cell of the globe's surface. Its value is the address of the cell description in the rangs(?).cel file. Note that for all addresses here and in the following, "address" means byte address starting with 1 for the first byte in the file.

If lon (0 to 359) is the longitude and lat (89 to -90) the latitude of the lower left corner of the cell, then addr=1 + 4 * ((89 - lat) * 360 + lon) is the address of this cell in rangs(?).cat, where the pointer into the rangs(?).cel file is looked up. The Cell Extraction List file, rangs(?).cel, contains pointers to all GSHHS shoreline segments belonging to a particular cell and information on how these segments are to be connected to form a closed, simple (non-self-crossing) cell polygon, and how these polygons with different ID numbers are nested inside each other.

The outermost polygon, to where the pointer of the rangs(?).cat file points, is always the cell border square with 4 vertices and polygon ID -1. Whether it is embedded in ocean or land is specified in its SegmentByte. If a cell does not contain any shoreline, then this cell border is the only description of that cell.

A recursive data structure PolygonList is at each address in rangs(?).cel where the rangs(?).cat file points:

PolygonList:=

PolygonByte (=1 or 2)
SegmentLoop
PolygonList
PolygonList
...
PolygonList
PolygonByte (=0)

Here the values of the byte PolygonByte have the meaning

1 Begin_Polygon_CCW (counterclockwise)
2 Begin_Polygon_CW (clockwise)
  End_PolygonList  

SegmentLoop describes a single polygon. It is immediately followed by the set of all polygons (PolygonList) it directly encloses (the "holes" in the polygon).

SegmentLoop :=

PolygonID
SegmentByte
SegmentData
SegmentByte
SegmentData
...
SegmentData
SegmentByte (with End_SegmentLoop flagged)

PolygonID is the 4-byte integer gshhs header ID number of the original GSHHS polygon we refer to in gshhs(?).rim.

SegmentByte is a single byte composed as follows:

SegmentByte=DataType + 8 * Clockwise + 16 * Interior

DataType= 0   End_SegmentLoop
  1..6=n   SegmentData is an n-vertex cell border segment
  7   SegmentData is a rim segment
Clockwise= 1   clockwise polygon, interior is on the right
  0   counterclockwise polygon
       
Interior= 0   inside is ocean
  1   inside is land
  2   inside is lake on land
  3   inside is island in lake
  4   inside is pond on island

SegmentData can be one of two possibilities depending on DataType, either a cell border segment or a rim segment. A cell border segment with n vertices consists of 1 to 6 vertices on the cell border. The first vertex and the last vertex are the polygon exit point and the polygon entry point (which may be just one point if the border is touched but not crossed); in between are 0 to 4 corner points of the cell square. Each vertex is explicitly given as a pair (lon, lat) of coordinates in microdegrees (4-byte integers). This SegmentData is 8 n bytes long. Interior and exterior are the same. A rim segment is 8 bytes long and consists of a 4-byte integer segment address in gshhs(?).rim and a 4-byte integer segment length (number of vertices). The exterior is one less the interior.

Below is a RANGS cell example, in this case a cell with multiple entry/exit points. The location is Sealand, Denmark, at 11· east longitude, 55· north latitude. Multiple islands appear as "holes" in the cell border polygon at the same level. The same global polygon (Sealand, ID 105) appears as 2 cell polygons (Sealand Main and Sealand Halsnaes). The sequence of rim points along the cell border is, in general, different from the sequence of appearance of the same points along the global polygon.

Information about WVS data can be found on the Web (http://crusty.er.usgs.gov/coast/wvs.html and http://www.ngdc.noaa.gov/mgg/fliers/93mgg01.html) as can information about GSHHS files (http://www.soest.hawaii.edu/wessel/gshhs/gshhs.html). RANGS can be downloaded from http://www.io-warnemuende.de/public/phy/rfeistel/index.htm.

Acknowledgment

This paper contributes to the European Union research project ENVIFISH (Environmental Conditions and Fluctuations in Distribution of Small Pelagic Fish Stocks), [http://www.sea.uct.ac.za/research/envi.htm], contract IC18-CT98-0329.

Author

Rainer Feistel, Baltic Sea Research Institute, Seastrasse 15, D-18119 Warnemünde, Germany; E-mail: rainer.feistel@io-warnemuende.de.

References

Douglas, D. H., and T. K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, Can. Cartogr., 10, 112-122, 1973.

Soluri, E. A., and V. A. Woodson, World vector shoreline, Int. Hydrogr. Rev., LXVII, 27-36, 1990.

U.S. Defense Mapping Agency (DMA), Defense Mapping Agency Product Specifications for World Vector Shoreline, PS/2GC/030, 1st ed., DMA Hydrographic/Topographic Center, Washington, D.C., May 1988.

Wessel, P., and W. H. F. Smith, A global self-consistent, hierarchical, high-resolution shoreline database, J. Geophys. Res., 101, 8741-8743, 1996.



AGU