wiki:ShpTree

shptree: Shapefile Spatial Indexing


The official shptree documentation is at: http://www.mapserver.org/utilities/shptree.html


The shptree utility creates spatial index files (.qix) for a single shapefile. MapServer will take advantage of these spatial index files to more rapidly identify the shapefiles that are needed to render a particular map extents.

Note that the .qix file does not provide any indexing for attribute queries, and is unrelated to the shapefile tile index created by tile4ms.

The shptree utility is distributed with the MapServer source tree, and is built by default when building the other MapServer Utilities (at least on Unix).

Usage

Syntax:
    shptree <shpfile> [<depth>] [<index_format>]
Where:
 <shpfile> is the name of the .shp file to index.
 <depth>   (optional) is the maximum depth of the index
           to create, default is 0 meaning that shptree
           will calculate a reasonable default depth.
 <index_format> (optional) is one of:
           NL: LSB byte order, using new index format
           NM: MSB byte order, using new index format
       The following old format options are deprecated:
           N:  Native byte order
           L:  LSB (intel) byte order
           M:  MSB byte order
       The default index_format on this system is: NL

In it's simplest form, it is sufficient to run the shptree utility once for each shapefile you wish to have spatially indexed. Each run will create a .qix file with the same basename as the shapefile.

The spatial index basically breaks the total shapefile area into subregions using a quad-tree approach, recursively splitting the area into four sub quadrants untill each lowest level area has only a few shapes in it (8?). The depth argument may be used to control the maximum depth to which the area will be broken.

The N/L/M/NL/NM argument allow control of the format of the .qix file. The default is to generate "old style" indexes in the byte order of the current system. The options allow the user to select old format or new format, and what byte order to generate. Generally the spatial indexes should be generated with the byte order of the system on which MapServer will be run (LSB for Intel, MSB for Solaris, Irix) to reduce byte swapping overhead.

Old vs. New Format

At this time I am not sure what the differences are between old and new formats. The new format seems to be smaller in the cases I have tried. Hopefully the developers (Steve Lime / Carl Anderson) will add notes here.

The New format is essentially the old format with a header added to explicitly indicate the byte order of the index, and a version number. This was in preparation for a (as of yet) unsubmitted work to add a integer based index in addition to the current (double) floating point index.

Eventually the old format would become deprecated then abandoned.

Related Utilities

The shptreevis utility can be used to generate a shapefile showing the quadtree generated in a particular .qix file. For instance, the following would generate a shapefile (quad.shp) showing the quads into which the input file (abc.shp) was split. This is mainly used to get a sense of what the quading is doing and for debugging.

shptreevis abc.shp quad.shp

The shptreetst utility allows executing a query against a spatial index and dumps some information useful for debugging how the search worked. It is mainly useful for testing.

Usage: shptreetst shapefile {minx miny maxx maxy} Example: shptreetst abc.shp -104 32 -103 33

History

The original shapetree (in memory) indexing work was done by Frank Warmerdam within the ShapeLib distribution. It was adapted for MapServer, and improved by Steve Lime and Carl Anderson. I am not sure which was responsible for the format rewrite (the new format).

The original shptree only did native order indexes. Carl was trying to use MapServer in a mixed server environment and added some byte swapping code. The old code was adjusted to guess byte order without changing the file format. The new format added a header with a tag "SQT", a byte order, and a format version number to allow future changes to be correctly detected.

At that time Carl also was having problems generating shptree indexes on machines with limited memory. The integer format is an attempt to reduce the index size and to speed up the indexing operation.

Last modified 6 years ago Last modified on Jan 21, 2011 12:56:43 PM