Name

rtree — build and query R-tree spatial indices

Synopsis

#include <rtree.h>
    
rtree_t* rtree_new( size_t dim,
  unsigned flags);
 
void rtree_destroy( rtree_t *rtree);
 
rtree_t* rtree_clone( const rtree_t *rtree);
 
bool rtree_empty( const rtree_t *rtree);
 
rtree_height_t rtree_height( const rtree_t *rtree);
 
int rtree_envelope( const rtree_t *rtree,
  rtree_coord_t *rect);
 
int rtree_add_rect( rtree_t *rtree,
  rtree_id_t id,
  const rtree_coord_t *rect);
 
int rtree_update( rtree_t *rtree,
  rtree_update_t *f,
  void *context);
 
int rtree_search( const rtree_t *rtree,
  const rtree_coord_t *rect,
  rtree_search_t *f,
  void *context);
 
int rtree_csv_write( const rtree_t *rtree,
  FILE *stream);
 
rtree_t* rtree_csv_read( FILE *stream,
  size_t dim,
  unsigned flags);
 
int rtree_json_write( const rtree_t *rtree,
  FILE *stream);
 
rtree_t* rtree_json_read( FILE *stream);
 
int rtree_bsrt_write( const rtree_t *rtree,
  FILE *stream);
 
rtree_t* rtree_bsrt_read( FILE *stream);
 
int rtree_postscript( const rtree_t *rtree,
  const rtree_postscript_t *options,
  FILE *stream);
 
bool rtree_identical( const rtree_t *rtree0,
  const rtree_t *rtree1);
 
size_t rtree_bytes( const rtree_t *rtree);
 
size_t rtree_dims( const rtree_t *rtree);
 
size_t rtree_page_size( const rtree_t *rtree);
 
size_t rtree_node_size( const rtree_t *rtree);
 
size_t rtree_rect_size( const rtree_t *rtree);
 
size_t rtree_branch_size( const rtree_t *rtree);
 
size_t rtree_branching_factor( const rtree_t *rtree);
 
double rtree_unit_sphere_volume( const rtree_t *rtree);
 
const char* rtree_strerror( int err);
 

DESCRIPTION

On including the header rtree.h the following functions become available:

rtree_new

The size_t argument specifies the dimension of the (empty) R-tree returned; it should be a positive integer. The flags is the bitwise-or of one or more of

  1. One of RTREE_SPLIT_LINEAR, RTREE_SPLIT_QUADRATIC or RTREE_SPLIT_GREENE; these determine the splitting strategy, the linear strategy is faster to build, the quadratic produces a better-quality R-tree which is faster to query. The Greene variant is a quadratic-time alternative to the Guttman original.

  2. The macro RTREE_NODE_PAGE(n) which sets the nodes-per-page value n. This value can affect performance quite dramatically, particularly build time. A value which is too large would result in an infeasible branching factor for the R-tree and will case the function to error with errno set to EINVAL.

    A value of n of zero is permitted and the default; in this case the function will choose a good value based on heuristics. You may get better performance for your use-case by manual experimentation, but zero is a good place to start.

  3. The RTREE_DEFAULT value which is (and will remain) zero. When used alone it will select the default splitting strategy and the heuristic nodes-per-page.

On error, returns NULL, and may set the errno variable to indicate the cause.

rtree_destroy

Releases the memory used by the rtree_t* which is its argument, as returned by rtree_new or rtree_clone. A NULL argument is acceptable here.

rtree_clone

Creates and returns a deep-copy of the rtree_t* which is its argument. On failure NULL is returned. The copy shares no data with the original and should be destroyed separately.

A typical use case for this function is a sequence of R-trees which share a common part. Once can create the latter, then clone and add the specific part for each instance.

On error, returns NULL, and may set the errno variable to indicate the cause.

rtree_empty

Returns true if the tree has no leaf nodes.

Equivalent to a test that the rtree_height is zero, but does not need to iterate over the whole tree.

rtree_height

Returns the height of the R-tree in the usual mathematical sense. A newly created rtree_t* has height zero (it has a root-node with no descendants).

The return values is of type rtree_height_t, an unsigned integer type.

rtree_envelope

For a non-NULL and non-empty rtree_t*, writes the minimal rectangle which contains all of the rectangles so-far inserted: the bounding box.

If the first argument is NULL or empty, then the function returns non-zero and does not modify the passed rect argument.

rtree_add_rect

Adds a rectangle (or higher-dimensional cuboid) to the R-tree structure specified by the first rtree_t* argument.

The second id argument is an integer used to identify the rectangle. The type rtree_id_t of this integer is determined at compile-time. By default it is the largest unsigned integer type which is no larger than a pointer, so a uint32_t on x86 and uint64_t on the AMD64 architecture. One can, however, override the default at compilation by defining the RTREE_ID_TYPE constant.

It is anticipated that the id will be used as an index for application specific data to which this rectangle relates, but this is entirely at the discretion of the caller, the library makes no use of the value, treating it as payload. In particular, the value may be non-unique and may be zero.

The final argument is (a pointer to) an array of rtree_coord_t, twice as many as the dimension of the R-tree. The floating point values are the lower extents of each dimension, followed by the upper extents of each dimension; these values are copied into the tree. By default the rtree_coord_t type is float, but this can be overridden at compilation by defining the RTREE_COORD_TYPE constant to the desired floating-point type.

rtree_update

Modifies the rtree which is the first argument according to the function f which is the second. This function, which has prototype

int f( rtree_id_t id,
  rtree_coord_t *rect,
  void *context);
 

is applied to all of the rectangles in the tree, it is passed the id as used in the constructor, and may modify the rect which is its second argument. If the return value is non-zero, then the iteration will abort and return RTREE_ERR_USER. Otherwise, once all rectangles have been updated, the spatial index will be updated and queries against the tree should be correct.

Notes

  1. The update of the spatial index does not modify the structure of the tree, it just recalculates the bounding rectangles for the parent branches down to the root (which is much faster than rebuilding the tree).

  2. If the rectangles are only slightly modified, then the resulting tree should be nearly as efficient as the original in spatial searching, but if the rectangles are substantially changed then one could find a collapse in search speed, one should instead rebuild the tree from scratch.

  3. If the function does not return RTREE_OK (in particular, if it returns RTREE_ERR_USER) then subsequent spatial queries against the tree may not be correct.

rtree_search

This function searches the rtree_t* which is its first argument for rectangles which intersect the the second rect argument. For each found, the callback function f, whose prototype is

int f( rtree_id_t id,
  void *context);
 

is called. The id is the rectangle identifier as given in rtree_add_rectangle, and the context is the void* passed as the final argument to rtree_search.

The function f should return zero to continue the search, non-zero to terminate it. In the latter case it is this value which rtree_search returns.

rtree_csv_write

Writes the ids and rectangles from the rtree_t* in the format described by rtree-csv (5) to the stream. One could then rebuild an equivalent rtree_t* with the rtree_csv_read function (equivalent in that the rebuilt rtree_t* will give the same results on search, but may fail on call to rtree_identical since the rectangle may not have been inserted in the same order).

rtree_csv_read

Reads rectangles from the stream in the format described by rtree-csv (5) writing them to the rtree_t* returned. One also passes the dimension of the tree as size_t; not guessing this value from the CSV file allows the function to handle files with extra columns. The final flags arguments is as described in rtree_new, above.

On error, returns NULL, and may set the errno variable to indicate the cause.

rtree_json_write

Writes the R-tree in the first argument to the stream in the second in the JSON format as described in rtree-json (5) .

rtree_json_read

Reads JSON in the format as described in rtree-json (5) from stream and returns the R-tree that it describes.

Note that the read will fail if the JSON input was created on a system with a different page-size, or where the library was compiled with a different float size. In this case one would need to build the R-tree afresh.

On error, returns NULL, and may set the errno variable to indicate the cause. In particular, the value ENOSYS indicates that the library was compiled without JSON support, while EINVAL that the input file is incompatible with the library (a later version, of differing page-size and so on).

rtree_bsrt_write

Writes the R-tree in the first argument to the stream in the second in the BSRT (binary serialised R-tree) format as described in rtree-bsrt (5) .

rtree_bsrt_read

Reads BSRT in the format as described in rtree-bsrt (5) from stream and returns the R-tree that it describes.

Note that the read will fail if the BSRT input was created on a system with a different page-size, or where the library was compiled with a different float size. In this case one would need to build the R-tree afresh.

On error, returns NULL, and may set the errno variable to indicate the cause. In particular, the value ENOSYS indicates that the library was compiled without BSRT support, while EINVAL that the input file is incompatible with the library (a later version, of differing page-size and so on).

rtree_postscript

Plots the rtree argument as PostScript to the specified stream. The appearance of the plot is controlled by the options argument.

rtree_identical

Two R-trees are identical if they have the same nodes, branches and leaves in the same order. This function returns a Boolean indicating whether its arguments are identical or not. It is mainly intended for tests (of rtree_clone and the serialisation functions for example).

rtree_bytes

The memory allocated by the R-tree, or zero if the argument is NULL.

rtree_dims

The dimension of the R-tree, or zero if the argument is NULL.

rtree_page_size

The hardware page-size used by R-tree, typically 1024 on 32-bit hardware, 4096 on 64-bit, or zero if the argument is NULL.

rtree_node_size

The size in bytes of an R-tree node, or zero if the argument is NULL.

rtree_rect_size

The size in bytes of an R-tree rectangle, or zero if the argument is NULL.

rtree_branch_size

The size in bytes of an R-tree branch, or zero if the argument is NULL.

rtree_branching_factor

The number of child-nodes for each parent-node in the R-tree (see the notes in rtree_new ), or zero if the argument is NULL.

rtree_unit_sphere_volume

The volume of the unit sphere in the dimension of the R-tree, or zero if the argument is NULL.

rtree_strerror

For library functions which return an integer, zero represents success, non-zero an error. A readable description of the error can be obtained by passing the value to this function. The returned string is a constant: it should not be freed.

The one exception to this is rtree_search, since that returns a user-defined value.

AUTHOR

J.J. Green, .