Rgeo — A Go package for basic, fast, local reverse geocoding

Rgeo is a fast, simple solution for local reverse geocoding, Rather than relying on external software or online APIs, rgeo packages all of the data it needs in your binary. This means it will only ever work down to the level of cities , but if that’s all you need then this is the library for you.

Rgeo uses data from naturalearthdata.com, if your coordinates are going to be near specific borders I would advise checking the data beforehand (links to which are in the files). If you want to use your own dataset, check out datagen.

Key Features

  • Fast - So I haven’t actually benchmarked other reverse geocoding tools but on my laptop rgeo can run at under 800ns/op.
  • Local - Rgeo doesn’t require pinging some API, most of which either cost money to use or have severe rate limits.
  • Lightweight - The rgeo repo is 141MB, which is large for a Go package but compared to the 800GB needed for a full planet install of Nominatim it’s miniscule.

Installation

Download with

go get github.com/sams96/rgeo

and add

import "github.com/sams96/rgeo"

to the top of your Go file to include it in your project.

Usage

r, err := New(Provinces10, Cities10)
if err != nil {
	// Handle error
}

loc, err := r.ReverseGeocode([]float64{141.35, 43.07})
if err != nil {
	// Handle error
}

fmt.Println(loc)
// Output: <Location> Sapporo, Hokkaido, Japan (JPN), Asia

First initialise rgeo using rgeo.New,

func New(datasets ...func() []byte) (*Rgeo, error)

which takes any non-zero number of datasets as arguments. The included datasets are:

  • Countries110 - Just country information, smallest and lowest detail of the included datasets.
  • Countries10 - The same as above but with more detail.
  • Provinces10 - Includes province information as well as country, so can still be used alone.
  • Cities10 - Just city information, if you want provinces and/or countries as well use one of the above datasets with it. Once initialised you can use ReverseGeocode on the value returned by New, with your coordinates to get the location information. See the Go Docs for more information on usage.

Then use ReverseGeocode to get the location information of the given coordinate.

func (r *Rgeo) ReverseGeocode(loc geom.Coord) (Location, error)

The input is a geom.Coord, which is just a []float64 with the longitude in the zeroth position and the latitude in the first position (i.e. []float64{lon, lat}). ReverseGeocode returns a Location, which looks like this:

type Location struct {
	// Commonly used country name
	Country string `json:"country,omitempty"`

	// Formal name of country
	CountryLong string `json:"country_long,omitempty"`

	// ISO 3166-1 alpha-1 and alpha-2 codes
	CountryCode2 string `json:"country_code_2,omitempty"`
	CountryCode3 string `json:"country_code_3,omitempty"`

	Continent string `json:"continent,omitempty"`
	Region    string `json:"region,omitempty"`
	SubRegion string `json:"subregion,omitempty"`

	Province string `json:"province,omitempty"`

	// ISO 3166-2 code
	ProvinceCode string `json:"province_code,omitempty"`

	City string `json:"city,omitempty"`
}

So, to put it all together:

r, err := rgeo.New(Countries110)
if err != nil {
	// Handle error
}

loc, err := r.ReverseGeocode([]float64{0, 52})
if err != nil {
	// Handle error
}

fmt.Printf("%s\n", loc.Country)
fmt.Printf("%s\n", loc.CountryLong)
fmt.Printf("%s\n", loc.CountryCode2)
fmt.Printf("%s\n", loc.CountryCode3)
fmt.Printf("%s\n", loc.Continent)
fmt.Printf("%s\n", loc.Region)
fmt.Printf("%s\n", loc.SubRegion)

// Output: United Kingdom
// United Kingdom of Great Britain and Northern Ireland
// GB
// GBR
// Europe
// Europe
// Northern Europe

Projects using rgeo

Currently the only project that I know of is my own rgeoSrv, which aims to wrap rgeo into a microservice. I am also planning on writing a command line appication to apply IPTC location tags to images that already contain geolocation information.

How it works

The data used by rgeo is made from a collection of GeoJSON files aquired from Natural Earth Data, which are packaged into Go files by datagen. The go files contain functions which return byte slices that contain the base 64 encoded, gzipped GeoJSON files. They’re packaged into function because that seems to be the only way to have the Go compiler ignore them if they aren’t used (so your program isn’t inflated by huge data files that you aren’t using).

New takes the datasets, decodes the data and parses the GeoJSON. It creates an Rgeo struct which contains an s2 shape index, a map to store the location information of each area and an s2 contains point query. The shape index contains s2 polygons for each of the areas in the given dataset, and the contains point query is what goes from the input coordinates to a shape. The map is then used to get the location information from the shape.

One thing I discovered through working on this is initialising s2 loops and polygons is very slow and there’s no way to store them that doesn’t have to run all the validation code upon loading them. I found that going from creating the s2 types every time to using geom’s IsPointInRing function yields a ~100x speed increase. I used s2 types because I wrote rgeo to be used many times in a program rather than a few, and s2’s contains point query is very fast (although I don’t have any benchmarks to hand).