Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#2918 closed patch (fixed)

Use GeographicLib functions for ST_Azimuth, ST_Distance, ST_Project, ST_Length, ST_Perimeter and ST_Area

Reported by: Mike Taves Owned by: pramsey
Priority: critical Milestone: PostGIS 2.2.0
Component: postgis Version: trunk
Keywords: Cc:

Description

I've been testing the direct and inverse functions in geodesic.h, and they look pretty good! This stand-alone C library which is detachable from GeographicLib was recently published by Karney (2013) with an MIT/X11 License. The peer-reviewed algorithms are "accurate, robust, and fast solutions to the direct and inverse geodesic problems".

The attached patch is only a start, with a USE_GEODESIC def at the top of lwspheroid.c to toggle the library's use, or the existing library use. Yet to come are more regression tests, speed testing (I'd like help here), and any related discussion on if this is a good idea. Also, there are polygon area functions that I'll look at further only if things are looking good.

The library is actually really easy to use. All angular units are in degrees, and linear units are in metres.

  • geod_inverse : takes two points, and returns the forward and reverse azimuths (these are different!), and the distance between the two points. Used in spheroid_distance and spheroid_direction.
  • geod_direct : takes a point, a forward azimuth, and distance, and returns the second point and the reverse azimuth. Used in spheroid_project.

Although I haven't written the extra regression tests yet, the library solves #2913 for inverse solutions to near-antipodal points.

The patch passes all of the existing regression tickets.

Patch source at GitHub pull request:

References

  1. F. F. Karney, Algorithms for geodesics, J. Geodesy 87(1), 43–55 (Jan. 2013); DOI:10.1007/s00190-012-0578-z Addenda

Change History (28)

comment:1 Changed 5 years ago by robe

Two questions to mind?

1) Have you done any speed tests at all between the two? 2) Does the code make any assumptions about spatial projection? One of the changes made in 2.1 was support for any spheroidal projection (which though I haven't tested) would allow supporting work on Mars or for gaming environments that have a spheroidal model but not an earthly one. I wouldn't want to lose that feature.

comment:2 Changed 5 years ago by pramsey

The ticket #2618 also implicitly does this, since the geodesic functions in Proj 4.9 were also copied from geographiclib I believe.

comment:3 Changed 5 years ago by Mike Taves

I haven't done any tests yet, but I'll design a few. Basically they would find the azimuth and distances from a cross product of point grids over a few extents: global, continental, and regional. I'll try a few latitudes to see if one algorithm converges quicker than another, because both old and new algorithms are iterative.

The implementation works on different spheroid definitions, such as Mars, e.g. comparing equatorial antipodes:

SELECT ST_Distance_Spheroid(A, B, earth)/1000.0 AS earth_km, ST_Distance_Spheroid(A, B, mars)/1000.0 AS mars_km
FROM (
  SELECT 'POINT(-90 0)'::geometry AS A, 'POINT(90 0)'::geometry AS B,
    'SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]]'::spheroid AS earth,
    'SPHEROID["Mars_2000_IAU_IAG",3396190.0,169.89444722361179]'::spheroid AS mars
) AS f;

     earth_km     |     mars_km      
------------------+------------------
 20003.9314586254 | 10638.0685065677
(1 row)

One limitation, however, is that the geography type has a hard-coded spheroid, so this doesn't work:

INSERT into spatial_ref_sys (srid, auth_name, auth_srid, proj4text, srtext) values ( 949900, 'iau2000', 49900, '+proj=longlat +a=3396190 +b=3376200 +no_defs ', 'GEOGCS["Mars 2000",DATUM["D_Mars_2000",SPHEROID["Mars_2000_IAU_IAG",3396190.0,169.89444722361179]],PRIMEM["Greenwich",0],UNIT["Decimal_Degree",0.0174532925199433]]');
SELECT degrees(ST_Azimuth(A, B)), ST_Distance(A, B)/1000.0 AS distance_km
FROM (
  SELECT
    'SRID=949900;POINT (-90 0)'::geography AS A,
    'SRID=949900;POINT (90 0)'::geography AS B
) AS f;

Gives Earth numbers, and not Martian numbers.

comment:4 Changed 5 years ago by Mike Taves

An that last SQL result was correct for Earth, but not Mars:

 degrees |   distance_km    
---------+------------------
       0 | 20003.9314586254
(1 row)

comment:5 Changed 5 years ago by Mike Taves

Performance tests are looking just over twice as slow for the geodesic C functions from GeographicLib compared to the current algorithms in PostGIS, equally for ST_Azimuth and ST_Distance which call the same function geod_inverse. I've also tested modifications to geodesic.c's #define GEOGRAPHICLIB_GEODESIC_ORDER from 6 to 3, which speeds things up with less precision, but they are still less than twice as slow as the existing functions.

The geodesic length and area functions are not tested to compare, but I'll investigate only if PSC think it is worth doing. There could be improvements to speed by changing internal coordinate units for GEOGRAPHIC_POINT in degrees, rather than constantly flipping them to radians (I'm yet to find any geodesic tool that uses radians), but this would require broader internal changes.

Some of the commands used to do some testing are provided here:

-- Sanity check to see if PostGIS's or GeographicLib's functions are being used
SELECT degrees(ST_Azimuth(A, B)), ST_Distance(A, B)/1000.0 AS distance_km
FROM (
  SELECT
    'POINT (-90 0)'::geography AS A,
    'POINT (90 0)'::geography AS B
) AS f;

CREATE OR REPLACE FUNCTION ST_GriddedPoints(
        nlon integer, nlat integer,
        lonsize float8, latsize float8,
        lon0 float8, lat0 float8)
    RETURNS SETOF geography AS
'SELECT ST_MakePoint(i / ($1::float8 - 1) * $3 + $5, j / ($2::float8 - 1) * $4 + $6)::geography
FROM
  generate_series(0, $1 - 1) AS i,
  generate_series(0, $2 - 1) AS j' LANGUAGE sql IMMUTABLE STRICT;

CREATE TEMP TABLE pts (gid serial primary key, geog geography(Point,4326));

TRUNCATE pts;
INSERT INTO pts(geog)
SELECT g
FROM ST_GriddedPoints(31, 21, 10, 10, 0, 0) AS g;

SELECT ST_Azimuth(p1.geog, p2.geog), ST_Distance(p1.geog, p2.geog)
INTO UNLOGGED foo
FROM pts p1, pts p2;
drop table foo;

Also, I've only tested this on a Debian x64-based computer. I haven't tested on any other platform, but would be keen to hear the comparisons.

comment:6 Changed 5 years ago by pramsey

The geodesic area in the current geography is quite brittle with many known failure modes (poles? etc). An improved area calculation, even a slower one, would be good to have. I'm less concerned with near-antipodal lines mostly because I want to discourage them in general (since actually antipodal lines are meaningless). How near to the antipodes do we have to get before Vincenty starts to fall apart?

comment:7 Changed 5 years ago by Mike Taves

Looking at the existing Vincenty methods in PostGIS, it looks like "near antipodal" is just under 1 mm distance away from the true antipode for ST_Distance to provide reliable numbers, and about 100 m distance for ST_Azimuth to provide reliable numbers. So it seems that "near antipode" is a corner case.

More importantly, I'm yet to compare the results between existing Vincenty's to Karney's methods. It is claimed that Vincenty's is less accurate, but I'd like to quantify this figure. The round-off errors are less than 15 nanometers. Also, the benchmarking done in Karney (2013) is essentially the same what I've found for the inverse methods (ST_Distance, ST_Azimuth): compare 2.34 μs to 1.34 μs for Vincenty's, or just over twice as slow. So there were never claims that GeographicLib were faster than existing methods. However, for things like ST_Segmentize for straight lines, each additional point can be found in 0.37 μs. Section 7 of the paper describes the implementation, and is a good and short read that is highly relevant for this ticket.

I'll set-up ptarray_area_spheroid to do something similar to the `planimeter.c` example to get a good idea of the area calculation abilities. The error in the area calculation in GeographicLib is about 0.1 m².

comment:8 Changed 5 years ago by Mike Taves

ST_Area has been updated to use GeographicLib, changes in GitHub PR.

We can now do area calculations like this:

select ST_Area('POLYGON((0 0, 1 1, 1 0, 0 0))'::geography);
-[ RECORD 1 ]-------------
st_area | 6154854786.72143

whereas with trunk, it cannot do this:

ERROR: ptarray_area_spheroid: cannot handle ptarray that crosses equator

I've checked areas for various countries with this shapefile, and compared old and new results

Picking a few countries with high/mid/low differences of areas, in terms of percent:

  • -0.48811% MB Martinique
  • -0.15857% SC Saint Kitts and Nevis
  • -0.00061% CA Canada
  • -0.00005% CH China
  • 0.00000% BR Brazil
  • 0.00000% RS Russia
  • 0.00015% US United States
  • 0.23673% AV Anguilla
  • 0.34825% BD Bermuda

And as a difference of areas in km²:

  • -78.9457 km² ML Mali
  • -59.2028 km² CA Canada
  • -4.7327 km² CH China
  • 0.0000 km² BR Brazil
  • 0.00000 km² RS Russia
  • 14.3232 km² US United States
  • 48.1370 km² ET Ethiopia
  • 50.1431 km² NI Nigeria

So this means that both results are <1% apart, but can be tens of km² difference for larger nations. Countries like Brasil and Russia, however, happen to calculate very similar areas.

comment:9 Changed 5 years ago by Mike Taves

More testing with ST_Area on with this more detailed shapefile shows very different results of comparisons of area from both difference of areas in km² and in %. Here are some samples of these results:

  • -0.2264 km² -0.01968% MB Martinique
  • -4.7335 km² -0.00038% ML Mali
  • -3.9577 km² -0.00020% MX Mexico
  • -2.2862 km² -0.00002% US United States
  • 0.0000 km² 0.00000% AY Antarctica
  • 0.0022 km² 0.00075% SC Saint Kitts and Nevis
  • 0.5005 km² 0.00001% CA Canada
  • 1.1292 km² 0.00001% CH China
  • 6.5751 km² 0.00056% NG Niger
  • 6.6234 km² 0.00156% YM Yemen
  • 0.0127 km² 0.01337% AV Anguilla
  • 0.1483 km² 0.02382% IM Isle of Man

As with the earlier simplified world borders, the errors have a mean close to zero, but they are different for the same countries in simplified and detailed datasets, so I don't think there is much bias of latitude in the Vincenty's vs. Karney's methods. However, it does indicate that there are errors in one or both calculations, which is expected. I can't say which errors are worse, unless I can find a slower/more precise geodesic method to test against.

Using the detailed world borders mentioned above, I've run some performance tests, which similarly shows that Karney's methods are about twice as slow as Vincenty's. E.g.:

SELECT ST_Area(geog) INTO UNLOGGED foo FROM world;

which is (on typical average) 539 ms for Vincenty's and 1144 ms for Karney's method.

comment:10 Changed 5 years ago by robe

Milestone: PostGIS 2.2.0

comment:11 Changed 5 years ago by darkblueb

I have built the GeographicLib on Ubuntu 14.04 64bit, Postgres 9.3; and applied the supplied patch to 2.2trunk.

The patch line 23 shows adding geodesic.o to NM_OBJS, but the build library shows

make[1]: Entering directory `/pinedisk/shared/srcs_thuban1/GeographicLib-1.37/src'
make[2]: Entering directory `/pinedisk/shared/srcs_thuban1/GeographicLib-1.37/src'
 /bin/mkdir -p '/usr/local/lib'
 /bin/bash ../libtool   --mode=install /usr/bin/install -c   libGeographic.la '/usr/local/lib'
libtool: install: /usr/bin/install -c .libs/libGeographic.so.13.0.0 /usr/local/lib/libGeographic.so.13.0.0
libtool: install: (cd /usr/local/lib && { ln -s -f libGeographic.so.13.0.0 libGeographic.so.13 || { rm -f libGeographic.so.13 && ln -s libGeographic.so.13.0.0 libGeographic.so.13; }; })
libtool: install: (cd /usr/local/lib && { ln -s -f libGeographic.so.13.0.0 libGeographic.so || { rm -f libGeographic.so && ln -s libGeographic.so.13.0.0 libGeographic.so; }; })
libtool: install: /usr/bin/install -c .libs/libGeographic.lai /usr/local/lib/libGeographic.la
libtool: install: /usr/bin/install -c .libs/libGeographic.a /usr/local/lib/libGeographic.a
libtool: install: chmod 644 /usr/local/lib/libGeographic.a
libtool: install: ranlib /usr/local/lib/libGeographic.a
libtool: finish: PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin" ldconfig -n /usr/local/lib

oddly, when I searched for geodesic.o, I found one here..

/pinedisk/srcsAced/grass_trunk/lib/gis/OBJ.x86_64-unknown-linux-gnu/geodesic.o

onward...

comment:12 Changed 5 years ago by darkblueb

some basic timings, which support the timings above:

9.3
 public | tm3_world_borders         | table    | dbb   | 6568 kB    | 

test_geog=# SELECT ST_Area(geog) INTO UNLOGGED foo FROM tm3_world_borders;
SELECT 246
Time: 648.776 ms

select name, st_area(geog) from tm3_world_borders order by name;
Time: 644.433 ms

##----
9.4b2
 public | tm3_world_borders         | table    | dbb   | 6568 kB    | 

test_geog=# SELECT ST_Area(geog) INTO UNLOGGED foo FROM tm3_world_borders;
SELECT 246
Time: 1615.268 ms

select name, st_area(geog) from tm3_world_borders order by name;
Time: 1601.604 ms

comment:13 in reply to:  11 Changed 5 years ago by Mike Taves

@darkblueb the GeographicLib C++ library is not needed for this feature. The patch includes two files from the C version, which only uses standard library features. There are a few other similar stand-alone libraries that natively reimplement the same algorithms and structures in various languages, such as Python, JavaScript?, and Matlab.

As for GRASS, they appear to have a file geodist.c which has the same name, but was created independently of GeographicLib and is very different.

comment:14 Changed 5 years ago by Mike Taves

Here are some more performance and accuracy tests on Debian x86_64. A suite of 500k points is available here. Descriptions of the fields are provided in the link, and most fields are regarded to be either exact, or accurate to 10-18 .

The GeodTest.dat.gz file can be gunzip'ed and loaded into a PostGIS database:

create unlogged table testgeod (
  id serial primary key,
  lat1 numeric, -- degrees
  lon1 numeric, -- always 0
  azi1 numeric, -- azimuth, in degrees
  lat2 numeric,
  lon2 numeric,
  azi2 numeric,
  s12 numeric,  -- distance, in metres
  a12 numeric,
  m12 numeric,
  "S12" numeric  -- area under the geodesic, in square metres
);

copy testgeod (lat1, lon1, azi1, lat2, lon2, azi2, s12, a12, m12, "S12")
from '/path/to/GeodTest.dat'
delimiter ' ';

alter table testgeod add column geog1 geography(Point,4326);
alter table testgeod add column geog2 geography(Point,4326);
alter table testgeod add column distance double precision;
alter table testgeod add column azimuth double precision;

update testgeod set
  geog1 = ST_MakePoint(lon1, lat1)::geography,
  geog2 = ST_MakePoint(lon2, lat2)::geography,
  azimuth = radians(azi1),
  distance = s12;

Direct geodesic calculations

-- test performance of st_project
drop table if exists foo;
select id, st_project(geog1, s12, azimuth) into unlogged foo from testgeod;

-- test accuracy of st_project
select f.id, lat1, lon1, lat2, lon2,
  st_y(st_project::geometry) as st_project_lat, st_x(st_project::geometry) as st_project_lon,
  s12, azi1, st_distance(st_project, geog2) AS st_distance_error
from foo f
join testgeod g on g.id=f.id
order by st_distance(st_project, geog2) desc
limit 10;

-- percent below 10 nm accuracy
select (count(*)/5e5*100)::numeric(8,4) || '%'
from foo f
join testgeod g on g.id=f.id
where st_distance(st_project, geog2) < 1e-8

For the performance, compare 2490 ms for current, and 3423 ms for patch, or about 1.37 slower.

For precision of distance in comparison with an exact number, compare a maximum of 0.002373693 m to 1.3e-08 m. Looking at % that is less than 10 nm accuracy, compare 24.8922% to 99.9522%. At the micrometer and millimetre scale, the current routines have accuracy for 28.0476% and 96.3272% of the 500k points, respectively, while GeographicLib has 100% accuracy at the 13 nm level. Considering only a small performance hit, there is a gain of precision with GeographicLib for those that care for sub millimetre accuracy.

I'll continue on tomorrow for a similar comparison with the inverse geodesic functions, ST_Azimuth and ST_Distance. I'm still figuring out how to test S12 for ST_Area.

comment:15 Changed 5 years ago by Mike Taves

More performance and precision results ....

Test data

First, the testgeod table has been updated from yesterday.

drop table testgeod;
create unlogged table testgeod (
  id serial primary key,
  lat1 numeric, -- degrees
  lon1 numeric, -- always 0
  azi1 numeric, -- azimuth, in degrees
  lat2 numeric,
  lon2 numeric,
  azi2 numeric,
  s12 numeric,  -- distance, in metres
  a12 numeric,
  m12 numeric,
  "S12" numeric,  -- area under the geodesic, in square metres
  geog1 geography(Point,4326),
  geog2 geography(Point,4326),
  polyg geography(Polygon,4326),  -- Karney (2013) Fig. 1, quadrilateral AFHB
  area double precision,
  azimuth double precision,
  distance double precision,
  e real, -- first eccentricity: circle is zero, one is thin
  antipode_fraction real
);

copy testgeod (lat1, lon1, azi1, lat2, lon2, azi2, s12, a12, m12, "S12")
from '/path/to/GeodTest.dat'
delimiter ' ';

update testgeod set
  geog1 = ST_MakePoint(lon1, lat1)::geography,
  geog2 = ST_MakePoint(lon2, lat2)::geography,
  polyg = ST_MakePolygon(ST_MakeLine(
    ARRAY[ST_MakePoint(lon1, lat1), -- A
          ST_MakePoint(lon1, 0),    -- F
          ST_MakePoint(lon2, 0),    -- H
          ST_MakePoint(lon2, lat2), -- B
          ST_MakePoint(lon1, lat1)])),  -- close with A
  area = abs("S12"),
  azimuth = radians(azi1),
  distance = s12,
  e = case when abs(lat1 - lat2) < abs(lon1 - lon2)
           then sqrt(1 - pow(lat1 - lat2, 2)/pow(lon1 - lon2, 2))
           else sqrt(1 - pow(lon1 - lon2, 2)/pow(lat1 - lat2, 2)) end,
  antipode_fraction = s12 / 20003931.459;

The test column e ("first eccentricity" of the bounds) is used to tell very thin quadrilaterals (1.0) to roundish (0.0). The antipode_fraction is near 0.0 for short geodesics, and 1.0 for ones that are 180°. These are necessary to filter out some end-member tests for fair comparisons with Vincenty's algorithms. The test data has a wide mixture of geodesics, including a number of corner cases like near-antipodes. Please keep this in mind for both interpretations for both timings and analysis of precision distributions, as they are probably biased against Vincenty's.

Direct geodesic calculation (ST_Project) - update

The only change is to avoid casting s12 from numeric, and just use distance (double precision).

drop table if exists f;
select id, st_project(geog1, distance, azimuth) into unlogged f from testgeod;

This boots the performance of each test to 1871 ms vs 2742 ms, or 1.46 times slower for Karney.

The precision results are the same as before, but to describe the % accuracy at a few other arbitrary cutoffs:

select 
  avg(case when diff_distance <= 1e-2 then 100 else 0 end)::numeric(8,2) || '%' as cm,
  avg(case when diff_distance <= 1e-3 then 100 else 0 end)::numeric(8,2) || '%' as mm,
  avg(case when diff_distance <= 1e-6 then 100 else 0 end)::numeric(8,2) || '%' as µm,
  avg(case when diff_distance <= 1e-8 then 100 else 0 end)::numeric(8,2) || '%' as ten_nm
from (
  select st_distance(st_project, geog2) as diff_distance
  from f join testgeod g on g.id=f.id
) as f;
Method 1 cm 1 mm 1 µm 10 nm
Vincenty100.00%96.33%28.05%24.90%
Karney100.00%100.00%100.00%99.98%

Inverse geodesic calculations

Azimuth

Test performance of ST_Azimuth:

drop table if exists f;
select id, st_azimuth(geog1, geog2) into unlogged f from testgeod;

With these tests, Karney's is 3 times faster than Vincenty's, which is the opposite of what I've found earlier, and I think it is due to the dataset which has challenging geodesics that require more iteration with Vincenty's. Compare 11384 ms to 3786 ms for Karney's.

Test accuracy of ST_Azimuth:

select f.id, s12, antipode_fraction, e, degrees(st_azimuth) as st_azimuth_degrees, azi1,
  (((degrees(st_azimuth) - azi1 + 180.0)::numeric % 360.0) - 180.0) as diff_degrees
from f join testgeod g on g.id=f.id
-- where antipode_fraction between 0.004 and 0.996
order by abs(((degrees(st_azimuth) - azi1 + 90.0)::numeric % 180.0) - 90.0) desc
limit 10;

With the near-antipodes included (no filter), compare maximum absolute errors 89.992633730349° to 0.031697236597° error. Removing the comment to apply the filter to remove near-antipodes, compare 0.000006260972° to 0.000000000003°. And the % within precision table:

select 
  avg(case when diff_degrees <= 90 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e+0 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e-3 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e-6 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e-9 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e-12 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_degrees <= 1e-15 then 100 else 0 end)::numeric(8,3) || '%'
from (
  select
    abs(((degrees(st_azimuth) - azi1 + 180.0)::numeric % 360.0) - 180.0) as diff_degrees
  from f join testgeod g on g.id=f.id
) as f;
Method90°1e-3°1e-6°1e-9°1e-12°1e-15°
Vincenty91.675%89.842%84.959%78.527%73.551%32.790%31.574%
Karney100.000%100.000%99.995%86.328%82.551%61.332%57.107%

Distance

Test performance of ST_Distance:

drop table if exists f;
select id, st_distance(geog1, geog2) into unlogged f from testgeod;

The results are essentially the same for azimuth, since they use the same underlying methods.

For the accuracy:

select f.id, s12, antipode_fraction, e, distance, st_distance,
  (st_distance - distance) as diff_distance
from f join testgeod g on g.id=f.id
-- where antipode_fraction between 0.004 and 0.996
order by abs(st_distance - distance) desc
limit 10;

With the near-antipodes included (no filter), compare maximum absolute errors 133912.525 m to 1.49e-8 m error. Removing the comment to apply the filter to remove near-antipodes, the error of Vincenty's reduces to 0.01953874 m, and Karney's remains the same, at 14.9 nm. And the % within precision table:

select 
  avg(case when diff_distance <= 1e+5 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e+3 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e+0 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e-3 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e-6 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e-8 then 100 else 0 end)::numeric(8,3) || '%',
  avg(case when diff_distance <= 1e-9 then 100 else 0 end)::numeric(8,3) || '%'
from (
  select
    abs(st_distance - distance) as diff_distance
  from f join testgeod g on g.id=f.id
) as f;
Method 500 km 1 km 1 m 1 mm 1 µm 10 nm 1 nm
Vincenty99.659%97.404%96.400%65.387%19.603%6.425%3.135%
Karney100.000%100.000%100.000%100.000%100.000%99.873%58.909%

Area

The geodesic test data use a the quadrilateral AFHB (Fig. 1, Karney 2013), which is area under the geodesic to the equator. Unfortunately, the current algorithm in PostGIS cannot use points in the same hemisphere ("ERROR: ptarray_area_spheroid: cannot handle ptarray that crosses equator"), so only 325088 tests or 65.02% can be used. Here are the performance tests:

drop table if exists f;
select id, st_area(polyg) into unlogged f from testgeod
where (lat1 > 0 and lat2 < 0) or (lat1 < 0 and lat2 > 0);

The performance is identical, about 1950 ms. I think I need to dig around a bit more to find out why, as I'm seeing really odd behaviour. This Polygon has a NaN area for both unpatched and patched:

select ST_Area('POLYGON((0 78.703946026663,0 0,179.999997913235 0,179.999997913235 -33.0888306884702,0 78.703946026663))'::geography);

Looking at the debug messages, it seems to be happy to calculate the dot product instead of the area

NOTICE:  [lwgeom.c:lwgeom_set_srid:1499] entered with srid=4326
NOTICE:  [lwgeom.c:lwgeom_is_empty:1277] lwgeom_is_empty: got type Polygon
NOTICE:  [lwgeom.c:lwgeom_is_empty:1277] lwgeom_is_empty: got type Polygon
NOTICE:  [lwgeodetic.c:edge_point_side:655] dot product 3.64209202e-08
NOTICE:  [lwgeodetic.c:edge_point_side:655] dot product -3.97170884e-09
NOTICE:  [lwgeom_transform.c:PROJ4SRSCacheDelete:139] deleting projection object (0x7f35d11ba360) with MemoryContext key (0x7f35d11d8c10)

whereas with select ST_Area('POLYGON((0 0, 1 1, 1 0, 0 0))'::geography) the messages in this location are:

NOTICE:  [lwgeom.c:lwgeom_set_srid:1499] entered with srid=4326
NOTICE:  [lwgeom.c:lwgeom_is_empty:1277] lwgeom_is_empty: got type Polygon
NOTICE:  [lwgeom.c:lwgeom_is_empty:1277] lwgeom_is_empty: got type Polygon
NOTICE:  [lwspheroid.c:ptarray_area_spheroid:143] geod_polygon_addpoint 1: 1 1
NOTICE:  [lwspheroid.c:ptarray_area_spheroid:143] geod_polygon_addpoint 2: 0 1
NOTICE:  [lwspheroid.c:ptarray_area_spheroid:143] geod_polygon_addpoint 3: 0 0
NOTICE:  [lwspheroid.c:ptarray_area_spheroid:151] geod_polygon_compute area: -6154854786.72
NOTICE:  [lwgeom_transform.c:PROJ4SRSCacheDelete:139] deleting projection object (0x7f35d11ba360) with MemoryContext key (0x7f35d11d8c10)

so why the dot product instead of ptarray_area_spheroid?

comment:16 Changed 5 years ago by Mike Taves

Well, I figured out the NaN issue, and is detailed in ticket #2932. The issue was due to a shortcut of logic that determines the sphere area if it is deemed too difficult to solve on an ellipsoid of revolution. I've moved the USE_GEODESIC def to liblwgeom/liblwgeom_internal.h, which is referenced in postgis/geography_measurement.c. The updated PR updates this. Should the USE_GEODESIC def appear elsewhere?

The previous attempts to find the area "S12" are mixed, as it is not easy to define the quadrilateral AFHB using a planimeter approach, since the FH geodesics don't necessarily follow the equator (since the shortest path is sometimes across the poles for edge length (1-f)*180 deg or 179.396494 deg). It would require geod_gendirect to properly determine "S12", which isn't useful for ST_Area. So I'm scrapping this test approach for area, and I don't have any more rigid way to test and compare methods except for the previous reports comparing country sizes. I've individually tested polygons with GeographicLib's Planimeter with exact option (-E) show essentially the same area as normal calculations without -E.

However, with respects to GeographicLib's area calculation, it looks much more improved in several ways:

  1. it actually calculates on the ellipsoid of a revolution if the user requests it (i.e. use_spheroid=true, whereas the current implementation silently calculates on a sphere for some brittle cases)
  2. areas can be calculated around poles, and across the equator, etc., and is generally much more robust for different spheroid definitions.

E.g., area of Antarctica:

SELECT ST_Area('POLYGON ((-74 -72.9, -102 -71.9, -102 -74.9, -131 -74.3, -163 -77.5, 163 -77.4, 172 -71.7, 140 -65.9, 113 -65.7, 88 -66.6, 59 -66.9, 25 -69.8, -4 -70, -14 -71, -33 -77.3, -46 -77.9, -61 -74.7, -74 -72.9))'::geography);

previously, this would have silently short-cut to a sphere calculation, regardless of use_spheroid option. The exact calculation is 13376856682207.375 m²

Now what? Possible ideas: redefine ST_Length and ST_Perimeter for geography. Possibly look at ST_Segmentize. And round a few asserts in liblwgeom/cunit/cu_geodetic.c so they work with current and GeographicLib's results, since they are most likely very close, but certainly not exactly the same.

comment:17 Changed 5 years ago by karney

A couple of comments on this ticket:

(1) A table of how the error varies as you alter GEOGRAPHICLIB_GEODESIC_ORDER is given in table 2 of

http://arxiv.org/abs/1102.1215

Unless there's a real permformance hit in a realistic application, I recommend sticking with the default order.

(2) If you want some test areas that GeographicLib can handle OK, but many existing area algorithms have problems with, try the following triangle on a sphere:

    lat lon
    80 0
    0 0
    40 0.00001

This causes problems for some algorithms because they use a formula for the area in terms of the lengths of the sides. For a sphere of radius 6400 km, the area is 3986457.56957 m2

(3) You might want to consider exposing the differential quantities associated with geodesics. Here's the solution of the closest approach problem using Newton's method

https://sourceforge.net/p/geographiclib/discussion/1026620/thread/33ce09e0/

This uses the reduced length and geodesic scale to compute the derivative needed by Newton's method.

comment:18 Changed 5 years ago by karney

There's a bug in the geodesic routines slated for proj 4.9.0. If you add the next vertex with geod_polygon_addedge and if this edge extends more that 180 degrees in longitude, then (probably) the wrong area will be returned. I've posted the bug + sample code illustrating it to

http://trac.osgeo.org/proj/ticket/251

This also includes a patch fixing the problem (which I hope will make into in the released version of proj 4.9.0).

comment:19 Changed 5 years ago by Mike Taves

Summary: Use GeographicLib functions for ST_Azimuth, ST_Distance and ST_ProjectUse GeographicLib functions for ST_Azimuth, ST_Distance, ST_Project, ST_Length, ST_Perimeter and ST_Area

Recent updates to the patch include upgrade to the most recent GeographicLib release (from 1.37 to 1.39), additions and refinement of regression testings (where possible, using independent tests from GeographicLib utilities), and amendment to documentation. More details are included with individual commits on github PR #27.

Before the patch gets too big, could this get a review?

comment:20 Changed 5 years ago by robe

pramsey has commented on this on github:

repeated here for completeness: https://github.com/postgis/postgis/pull/27

"OK, so I haven't actually run this, but reading it, it seems to take over all geodesic calculations, the area calculation explicitly and the length/distance/projection ones by taking over the core edge routines for geodetic calculations. I thought some performance work showed that there was a speed penalty for this approach and we were looking at just replacing area."

comment:21 Changed 5 years ago by karney

As the author of GeographicLib, I am not a disinterested observer; so treat the following comments in this light...

I recommend using the GeographicLib routines for all geodesic calculations rather than a hybrid of legacy code (presumably Vincenty) and GeographicLib. The reasons are:

(1) Improved accuracy. Vincenty is pretty accurate (the error is about 0.1 mm). However this error can by magnified in certain circumstances (a poorly conditioned triangulation, for example). GeographicLib gives full double-precision accuracy.

(2) Solution to the inverse problem is always found. Vincenty fails in a small fraction of cases which means that all callers of the inverse routines need to handle a possible failure.

(3) GeographicLib calculates differential quantities related to geodesics. These are useful in solving some problems, e.g., computing the time of closest approach for two vessels moving along geodesics.

(4) Smaller library size. To calculate the area you need to pull in the GeographicLib routines anyway. It make sense to dispense with the (now redundant) code.

(5) Consistency. In the case where there are multiple shortest geodesics, it is important that the area reported apply to the geodesic which is found. This is difficult to do with the hybrid approach.

This just leaves the issue of the times. Most likely this is not a key issue since the times are so small -- roughly equivalent to performing a map projection. (In any case, many users would trade a more accurate and robust solution for a little speed.) The comparative times for randomly sampled points reported in my paper (Sec. 7) are:

         Vincenty  GeographicLib 
direct    1.11 us     0.88 us
line                  0.37 us
inverse   1.34 us     2.34 us

The time reported for "line" is the time per point to compute several points along a single geodesic (duplicating this functionality with Vincenty is straightforward but typically isn't provided).

comment:22 Changed 5 years ago by robe

pramsey if it helps -- I can try to do some tests against this git pull comparing against existing times and numbers to confirm no major performance or accuracy issues we haven't accounted for. I'm strapped for time this week, but should have time to devote in another 2 weeks.

comment:23 Changed 5 years ago by Mike Taves

I've merged the latest svn-trunk to the pull request, so it should be all caught up. There are a few other misc. changes that caught up, such as using math constants for fractions of PI, and wording in the manual.

I'm neutral for a hybrid approach. If such approach were implemented for inverse geodesic measures (distance), it would switch methods where there are known precision issues (e.g.) >0.1 mm error, and/or the difference of performance is negligible. It would take a bit more investigating to find out how to implement these limits (or to find out if that approach is too complicated). Area and direct geodesic (ST_Project) methods should definitely use GeographicLib.

At any rate, the user has the choice of using the use_spheroid=true option if they don't want to compromise speed. But if they want to use more expensive and accurate geodesic methods, then they can choose to have it with use_spheroid=false.

comment:24 Changed 4 years ago by pramsey

Priority: mediumcritical

comment:25 Changed 4 years ago by pramsey

Committed at r13521 and r13522

comment:26 Changed 4 years ago by pramsey

Backed out the geodesic.c/h files and depend on proj 4.9+ for the new function instead at r13526

comment:27 Changed 4 years ago by pramsey

Resolution: fixed
Status: newclosed

Thanks for the great patch!

comment:28 Changed 4 years ago by dbaston

Should this GitHub PR be closed out?

Note: See TracTickets for help on using tickets.