Opened 6 years ago

Last modified 2 years ago

#2564 new enhancement

Implement the inverse of the ST_AsLatLonText function

Reported by: andrewxhill Owned by: pramsey
Priority: low Milestone: PostGIS Fund Me
Component: postgis Version: 2.1.x
Keywords: Cc: strk, dbaston

Description

It would be helpful (likely to a number of applications and users) if PostGIS had a function to go from DMS to geometry.

For example being able to perform these two variations would be a great start,

geometry pt ST_FromLatLonText(text, text); example: SELECT ST_FromLatLonText('31°55'12.000"N','98°58'48.000"W');

and

geometry pt ST_FromLatLonText(text); example: SELECT ST_FromLatLonText('31°55'12.000"N 98°58'48.000"W');

Change History (10)

comment:1 Changed 6 years ago by robe

Milestone: PostGIS 2.1.2PostGIS 2.2.0

New function can't go into a micro release

comment:2 Changed 6 years ago by dbaston

Cc: dbaston added

This is something I've thought would be useful too, and I'd be happy to put together a patch for consideration.

I see a few directions this could take:

1) Input text must match some very rigid format (no risk of misinterpretation) or returns error. Somewhat reduces the utility of function, which is when you have coordinates that you don't feel like parsing yourself.

2) Input text must match looser format (some risk of misinterpretation of strange input) returns error if anything really weird. By loose I mean accepting up to 3 valid numbers divided by non-numbers and assuming they represent D, M, S.

3) Input text must match user-supplied format string. ST_AsLatLonText($$31°55'12.000"N$$, $$DD°MM'SS.SSS"X$$) for example.

Any thoughts?

comment:3 Changed 6 years ago by pramsey

In my musing on this topic over the years, I've always felt a format string would be req'd.

DDMMSS
DDMMSS.SS
DDMM.MMM
DD.DDDD
DDW MM" SS'

Some quirks would be around noticing the NSEW tokens and ensuring the right value ended up in X and Y. Dealing with 2 and 3 digit east/west degree values. Dealing with both NSEW tokens and decimal degrees as potential inputs. Could be a very challenging piece. I always looked at it and said "let them write their own regex". Might be easier to write a regex tutorial page than write the code itself :)

comment:4 Changed 6 years ago by strk

I think a format string is the way to go, including a format character to obtain the "looser" behavior, to make every case supported. The "loose format string" would be the default.

Any output from ST_AsLatLonText, with a given format string, should be interpreted to result in its input when passed with the same format string to ST_FromLatLonText. The output from ST_AsLatLonText with no format string should be interpreted to result in its input when no format string is used in the ST_FromLatLonText function.

comment:5 Changed 6 years ago by dbaston

I see how the idea for regex tutorial came about! I was thinking a format string seemed necessary too, but as I thought more about it, I'm not sure.

Say you're willing to require the following about the input. 1) Latitude and Longitude have the same formatting (they're both DMS or DD.DDDD, etc., but not mixed) 2) Either cardinal directions are provided, or latitude can be assumed to come before longitude. 3) If N, S, E, and W appear in the input string, they can be assumed to represent cardinal directions 4) Some kind of delimiter is used between degrees, minutes, and seconds. (you don't have 431720.33 for 43°17'20.33)

I put together a function that parses under these assumptions, and I'm getting good results on a pretty broad set of inputs. The lines below show raw input strings, followed by the returned lat/lon or error condition.

I put code for this on github at https://github.com/dbaston/parse_dms . If you're comfortable with the approach, I can work on a patch to liblwgeom (or wherever it would be appropriate)

raw: 2°19'29.928"S 3°14'3.243"W ;

lat:-2.324980 lon:-3.234234

raw: 2 degrees, 19 minutes, 30 seconds to the S 3 degrees, 14 minutes, 3 seconds to the W

lat:-2.325000 lon:-3.234167

raw: -2°19'29.928" -3°14'3.243"

lat:-2.324980 lon:-3.234234

raw: 2.3250 degrees S 3.2342 degrees W

lat:-2.325000 lon:-3.234200

raw: 44° 8.156', -72° 16.194'

lat:44.135933 lon:-72.269900

raw: 32° 18' 23.1" N 122° 36' 52.5" W

lat:32.306417 lon:-122.614583

raw: 32° 18.385' N 122° 36.875' W

lat:32.306417 lon:-122.614583

raw: 32.30642° N 122.61458° W

lat:32.306420 lon:-122.614580

raw: +32.30642, -122.61458

lat:32.306420 lon:-122.614580

raw: the coordinates were 122° 36' 52.5" W and 32° 18' 23.1" N

lat:32.306417 lon:-122.614583

raw: 40:26:46.302N 079:58:55.903W

lat:40.446195 lon:-79.982195

raw: 40°26′46″N 079°58′56″W

lat:40.446111 lon:-79.982222

raw: 40d 26′ 46″ N 079d 58′ 56″ W

lat:40.446111 lon:-79.982222

raw: 40.446195N 79.982195W

lat:40.446195 lon:-79.982195

raw: 40.446195, -79.982195

lat:40.446195 lon:-79.982195

raw: 40.446195,-79.982195

lat:40.446195 lon:-79.982195

raw: 40° 26.7717, -79° 58.93172

lat:40.446195 lon:-79.982195

raw: N40:26:46.302 W079:58:55.903

lat:40.446195 lon:-79.982195

raw: N40°26′46″ W079°58′56″

lat:40.446111 lon:-79.982222

raw: N40d 26′ 46″ W079d 58′ 56″

lat:40.446111 lon:-79.982222

raw: N40.446195 W79.982195

lat:40.446195 lon:-79.982195

raw: -16 deg. 23.44 min. S, -44 deg. 32.2 min. W

lat:-16.390667 lon:-44.536667

raw: 32.30642° NE 122.61458° W (problem is too many cardinal directions)

too many cardinal directions

raw: 32°23.45' N 122.61458° W (problem is differing numbers of components DM vs D)

invalid # numeric components.

raw: N40d 26′ 46″ W079d 58′ 56″ 24ms (problem is too many components)

invalid # numeric components.

raw: 40.446.195, -79.982195 (unparseable number)

numeric parse error

raw: 32°26′46″ N 122.61458° W (inconsistent format)

coordinates not same format

comment:6 Changed 6 years ago by pramsey

I always go back to one of my examples from a project where the numbers were

4812.2321 12043.1234

Turns out the fields were DDMMM.MMM. Decimal minutes might well be a corner case not worth thinking about, but I think un-delimeted inputs is perhaps not as rare as one might think.

481232.132 1202314.123

DDMMSS.SSS seems like a very likely format to see in the wild.

Interested to see what others think. Pleasing everyone is definitely not an option on a task with so much potential variation.

comment:7 Changed 6 years ago by dbaston

Interesting.

I was getting hung up on how to write a format string that could handle undelimited versions of both

99°59'59.999" and 100°00'00.000" ?

It seems like you'd have to go with

DMMSS.SSS : you specify the number of digits for minutes and seconds, and everything leftover on the left is taken to be degrees? Minutes and seconds would have to be left-padded with zeros for correct interpretation...

Of course, with input that rigid, it would be pretty easy to use a substring function to build the geometry too. Just thinking out loud...

comment:8 Changed 4 years ago by pramsey

Milestone: PostGIS 2.2.0PostGIS 2.3.0

comment:9 Changed 3 years ago by robe

Milestone: PostGIS 2.3.0PostGIS 2.4.0

comment:10 Changed 2 years ago by robe

Milestone: PostGIS 2.4.0PostGIS Fund Me
Note: See TracTickets for help on using tickets.