Opened 13 years ago

Last modified 8 years ago

#1424 new defect

Profile tool always casts to integers (incl. csv output)

Reported by: hamish Owned by: grass-dev@…
Priority: major Milestone: 6.4.6
Component: wxGUI Version: svn-releasebranch64
Keywords: profile tool, precision Cc:
CPU: All Platform: All

Description

Hi,

as reported on the grass-user mailing list yesterday, export to .csv always exports as integers, regardless of if the source data is CELL, FCELL, or DCELL.

Moritz prepared a patch:

--- profile.py    2011-08-19 13:28:17.000000000 +0200
+++ profile_new.py    2011-08-19 13:28:43.000000000 +0200
@@ -598,7 +598,7 @@
                                   caption=_("Error"), style=wx.OK | wx.ICON_ERROR | wx.CENTRE)
                     return False
                 for datapair in r['datalist']:
-                    file.write('%d,%d\n' % (float(datapair[0]),float(datapair[1])))
+                    file.write('%f,%f\n' % (float(datapair[0]),float(datapair[1])))

                 file.close()

but I think we need to go deeper than that, as %d shows up all around the script, both in preparing the source points to feed into r.profile, and in processing the results of it.

  • Rather importantly in lat/lon locations casting the tie points to int can silently return the wrong results, with coords shifted a long way from where you expected.

On the input side it will be easy enough to change %d to %.15g (technically %.9g would be enough to cover sub-millimeter in lat/lon) for r.profile and r.what's coord input.

On the output site the distance along transect should be at least to the mm (%.3f or %.3g assuming the tool preserves r.profile's column 1 distance as meters), and test for map type, exporting column 2 as %d for CELL, %.7g for FCELL, and %.15g for DCELL. (see #335)

another thing to keep in mind is that map_units is not always going to be feet, meters, or degrees. there's little reason it couldn't be microns or parsecs, so to avoid others being limited by our humble imaginations we should avoid unnecessary assumptions about that whenever possible.

Hamish

ps- see also #1292 "Profile tool incorrectly processes no data values"

Attachments (2)

15g_profile.patch (1.6 KB ) - added by hamish 13 years ago.
wxprofile.diff (2.5 KB ) - added by mlennert 11 years ago.
diff combining Hamish' patch with the use of m.measure

Download all attachments as: .zip

Change History (11)

comment:1 by hamish, 13 years ago

Also I notice an occurrence of sqrt(a^2 + b^2) in profile.py, which is to be avoided as it gives the wrong answer from lat/long locations.

if lasteast and lastnorth:
    dist = math.sqrt(math.pow((lasteast-point[0]),2) + math.pow((lastnorth-point[1]),2))

in trunk there is the m.measure C module to use a front end to G_distance().

In 6.x we used to have swig/python/examples/m.distance as a python front-end to that; I'm not sure if a simple ctypes replacement exists yet or not.

thanks, Hamish

comment:2 by hamish, 13 years ago

math.sqrt() is also used in mapdisp_window.py and nviz_mapdisp.py, and will need to be checked/replaced.

Hamish

in reply to:  1 ; comment:3 by glynn, 13 years ago

Replying to hamish:

Also I notice an occurrence of sqrt(a^2 + b^2) in profile.py, which is to be avoided as it gives the wrong answer from lat/long locations.

That will give the wrong answer for all locations; Python follows C and uses ^ for the bitwise exclusive-or operator. Exponentiation uses ** (which should be preferred over math.pow()), i.e. sqrt(a2 + b2).

However, Python's math module also has a hypot() function, which is probably the optimal approach here.

in reply to:  3 comment:4 by hamish, 13 years ago

Replying to glynn:

That will give the wrong answer for all locations; Python follows C and uses ^ for the bitwise exclusive-or operator. Exponentiation uses ** (which should be preferred over math.pow()), i.e. sqrt(a2 + b2).

oh, I was just pseudo-coding. The actual python code uses math.pow(,2)

However, Python's math module also has a hypot() function, which is probably the optimal approach here.

but not if the map units are in degrees, which is why I'm looking for a python front-end to G_distance(), or failing that a pipe_command() to the m.measure module.

finishing a patch for the %d problem, Hamish

by hamish, 13 years ago

Attachment: 15g_profile.patch added

comment:5 by hamish, 13 years ago

basic patch added to the ticket. But before applying in svn:

grass.raster_info(raster)['datatype']

should be used to choose export precision based on the raster map's type, something like:

@@ -585,9 +585,18 @@
                                   message = _("Unable to open file <%s> for writing.") % pfile,
                                   caption = _("Error"), style = wx.OK | wx.ICON_ERROR | wx.CENTRE)
                     return False
+
+                rast_type = grass.raster_info(raster)['datatype']
+                if rast_type == 'CELL':
+                    write_fmt = '%.3f,%d\n'
+                if rast_type == 'FCELL':
+                    write_fmt = '%.3f,%.7g\n'
+                if rast_type == 'DCELL':
+                    write_fmt = '%.3f,%.15g\n'
+
                 for datapair in r['datalist']:
-                    file.write('%d,%d\n' % (float(datapair[0]),float(datapair[1])))
-                                        
+                    file.write(write_fmt % (float(datapair[0]),float(datapair[1])))
+
                 file.close()
 
         dlg.Destroy()

(vs trunk; untested)

... but why are we doing that at all? for the .csv export why not just re-run r.profile and save to the output= file directly, with all the above stuff already taken care of? no risk of introducing aliasing or bugs if we produce that straight from the source + it would make the code a bit simpler.

a good test for this is to go into Spearfish, and prepare a limited range map:

g.region rast=elevation.dem
r.mapcalc "elev_1k = elevation.dem / 1000.0"
r.info -r elev_1k
 min=1.066
 max=1.84

then pull that into a lat/long location with r.proj. So both map units (decimal degrees) and z-values will show up as obviously wrong if something gets cast to an integer.

Hamish

comment:6 by mlennert, 11 years ago

A student of mine was bitten by this again, recently, so I looked into it.

I'll attach a new patch which combines Hamish' original patch with the addition of using m.measure. This solves the issue of casts to integers (and no need for all the raster format checks Hamish suggested as the data actually is the original r.profile output with just nulls filtered out. And .15g seems appropriate for me as the output is then adapted to the type of data in input (i.e. integers are displayed as integers).

I'm still a bit stuck on the use of the profile tool in lat-long locations. The use of m.measure was only one aspect. Antoher issue is on line 232ff where the resolution to be fed to r.profile is calculated. Using m.measure to determine the transect length gives a length in meters and thus the calculated resolution is in meters which then does not work for r.profile in a lat-long location.

So, the patch is still unfinished work, but I'm very busy with other things these days, and do not have the time in the immediate future to look at this. Maybe someone else can pick up on what I did.

In any case, it would be great to have the integer->float parts of the patch applied (and backported to 64release ?).

Moritz

by mlennert, 11 years ago

Attachment: wxprofile.diff added

diff combining Hamish' patch with the use of m.measure

in reply to:  6 ; comment:7 by neteler, 8 years ago

Milestone: 6.4.26.4.6

Replying to mlennert:

A student of mine was bitten by this again, recently, so I looked into it.

I'll attach a new patch which combines Hamish' original patch with the addition of using m.measure.

...

So, the patch is still unfinished work, but I'm very busy with other things these days, and do not have the time in the immediate future to look at this. Maybe someone else can pick up on what I did.

In any case, it would be great to have the integer->float parts of the patch applied (and backported to 64release ?).

Moritz

What is the state here?

in reply to:  7 comment:8 by mlennert, 8 years ago

Replying to neteler:

Replying to mlennert:

A student of mine was bitten by this again, recently, so I looked into it.

I'll attach a new patch which combines Hamish' original patch with the addition of using m.measure.

...

So, the patch is still unfinished work, but I'm very busy with other things these days, and do not have the time in the immediate future to look at this. Maybe someone else can pick up on what I did.

In any case, it would be great to have the integer->float parts of the patch applied (and backported to 64release ?).

Moritz

What is the state here?

No one ever reacted to the patch. It worked for me at the time. Currently I can't check because I'm bitten by #2558.

comment:9 by mlennert, 8 years ago

Sorry, was a bit quick with my answer here: there was still the issue of getting the region in a resolution in meters to be able to use it in relation with the m.measure output.

I see two options:

  • Declare the profiling tool unusable in lat-long and output an error message in that case.
  • Use g.region -m to get region resolution in meters. This would mean something like this on line 239:
region = grass.parse_command('g.region', flags='m')

I'm not so sure about the meaning of this -m flag in g.region as a constant resolution in degrees does not mean a constant resolution in meters. So this does not seem correct. But then again, in this case it is only used to get an idea of the step size for the profile, so this is not much of a problem.

Note: See TracTickets for help on using tickets.