Ticket #660 (closed defect: fixed)

Opened 5 years ago

Last modified 4 years ago

v.delaunay producing incorrect results

Reported by: cmbarton Owned by: grass-dev@…
Priority: critical Milestone: 6.4.0
Component: Vector Version: 6.4.0 RCs
Keywords: delaunay Cc: midinastasurazz
Platform: MacOSX CPU: OSX/Intel

Description

When I tried v.delaunay after compiling GRASS 6.5 from the SVN on Friday (20 May), it gave very weird results. I'm attaching an example from using it on archsites from Spearfish.

Attachments

delaunay.png Download (16.4 KB) - added by cmbarton 5 years ago.
delaunay-valgrind1.txt Download (206.1 KB) - added by kyngchaos 4 years ago.
valgrind on OSX, with a few detail options, part 1
delaunay-valgrind2.txt Download (208.7 KB) - added by kyngchaos 4 years ago.
valgrind on OSX, with a few detail options, part 2
dt_archsites.png Download (3.9 KB) - added by cmbarton 4 years ago.
archsites from Speafish demo data
dt_allsites2005.png Download (12.8 KB) - added by cmbarton 4 years ago.
v.delaunay run on all archsites in June 2005 (correct behavior)
dt_fewsites2009.png Download (7.2 KB) - added by cmbarton 4 years ago.
v.delaunay run on a few of archsites today, i.e., late 2009 (correct behavior)
dt_allsites2009.png Download (17.7 KB) - added by cmbarton 4 years ago.
v.delaunay run on a all archsites today, i.e., late 2009 (incorrect behavior)
archsites11.zip Download (2.4 KB) - added by cmbarton 4 years ago.
v.delaunay.64.patch Download (23.6 KB) - added by mmetz 4 years ago.
bugfix and clean up, patch is for grass64 svn
delaunay1000_fixed.png Download (60.0 KB) - added by cmbarton 4 years ago.
v.delaunay run on 1000 random elevation points from the SC demo data set

Change History

Changed 5 years ago by cmbarton

  Changed 5 years ago by hamish

  • summary changed from v.dalaunay producing incorrect results to v.delaunay producing incorrect results

see these mailing list threads:

 http://thread.gmane.org/gmane.comp.gis.grass.user/30001/  http://thread.gmane.org/gmane.comp.gis.grass.devel/33926

including this valgrind error:

...
WARNING: Vector map <DelaunayTriangles> already exists and will be
         overwritten
==7844== Syscall param write(buf) points to uninitialised byte(s)
==7844==    at 0x71C09F0: write (in /lib/libc-2.7.so)
==7844==    by 0x716E909: _IO_file_write (in /lib/libc-2.7.so)
==7844==    by 0x716E569: (within /lib/libc-2.7.so)
==7844==    by 0x716E8A4: _IO_do_write (in /lib/libc-2.7.so)
==7844==    by 0x7170256: _IO_switch_to_get_mode (in /lib/libc-2.7.so)
==7844==    by 0x716ED8F: _IO_file_seekoff (in /lib/libc-2.7.so)
==7844==    by 0x7164049: ftell (in /lib/libc-2.7.so)
==7844==    by 0x5D455FD: dig_ftell (file.c:41)
==7844==    by 0x5D4600F: dig__write_head (head.c:56)
==7844==    by 0x4E5AC9C: V1_open_new_nat (open_nat.c:127)
==7844==    by 0x4E5A155: Vect_open_new (open.c:565)
==7844==    by 0x403B0D: main (main.c:106)
==7844==  Address 0x4022009 is not stack'd, malloc'd or (recently) free'd
Building topology for vector map <DelaunayTriangles>...
Registering primitives...
...

works ok in Linux + latest grass 6.5svn, both 32 and 64bit.

Hamish

  Changed 5 years ago by hamish

is this seen on 6.4.0rc as well?

  Changed 5 years ago by cmbarton

Yes unfortunately.

I just updated 6.4 from the SVN and compiled in 32 bit. I get the same kind of output seen in the PNG attached to this report. This is a blocker for 6.4.0.

Michael

  Changed 5 years ago by martinl

  • priority changed from critical to blocker
  • version changed from svn-develbranch6 to 6.4.0 RCs
  • milestone changed from 6.5.0 to 6.4.0

  Changed 5 years ago by kyngchaos

On a random suggestion from Markus Neteler, I tested a build with no GCC optimization. Same results. 32bit and 64bit.

  Changed 5 years ago by hamish

  • priority changed from blocker to critical

triage

  Changed 5 years ago by adhollander

On Linux v.delaunay is looking fine to me, testing it on 64bit Ubuntu 9.4 running 6.4.0RC5.

  Changed 5 years ago by cmbarton

Still badly broken on my Mac in 6.4 compiled from svn 26 September. Has there been any fix since that date?

Michael

  Changed 4 years ago by cmbarton

v.delaunay is still broken. It creates a mess of overlapping triangles in places and leaves complete blankes in other places.

  Changed 4 years ago by marisn

Michael, could You, please, run affected module from 6.5 under valgrind (or any other memory tracing tool)? It works just fine on Linux and I'm unable to get meaningfull errors out of valgrind/mudflap.

Changed 4 years ago by kyngchaos

valgrind on OSX, with a few detail options, part 1

Changed 4 years ago by kyngchaos

valgrind on OSX, with a few detail options, part 2

  Changed 4 years ago by kyngchaos

maris, just checking if you saw my valgrind attachments - trac didn't email me like it does for comments.

  Changed 4 years ago by marisn

kyngchaos - no need for leakcheck, as we are not hunting leaks :) Thanks for output. Same as on my AMD64 Linux box, still here results are fine. No other ideas from me (mudflap is somehow broken and thus requires some tweaking before use and it doesn't show anything interesting on my Linux box).

MarkusM - what's up with those warnings in vlib/diglib code?

Changed 4 years ago by cmbarton

archsites from Speafish demo data

Changed 4 years ago by cmbarton

v.delaunay run on all archsites in June 2005 (correct behavior)

Changed 4 years ago by cmbarton

v.delaunay run on a few of archsites today, i.e., late 2009 (correct behavior)

Changed 4 years ago by cmbarton

v.delaunay run on a all archsites today, i.e., late 2009 (incorrect behavior)

  Changed 4 years ago by cmbarton

I did some more tests and have posted the results to better show what is happening. For comparison, I show a map of Spearfish archsites (blue triangles in all images). In June 2005, v.delaunay worked correctly on my computer. It also worked correctly 24 April 2008, so that narrows down when it broke better.

For small datasets (4-5 points) it seems to work fine. See dt_fewsites.png.

For medium sized datasets (all archsites) it runs but gives incorrect results. See dt_allsites.png

For larger datasets (1000 points) it uses all available CPU resources and takes a VERY long time to complete (or perhaps just locks up).

Michael

Michael

  Changed 4 years ago by cmbarton

More information. When I ran v.delaunay on 1000 points, I finally killed it, but it had created a coor file of 1.37 Gb. Yes that is Gb. Before I realized how huge it was, I tried to run v.build on it. It was up to < 300,000 primitives before I killed that.

Something is seriously wrong here.

Michael

  Changed 4 years ago by mmetz

Replying to marisn:

MarkusM - what's up with those warnings in vlib/diglib code?

I assume you refer to the valgrind output, at least I don't get any warnings when running v.delaunay without valgrind. The only valgrind errors that tell me something are right at the beginning, here

 http://trac.osgeo.org/grass/attachment/ticket/660/delaunay-valgrind1.txt#L17

and here

 http://trac.osgeo.org/grass/attachment/ticket/660/delaunay-valgrind1.txt#L27

The second one saying that Address 0x66d5f29 is 9 bytes inside a block of size 4,096 alloc'd may cause problems, but I have not the faintest idea where this is coming from, I don't get this one on Linux, only number 1.

If all other vector modules are working, the problem is IMHO more likely somewhere in the module than in the vector libs. So, are other vector modules working?

On my Linux system, v.delaunay is fast and produces correct results, also with e.g. elev_lid792_bepts with 118,716 points.

Sorry, not much help from my side,

Markus M

follow-up: ↓ 21   Changed 4 years ago by mmetz

Please try attached patch in verbose mode.

Markus M

follow-up: ↓ 20   Changed 4 years ago by kyngchaos

Same output on archsites for me.

v.voronoi appears to work, on archsites. I don't have a large point file to test it - Michael?

  Changed 4 years ago by cmbarton

I tried v.voronoi on 1000 elevation points in the SC demo data set. It took only a couple seconds and worked fine.

Michael

  Changed 4 years ago by cmbarton

I applied the patch in GRASS 6.4 (just updated from the SVN). Unfortunately, it gives the same results as before.

Michael

in reply to: ↑ 17   Changed 4 years ago by hamish

Replying to kyngchaos:

I don't have a large point file to test it

you can make one from noise with v.random or from a real DEM with 'r.random vector_output='.

in reply to: ↑ 16 ; follow-up: ↓ 22   Changed 4 years ago by marisn

Replying to mmetz:

Please try attached patch in verbose mode. Markus M

On my Linux box still works just fine. Only question is regarding this warning. Is it OK to ignore?

==10913== Syscall param write(buf) points to uninitialised byte(s)
==10913==    at 0x749D210: __write_nocancel (syscall-template.S:82)
==10913==    by 0x7447422: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1268)
==10913==    by 0x7447099: new_do_write (fileops.c:522)
==10913==    by 0x74473C4: _IO_do_write@@GLIBC_2.2.5 (fileops.c:495)
==10913==    by 0x7448CC6: _IO_switch_to_get_mode (genops.c:189)
==10913==    by 0x7447AB7: _IO_file_seekoff@@GLIBC_2.2.5 (fileops.c:978)
==10913==    by 0x743CD0B: ftell (ioftell.c:41)
==10913==    by 0x5D22F8B: dig__write_head (head.c:56)
==10913==    by 0x4E568BE: V1_open_new_nat (open_nat.c:127)
==10913==    by 0x4E55A7B: Vect_open_new (open.c:565)
==10913==    by 0x40309B: main (main.c:106)
==10913==  Address 0x4036009 is not stack'd, malloc'd or (recently) free'd

in reply to: ↑ 21   Changed 4 years ago by mmetz

Replying to marisn:

Replying to mmetz:

Please try attached patch in verbose mode. Markus M

On my Linux box still works just fine. Only question is regarding this warning. Is it OK to ignore?

==10913== Syscall param write(buf) points to uninitialised byte(s)
==10913==    at 0x749D210: __write_nocancel (syscall-template.S:82)
==10913==    by 0x7447422: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1268)
==10913==    by 0x7447099: new_do_write (fileops.c:522)
==10913==    by 0x74473C4: _IO_do_write@@GLIBC_2.2.5 (fileops.c:495)
==10913==    by 0x7448CC6: _IO_switch_to_get_mode (genops.c:189)
==10913==    by 0x7447AB7: _IO_file_seekoff@@GLIBC_2.2.5 (fileops.c:978)
==10913==    by 0x743CD0B: ftell (ioftell.c:41)
==10913==    by 0x5D22F8B: dig__write_head (head.c:56)
==10913==    by 0x4E568BE: V1_open_new_nat (open_nat.c:127)
==10913==    by 0x4E55A7B: Vect_open_new (open.c:565)
==10913==    by 0x40309B: main (main.c:106)
==10913==  Address 0x4036009 is not stack'd, malloc'd or (recently) free'd

This warning is a complete mystery to me, it shows up with some vector modules but not with others, therefore I would say it is safe to ignore. The beginning of v.delaunay is nearly identical with v.voronoi, but v.voronoi does not show this warning?

AFAICT, this warning appears because the file pointer for the coor file points to the end of the coor file (uninitialised bytes), but that's what it's supposed to do.

Markus M

  Changed 4 years ago by mmetz

Apparently v.delaunay worked in grass6.3, so I see 3 options to solve this ticket:

1) Remove v.delaunay and reactivate v.delaunay in v.voronoi. Not that good because v.voronoi can do with some maintenance itself and there was a reason for the GSoC project (time and memory requirements I guess).

2) A grass developer working on a Mac digs into the source code of v.delaunay and tears it apart until the bug is found and eliminated.

3) A grass developer without a Mac supplies patches that are tested by others with a Mac. Will take some time.

Going for 3), please try the updated patch. Before applying the patch, all files in v.delaunay2 must be deleted if the previous patch was applied, then svn up. Just for the record, run make | grep warning to see if the Mac compiler has some complains (Linux gcc doesn't). Then apply the patch, make and grep warnings, test patched v.delaunay if no compile warnings or errors occurred. If this patch still doesn't fix it, I am at a loss.

Markus M

follow-up: ↓ 28   Changed 4 years ago by kyngchaos

1) yes, I remember memory issues when running the old v.delaunay on a large number of points.

2) not me, I can tinker, but don't know C well enough for this sort of task ;)

3) No change (both 32 and 64bit). No compile errors before or after the patch :(

I can run it thru GDB, though I've only ever used that once thru the Xcode GUI. Any suggestions what I might look at/for? Is there an automatic trace that can be generated (minimal entry of gdb commands) that can be compared with Linux?

follow-up: ↓ 26   Changed 4 years ago by cmbarton

I notice that in GRASS 6.4+ v.delaunay is now v.delauany2. Perhaps a change was made at some time after April 2008 that replaced v.delaunay (working on Mac) with v.delaunany2 (not working on Mac). In that case, it seems like a good start to see what was changed.

Michael

in reply to: ↑ 25   Changed 4 years ago by hamish

  • cc mpavlovsky@… added

Replying to cmbarton:

I notice that in GRASS 6.4+ v.delaunay is now v.delauany2. Perhaps a change was made at some time after April 2008 that replaced v.delaunay (working on Mac) with v.delaunany2 (not working on Mac). In that case, it seems like a good start to see what was changed.

what changed is that the module was entirely replaced by a new version by Martin Pavlovsky, one of the Google Summer of Code students from 2008.

The old version had more fundamental problems (which is why it was replaced).

Hamish

  Changed 4 years ago by cmbarton

I'll warrant that it is the replacement that broke on the Mac. It probably wasn't tested on that platform. Has anyone tested this on Windows? Maybe a look at the differences would provide a clue to what broke.

Michael

in reply to: ↑ 24 ; follow-up: ↓ 29   Changed 4 years ago by hamish

Replying to kyngchaos:

I can run it thru GDB, though I've only ever used that once thru the Xcode GUI. Any suggestions what I might look at/for? Is there an automatic trace that can be generated (minimal entry of gdb commands) that can be compared with Linux?

see http://grass.osgeo.org/wiki/Bugs#Using_GDB

Hamish

in reply to: ↑ 28   Changed 4 years ago by cmbarton

Replying to hamish:

Replying to kyngchaos:

I can run it thru GDB, though I've only ever used that once thru the Xcode GUI. Any suggestions what I might look at/for? Is there an automatic trace that can be generated (minimal entry of gdb commands) that can be compared with Linux?

see http://grass.osgeo.org/wiki/Bugs#Using_GDB Hamish

Thanks. *I* don't know what to look for, but Martin M might.

Michael

  Changed 4 years ago by midinastasurazz

This error was definitely introduced by the rewrite I did over the summer of 2008. At the time I tested it on Ubuntu 8.04(32-bit) and I didn't notice any incorrect output. However, I didn't do any testing on MacOS X platform. I will have a look at this and report if I manage to fix it.

Martin

  Changed 4 years ago by hamish

  • cc midinastasurazz added; mpavlovsky@… removed

Michael: can you find/extract a really small test case that fails?

e.g. g.region= + v.in.region + v.select to extract a handful of points from e.g. bugsites@spearfish?

if we can get it down to a low enough number of points we can follow the action by hand, step by step.

Changed 4 years ago by cmbarton

  Changed 4 years ago by cmbarton

Thanks for all your efforts to get this fixed. I've attached a file of cats 1-11 of archsites. This seems the minimum to cause trouble (1-10 produce good delaunay triangles, but 1-11 are a mess). Just drop this into $GISDBASE/Spearfish/PERMANENT/vector.

Michael

Changed 4 years ago by mmetz

bugfix and clean up, patch is for grass64 svn

  Changed 4 years ago by mmetz

I got the same weird results on Windows XP, found the bug and fixed it, hopefully also for MacOSX. Please try the updated patch v.delaunay.64.patch. This patch is against svn, so again, delete all files in v.delaunay2, svn up, apply the patch, test. Works for me now in Linux and Windows, can't test Mac.

Markus M

  Changed 4 years ago by kyngchaos

Woohoo! Works on archsites. So, what was affecting Mac and Windows but not Linux?

Changed 4 years ago by cmbarton

v.delaunay run on 1000 random elevation points from the SC demo data set

follow-up: ↓ 38   Changed 4 years ago by cmbarton

  • status changed from new to closed
  • resolution set to fixed

YEA!!!

I just updated from svn, patched, and compiled. It works great. I tried it on 1000 points from the SC demo data set. Before, I had to kill it after several minutes and over a Gb of bad data. This took a second or two and worked perfectly. I posted a PNG of the output.

Thanks very much for sticking with it and getting this blocker fixed! I'm changing this to fixed. Please backport to 6.5 and 7.

Michael

  Changed 4 years ago by neteler

I take liberty to reopen since the changes are yet unsubmitted.

  Changed 4 years ago by cmbarton

You are correct. I was so overjoyed at having this finally fixed that I was somewhat premature.

Michael

in reply to: ↑ 35   Changed 4 years ago by hellik

Replying to cmbarton:

YEA!!! I just updated from svn, patched, and compiled. It works great. I tried it on 1000 points from the SC demo data set. Before, I had to kill it after several minutes and over a Gb of bad data. This took a second or two and worked perfectly. I posted a PNG of the output. Thanks very much for sticking with it and getting this blocker fixed! I'm changing this to fixed. Please backport to 6.5 and 7. Michael

Grass64.svn self-compiled in the osgeo4w-tree with patch included with a WinVista?32-box

just for confirmation: I've testest it with the firestations in the nc-sample-dataset, v.delaunay producs correct results.

best regards Helmut

follow-up: ↓ 40   Changed 4 years ago by hamish

Nice.

How does it perform when the number of input points > 10,000 (& up to ~3M)? which was where the old v.delaunay1 fell over?

re. the patch, three minor comments after a cursory look.

1) MarkusM, you is the man, congrats 2) TRUE=1, FALSE=0 is defined by gis.h, no need to repeat it. 3) cmp_v() appears to test two FP values with ==. Should it use

 if( fabs(val1 - val2) < GRASS_EPSILON )

instead? Is it looking for itself or another point at the same place as it is?

Hamish

in reply to: ↑ 39   Changed 4 years ago by mmetz

Replying to hamish:

Thanks for looking at the code, two people see more than one alone.

How does it perform when the number of input points > 10,000 (& up to ~3M)? which was where the old v.delaunay1 fell over?

I generated 10,000,000 random points and killed v.delaunay when building topology for the output vector, my system ran out of memory (8GB RAM). The delaunay triangulation went fast and did not use a lot of memory, but the topo file would be larger than 2GB which is only supported in grass7.

Then I generated 5,000,000 random points, v.delaunay finsihed successfully in 11 min. The Delaunay triangulation took just a few seconds, impressive! That is the original algorithm of Martin Pavlovsky, slightly simplified by me. Again, the bottleneck was building topology, although I used lines not areas as output. Areas as output for 5 million points will only work in grass7.

re. the patch, three minor comments after a cursory look. 2) TRUE=1, FALSE=0 is defined by gis.h, no need to repeat it.

TRUE=1, FALSE=0 are used in geometry.c only which does not include gis.h, and it uses #ifndef TRUE...

3) cmp_v() appears to test two FP values with ==. Should it use {{{ if( fabs(val1 - val2) < GRASS_EPSILON ) }}} instead? Is it looking for itself or another point at the same place as it is?

It's looking for another point at the same place. There is a lot of fp comparison in grass, should something like (x1 < x2) also be replaced with (fabs(x2 - x1) > GRASS_EPSILON)? With respect to the v.delaunay2 code, it would be safest and most consistent to use cmp() from main.c instead of cmp_v(), that also avoids code duplication. BTW, cmp_v() corresponds to compare() in the old code which caused qsort to fail on Windows and Mac for at least one, maybe three reasons (correct usage in v.voronoi).

Markus M

  Changed 4 years ago by mmetz

Fixed in trunk, devbr6, and relbr64 (r40246 - r40249).

I have removed cmp_v() completely and now use only cmp(), the function that passed tests on Linux, Windows and Mac.

Markus M

Note: See TracTickets for help on using tickets.