Opened 14 years ago
Closed 11 years ago
#1214 closed defect (fixed)
r.li.edgedensity: segmentation fault
Reported by: | neteler | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 6.4.4 |
Component: | Raster | Version: | svn-releasebranch64 |
Keywords: | r.li.edgedensity | Cc: | |
CPU: | x86-64 | Platform: | All |
Description
Running r.li.edgedensity in 6.4.svn, we obtain a segmentation fault. The input map corine_1_2000_nonull contains a ring-like area surrounded by -1 cells and -1 in the center (if that matters):
(gdb) r map=corine_1_2000_nonull conf=corine_1_2000 output=corine_1_2000_edge Starting program: /usr/local/grass-6.4.1svn/bin/r.li.edgedensity map=corine_1_2000_nonull conf=corine_1_2000 output=corine_1_2000_edge Detaching after fork from child process 30690. Detaching after fork from child process 30691. Detaching after fork from child process 30692. Detaching after fork from child process 30693. Detaching after fork from child process 30694. Detaching after fork from child process 30695. Detaching after fork from child process 30696. Detaching after fork from child process 30697. Detaching after fork from child process 30698. Detaching after fork from child process 30699. Program received signal SIGSEGV, Segmentation fault. 0x00002aaaaaf42655 in next_Area (parsed=1, l=0x604690, g=0x6046b0, m=0x7fffffffadc0) at daemon.c:651 651 memcpy(&tmp, l->head->m, sizeof(msg)); (gdb) bt full #0 0x00002aaaaaf42655 in next_Area (parsed=1, l=0x604690, g=0x6046b0, m=0x7fffffffadc0) at daemon.c:651 tmp = {type = 2, f = {f_a = {aid = 2, x = 0, y = 0, rl = 147, cl = 295}, f_ma = {aid = 2, x = 0, y = 0, rl = 147, cl = 295, mask = "corine_1_2000", '\000' <repeats 191 times>"\324, .\364\252\252*", '\000' <repeats 14 times>, "\001\000\000\000\300\255\377\377\377\177\000\000\000\000\000\000\006", '\000' <repeats 14 times>}, f_d = {aid = 2, pid = 0, res = 3.1193338127189503e-312}, f_e = {aid = 2, pid = 0}, f_t = {pid = 2}}} #1 0x00002aaaaaf40845 in calculateIndex (file=0x604250 "corine_1_2000", f=0x401238 <edgedensity>, parameters=0x0, raster=0x604230 "corine_1_2000_nonull", output=0x604270 "corine_1_2000_edge") at daemon.c:156 pathSetup = "/home/lucadelu/.r.li/history/corine_1_2000\000\000\000\000\000\000/grassdata/eu_laea/slovenia", '\000' <repeats 77 times>, "Xڀh<", '\000' <repeats 11 times>"\250, \331\363\252\252*\000\000h8\300i<\000\000\000'|\200h<", '\000' <repeats 67 times>, "\001", '\000' <repeats 23 times>, "\037a\200h<", '\000' <repeats 70 times>, "j<\000\000\000\000 \000j<\000\000\000\224\037\000j<\000\000\000\224\037\000j<", '\000' <repeats 11 times>, "Xڀh<\000\000\000\000 j<\000\000\000\250\331\363\252\252*\000\000\363"... out = "/home/lucadelu/.r.li/output/corine_1_2000_edge\000\000\033", '\000' <repeats 11 times>"\377, \377\377\377\034\036\260\252\252*", '\000' <repeats 946 times>"\300, \314\377\377\377\177\000\000\266\070\260\252\252*\000\000\266\070\260\252\252*\000\000\a\000\000\000\000\000\000\000?5\204i<\000\000\000\020\000\000\000\060\000\000\000@\311\377\377\377\177\000\000\200\310\377\377\377\177\000\000\266\070\260\252\252*\000\000\a\000\000\000\000\000\000\000?5\204i<\000\000\000\b\000\000\000\000\000\000\000\260\314\377\377\377\177\000\000\020\000\000\000\060\000\000\000\200\311\377\377\377\177\000\000\300\310\377\377\377\177\000\000\035\036\260\252\252*\000\000\b\000\000\000\000\000\000\000?5\204i<\000\000\000\261\070\260\252\252*\000\000'\214\200h<\000\000\000\030\036\260\252\252*\000\000\330v\024\253\252*\000\000\006\000\000\000\000\000\000\000"... parsed = 1 '\001' reportChannelName = 0x604710 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.0" random_access_name = 0x3c68a1bbc0 "" history = {mapid = '\000' <repeats 79 times>, title = '\000' <repeats 79 times>, mapset = '\000' <repeats 79 times>, creator = '\000' <repeats 79 times>, maptype = '\000' <repeats 79 times>, datsrc_1 = '\000' <repeats 79 times>, datsrc_2 = '\000' <repeats 79 times>, keywrd = '\000' <repeats 79 times>, edlinecnt = 0, edhist = {'\000' <repeats 79 times> <repeats 45 times>, "\000\000\000\000\060\307\377\377\377\177\000\000\266\070\260\252\252*\000\000\266\070\260\252\252*\000\000\a\000\000\000\000\000\000\000?5\204i<", '\000' <repeats 11 times>"\360, \306\377\377\377\177\000\000\000\000\000\000\000\000\000\000p\307\377\377\377\177\000\000\035\036\260\252", "\252*\000\000\035\036\260\252\252*\000\000\b\000\000\000\000\000\000\000?5\204i<\000\000\000\261\070\260\252\252*\000\000\060\307\377\377\377\177\000\000\000\000\000\000\000\000\000\000P\302\377\377\377\177", '\000' <repeats 18 times>"\377, \377\377\377", "\377\377\377\377\240\310\377\377\377\177\000\000\030\036\260\252\252*\000\000\200\307\377\377\377\177\000\000\000\000\000\000\001", '\000' <repeats 27 times>"\377, \377\377\377\377\377\377\377\030\036\260\252\252*\000\000\000\000\000", "\000\000\000\000\070\306\377\377\377\177\000\000 \000\000\000\001", '\000' <repeats 59 times>, " \000\000", '\000' <repeats 15 times>, "s", '\000' <repeats 60 times>, "\022\000\000"}} g = 0x6046b0 receiveChannel = 15 res = 16 child = {{pid = 30690, pipe = 0x604750 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.1", channel = 5}, {pid = 30691, pipe = 0x604790 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.2", channel = 6}, { pid = 30692, pipe = 0x6047d0 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.3", channel = 7}, {pid = 30693, pipe = 0x604810 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.4", channel = 8}, { pid = 30694, pipe = 0x604850 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.5", channel = 9}, {pid = 30695, pipe = 0x604890 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.6", channel = 10}, { pid = 30696, pipe = 0x6048d0 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.7", channel = 11}, {pid = 30697, pipe = 0x604910 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.8", channel = 12}, { pid = 30698, pipe = 0x604950 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.9", channel = 13}, {pid = 30699, pipe = 0x604990 "/grassdata/eu_laea/slovenia/.tmp/blade06/30687.10", channel = 14}} i = 2 mypid = 30687 doneDir = -1 withoutJob = 32767 mv_fd = 1753295058 random_access = 60 l = 0x604690 m = {type = 2, f = {f_a = {aid = 2, x = 0, y = 0, rl = 147, cl = 295}, f_ma = {aid = 2, x = 0, y = 0, rl = 147, cl = 295, mask = "corine_1_2000", '\000' <repeats 242 times>}, f_d = { aid = 2, pid = 0, res = 3.1193338127189503e-312}, f_e = {aid = 2, pid = 0}, f_t = { pid = 2}}} doneJob = {type = 0, f = {f_a = {aid = 0, x = 0, y = 0, rl = 0, cl = 0}, f_ma = {aid = 0, x = 0, y = 0, rl = 0, cl = 0, mask = '\000' <repeats 255 times>}, f_d = {aid = 0, pid = 0, res = 0}, f_e = {aid = 0, pid = 0}, f_t = {pid = 0}}} testpath = "/home/lucadelu/.r.li/history/", '\000' <repeats 4066 times> #2 0x0000000000401236 in main (argc=4, argv=0x7fffffffe3c8) at edgedensity.c:72 raster = 0x2aaaaad14ee0 conf = 0x604090 output = 0x604150 class = 0x604470 module = 0x2aaaaad14f80 par = 0x0 (gdb)
Any ideas what's wrong in raster/r.li/r.li.daemon/daemon.c ?
Markus
Change History (15)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Keywords: | r.li.edgedensity added |
---|---|
Milestone: | 6.4.1 → 6.4.4 |
follow-up: 4 comment:3 by , 11 years ago
Hi,
you might test in grass7 where the server/client stuff has been removed so r.li.daemon is simpler. If that works and 6.x doesn't, it narrows down where to look.
Also, to help with reproducibility it would be good if we could all test with the same data and commands- see the r.li module help page examples or the r.li/TODO which has a number of usage examples using the standard sample datasets.
Hamish
follow-ups: 5 6 comment:4 by , 11 years ago
Replying to hamish:
Hi,
you might test in grass7 where the server/client stuff has been removed so r.li.daemon is simpler. If that works and 6.x doesn't, it narrows down where to look.
edegedensity and patchdensity fails for me in grass70 and works in grass65 and grass64 svn revision: 58968 location: nc_spm_08 mapset: PERMANENT
r.li.patchdensity map=geology conf=geology_mov output=pdens_out r.li.edgedensity map=geology conf=geology_mov output=edens_out
geology_mov:
SAMPLINGFRAME 0|0|1|1 SAMPLEAREA -1|-1|0.03571428571428571|0.02631578947368421 MOVINGWINDOW
r.li.patchdensity map=lc96ras conf=lc96_overlay output=pdens_out_overlay r.li.edgedensity map=lc96ras conf=lc96_overlay output=edens_out_overlay
lc96_overlay:
SAMPLINGFRAME 0|0|1|1 MASKEDOVERLAYAREA rli_samp_btest_480|205983.75|205812.75|929030.5031|928916.5031 MASKEDOVERLAYAREA rli_samp_btest_874|61716.75|61403.25|728333.5031|727963.0031 RASTERMAP lc96ras@PERMANENT VECTORMAP btest@PERMANENT
btest created from:
v.extract input=boundary_county@PERMANENT output=btest random=3
Also, to help with reproducibility it would be good if we could all test with the same data and commands- see the r.li module help page examples or the r.li/TODO which has a number of usage examples using the standard sample datasets.
I am using geology map from spearfish60 dataset for moving window configuration. For vector overlay option i use the data lc96ras.img mentioned in bug #2024 comment #4.
Hamish
comment:5 by , 11 years ago
Replying to rashadkm:
edegedensity and patchdensity fails for me in grass70 and works in grass65 and grass64 svn revision: 58968 location: nc_spm_08 mapset: PERMANENT
r.li.patchdensity map=geology conf=geology_mov output=pdens_out r.li.edgedensity map=geology conf=geology_mov output=edens_outgeology_mov:
SAMPLINGFRAME 0|0|1|1 SAMPLEAREA -1|-1|0.03571428571428571|0.02631578947368421 MOVINGWINDOW
I'll start with this one, both fail for me in grass7 too.
r.li.edgedensity segfaults in Rast_get_cellhd():
(gdb) list 78 { 79 struct Cell_head hd; 80 int ris = -1; 81 double indice = 0; 82 83 Rast_get_cellhd(ad->raster, "", &hd); 84 85 switch (ad->data_type) { 86 case CELL_TYPE: 87 {
on the fifth pass into the function, although I'm not sure why, 'hd' is defined only a few lines earlier and ad->raster
seems ok and the same setup as r.li.patchdensity. (weirdness generally makes me suspect the stack getting smashed sometime earlier)
I can get r.li.patchdensity to work in grass7 if I comment out the G_free(sup);
at the very end of main.c, but then of course it leaks memory for every row of raster data. I notice in top
that while it's running 5-10% of the CPU time is spent for the system: there's something horribly inefficient going on internally.
I'd also note while visually similar the r.li.patchdensity raster output is numerically different from G6.5's r.li.patchdensity run with the same input files, so beware in trusting the results.
Again, the valgrind errors are probably a good place to start:
==15739== ==15739== Conditional jump or move depends on uninitialised value(s) ==15739== at 0x4E2FA8D: RLI_get_cell_raster_row (worker.c:256) ==15739== by 0x4010B0: patch_density (main.c:101) ==15739== by 0x4E2F763: worker_process (worker.c:179) ==15739== by 0x4E2CF2A: calculateIndex (daemon.c:133) ==15739== by 0x400EB4: main (main.c:54) ==15739== ==15739== Invalid read of size 4 ==15739== at 0x401316: patch_density (main.c:153) ==15739== by 0x4E2F763: worker_process (worker.c:179) ==15739== by 0x4E2CF2A: calculateIndex (daemon.c:133) ==15739== by 0x400EB4: main (main.c:54) ==15739== Address 0x62e7464 is 4 bytes inside a block of size 2,004 free'd ==15739== at 0x4C240FD: free (vg_replace_malloc.c:366) ==15739== by 0x526F1A6: G_free (alloc.c:151) ==15739== by 0x4015E8: patch_density (main.c:205) ==15739== by 0x4E2F763: worker_process (worker.c:179) ==15739== by 0x4E2CF2A: calculateIndex (daemon.c:133) ==15739== by 0x400EB4: main (main.c:54) ==15739== (many more) ... Program received signal SIGABRT, Aborted. (gdb) bt #0 0x00007ffff71af1b5 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007ffff71b1fc0 in *__GI_abort () at abort.c:92 #2 0x00007ffff71e55bb in __libc_message (do_abort=<value optimized out>, fmt=<value optimized out>) at ../sysdeps/unix/sysv/linux/libc_fatal.c:189 #3 0x00007ffff71eee16 in malloc_printerr (action=3, str=0x7ffff72a66f8 "free(): invalid next size (fast)", ptr=<value optimized out>) at malloc.c:6267 #4 0x00007ffff71f3b8c in *__GI___libc_free (mem=<value optimized out>) at malloc.c:3739 #5 0x00007ffff776e1a7 in G_free (buf=0x6167f0) at alloc.c:151 #6 0x00000000004015e9 in patch_density (fd=0, par=0x0, ad=0x602990, result=0x7ffff7dde540) at main.c:205 #7 0x00007ffff7bdc764 in worker_process (ret=0x7fffffffb970, m=0x7fffffffba90) at worker.c:179 #8 0x00007ffff7bd9f2b in calculateIndex (file=0x6026e0 "geology_mov", f=0x400eb7 <patch_density>, parameters=0x0, raster=0x6026c0 "geology_30m", output=0x602700 "pdens_out7") at daemon.c:133 #9 0x0000000000400eb5 in main (argc=4, argv=0x7fffffffdd78) at main.c:54 ... (gdb) frame 6 #6 0x00000000004015e9 in patch_density (fd=0, par=0x0, ad=0x602990, result=0x7ffff7dde540) at main.c:205 205 G_free(sup); (gdb) list 200 if (area != 0) 201 *result = (count / area) * 1000000; 202 else 203 *result = -1; 204 205 G_free(sup); 206 207 return RLI_OK; 208 }
?
Hamish
comment:6 by , 11 years ago
Replying to rashadkm:
Replying to hamish:
Hi,
you might test in grass7 where the server/client stuff has been removed so r.li.daemon is simpler. If that works and 6.x doesn't, it narrows down where to look.
edegedensity and patchdensity fails for me in grass70 and works in grass65 and grass64 svn revision: 58968 location: nc_spm_08 mapset: PERMANENT
r.li.patchdensity map=geology conf=geology_mov output=pdens_out r.li.edgedensity map=geology conf=geology_mov output=edens_out
geology_mov:
SAMPLINGFRAME 0|0|1|1 SAMPLEAREA -1|-1|0.03571428571428571|0.02631578947368421 MOVINGWINDOW
r.li.patchdensity and r.li.edgedensity pass the test with trunk r59072.
follow-up: 9 comment:8 by , 11 years ago
Replying to neteler:
What is our backport policy here?
The r.li modules should be backported, or else the r.li modules should be deactivated in G6 because they are broken.
Do we want to keep the parallelization in G6?
follow-up: 10 comment:9 by , 11 years ago
Replying to mmetz:
Replying to neteler:
What is our backport policy here?
The r.li modules should be backported, or else the r.li modules should be deactivated in G6 because they are broken.
Do we want to keep the parallelization in G6?
Since probably only "brute-force" backporting is possible (i.e. replace the current G6 version with trunk and then merge back the libgis calls), we will not be able to keep the parallelization in G6.
follow-up: 11 comment:10 by , 11 years ago
Replying to neteler:
Replying to mmetz:
Replying to neteler:
What is our backport policy here?
The r.li modules should be backported, or else the r.li modules should be deactivated in G6 because they are broken.
Do we want to keep the parallelization in G6?
Since probably only "brute-force" backporting is possible (i.e. replace the current G6 version with trunk and then merge back the libgis calls), we will not be able to keep the parallelization in G6.
For the actual modules, I agree that a brute-force backport is the easiest solution. The daemon could be updated by just backporting the relevant changes which were not that many. Essentially it is only the fixes regarding the IO cache, the handling of the computational region and the creation of the r.li-specific mask. That would allow for testing the parallelization, preferably with a moving window configuration.
follow-up: 12 comment:11 by , 11 years ago
Platform: | Linux → All |
---|
Replying to mmetz:
Replying to neteler:
Replying to mmetz:
Replying to neteler:
What is our backport policy here?
The r.li modules should be backported, or else the r.li modules should be deactivated in G6 because they are broken.
Do we want to keep the parallelization in G6?
Since probably only "brute-force" backporting is possible (i.e. replace the current G6 version with trunk and then merge back the libgis calls), we will not be able to keep the parallelization in G6.
For the actual modules, I agree that a brute-force backport is the easiest solution.
I have done so in r59304 (G6.5.svn) and in r59305 (G6.4.svn).
Open issues: the modules r.li.dominance, r.li.pielou, r.li.renyi, r.li.shannon, r.li.simpson are not yet producing proper output. Since they appear to be similar among each other, there might be the same issue to be fixed in all. Help needed.
follow-up: 13 comment:12 by , 11 years ago
Replying to neteler:
Open issues: the modules r.li.dominance, r.li.pielou, r.li.renyi, r.li.shannon, r.li.simpson are not yet producing proper output. Since they appear to be similar among each other, there might be the same issue to be fixed in all. Help needed.
I have cleaned up the backport to 6.4 a bit in r59598, then tested with the spearfish example described in the TODO, both moving window and whole region. The results are identical between G64 and G71. Can you test again and provide the example that produces incorrect results, if the results are still incorrect?
comment:13 by , 11 years ago
Replying to mmetz:
I have cleaned up the backport to 6.4 a bit in r59598
I needed to add r59743 in order to avoid odd error messages if the input map is not there. Still I get
GRASS 6.4.4svn (spearfish60):~ > g.gisenv set=DEBUG=1 GRASS 6.4.4svn (spearfish60):~ > r.li.edgedensity map=geology conf=movwindow7 output=edens_out D1/1: r.li.daemon pathSetup: [/home/neteler/.r.li/history/movwindow7] Illegal filename. Character <> not allowed. Illegal filename. Character <> not allowed. Illegal filename. Character <> not allowed. ...
comment:14 by , 11 years ago
After "make distclean" it runs fine. I added two new test scripts:
- raster/r.li/r.li.testing_nc_asc.sh
- raster/r.li/r.li.testing_sp_mov.sh
Whole maplayer versus moving window. Both produce identical results in GRASS 6 and 7, respectively.
I think the bug(s) have been solved and the ticket can be closed.
comment:15 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
related * Tickets: #2024. Markus, Could you try the patch attached in the related ticket