Opened 7 years ago

Closed 6 years ago

#1736 closed defect (fixed)

wxNVIZ volume display crashes Mac

Reported by: cmbarton Owned by: grass-dev@…
Priority: critical Milestone: 6.4.4
Component: wxGUI Version: svn-releasebranch64
Keywords: wxNVIZ Cc:
CPU: OSX/Intel Platform: MacOSX

Description

I am unable to display a volume that I created and have displayed previously. I can add a slice or an isosurface. But as soon as I try to change any setting for either, the entire GUI crashes. I will attach the error message that I sent to Anna previously in case someone else can figure this out. Note that whenever I quit the GUI after using the 3D manager, I also get a similar error message.

Attachments (1)

NVIZ_volume_crash.txt (71.7 KB) - added by cmbarton 7 years ago.
Crash report when wxNVIZ tries to display a volume or even when it quits

Download all attachments as: .zip

Change History (33)

Changed 7 years ago by cmbarton

Attachment: NVIZ_volume_crash.txt added

Crash report when wxNVIZ tries to display a volume or even when it quits

comment:1 Changed 7 years ago by martinl

It seems to be a Mac OS X related bug. Unfortunately there are few developers who have access to this platform. The question is whether it can be set as a blocker if there is probably nobody who will fix it in the next days.

comment:2 Changed 7 years ago by cmbarton

This is something that did work within the past year and now is broken. Not sure why.

But it effectively makes all the volume commands useless if you cannot display the result. I was trying to test some alternative files with Helena to debug this, but we found out that r3.in.ascii is also broken now. I don't know if she filed a report or not. I'd copy her here, but apparently there is no longer a way to add someone to this ticket.

Michael

comment:3 in reply to:  2 Changed 7 years ago by martinl

Replying to cmbarton:

This is something that did work within the past year and now is broken. Not sure why.

But it effectively makes all the volume commands useless if you cannot display the result. I was trying to test some alternative files with Helena to debug this, but we found out that r3.in.ascii is also broken now. I don't know if she filed a report or not. I'd copy her here, but apparently there is no longer a way to add someone to this ticket.

can you provided more detailed info, at least in which sense r3.in.ascii is broken? Note that ticket is marked as blocker. We need to move on.

comment:4 Changed 7 years ago by cmbarton

Trying. But am traveling and in meetings. So time is limited. Still, I just sent in some info today. I will try Anna's suggestions too. Lot of people using GRASS on the Mac. Would help if more people could test.

comment:5 Changed 7 years ago by helena

I have just freshly compiled 6.4.3 on my home mac with osx 10.6.8 and the isosurfaces work properly for my small example. The isosurfaces using binary from Michael crash the GUI on the same machine, so it may be an issue related to compiling on 10.7? Also, if the number of layers is large (in my second test case it was 42), nothing gets drawn. So there are issues in terms of handling cases that should provide an error rather than quietly do nothing but with the smaller data set (9 layers) the volumes run OK on mac osx 10.6.

Helena

comment:6 in reply to:  5 Changed 7 years ago by annakrat

Replying to helena:

The isosurfaces using binary from Michael crash the GUI on the same machine, so it may be an issue related to compiling on 10.7? Also, if the number of layers is large (in my second test case it was 42), nothing gets drawn. So there are issues in terms of handling cases that should provide an error rather than quietly do nothing but with the smaller data set (9 layers) the volumes run OK on mac osx 10.6.

Maximium numbers of volumes/slices/isosurfaces are defined here. In case of loading too many volumes there is a warning, in other cases probably not.

comment:7 in reply to:  4 Changed 7 years ago by martinl

Replying to cmbarton:

Lot of people using GRASS on the Mac. Would help if more people could test.

OK, so probably some of them could fix it. Me or most of GRASS devs I know have no access to Mac. So little chance we can fix it. Please bear in mind what blocker means.

comment:8 in reply to:  5 Changed 7 years ago by martinl

Priority: blockercritical

Replying to helena:

I have just freshly compiled 6.4.3 on my home mac with osx 10.6.8 and the isosurfaces work properly for my small example. The isosurfaces using binary from Michael crash the GUI on the same machine, so it may be an issue related to compiling on 10.7? Also, if the number of layers is large (in my second test case it was 42), nothing gets drawn. So there are issues in terms of handling cases that should provide an error rather than quietly do nothing but with the smaller data set (9 layers) the volumes run OK on mac osx 10.6.

based on this info I took liberty to decrease the priority.

comment:9 Changed 7 years ago by cmbarton

The fact that this runs on the Mac OS that is 2 generations old, is encouraging. But the fact that it doesn't run on the 2 most recent versions of the Mac OS is a big problem. Hopefully this will turn out to be a Mac compiling issue that William and I can work out. But it still may be something in the wxNVIZ code that is only showing up now in current versions of the OS. So releasing a stable version (i.e., that can no longer be altered) with an important part of GRASS non-functional is something that I don't think we should do. So we need to at least figure out what is wrong. Anyway, I just updated dependencies (frameworks) and tested William's new build of GRASS 6.4.3. In this case, 3D does not work at all, but maybe the errors will be helpful in trouble shooting the problem.

Message on starting GRASS:

3D view mode: dlopen(/Applications/GRASS-6.4.app/Contents/MacOS/lib/libgrass_ogsf.6.4.3svn.dylib, 10): Library not loaded: /Users/Shared/unix/ffmpeg-snow/lib/libavutil.dylib
  Referenced from: /Applications/GRASS-6.4.app/Contents/MacOS/lib/libgrass_ogsf.6.4.3svn.dylib
  Reason: image not found

Starting 3D mode failed after:

1) opening JR_2008_ALL_dem in the layer manger 2) opening jr_7408MR_2m_t70 in the layer manager 3) setting region to match jr_7408MR_2m_t70 using g.region 4) displaying layers (JR_2008_ALL_dem only shows) 5) switching to 3D mode in the display

Here is the error:

Starting 3D view mode...                                                        
Exception in thread Thread-14:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions
/2.6/lib/python2.6/threading.py", line 532, in
__bootstrap_inner
    self.run()
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/nviz/mapwindow.py", line 64, in run
    self._display = wxnviz.Nviz(self.log, self.progressbar)
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/nviz/wxnviz.py", line 98, in __init__
    self.Init()
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/nviz/wxnviz.py", line 124, in Init
    GS_libinit()
NameError: global name 'GS_libinit' is not defined
Exception
AttributeError
:
"'Nviz' object has no attribute 'data'"
 in
<bound method Nviz.__del__ of <nviz.wxnviz.Nviz object at
0x618ec30>>
 ignored
Traceback (most recent call last):
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/mapdisp/toolbars.py", line 229, in OnSelectTool

self.parent.AddNviz()
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/mapdisp/frame.py", line 294, in AddNviz

Map = self.Map, tree = self.tree, lmgr = self._layerManager)
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/nviz/mapwindow.py", line 156, in __init__

self.decoration['arrow']['size'] = self._getDecorationSize()
  File "/Applications/GRASS-6.4.app/Contents/MacOS/etc/wxpyt
hon/nviz/mapwindow.py", line 1172, in _getDecorationSize

size = self._display.GetLongDim() / 8.
AttributeError
:
'NoneType' object has no attribute 'GetLongDim'

comment:10 Changed 7 years ago by cmbarton

I tried setting debug=5 and crashing wxNVIZ with the most recent build from trunk. To get the crash, I used Helena's test DEM and volume. I set the region to match the volume and loaded both DEM and volume into the layer manager. Then I displayed them with wxNVIZ. I switched to the volume window under the data tab and added an isosurface. No crash yet. Then I tried to change the level displayed by the isosurface from the default (minimum value) to 10. This is where the crash happens. Here is the debug output for that action. Hopefully it can identify what is happening

D3/5: GVL_vol_exists
D5/5: gvl_get_vol():
D5/5:     id=81721
D3/5: GVL_isosurf_num_isosurfs
D5/5: gvl_get_vol():
D5/5:     id=81721
D3/5: GVL_vol_exists
D5/5: gvl_get_vol():
D5/5:     id=81721
D3/5: GVL_isosurf_num_isosurfs
D5/5: gvl_get_vol():
D5/5:     id=81721
D3/5: GVL_isosurf_set_att_const() id=81721 isosurf_id=0 att=1 const=10.000000
D5/5: gvl_isosurf_get_isosurf(): id=81721 isosurf=0
D5/5: gvl_get_vol():
D5/5:     id=81721
D5/5: gvl_isosurf_set_att_const(): att=1, const=10.000000
D5/5: gvl_isosurf_set_att_src
D5/5: isosurf_get_att_src
D5/5: gvl_isosurf_set_att_changed
D3/5: GS_clear
D3/5: GS_ready_draw
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_aspect(): left=0, right=797, top=545, bottom=0
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_draw_wire(): id=110658
D5/5: gs_get_surf():
D5/5:   id=110658
D3/5: gsd_wire_surf(): id=110658
D5/5: gs_get_att_src(): id=110658, desc=1
D5/5: gs_get_att_typbuff(): id=110658 desc=1 to_write=0
D5/5: gs_get_att_typbuff(): id=110658 desc=2 to_write=0
D5/5: gs_update_curmask(): id=110658
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_done_draw
D3/5: GS_ready_draw
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_aspect(): left=0, right=797, top=545, bottom=0
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_clear
D5/5: gs_get_surf():
D5/5:   id=110658
D3/5: GS_get_zextents(): id=110658
D3/5: GS_draw_surf(): id=110658
D5/5: gs_get_surf():
D5/5:   id=110658
D5/5: gsd_surf(): id=110658
D5/5: gs_get_att_src(): id=110658, desc=1
D5/5: gs_get_att_typbuff(): id=110658 desc=1 to_write=0
D5/5: gs_get_att_typbuff(): id=110658 desc=2 to_write=0
D5/5: gs_update_curmask(): id=110658
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_ready_draw
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_aspect(): left=0, right=797, top=545, bottom=0
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_done_draw
D3/5: GS_ready_draw
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_aspect(): left=0, right=797, top=545, bottom=0
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_done_draw
D3/5: GS_ready_draw
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_aspect(): left=0, right=797, top=545, bottom=0
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D5/5: gvl_get_vol():
D5/5:     id=81721
D5/5: gvld_vol(): id=81721
D5/5: gvl_slices_calc(): id=81721
D5/5: color buf = [0% yellow]
D5/5: color buf = [20% green]
D5/5: color buf = [40% cyan]
D5/5: color buf = [60% blue]
D5/5: color buf = [80% magenta]
D5/5: color buf = [100% red]
D5/5: gvld_slices
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: gvl_write_char(): reallocate memory for pos : 0 to : 1000000 B
pythonw2.6(12150,0xacbe1a28) malloc: *** error for object 0xdea2000: pointer being realloc'd was not allocated
*** set a breakpoint in malloc_error_break to debug

comment:11 Changed 7 years ago by cmbarton

No problem displaying volumes on OS X 10.8 via the old TclTk? NVIZ with GRASS 6.4.2

comment:12 Changed 7 years ago by cmbarton

After combined work by Anna, Helena, and me, I've found a way to fix this. Remove all references to gvl_data_align in ../lib/ogsf/gvl_calc.c

gvl_data_align is called in 2 places in gel_calc.c, once for viewing isosurfaces and once for viewing slices. It needs to be removed in both places. Since these are the only places it is called AFAICT, the function can be removed too.

No one (including the developer who's name is listed on the gvl_calc.c source code) seems to know what gvl_data_align is supposed to do. BUT I don't know if this will cause a problem on any other platform, including:

Mac OS X 10.6 Linux Windows

So testing is needed. If this works on all platforms, it needs to be propagated to all GRASS versions now in development.

Michael

comment:13 Changed 7 years ago by annakrat

Just for the record, Markus Metz tried to fix it in r54866.

The crash report is still the same?

comment:14 Changed 7 years ago by cmbarton

That fix did not work. I wrote back to Markus about it.

Michael

comment:15 Changed 7 years ago by cmbarton

I only tested trunk, but that's what he said he changed.

comment:16 in reply to:  15 Changed 7 years ago by mmetz

Replying to cmbarton:

I only tested trunk, but that's what he said he changed.

OK, I think I have understood what gvl_align_data() does and how it is used. Its purpose is to reduce memory allocation to the actual amount of data, that is all. The function itself had a bug, wrong pointer arithmetic (correct in gvl_write_char()).

I have fixed gvl_align_data() in trunk r54877 and disabled in relbr64 and devbr6 (r54878-9). If it is still not working in trunk, we have to disable it in trunk as well.

Markus M

comment:17 Changed 7 years ago by cmbarton

OK. I'll try to compile again. Thanks.

Michael

comment:18 Changed 7 years ago by cmbarton

That fixed it!!!

Thanks. If you want to backport to GRASS 6.x, I can test.

Michael

comment:19 Changed 7 years ago by cmbarton

Umm. I managed to get it to crash by changing settings several times. Let me test further and see if it is something in particular. It may just be that gel_data_align does not really do what it is supposed to do, at least on the Mac. The important thing is that you can now view volumes in nviz again.

Michael

comment:20 Changed 7 years ago by cmbarton

I'm afraid I spoke too soon. It is certainly better than it was. But when I reduce resolution to 1 with isosurfaces, it crashes. Otherwise, it seems to work.

Michael

comment:21 in reply to:  20 Changed 7 years ago by mmetz

Replying to cmbarton:

I'm afraid I spoke too soon. It is certainly better than it was. But when I reduce resolution to 1 with isosurfaces, it crashes.

Does it crash with the same error or with a different error? By looking at the ogsf and nviz libraries I got the suspicion that there may be more bugs to be discovered, in particular when settings are changed several times.

Markus M

comment:22 Changed 7 years ago by cmbarton

The Apple error report shows a crash after gvl_write_char, also in gvl_calc.c. Here is the last part of the GRASS Debug output for changing resolution from 3 to 2 (no crash) and 2 to 1 (crash).

CHANGING RESOLUTION FROM 3 TO 2. OK

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: gvl_write_char(): reallocate memory for pos : 0 to : 1000000 B
D3/5: gvl_align_data(): reallocate memory finally to : 349308 B
D5/5: gvld_isosurf():
D5/5:   start : gvl: jr_7408MR_2m_t70@test3d isosurf : 0

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D5/5:   intialize OK
D5/5:   end : isosurf : 0 datalength : 349308 B

D3/5: GS_done_draw
D3/5: GS_done_draw

CHANGING RESOLUTION FROM 2 TO 1. CRASH

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: gvl_write_char(): reallocate memory for pos : 0 to : 1000000 B
D3/5: gvl_write_char(): reallocate memory for pos : 1000000 to : 2000000 B
D3/5: gvl_align_data(): reallocate memory finally to : 1840605 B
D5/5: gvld_isosurf():
D5/5:   start : gvl: jr_7408MR_2m_t70@test3d isosurf : 0

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D5/5:   intialize OK

Maybe you can do a similar variable initialization to help gel_char_write that will fix this.

Michael

comment:23 in reply to:  22 Changed 7 years ago by mmetz

Replying to cmbarton:

The Apple error report shows a crash after gvl_write_char, also in gvl_calc.c. Here is the last part of the GRASS Debug output for changing resolution from 3 to 2 (no crash) and 2 to 1 (crash).

CHANGING RESOLUTION FROM 3 TO 2. OK

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: gvl_write_char(): reallocate memory for pos : 0 to : 1000000 B
D3/5: gvl_align_data(): reallocate memory finally to : 349308 B
D5/5: gvld_isosurf():
D5/5:   start : gvl: jr_7408MR_2m_t70@test3d isosurf : 0

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D5/5:   intialize OK
D5/5:   end : isosurf : 0 datalength : 349308 B

D3/5: GS_done_draw
D3/5: GS_done_draw

CHANGING RESOLUTION FROM 2 TO 1. CRASH

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: gvl_write_char(): reallocate memory for pos : 0 to : 1000000 B
D3/5: gvl_write_char(): reallocate memory for pos : 1000000 to : 2000000 B
D3/5: gvl_align_data(): reallocate memory finally to : 1840605 B
D5/5: gvld_isosurf():
D5/5:   start : gvl: jr_7408MR_2m_t70@test3d isosurf : 0

D3/5: GS_get_zrange(): min=-0.09 max=22.33
D3/5: GS_get_zrange(): min=-0.09 max=22.33
D5/5:   intialize OK

There is no error message, everything seems to be ok.

Markus M

comment:24 Changed 7 years ago by cmbarton

There are never any error messages because this is not a normal GRASS module that goes through the GRASS error system. It simply crashes then entire interface. The native error system says that the crash starts at gvl_write_char. In the GRASS debug output, the second case crashes after the "initialize OK" line.

If you compare the two debug outputs above, gvl_align_char allocates enough memory for a resolution of 2 (1000000 > 349308) but runs twice and allocates 2 chunks in the second case.

gvl_align_data correctly allocates the needed memory in both cases now.

I'm hoping that this is enough to suggest a fix.

Michael

comment:25 in reply to:  24 Changed 7 years ago by mmetz

Replying to cmbarton:

There are never any error messages because this is not a normal GRASS module that goes through the GRASS error system. It simply crashes then entire interface. The native error system says that the crash starts at gvl_write_char.

I guess that this is a symptom but not the cause, i.e. the memory corruption happens earlier because to me gvl_write_char() seems to be ok.

In the GRASS debug output, the second case crashes after the "initialize OK" line.

As above, I think this a symptom, not the cause.

If you compare the two debug outputs above, gvl_align_char allocates enough memory for a resolution of 2 (1000000 > 349308) but runs twice and allocates 2 chunks in the second case.

AFAICT, the behaviour of gvl_align_char is correct because the function gets the correct arguments.

gvl_align_data correctly allocates the needed memory in both cases now.

I'm hoping that this is enough to suggest a fix.

Unfortunately I can not suggest a fix right now. There are various places in the ogsf and nviz library that could cause a crash, but an in-depth analysis of the ogsf and nviz library will take some more time. Having had a quick look at the ogsf and nviz libraries, I would like to see these libraries rewritten, granted that the resources expertise, time, and money (in that order) are available.

Markus M

comment:26 Changed 6 years ago by annakrat

Milestone: 6.4.36.4.4

Update: crash happens when drawing isosurfaces both on Mac and Linux (tested on G7). It does not happen always, I think it depends on data. Testing data from Helena are here. Try e.g. isosurface of value 10. This is the relevant part of Mac report:

0   libgrass_ogsf.7.0.svn.dylib   	0x077aaf09 gvl_read_char + 41 (gvl_calc.c:770)
1   libgrass_ogsf.7.0.svn.dylib   	0x077b0d59 gvld_isosurf + 3577 (gvld.c:329)
2   libgrass_ogsf.7.0.svn.dylib   	0x077afea6 gvld_vol + 134 (gvld.c:54)
3   libgrass_ogsf.7.0.svn.dylib   	0x0777a4ed GVL_draw_vol + 45 (GVL2.c:410)
4   libgrass_nviz.7.0.svn.dylib   	0x0787c2af Nviz_draw_all_vol + 63 (draw.c:186)
5   libgrass_nviz.7.0.svn.dylib   	0x0787c399 Nviz_draw_all + 153 (draw.c:241)

The problem is (as debugger in Qt shows) that variable 'crnt_ev' in gvld.c on line 320 is 12:

pos[i] = edge_pos[crnt_ev];

but 'edge_pos' is a field with size 12 (see line 111) so it reads wrong data. The value of the variable 'crnt_ev' comes from a big look-up table (mc33_table.h, line 340). So either the size of the field 'edge_pos' is wrong or the table is wrong. On line 323 in gvld.c 'crnt_ev' is tested on value 12 so it seems that 12 is valid?

The marching cubes algorithm is unfortunately too complicated and there are almost no comments in the code so I have no idea what all the numbers mean. I tried to set the size of the field 'edge_pos' to 13 and isosurfaces don't crash anymore (tested on Linux only) and look normally. But I have no idea if they are drawn correctly. Any ideas?

comment:27 Changed 6 years ago by cmbarton

This is progress because 1) the error can be replicated on more than the Mac and 2) it is narrowing in on the place that is actually causing the crash. Thanks for keeping up the effort here. This kind of display could be very important to a new research project that was just funded by NSF.

Michael

comment:28 Changed 6 years ago by annakrat

Since nobody has a better idea and this is blocking me, I committed the change in r57620 in GRASS 7.

comment:29 Changed 6 years ago by cmbarton

Thanks. Hope to test this and Motitz' patch and let you both know.

Michael

comment:30 Changed 6 years ago by cmbarton

I just tested this using Helena's test data for the NC_spm_08 demo data location. I've added several isosurces and a slice, and set resolution down to 1 and cannot seem to crash it. I thought this fixed or greatly improved the problem.

BUT I cannot get my own data to display at all. Helena, can you check to see if your test data set is displaying correctly. Not sure why my data do not display (they did a year or two back).

Michael

comment:31 Changed 6 years ago by cmbarton

I was wrong about my data. They DO show up for the first time in months. This is GREAT. I'm still not certain that it is showing the z-axis correctly. So Helena it is still a good idea to check your test data for this. But this is a huge improvement it seems. Maybe it is "fixed" as well as it can be.

Michael

comment:32 in reply to:  31 Changed 6 years ago by neteler

Keywords: wxNVIZ added
Resolution: fixed
Status: newclosed

Replying to cmbarton:

Maybe it is "fixed" as well as it can be.

Closing. Reopen if needed.

Note: See TracTickets for help on using tickets.