Ticket #299 (reopened defect)

Opened 3 years ago

Last modified 3 months ago

geos::geom::CoordinateArraySequenceFactory unit test failing on Bad Apple

Reported by: pramsey Owned by: geos-devel@…
Priority: major Milestone: GEOS Future
Component: Unit Tests Version: svn-trunk
Severity: Unassigned Keywords: mac,macosx,osx
Cc:

Description (last modified by strk) (diff)

---> group: geos::geom::CoordinateArraySequenceFactory, test: test<1>
     problem: assertion failed

The problem goes away when I comment out lines 55 and 56 of the test.

source:trunk/tests/unit/geom/CoordinateArraySequenceFactoryTest.cpp#L55

Problem seen on OS/X 10.6, i686-apple-darwin10-gcc-4.2.1

Attachments

299-patch.patch Download (2.1 KB) - added by hobu 3 months ago.

Change History

  Changed 3 years ago by strk

Line 55 tests that the CoordinateFactory? returned by CoordinateArrayFactory::instance() is really a CoordinateArrayFactory?.

Can't understand why this would fail. Maybe some "Apple GCC" crap ?

  Changed 3 years ago by strk

Building on the reporter's system with these flags fixed the problem:

CXXFLAGS="-Wnon-virtual-dtor -Woverloaded-virtual -Wall -pedantic"

  Changed 3 years ago by strk

CXXFLAGS="-Wnon-virtual-dtor -Woverloaded-virtual" was also enough

  Changed 2 years ago by pramsey

  • status changed from new to closed
  • resolution set to fixed

I've conditionally added these magic flags for Snow Leopard OS/X only at r2744

  Changed 2 years ago by pramsey

  • status changed from closed to reopened
  • resolution fixed deleted
  • milestone changed from 3.2.0 to 3.3.0

  Changed 2 years ago by pramsey

Now the magic flags don't seem to fix things anymore... the test has returned to failing.

  Changed 2 years ago by strk

The new failure is in another test file, right ? (IsValidOp? vs CoordinateSequenceFactory?) Can you paste the error if new ?

  Changed 2 years ago by pramsey

No, same file same place :(

---> group: geos::geom::CoordinateArraySequenceFactory, test: test<1>
     problem: assertion failed

  Changed 2 years ago by pramsey

It gets better (or worse?). I compiled and ran make check on OpenSolaris?

geos::util::UniqueCoordinateArrayFilter: .

---> group: geos::linearref::LocationIndexedLine, test: test<11>
     problem: assertion failed

gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)
SunOS opensolaris 5.11 snv_111b i86pc i386 i86pc Solaris

  Changed 2 years ago by kyngchaos

Answering your call on the mailing list, I tried a few different things (GEOS 3.2 rc3):

- OSX 10.6/gcc 4.2 (default), 32 and 64bit both fail at CoordinateArraySequenceFactory?.

- 10.5/gcc 4.0 (default), 32 and 64bit pass all tests.

- 10.6/gcc 4.0 (CC=gcc-4.0 CXX=g++-4.0), 32 and 64bit pass all tests.

  Changed 2 years ago by pramsey

Thanks! Is gcc-4.0 part of the 10.6 XCode distribution, or do I have it because I happened to upgrade? Can I just detect 10.6 and force gcc 4.0 do you think?

  Changed 2 years ago by kyngchaos

It is a part of Xcode, but unless everyone sticks with the C API, there is the possibility of the dreaded C++ version incompatibilities. Though I haven't seen problems with Qt (gcc 4.0 built) in Qgis (gcc 4.2 built).

Maybe just recommend setting CC and CXX manually in case it gets fixed?

  Changed 2 years ago by mloskot

  • owner mloskot deleted
  • status changed from reopened to new

  Changed 2 years ago by kyngchaos

More tests - narrowed down the problem. I configured, built and ran the tests with no GCC optimization, with GCC 4.2: success! I've seen optimization bugs before on OSX, when Apple first moved to GCC 4.0 (they're all fixed now). So maybe there's a new optimization bug in 4.2?

Next test: configured and built GEOS with -Os (Apple's default, what I normally use and what I used for the tests in comment 10, though I'm guessing you used the GEOS default -O2?), then hacked the unit test makefile to have no optimization (-O0). That failed. So it looks like the optimization bug is happening in the library code, not the test code :(

Next test, to retain the most optimization possible: configured -Os, then add a target in geom/makefile to override optimization of CoordinateArraySequenceFactory?.cpp with -O0. Tested with and without unit optimization. With: failed. Without: success.

So now, a combination of optimization for all library code except CoordinateArraySequenceFactory?.cpp, and no optimization in the tests works. Phwew.

Maybe not acceptable to work around compiler bugs? Maybe CoordinateArraySequenceFactory?.cpp can be tweaked to not trigger the bug - it worked in one case (I don't remember what source, I didn't make the changes)?

  Changed 2 years ago by pramsey

Well, it meets the "better than nothing" test, I suppose... and if we carefully tie it to the combination of OS/X and GCC 4.2.1 (build 5646), then when a new GCC comes out, the behavior will revert to default and we can see if the bug has gone away. Do we know if this bug is on Apple's radar at all? Is there a ticket we can reference?

  Changed 2 years ago by kyngchaos

Bug reporting at Apple is a bit closed - one logs in to their developer account (free or paid) to report a bug, and I don't see any way to search bugs reported by others.

As for the solution, not much of one yet, as it requires both the library source (just CoordinateArraySequenceFactory?.cpp) and the program using GEOS (ie GDAL, MapServer?, ...) to be compiled without optimization. bleh.

I wonder - all these tests are C++, what about tests from C? If a C program optimized doesn't have the problem, maybe it would be acceptable to force CoordinateArraySequenceFactory? unoptimized and leave the optimized C++usage broken until GCC 4.2 is fixed. (do GDAL and Qgis use GEOS C or C++?) I see the capi unit tests, but it doesn't appear to cover this exact test.

Yet another test, to fill out the combinations - library unoptimized, test optimized: fail. Looks like both sides definitely need to be unoptimized for it to work.

FYI, I added this target right before the .cpp.o target in source/geom/makefile.in:

CoordinateArraySequenceFactory.lo: CoordinateArraySequenceFactory.cpp
@am__fastdepCXX_TRUE@	$(LTCXXCOMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -O0 -c -o $@ $<
@am__fastdepCXX_TRUE@	mv -f $(DEPDIR)/$*.Tpo $(DEPDIR)/$*.Plo
@AMDEP_TRUE@@am__fastdepCXX_FALSE@	source='$<' object='$@' libtool=yes @AMDEPBACKSLASH@
@AMDEP_TRUE@@am__fastdepCXX_FALSE@	DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
@am__fastdepCXX_FALSE@	$(LTCXXCOMPILE) -O0 -c -o $@ $<

I disable static libs, but for static I think you'd need a similar target for CoordinateArraySequenceFactory?.o. Of course, they'd be wrapped in an ifdef or ifeq.

  Changed 2 years ago by mloskot

  • owner set to geos-devel@…

  Changed 2 years ago by mloskot

  • keywords mac,macosx,osx added

Duplicated as #328

  Changed 13 months ago by strk

Bad apples... ... is anyone willing to fight for this ?

  Changed 12 months ago by kyngchaos

...as I suggested a while back, how about just add a note to compilation instructions to force GCC 4.0 when using Xcode 3.2 (on OS X 10.6)? And it's really easy with the new cmake build in GEOS 3.3.

I'm just too lazy to go any deeper into this (heck, I forgot about this bug until just now when getting ready for the 3.3 release).

  Changed 12 months ago by strk

You mentioned on the mailing list that this bug is fixed by using GCC-4.0, right ?

  Changed 12 months ago by kyngchaos

As a workaround. I still need to try it with QGIS compiled with GCC 4.2. Everything else (GDAL, MapServer?, ...) uses the C API so they shouldn't have any problems with C++ version issues.

  Changed 12 months ago by strk

  • milestone changed from 3.3.0 to GEOS Future

Yeah, hopefully there will be no problem trough the C API.

  Changed 10 months ago by kyngchaos

Well, gcc-4.0 seems to be working. BUT.

In OS X 10.7 Lion/Xcode? 4.1 Apple dropped gcc-4.0! Probably goes along with dropping the 10.5 SDK (10.5 used gcc 4.0).

Maybe I'll have better luck with the now-default llvm, if I can get it to work.

  Changed 8 months ago by hobu

  • description modified (diff)

After spending an afternoon on this, I think the problem is related to the fact that geos::geom::CoordinateSequenceFactory has virtual methods, and the instance is not complete, and the dynamic_cast at source:/trunk/tests/unit/geom/CoordinateArraySequenceFactoryTest.cpp@3496#L53 is not valid.

From http://www.cplusplus.com/doc/tutorial/typecasting/ in the dynamic_cast section

When dynamic_cast cannot cast a pointer because it is not a complete object of the required class -as in the second conversion in the previous example- it returns a null pointer to indicate the failure.

This test fails for me on OSX using both clang and gcc 4.2.1. That it fails on two recent compilers makes me think the problem is that the test is bad, and that it succeeded before was just luck.

  Changed 8 months ago by strk

  • description modified (diff)

  Changed 8 months ago by strk

The code in the typecastin example is not the same as ours. In our case CoordinateArraySequenceFactory::instance() should be returning a complete instance of a CoordinateArraySequence?

  Changed 4 months ago by strk

  • status changed from new to closed
  • resolution set to wontfix
  • summary changed from geos::geom::CoordinateArraySequenceFactory unit test failing to geos::geom::CoordinateArraySequenceFactory unit test failing on Bad Apple

After going through all comments about this I really believe it's not GEOS business to work around Apple bugs.

  Changed 4 months ago by pramsey

  • status changed from closed to reopened
  • resolution wontfix deleted

You don't have to block release, but you don't get to close it out. It's marked Future and when through some combination of Apple improvements or our own research we manage to close it, we will. As it stands, I can't regress geos, it's a bug :)

  Changed 3 months ago by 2sin18

  • status changed from reopened to closed
  • resolution set to fixed

It is a GEOS bug, I fix it using following steps: 1 add a dealloc method implementation for geos::operation::releate::RelateComputer? 2 add missing includes in geos::geom::util::SineStarFactory? implementation:

#include <geos/geom/CoordinateSequence.h> #include <geos/geom/LinearRing.h>

It works good with GEOS 3.3.0 for Mac OSX 10.7.

I do not know how to commit a patch, somebody could make one, thanks.

BTW GEOS should do more code review.

follow-up: ↓ 33   Changed 3 months ago by pramsey

  • status changed from closed to reopened
  • resolution fixed deleted
  • Please don't close tickets if the resolution isn't actually committed into the code base
  • Please include patches with your fixes (see the Attach File button above, and 'svn diff')
  • If you find "GEOS", please let him know we're looking forward to his code reviews.

follow-up: ↓ 34   Changed 3 months ago by hobu

Additionally, how did you track this down? I spent quite a while chasing this and mostly chased my tail...

Do you have a fix for #479 too, or did you compile this with clang?

in reply to: ↑ 31   Changed 3 months ago by 2sin18

Replying to pramsey:

* Please don't close tickets if the resolution isn't actually committed into the code base * Please include patches with your fixes (see the Attach File button above, and 'svn diff') * If you find "GEOS", please let him know we're looking forward to his code reviews.

My apologies for the misuse of "close ticket" and poor English. I am very much new to trac. Validating and patching with your help is appreciated.

in reply to: ↑ 32   Changed 3 months ago by 2sin18

Replying to hobu:

Additionally, how did you track this down? I spent quite a while chasing this and mostly chased my tail... Do you have a fix for #479 too, or did you compile this with clang?

I used llvm/clang with the help of xCode. xCode suggested the problem is that some classes are incomplete types at the point where auto_ptr destructor is instantiated. So I checked the source code, and found one class missing non-trivial destructor and another class missing necessary includes for forward declared types.

Sorry, my English is too poor to explain more.

Changed 3 months ago by hobu

  Changed 3 months ago by hobu

Can you please supply a patch? The patch I have attached that attempts to do what you suggest does not seem to fix the issue.

  Changed 3 months ago by pramsey

Yes, a patch would make us very happy. 'svn diff' your check-out, please!

Note: See TracTickets for help on using tickets.