Opened 14 years ago

Last modified 4 years ago

#299 closed defect

geos::geom::CoordinateArraySequenceFactory unit test failing — at Version 25

Reported by: pramsey Owned by: geos-devel@…
Priority: major Milestone: GEOS Fund Me
Component: Unit Tests Version: main
Severity: Unassigned Keywords: mac, macosx, osx
Cc:

Description (last modified by hobu)

---> group: geos::geom::CoordinateArraySequenceFactory, test: test<1>
     problem: assertion failed

The problem goes away when I comment out lines 55 and 56 of the test.

:trac:browser/trunk/tests/unit/geom/CoordinateArraySequenceFactoryTest.cpp#L55

Problem seen on OS/X 10.6, i686-apple-darwin10-gcc-4.2.1

Change History (25)

comment:1 by strk, 14 years ago

Line 55 tests that the CoordinateFactory returned by CoordinateArrayFactory::instance() is really a CoordinateArrayFactory.

Can't understand why this would fail. Maybe some "Apple GCC" crap ?

comment:2 by strk, 14 years ago

Building on the reporter's system with these flags fixed the problem:

CXXFLAGS="-Wnon-virtual-dtor -Woverloaded-virtual -Wall -pedantic"

comment:3 by strk, 14 years ago

CXXFLAGS="-Wnon-virtual-dtor -Woverloaded-virtual" was also enough

comment:4 by pramsey, 14 years ago

Resolution: fixed
Status: newclosed

I've conditionally added these magic flags for Snow Leopard OS/X only at r2744

comment:5 by pramsey, 14 years ago

Milestone: 3.2.03.3.0
Resolution: fixed
Status: closedreopened

comment:6 by pramsey, 14 years ago

Now the magic flags don't seem to fix things anymore... the test has returned to failing.

comment:7 by strk, 14 years ago

The new failure is in another test file, right ? (IsValidOp vs CoordinateSequenceFactory) Can you paste the error if new ?

comment:8 by pramsey, 14 years ago

No, same file same place :(

---> group: geos::geom::CoordinateArraySequenceFactory, test: test<1>
     problem: assertion failed

comment:9 by pramsey, 14 years ago

It gets better (or worse?). I compiled and ran make check on OpenSolaris

geos::util::UniqueCoordinateArrayFilter: .

---> group: geos::linearref::LocationIndexedLine, test: test<11>
     problem: assertion failed

gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)
SunOS opensolaris 5.11 snv_111b i86pc i386 i86pc Solaris

comment:10 by kyngchaos, 14 years ago

Answering your call on the mailing list, I tried a few different things (GEOS 3.2 rc3):

  • 10.5/gcc 4.0 (default), 32 and 64bit pass all tests.
  • 10.6/gcc 4.0 (CC=gcc-4.0 CXX=g++-4.0), 32 and 64bit pass all tests.

comment:11 by pramsey, 14 years ago

Thanks! Is gcc-4.0 part of the 10.6 XCode distribution, or do I have it because I happened to upgrade? Can I just detect 10.6 and force gcc 4.0 do you think?

comment:12 by kyngchaos, 14 years ago

It is a part of Xcode, but unless everyone sticks with the C API, there is the possibility of the dreaded C++ version incompatibilities. Though I haven't seen problems with Qt (gcc 4.0 built) in Qgis (gcc 4.2 built).

Maybe just recommend setting CC and CXX manually in case it gets fixed?

comment:13 by mloskot, 14 years ago

Owner: mloskot removed
Status: reopenednew

comment:14 by kyngchaos, 14 years ago

More tests - narrowed down the problem. I configured, built and ran the tests with no GCC optimization, with GCC 4.2: success! I've seen optimization bugs before on OSX, when Apple first moved to GCC 4.0 (they're all fixed now). So maybe there's a new optimization bug in 4.2?

Next test: configured and built GEOS with -Os (Apple's default, what I normally use and what I used for the tests in comment 10, though I'm guessing you used the GEOS default -O2?), then hacked the unit test makefile to have no optimization (-O0). That failed. So it looks like the optimization bug is happening in the library code, not the test code :(

Next test, to retain the most optimization possible: configured -Os, then add a target in geom/makefile to override optimization of CoordinateArraySequenceFactory.cpp with -O0. Tested with and without unit optimization. With: failed. Without: success.

So now, a combination of optimization for all library code except CoordinateArraySequenceFactory.cpp, and no optimization in the tests works. Phwew.

Maybe not acceptable to work around compiler bugs? Maybe CoordinateArraySequenceFactory.cpp can be tweaked to not trigger the bug - it worked in one case (I don't remember what source, I didn't make the changes)?

comment:15 by pramsey, 14 years ago

Well, it meets the "better than nothing" test, I suppose... and if we carefully tie it to the combination of OS/X and GCC 4.2.1 (build 5646), then when a new GCC comes out, the behavior will revert to default and we can see if the bug has gone away. Do we know if this bug is on Apple's radar at all? Is there a ticket we can reference?

comment:16 by kyngchaos, 14 years ago

Bug reporting at Apple is a bit closed - one logs in to their developer account (free or paid) to report a bug, and I don't see any way to search bugs reported by others.

As for the solution, not much of one yet, as it requires both the library source (just CoordinateArraySequenceFactory.cpp) and the program using GEOS (ie GDAL, MapServer, ...) to be compiled without optimization. bleh.

I wonder - all these tests are C++, what about tests from C? If a C program optimized doesn't have the problem, maybe it would be acceptable to force CoordinateArraySequenceFactory unoptimized and leave the optimized C++usage broken until GCC 4.2 is fixed. (do GDAL and Qgis use GEOS C or C++?) I see the capi unit tests, but it doesn't appear to cover this exact test.

Yet another test, to fill out the combinations - library unoptimized, test optimized: fail. Looks like both sides definitely need to be unoptimized for it to work.

FYI, I added this target right before the .cpp.o target in source/geom/makefile.in:

CoordinateArraySequenceFactory.lo: CoordinateArraySequenceFactory.cpp
@am__fastdepCXX_TRUE@	$(LTCXXCOMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -O0 -c -o $@ $<
@am__fastdepCXX_TRUE@	mv -f $(DEPDIR)/$*.Tpo $(DEPDIR)/$*.Plo
@AMDEP_TRUE@@am__fastdepCXX_FALSE@	source='$<' object='$@' libtool=yes @AMDEPBACKSLASH@
@AMDEP_TRUE@@am__fastdepCXX_FALSE@	DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
@am__fastdepCXX_FALSE@	$(LTCXXCOMPILE) -O0 -c -o $@ $<

I disable static libs, but for static I think you'd need a similar target for CoordinateArraySequenceFactory.o. Of course, they'd be wrapped in an ifdef or ifeq.

comment:17 by mloskot, 14 years ago

Owner: set to geos-devel@…

comment:18 by mloskot, 14 years ago

Keywords: mac macosx osx added

Duplicated as #328

comment:19 by strk, 13 years ago

Bad apples... ... is anyone willing to fight for this ?

comment:20 by kyngchaos, 13 years ago

...as I suggested a while back, how about just add a note to compilation instructions to force GCC 4.0 when using Xcode 3.2 (on OS X 10.6)? And it's really easy with the new cmake build in GEOS 3.3.

I'm just too lazy to go any deeper into this (heck, I forgot about this bug until just now when getting ready for the 3.3 release).

comment:21 by strk, 13 years ago

You mentioned on the mailing list that this bug is fixed by using GCC-4.0, right ?

comment:22 by kyngchaos, 13 years ago

As a workaround. I still need to try it with QGIS compiled with GCC 4.2. Everything else (GDAL, MapServer, ...) uses the C API so they shouldn't have any problems with C++ version issues.

comment:23 by strk, 13 years ago

Milestone: 3.3.0GEOS Future

Yeah, hopefully there will be no problem trough the C API.

comment:24 by kyngchaos, 13 years ago

Well, gcc-4.0 seems to be working. BUT.

In OS X 10.7 Lion/Xcode 4.1 Apple dropped gcc-4.0! Probably goes along with dropping the 10.5 SDK (10.5 used gcc 4.0).

Maybe I'll have better luck with the now-default llvm, if I can get it to work.

comment:25 by hobu, 13 years ago

Description: modified (diff)

After spending an afternoon on this, I think the problem is related to the fact that geos::geom::CoordinateSequenceFactory has virtual methods, and the instance is not complete, and the dynamic_cast at source:/trunk/tests/unit/geom/CoordinateArraySequenceFactoryTest.cpp@3496#L53​ is not valid.

From http://www.cplusplus.com/doc/tutorial/typecasting/ in the dynamic_cast section

When dynamic_cast cannot cast a pointer because it is not a complete object of the required class -as in the second conversion in the previous example- it returns a null pointer to indicate the failure.

This test fails for me on OSX using both clang and gcc 4.2.1. That it fails on two recent compilers makes me think the problem is that the test is bad, and that it succeeded before was just luck.

Note: See TracTickets for help on using tickets.