Opened 17 months ago

Closed 3 months ago

#2874 closed defect

Error on MgMappingUtil.StylizeLayers

Reported by: dfanetti Owned by:
Priority: low Milestone:
Component: General Version:
Severity: trivial Keywords:
Cc: External ID:

Description

Hi, I'm testing mapguide opensource 4.0.0 beta 1. When I zoom or pan on the map, randomly and only sometimes I get the map completely white. Looking at the log, the call that starts is:

/mapguide/mapagent/mapagent.fcgi?USERNAME=Anonymous&OPERATION=GETDYNAMICMAPOVERLAYIMAGE&VERSION=2.1.0&LOCALE=en&CLIENTAGENT=ol.source.ImageMapGuide%20source&CLIP=1&SETDISPLAYDPI=96&SETDISPLAYWIDTH=1920&SETDISPLAYHEIGHT=860&SETVIEWSCALE=9017.889084962837&SETVIEWCENTERX=1688615.15550708&SETVIEWCENTERY=4799071.07608587&MAPNAME=catasto_analisi&SESSION=e4bc5cc2-1661-11ee-8000-0242ac110002_en_MTI3LjAuMC4x0AFC0AFB0AFA&FORMAT=PNG8&BEHAVIOR=2

The strange thing is that if I launch the same call 10 times, 8/9 times the stylized map arrives and 1/2 times a white image arrives.

In the Error.log, when the image is blank it writes the following error repeated for each layer in the map:

<2023-06-29T10:38:47>   -343939520      ol.source.ImageMapGuide source  172.17.0.1      Anonymous
 Error: Failed to stylize layer: territorio
        An exception occurred in FDO component.
        Error occurred in Feature Source (Library://vhdevelopment/siena/data/siena.FeatureSource): PROJ: SQLite3 version is 3.3.17, whereas at least 3.11 should be used (Cause: , Root Cause: PROJ: SQLite3 version is 3.3.17, whereas at least 3.11 should be used)
 StackTrace:
  - MgMappingUtil.StylizeLayers() line 918 file /tmp/work/src/Server/src/Services/Mapping/MappingUtil.cpp

It would seem a version problem of some library used by the dbxml? Although the fact that the error occurs randomly on the same request leaves me a bit perplexed. Could some concurrent condition cause the error?

I'm using the OSGeo.OGR provider and the database I'm connecting to is PostgreSQL.

Change History (11)

comment:1 by jng, 17 months ago

Which linux environment and which linux build are you using?

comment:2 by dfanetti, 17 months ago

Hi jng,

root@a8f22ff1ec2b:/# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy
root@a8f22ff1ec2b:/# uname -a
Linux a8f22ff1ec2b 6.2.0-24-generic #24-Ubuntu SMP PREEMPT_DYNAMIC Fri Jun 16 12:03:50 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

comment:3 by crispinatime, 17 months ago

Is this part of the error text correct: "PROJ: SQLite3 version is 3.3.17, whereas at least 3.11 should be used"

SQLite 3.3.17 is from 2007-04-25

comment:4 by dfanetti, 17 months ago

Yes, very old. I think the installer carries with it some old libsqlite3 library in the Berkeley DB part. I don't understand why this error only happens sometimes even in the same exact request. If there was a library compatibility issue it should always happen.

comment:5 by jng, 17 months ago

I just searched for sqlite in our various MapGuide/FDO sources and found the following versions:

  • FDO / SDF Provider - 3.3.13 with custom modifications
  • FDO / SQLite Provider - 3.7.0.1 with custom modifications
  • MapGuide / berkeleydb - 3.5.9

None of these versions match the 3.3.17 version being reported here. I'm really curious where this SQLite 3.3.17 is coming from. Because for the ubuntu build of MapGuide/FDO, we're building against the distro-provided gdal library and all of its library dependencies.

If you installed the distro-provided gdal binaries and ran some postgis spatial queries with ogrinfo do you get the same error?

We really need to know where this SQLite 3.3.17 is coming from. Because I'm very certain it's not our code.

comment:6 by dfanetti, 16 months ago

Hi jng, I recompiled the gdal library (3.4.3) to have ecw support. However I don't think this lead to such an old version of the sqlite library. By doing svn of the source code of mapguide (trunk) and doing grep of the version 3.3.17 I get the following results:

duccio@ldp052:~/devel/mapguide_rev10052$ grep -iIR "3\.3\.17"
.svn/pristine/5f/5fcf6c13383e1540150dfb67d989ee1235c3deaa.svn-base:** version 3.3.17.  By combining all the individual C code files into this 
.svn/pristine/5f/5fcf6c13383e1540150dfb67d989ee1235c3deaa.svn-base:#define SQLITE_VERSION         "3.3.17"
.svn/pristine/4b/4b97512ca3fd39246d866e21cc56e98b22a2cb77.svn-base:#define SQLITE_VERSION         "3.3.17"
.svn/pristine/b0/b09db8cda017a16e43a45a663a3ae5326f537916.svn-base: *  3.3.17 int
.svn/pristine/f6/f68e7e3145901750b35cce24085a9767b951b6de.svn-base: * 3.3.17.2 Canonical representation
.svn/pristine/f6/f68e7e3145901750b35cce24085a9767b951b6de.svn-base: * Lexical representation (3.3.17.1). Specifically,
MgDev/Oem/dbxml/xerces-c-src/src/xercesc/validators/datatype/DecimalDatatypeValidator.cpp: *  3.3.17 int
MgDev/Oem/dbxml/xerces-c-src/tests/src/XSValueTest/XSValueTest.cpp: * 3.3.17.2 Canonical representation
MgDev/Oem/dbxml/xerces-c-src/tests/src/XSValueTest/XSValueTest.cpp: * Lexical representation (3.3.17.1). Specifically,
MgDev/Oem/DWFTK/develop/global/src/dwfcore/sqlite/sqlite3.c:** version 3.3.17.  By combining all the individual C code files into this 
MgDev/Oem/DWFTK/develop/global/src/dwfcore/sqlite/sqlite3.c:#define SQLITE_VERSION         "3.3.17"
MgDev/Oem/DWFTK/develop/global/src/dwfcore/sqlite/sqlite3.h:#define SQLITE_VERSION         "3.3.17"

Do you think any of these occurrences could cause the error?

comment:7 by jng, 16 months ago

Wow! How did I missing DWF Toolkit in my sqlite search? I think you're on to something here.

We had issues in the past in the SDF and SQLite FDO providers, where somehow the "wires got crossed" and these providers were calling sqlite3_* functions in the DWF Toolkit core library instead of their own private versions of SQLite in their own provider .so files. We fixed these issues by making these providers compile on linux with -BSymbolic to force these provider .so files to look within themselves when calling sqlite3_* functions.

But before I look at possible remedies, I'd like to see this issue be more reproducible.

Instead of a postgis database, could you perhaps ogr2ogr a subset of the data from postgis to some flat file format and if you setup a OGR feature source to this flat file instead, does the problem still occur?

And if it is reproducible with a flat file via an OGR feature source, could I then get a copy of this flat file?

Thanks.

comment:8 by dfanetti, 16 months ago

Sorry jng, unfortunately I personally don't have much time to help you with these tests. I will try to hear from some of my colleagues if they can take some time to help you better identify the problem.

Thank you

comment:9 by jng, 14 months ago

This is more a mental note for myself when I finally give this ticket my full attention:

My current running theory is that due to our linux builds of both MapGuide and FDO not using GCC symbol visibility (-fvisibility=hidden), symbols of thirdparty libs we use (such as SQLite) are present in our final .so files. You can see sqlite3_* symbols when doing a symbol dump of some of our .so files.

Because these symbols are present, I theorize that when GDAL calls PROJ for any CS/projection functionality, PROJ's use of SQLite is hitting our .so files for SQLite3 symbols instead of the normal expected SQLite3 .so library and this "crossing of the wires" is the cause of the reported error:

PROJ: SQLite3 version is 3.3.17, whereas at least 3.11 should be used

We had this "crossing of the wires" issue in the past with the SDF provider (looking for sqlite3 functions in the DWF Toolkit .so), but we were able to band-aid over this problem by building the SDF provider with -BSymbolic, forcing the SDF provider to look within itself for SQLite3 functions.

That kind of band-aid won't work here, we need to make these sqlite3 symbols hidden and GCC visibility attributes would be the proper way to make sure this happens.

But I don't see adding GCC visibility attributes to MapGuide/FDO to be a trivial task. While conceptually similar to __declspec(dllimport) and __declspec(dllexport) on Windows, we have to make sure the proper pre-preprocessor symbols are added to all the MapGuide/FDO projects so that we are properly exporting and importing any given public API (just like on Windows).

And that's a lot of projects we have to update!

Another less nuclear option (we only just want to hide our internal sqlite3 symbols!) may be something like this:

https://stackoverflow.com/questions/61598075/hide-symbols-from-a-3rd-party-a-file-that-is-linked-into-a-so-file

comment:10 by jng, 3 months ago

Running mgserver with LD_DEBUG=bindings,all confirms that ld linker is binding various sqlite3 symbols to libdwfcore.so instead of the distro-supplied libsqlite3.so

Current proposed solution being tested:

  • Section off thirdparty sources in DWFCore into its own static library target
  • Link this static library into DWFCore with -Wl,--exclude-libs,ALL

Observations:

  • No sqlite3* symbols in libdwfcore.so (yay!)
  • HOWEVER, OGR FDO provider is now almost unusable as querying feature data with a basic BBOX causes a segfault in distro-supplied libgeos. We need to find a resolution on this in order for the proposed solution to be accepted.

comment:11 by jng, 3 months ago

Status: newclosed

Fixed in r10094 in terms of the criss-crossing symbol resolution behavior.

However, with this change, the PostGIS support in the OGR provider is still somewhat flaky as I get constant errors in Error.log of the form:

cursor "ogrpglayerreader0xsomememoryaddr" does not exist

So for Beta 2 at least, the recommendation is to use the OSGeo.PostgreSQL provider instead. This will be noted as a known issue in the Beta 2 release notes.

Note: See TracTickets for help on using tickets.