Ticket #3638 (reopened enhancement)

Opened 3 years ago

Last modified 2 months ago

[PATCH] OGR Driver for Norwegian SOSI (.sos) files.

Reported by: relet Owned by: relet
Priority: normal Milestone:
Component: default Version: svn-trunk
Severity: normal Keywords:
Cc: warmerdam

Description

 http://www.statkart.no/nor/SOSI/SOSI_in_English/

Just to leave a note that we are working on it. I'll attach some code when there is a stable version.

Attachments

sositest.png Download (52.3 KB) - added by relet 3 years ago.
Norwegian administrative boundaries visualized by mapserver, read from SOSI file using OGR connection.
sosi-topo2.png Download (117.4 KB) - added by relet 3 years ago.
Another snapshot of Norway rendered entirely from SOSI files using mapserver OGR connection.
GNUmakefile Download (243 bytes) - added by relet 3 years ago.
makefile.vc Download (210 bytes) - added by relet 3 years ago.
trunk.diff Download (2.3 KB) - added by relet 3 years ago.
20BygnAnlegg.SOS Download (201.7 KB) - added by relet 3 years ago.
example sosi file - contains geometries
20Tekst5000biterISO8859-10.sos Download (1.2 KB) - added by relet 3 years ago.
example sosi file - contains text elements
ogr_sosi.h Download (4.6 KB) - added by relet 3 years ago.
ogrsosidatasource.cpp Download (18.4 KB) - added by relet 3 years ago.
ogrsosilayer.cpp Download (10.2 KB) - added by relet 3 years ago.
valgrind.log Download (130.8 KB) - added by relet 3 years ago.
ogrsosidriver.cpp Download (3.8 KB) - added by relet 3 years ago.
fyba_melding.cpp Download (2.2 KB) - added by relet 3 years ago.
sosi.patch.20487 Download (31.8 KB) - added by relet 3 years ago.
Patch based on r20487- some bugfixes (proper UTF encoding), cleanup, and a code stub for datafile creation.
sosi.patch.22134 Download (7.5 KB) - added by relet 2 years ago.
patch based on r22134 - support for BUEP (arc) element using interpolation to linestrings

Change History

Changed 3 years ago by relet

Norwegian administrative boundaries visualized by mapserver, read from SOSI file using OGR connection.

Changed 3 years ago by relet

Another snapshot of Norway rendered entirely from SOSI files using mapserver OGR connection.

Changed 3 years ago by relet

Changed 3 years ago by relet

Changed 3 years ago by relet

Changed 3 years ago by relet

example sosi file - contains geometries

Changed 3 years ago by relet

example sosi file - contains text elements

  Changed 3 years ago by relet

  • status changed from new to closed
  • resolution set to fixed

Here we go. I have attached the driver code as single attachments so that they remain readable.

You will want to store most files in ./ogr/ogrsf_frmts/sosi/ with the following exceptions

  • trunk.diff contains some modifications to trunk, to enable the SOSI driver in general and in the GNUmakefiles.
  • the .sos files are example files which display some geometries and text elements in Finnmark. (The geometries are mostly disconnected, this is correct)
  • the .png files are older examples of driver progress, you can ignore these.

Known issues:

  • You will need the FYBA library (and the helper libraries GM and UT) available as Windows VC++ binary LIB and DLL from  http://www.statkart.no/nor/SOSI/Program/ (last section). We hope to release that for other architectures or as source code in the future.
  • SOSI files are often enough encoded in ISO8859-10, as are the examples. I added the #define for that, so that the CPL transcoding generates a sensible error message when we try to read the file. The current fallback to ISO8859-1 works, but will mess with your Saami.
  • To actually include SOSI support on windows you will have to edit the makefile.vc files in ogrsf_frmts to include:
    • "sosi" in DIRLIST
    • "sosi\*.obj" and the three FYBA .lib files in the "lib" list
  • and ogrsf_frmts/generic to include:
    • "-DSOSI_ENABLED" in BASEFORMATS (or make it optional)

  Changed 3 years ago by rouault

  • status changed from closed to reopened
  • resolution fixed deleted
  • summary changed from OGR Driver for Norwegian SOSI (.sos) files. to [PATCH] OGR Driver for Norwegian SOSI (.sos) files.

Reopening. If closed, I'm afraid it will be forgotten forever !

I've looked at the new files and the first impression I got is that the indentation should be fixed ! Please conform with http://trac.osgeo.org/gdal/wiki/rfc8_devguide, especially for :

# Use 4 character indentation levels.
# Use spaces instead of hard tab characters in source code.

I've noticed an extensive use of printf() in the code. This should be avoided. Use CPLError() for errors that must be reported to the user (by default on standard error stream, or possibly caught by a user callback), or CPLDebug() for debug info that appears only if selectively turned on with environment variable CPL_DEBUG=ON.

I also think there are a few memory leaks. If you test on Linux, I'd advise you using valgrind. For example :

  • in OGRSOSILayer::GetNextFeature?(), you construct a poGeom object and assign it to the feature with SetGeometry?(). I think you should rather use SetGeometryDirectly?() so that the feature becomes the owner of the geometry and frees it when necessary (I can be wrong on that point. This is just by quick code inspection).
  • In ogrosidatasource.cpp, the return value of CPLRecode() should be freed with CPLFree() after having been passed to the OGRFieldDefn constructor.
  • poFeatureDefn probably also leak. A poFeatureDefn->Release(); should be probably added in the destructor of the layer
  • There might be other leaks...

At line 195 of ogrsosilayer.cpp, you use CPLMalloc(), but you deallocate with free(). For consistency, you should rather use CPLFree() (you can catch those kinds of errors by uncommenting //#define DEBUG_VSIMALLOC in port/cpl_vsisimple.cpp. This will crash horribly when you don't pair things correctly ;-).

Questions :

  • In OGRSOSIDataSource::Open(), I see that LC_Init() is called. Is it safe to call it several times in a process lifetime ?
  • Is there a LC_Terminate() or something like that ? If LC_Init() initializes memory, that would be cool to have a terminate function called in the driver destructor to free what needs to be freed.
  • What is the runtime cost of LC_Init() and HO_TestSOSI() ? If they're fast, it's fine. If it takes some time, something should be done to have a fast detection of SOSI file, or in fact a fast detection of *non* SOSI files. The aim is that the opening of non-SOSI files that can be opened by another OGR driver isn't delayed too much.

I've not been able to test the driver, as my current environment is Linux.

Personnal note that doesn't engage the GDAL project: I'm wondering if it wouldn't be more convenient to drop the content of your FYBA lib in the driver itself, mainly to avoid one more GDAL external dependency and make users life easier, unless the FYBA source code is huge, isn't cross-platform or cannot be released under the terms of the X/MIT licence... or you don't want that for any reason (if it's used by third-party projects, ...). Anyway, that isn't a requirement and it's too soon to call for that because I haven't seen what it looks like ;-)

(Note: Starting with next week, I will be mostly away for the 3 following weeks)

follow-up: ↓ 4   Changed 3 years ago by relet

Thanks a lot for the very quick review! I'm going to address the issues today and update the files above. And I'll address your questions when I've verified some my answers. :)

Just a quick stupid question - I have used valgrind/alleyoop against ogr2ogr and ogrinfo through the development process at the most verbose level I could find. I debugged the memory management until, at this point, the only message it repeats, is:

==10466== HEAP SUMMARY:
==10466==     in use at exit: 0 bytes in 0 blocks
==10466==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==10466== 
==10466== All heap blocks were freed -- no leaks are possible

That made me confident, but some of these mistakes you mention should have been reported, shouldn't they? Have I been missing something?

in reply to: ↑ 3   Changed 3 years ago by relet

Never mind. I must have dropped the option to trace child processes somewhere. The lack of leaks should've made me more suspicious. :}

[x] Indentation fixed
[x] spaces used
[x] printfs replaced by CPLError and CPLDebug as appropriate

[x] Geometry set directly
[x] All output of CPLRecode CPLFree'd again
[x] poFeatureDefn released
[x] some other leaks fixed. ;)

I'm still not entirely confident that I caught them all, and would appreciate any help with that. I'll attach a log - there are still some leaks reported from processes which I couldn't identify as my driver or the fyba lib - these two would have debug symbols. I could also provide you with a copy of libfyba for linux i386 if that helps. I'm just not able to release it or the source code for now.

[x] DEBUG_VSIMALLOC enabled and not seen any horrible crashes.
[x] LC_Init moved to the constructor of OGRSOSIDriver, since it is indeed not safe to call it multiple times in the currently released version. 
[x] LC_Close added to the destructor.

The file checks are reasonably fast - I'll try to profile them to have some numbers. ogrinfo on an unknown file still takes 0.143s to terminate.

The option to release FYBA with a MIT licence exists, and if that happens, I will definitely remove the dependency here. (It is using another Hungarian naming convention though, which is certain to confuse people. I speak from experience.)

Going to replace the files now.

Changed 3 years ago by relet

Changed 3 years ago by relet

Changed 3 years ago by relet

Changed 3 years ago by relet

Changed 3 years ago by relet

follow-ups: ↓ 6 ↓ 7   Changed 3 years ago by rouault

The version of fyba_melding.cpp you've attached is incorrect. It looks like the result of a svn merge that conflicted. Not sure which implementation is wanted: the empty one or the one with Norwegian error messages and the Windows Beep() ;-) You could probably map those error callbacks to CPLError() or CPLDebug() I imagine.

Anyway I've tried on a Windows VM to compile and run it. I managed to get it compile and link, although that's painful currently : a better approach would be to add commented variables in nmake.opt and use the variable names in the makefiles to get the include directories, get optionnal compilation. You can take example on the other drivers that depend on third-party DLL.

So I managed to link, but ogrinfo fails to start with an "the application couldn't be initalized correctly (0xc01500002).". I downloaded the Visual C++ 2005 Redistribuable DLL as I saw that fyba_dll.dll has a dependency to MSVCR80.DLL but that didn't help. And curiously that fact that fyba_dll.dll is in the path or not doesn't change the error message.

Changed 3 years ago by relet

in reply to: ↑ 5   Changed 3 years ago by relet

Can you try running dependency walker to see what is missing?  http://www.dependencywalker.com/

in reply to: ↑ 5   Changed 3 years ago by relet

saw that fyba_dll.dll has a dependency to MSVCR80.DLL but that didn't help.

I am still trying to reproduce the problem - did you link against the .dll or against the .lib?

follow-up: ↓ 9   Changed 3 years ago by rouault

I've finally integrated the driver in r20487, but I'm still unable to execute ogrinfo. Dependecy Walker mentions "The Side-by-Side configuration information for FYBA_DLL.dll contains errors". No way to solve this on my side. I presume there's an issue with the way you compiled the fyba stuff. You should try to deploy it on a machine with no Visual Studio installed. (I compile GDAL with Visual Studio 2008 Express for the record)

in reply to: ↑ 8   Changed 3 years ago by tamas

In response to a question I've already posted some comments to the -dev list with regards to this driver, see  http://permalink.gmane.org/gmane.comp.gis.gdal.devel/25135

The Side-by-Side configuration information error reported above may be due to an incorrect compilation of the dependent library, probably the manifest file (describing the CRT dependency) is not embedded as a resource in the dll. In order to integrate the driver in the daily builds of  http://vbkto.dyndns.org/sdk/ the proper compilation for the supported compilers and architectures would also be required. It could probably be done easily if the corresponding sources was available somewhere do download.

Changed 3 years ago by relet

Patch based on r20487- some bugfixes (proper UTF encoding), cleanup, and a code stub for datafile creation.

  Changed 3 years ago by relet

You are right, and we are working on it. These mills grind slowly, but we should be able to release the source code eventually and include it in this driver.

  Changed 3 years ago by rouault

sosi.patch.20487 commited in r21065

Changed 2 years ago by relet

patch based on r22134 - support for BUEP (arc) element using interpolation to linestrings

  Changed 2 years ago by rouault

Reviewing sosi.patch.2213, I think that naming sqr() a method is a bit risky. There's a risk of symbol conflict with such a short and generic name. Perhaps making it a static function, or a static method of the class, or a macro... would avoit it.

I also want to mention that there is a OGRGeometryFactory::approximateArcAngles() method, but I do not know if it is directly usable for your needs.

Any news on the open-sourcing of the fyba lib ?

  Changed 21 months ago by rouault

I just saw your post on IRC. Is the open-sourcing of the fyba lib made such as it is intended to be just included in the directory where lies the OGR SOSI driver (and thus with the licence GDAL/OGR use), or is it a third-party lib that should be compiled separately and then refered by ./configure / nmake.opt ? If the former, we'll need to review it of course (you can attach it in this ticket). If the latter, not necessarily ;-)

  Changed 21 months ago by relet

Our release will be a standalone library, as it is used by other software packages. But I made sure that the licence is MIT/X style as is ogr's, and would like to include it in this driver directly. I wanted to see if you have any technical comments or objections after seeing the code. ;)

If that's ok with you I would like to send you a package directly, since the initial release is coordinated with PR and might take a moment longer. :P

  Changed 21 months ago by rouault

ok, you can send the package to me directly (provided you don't ask me to sign a NDA ;-). I can't promise a review in depth, mostly because I don't know anything about the SOSI format itself. What would be intersting to know is how you intend to manage the coexistence of the standalone library and its import into GDAL tree. The integration solution you describe could be similar to what is done with the MITAB or AVC drivers, that have both a upstream project, with its own revision control and bug tracker. When a bug related to one of this driver is filed in GDAL Trac, the procedure we follow is then to make sure that it is first fixed in the upstream project and then merged back in GDAL source tree to avoid forking from the upstream project.

Anyway, even if you cannot share your code with everybody, it might be good that you send an email to gdal-dev to share your plans and gets a feedback from the community. I woud like that the GDAL PSC knows on how you intend to manage the coexistence of the standalone library and its import into GDAL tree, and can express its opinion.

  Changed 21 months ago by relet

Thanks, good point. We had in mind to maintain the library in our public svn. I currently expect more contributions coming from ongoing internal development than external contributors but that might just as well change. The integration upstream sounds good. I'm a bit reluctant to set up an own bug tracker, just because it's a lot of trouble to maintain and ward from spam for a single small project. If it was for me, we would just use github or google code. I am going to discuss that.

I will send an email to gdal-dev then.

  Changed 21 months ago by rouault

I've seen your message on IRC but we're never connected at the same time apparently. You'll find my email address in the COMMITERS file at the root of GDAL source tree.

  Changed 21 months ago by rouault

Thomas (assuming it is your first name from the bickbucket invitation),

I've looked for 10 minutes at the FYBA code, but honestly don't expect me to spend more... The size of the code, the comments, functions and variable names in Norwegian, plus my lack ok knowledge of the formats are show stoppers for me to review as a after-day work (not sure I'd accept it as a day job either ;-)).

Well, from this quick visit, I have just a few comments :

  • The X/MIT licence should be included in all source files
  • the indentation is somewhat inconsistant. I guess there is a mix of spaces and tabs. Tabs should be banned as a way of indenting code
  • I see there are global variables. Doesn't sound good if there are multiple OGR SOSI datasources open at the same time...
  • I have the feeling the code is mainly designed for Windows (stdafx files, etc). How does it fly with other environments such as Linux ? I see the use of the "long" type: no issue with 64 bit Linux ?

I'm now convinced that including this code in GDAL would require discussion and agreement of the PSC. It is obvious that no-one apart you or your team would be able to maintain it, so I think that it would require you to be a GDAL commiter to do the integration and maintenance work. Perhaps in a first step, it could be done in a sandbox.

  Changed 21 months ago by rouault

  • cc warmerdam added

  Changed 21 months ago by relet

Thanks for the assessment. I guess it's good that you now have a feel for what the library looks like. ;)

Quick comments: - including the license and beautifying the code is no problem of course - The library was usually written for a wide spectrum of systems (hence the "UT" portability lib), then maintained for many years on Windows only. I got it to run fine on Linux 32/64bit, but lack the understanding of porting issues to call it a proper port. - I have been running fyba/sosi as the backend of a mapserver instance during testing, but not done further testing with multiple instances. I put that on my todo list. - We are running the code on 64bit Linux (usually for conversion jobs from SOSI->PostGIS). No issues with that so far, even for massive datasets, but I wouldn't totally exclude it.

  Changed 19 months ago by relet

To those following the thread.

The source code to fyba is available at  https://bitbucket.org/relet/fyba Some binary builds of fyba and ogr/sosi are available in the "releases" branch, but I'll be happy to include more or link to other builds if you manage to compile on platforms I do not have access to.

  Changed 2 months ago by relet

Update because of increased interest:

The latest versions can be found at our github:  http://github.com/kartverket/gdal  http://github.com/kartverket/fyba

Would love to get some help with correct the automake setup in gdal, so that I can issue a pull request to the official branch now and then.

Note: See TracTickets for help on using tickets.