Opened 10 years ago

Closed 10 years ago

#3514 closed defect (fixed)

Support for large DBF/SHP files

Reported by: jmckenna Owned by: warmerdam
Priority: normal Milestone: 6.0 release
Component: MapServer C Library Version: unspecified
Severity: normal Keywords:
Cc: sdlime, dmorissette, pramsey, warmerdam, woodbri


One of the requirement's of this year's benchmarking exercise is to display a large shapefile (DBF is 2.9GB, SHP is 1.3 GB). The file's name is 'buildings.shp'. The problem is that with native MapServer the file is not displaying. If I change to an OGR connection the file does display. Other server teams can display this same shapefile (in fact 2 days ago GeoServer? had to be modified to display this same large shapefile...'use longs instead of ints for dbase offsets'). MapServer must also be hitting some kind of limit with this file.

I have created a small mapfile on the Linux and Windows benchmark machines (/benchmarking/mapserver/ We can provide full access to these machines if you wish.

Also here are working getmap requests showing our problem:

Change History (9)

comment:1 Changed 10 years ago by warmerdam

Cc: warmerdam added

The same issue was encountered in shapelib, and the solution was too use a "large file API" for access to the DBF file instead of the traditional '32bit signed' stdio. There may also need to be some care with use of unsigned longs instead of signed longs for offset calculations.

I could take on this ticket if Daniel wants it addressed as part of our maintenance arrangement.

comment:2 Changed 10 years ago by pramsey

The "hack fix" approach is available here, if you are in a hurry and not committing back.

comment:3 Changed 10 years ago by warmerdam

I would note this (Paul's referenced patch) is only likely to help on 64bit systems with long and int are not the same.

comment:4 Changed 10 years ago by woodbri

Cc: woodbri added

comment:5 Changed 10 years ago by dmorissette

Milestone: 6.0 release
Owner: changed from sdlime to warmerdam

Frank, yes please go ahead. Note however that since this is more than just a trivial fix it should happen only in 6.0 and cannot be backported to 5.6 unless the PSC decides to make an exception.

comment:6 Changed 10 years ago by warmerdam

Status: newassigned

My usual approach to large file support is to "hook" the IO api. On unix there are a few options as seen in gdal/port/cpl_vsil_unix_stdio_64.cpp while on windows I use the native win32 file io api (CreateFile?(), etc). However, this is a fairly heavy weight approach. Digging a bit deeper, for instance in it seems there are easier alternatives.

I have confirmed that on 32bit linux defining _FILE_OFFSET_BITS=64 allows writing large files. But it does not alter the "long" argument to fseek(). I will continue to dig...

comment:7 Changed 10 years ago by warmerdam

Preliminary incompletely tested fix in trunk (r10462, r10463). I have not confirmed mapserver itself working with a >2GB file. The fix should work on VStudio 2005 or later, newish 32bit unix/linux and 64bit unix/linux.

It would be fairly involved to test so I'm going to "assume" it works for now.

comment:8 Changed 10 years ago by jmckenna

I have confirmed this fix with the problem 2.9GB DBF file, on Windows x64. Great work Frank!!!! Thank you so much. I'll ask Mike to rebuild MapServer on Linux to test and I will report back.

comment:9 Changed 10 years ago by warmerdam

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.