Opened 17 years ago
Closed 16 years ago
#1594 closed defect (fixed)
Memory allocation error during SPATIAL INDEX creation for Shapefile
Reported by: | Mateusz Łoskot | Owned by: | Mateusz Łoskot |
---|---|---|---|
Priority: | normal | Milestone: | 1.4.3 |
Component: | OGR_SF | Version: | unspecified |
Severity: | normal | Keywords: | index shapefile spatial |
Cc: | ftrastour, warmerdam |
Description (last modified by )
Yesterday, user with nickname Kredik reported following problem with creating spatial index for his Shapefile. Here is the story:
I am trying to index a point shapefile. I use:
ogrinfo -sql "CREATE SPATIAL INDEX ON temp" temp.shp
ogrinfo uses more than 1.5Go of VM and the process failed with a memory allocation error. If i use shptree temp.shp the index creation is done in less than 15 seconds...
The shptree.exe binary is taken from fwtools 1.1.4 (i can't find the source of this application)
It's a point shapefile, 1136000 features (PointZ)
I have tried to specify the depth in the CREATE SPATIAL INDEX statement. I think the default depth used is 19. I have tried with 8, 7,... but the memory footprint is always very large.
Kredik, sent me his file and I confirm the problem occurs. I also tested index creation with other big files I have (ie. from Mass GIS) and here everything works. I suppose the problem is with Kredik's file (he is going to confirm if the file is valid or not).
Although, Frank suggests to investigate the problem with Kredik's file, so we will know what's the reason of the problem.
Attachments (3)
Change History (14)
comment:1 by , 17 years ago
Description: | modified (diff) |
---|
comment:2 by , 17 years ago
Cc: | added |
---|---|
Owner: | changed from | to
comment:3 by , 17 years ago
Milestone: | → 1.4.3 |
---|
comment:4 by , 17 years ago
Status: | new → assigned |
---|
Unfortunately, I've lost Kredik's test file. I tried to reproduce this problem using other big files but without luck. Kredik or Frank, do you still have it on your disk? Could you send it to me?
I suppose, this problem might be related to 3D geometries, perhaps similar to #1790
comment:5 by , 16 years ago
Trying to reproduce this issue:
~/dev/gdal/bugs/1594 $ ogrinfo -sql "CREATE SPATIAL INDEX ON ptsz" ptsz.shp -bash: ogrinfo: command not found ~/dev/gdal/bugs/1594 $ ~/dev/gdal/_svn/trunk/gdal/apps/ogrinfo -sql "CREATE SPATIAL INDEX ON ptsz" ptsz.shp INFO: Open of `ptsz.shp' using driver `ESRI Shapefile' successful. ogrinfo(9627) malloc: *** vm_allocate(size=1069056) failed (error code=3) ogrinfo(9627) malloc: *** error: can't allocate region ogrinfo(9627) malloc: *** set a breakpoint in szone_error to debug Bus error
I'm using Mac Pro, 2x2.66 GHz Intel Xeon (4 cores) + 5GB RAM.
~/dev/gdal/bugs/1594 $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) 6144 file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 256 pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 266 virtual memory (kbytes, -v) unlimited
comment:6 by , 16 years ago
Analysing vm_allocate(size=1069056) failed (error code=3) message:
- code 3 means KERN_NO_SPACE
- KERN_NO_SPACE is defined in /usr/include/mach/kern_return.h (Mac OS X 10.4)
#define KERN_NO_SPACE 3 /* The address range specified is already in use, or * no address range of the size specified could be * found. */
comment:7 by , 16 years ago
The two last comments above apply to tests made using Kredik's dataset:
~/dev/gdal/bugs/1594 $ ls -lh total 109344 -rw-rw-rw- 1 mloskot mloskot 14M Oct 24 18:27 ptsz.dbf -rw-rw-rw- 1 mloskot mloskot 32M Oct 24 18:34 ptsz.shp -rw-rw-rw- 1 mloskot mloskot 7M Oct 24 18:34 ptsz.shx
by , 16 years ago
Attachment: | ogrinfo-winxp-test-1.png added |
---|
First, spatial index generation test on Windows (note amount of VM used)
by , 16 years ago
Attachment: | ogrinfo-winxp-test-2.png added |
---|
Second spatial index test on Windows after added defensive code to shptree.c. Again, compare amount of VM used with VM in test No 1
comment:8 by , 16 years ago
Similarly to the tests I run under Windows, ogrinfo under Mac OS X every time fails near VM usage of 1.8 GB.
comment:9 by , 16 years ago
The problem seems to be identified. All tests were made using quite big (33 MB) shapefile I got from Kredik:
D:\dev\gdal\bugs\1594>%GDAL%\apps\ogrinfo -so ptsz.shp ptsz OGR: OGROpen(ptsz.shp/003B57D8) succeeded as ESRI Shapefile. INFO: Open of `ptsz.shp' using driver `ESRI Shapefile' successful. OGR: GetLayerCount() = 1 Layer name: ptsz Geometry: 3D Point Feature Count: 932870 Extent: (440001.000000, 5652001.000000) - (441999.000000, 5653999.000000) Layer SRS WKT: (unknown) X: Integer (6.0) Y: Integer (7.0) Z: Integer (2.0)
The indexing algorithm, if not requested differently, calculates number of tree levels automatically and for the ptsz.shp file it's calculated as 17-18 levels. This number of levels requires a lot of allocations of memory what causes the memory failure.
There are two solutions possible, first one does not need any changes in the ode, second one does:
- Avoid automatic estimation of tree levels by specifying it manually this way:
CREATE SPATIAL INDEX ON mylayer DEPTH 8
- Use max level limit as harcoded value in the shptree algorithm, for example 8, 10 or 12 levels.
Probably, we will also fix this issue following the second solution, to avoid similar problems in future.
Users can try to estimate max level of tree nodes by trying to create spatial index a few times using different values and observing if it succeeds or not.
For example, on my Mac OS, I found that I can generate spatial index for Kredik's shapefile with level 16:
D:\dev\gdal\bugs\1594>%GDAL%\apps\ogrinfo -sql "CREATE SPATIAL INDEX ON ptsz DEPTH 16" ptsz.shp OGR: OGROpen(ptsz.shp/003B5B80) succeeded as ESRI Shapefile. INFO: Open of `ptsz.shp' using driver `ESRI Shapefile' successful. SHAPE: Creating index file ptsz.qix OGR: GetLayerCount() = 1
Using DEPTH equal to 16, produced index file is of size 213 MB:
D:\dev\gdal\bugs\1594>ls -lh total 267M -rw-rw-rw- 1 mloskot 0 15M 2007-10-24 18:27 ptsz.dbf -rw-rw-rw- 1 mloskot 0 213M 2007-10-26 07:00 ptsz.qix -rw-rw-rw- 1 mloskot 0 33M 2007-10-24 18:34 ptsz.shp -rw-rw-rw- 1 mloskot 0 7.2M 2007-10-24 18:34 ptsz.shx
I hope it makes sense and is helpful to understand the problem and how to solve it.
The bug will be closed as fixed after patch is applied following the second solution.
comment:10 by , 16 years ago
Mateusz,
If maxdepth is not passed in I think we should use a value of 12 instead of the current apparently unbounded depth.
comment:11 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I applied fix following the second proposed solution using value suggested by Frank.
#define MAX_DEFAULT_TREE_DEPTH 12
Now, if user does not specify depth of the spatial index tree, the algorithm makes simple estimation based number of features in a shapefile. If this calculated number of tree levels is higher than MAX_DEFAULT_TREE_DEPTH, the algorithm falls back with using MAX_DEFAULT_TREE_DEPTH value (and short message is printed if CPL_DEBUG=ON is set).
by , 16 years ago
Attachment: | shptree-depth-size-chart.png added |
---|
Chart presenting how DEPTH value influences size of .qix file. This chart was generated using Kredik's test dataset (ptsz.shp, 33MB)
Target to fix this for 1.4.3 ...
I don't know that I still have the data for this bug, so hopefully this won't prove too hard to reproduce.