Opened 16 years ago

Closed 16 years ago

Last modified 15 years ago

#2767 closed defect (wontfix)

Memory problem using mapscript shapefileObj

Reported by: adube Owned by: aboudreault
Priority: normal Milestone: 5.4 release
Component: MapScript-PHP Version: 5.0
Severity: normal Keywords:
Cc: adube@…, dmorissette

Description

After a couple of tests using mapscript 5.0.2 and PHP 5.2.5, I found out that the memory used by a simple script looping through all shapes of a shapefile keeps increasing, even after the shapefile object is freed using free().

The shapefile size 17.9 KB and has 16473 polygon shapes. Here's the results and the script I used for my tests :

### RESULTS ###

before getShape : 60220
after getShape  : 10180208
after free      : 10180208

### SCRIPT ###

<pre>
<?php
dl("php_mapscript.so");
$oShapeFile = ms_newShapefileObj(
    "../../data/bdga/BDGA_HYDRO_S_POLY.shp", -1);

echo "before getShape : ".memory_get_usage()."\n";

for ($i=0; $i<$oShapeFile->numshapes; $i++){
    $oShape = $oShapeFile->getShape($i);
}

echo "after getShape  : ".memory_get_usage()."\n";
$oShapeFile->free();
echo "after free      : ".memory_get_usage()."\n";
?>
</pre>

Change History (8)

comment:1 by dmorissette, 16 years ago

Cc: dmorissette added
Owner: changed from mapserverbugs to aboudreault

Alan, can you please check this?

comment:2 by adube, 16 years ago

I also tried with MS 5.2 and got the same problem :

before getShape : 60628
after getShape  : 10180716
after free      : 10180716

comment:3 by adube, 16 years ago

Oops ! The shapfile size is more like 17.9 MB.

Sorry.

comment:4 by aboudreault, 16 years ago

Resolution: wontfix
Status: newclosed

There is no problem with PHP/MapScript. There is no memory leaks either. All the memory is well freed at the end of the script. The problem is in PHP itself. The garbage collector seems to have some problems in this kind of cases. PHP guys are supposed to make a new GC patch for the 5.3 release.

comment:5 by dmorissette, 16 years ago

Alan, can you please include a link to the PHP bug or mailing list message that talks about this GC patch?

comment:6 by aboudreault, 16 years ago

Here's the response i received from the PHP mailing-list:

I think the engine doesn't really worry about freeing memory until it has to. Or was this only on circular references? I remember there was that whole GSOC project to implement garbage collection, but it never got implemented. Maybe someone who really knows will chime in.

you're right; I remember the discussion over on internals 3rd December last year; they made a new gc patch for 5.3; not sure if it got implemented or not..

here's what andy wrote (there followed a massive discussion)

[quote] Hi all,

Was hoping to send this off earlier but I was travelling for the past week and had very limited email access.

As promised in the past few weeks we have spent a significant amount of time in reviewing the garbage collector work and testing it in our performance lab. Dmitry has been exchanging ideas and patches with David Wang during this time. Suffice to say that as I've mentioned in the past, memory management is an extremely sensitive piece of PHP, which is why it was important for us to review this work in great depth before committing it to the code base.

The updated patch still retains the same algorithm as David's original implementation however the data structures have been changed in order to work faster and use less memory.

The revised patch has the following advantages:

  • It fixes several crash bugs
  • Enhances performance by removing several unnecessary checks
  • The memory overhead was reduced (from 12 to 4 bytes for each

heap-allocated zval)

  • The speed of "clean" PHP code (code that doesn't create cycles) was

improved

  • Additional test cases were created

The end result is a more stable and faster GC patch. That said we have yet to find real-life applications that create significant cycles which would benefit from this patch. In fact as you'll see from the results our tests show an increase in maximum memory use and slower execution (to be fair they are marginal).

We have tested both PHP_5_3 without any patches, the original patch and the new patch.

The following table shows execution time (seconds for N requests) and slowdown.

PHP_5_3

Original GC patch

Current GC patch

slowdown

slowdown

bench

11,550

12,310

6,58%

12,170

5,37%

hello

8,781

8,852

0,81%

8,813

0,36%

xoops

128,500

135,100

5,14%

130,200

1,32%

static

18,540

20,840

12,41%

18,920

2,05%

qdig

29,320

30,270

3,24%

29,610

0,99%

qdig2

13,960

14,100

1,00%

14,090

0,93%

The next table shows memory usage in MB and overhead

PHP_5_3

Original GC patch

Current GC patch

overhead

overhead

hello

13,750

13,945

1,42%

13,765

0,11%

xoops

18,036

18,636

3,33%

18,568

2,95%

static

15,300

16,000

4,58%

15,308

0,05%

qdig

14,820

15,008

1,27%

14,828

0,05%

qdig2

14,824

15,012

1,27%

14,838

0,09%

To summarize the patch lead to approx. 5% slowdown and 3% memory overhead for typical applications (as always, you mileage may vary depending on your system's architecture and OS although my guesstimate is that you will see even worse results in a 64bit environment). We also tested SugarCRM to get another sanity for these results and we got similar results.

I am not quite sure where this leaves us with this patch. On one hand I think now the effort has been made it's a good thing to offer it as part of PHP. The downside is of course some performance degradation and possible instabilities as this patch continues to stabilize (it took about two releases for us to stabilize the Zend Memory Manager even after in-depth testing due to edge cases in allocation patterns and various extensions, etc...).I'd be surprised if our team has found all possible bugs.

Personally I think the decision should be either in or out. Adding this as a compile option is not a good idea as it would create binary compatibility issues and would be a pain for the community. I think my inclination is to commit the patch, not have it #ifdef'ed but always compiled but to have the garbage collection itself turned off by default (mainly for stability reasons. Note: the increased memory footprint and performance loss also exists with the collection itself turned off). We can turn it on when we're in dev for snapshots so that we iron out bugs. That said, as you can see from the results most people and real-life applications will be worse off than today.

Thanks to David & Dmitry for working hard on this (and everyone else who contributed).

The stage is open for ideas/thoughts/suggestions J

Andi quote

comment:7 by aboudreault, 16 years ago

Here's a related bug about the garbage collector bug: http://bugs.php.net/bug.php?id=33595
It seems to be fixed in the php trunk. I'll try to test the PHP 5.3 alpha1 with php/mapscript extension in the next days to see if the patch solve the problem.

comment:8 by aboudreault, 15 years ago

I just committed a very small change in php mapscript that solve the primary problem.(r8445)

  1. without the patch (php config: memory_limit = 16M)
    PHP Memory usage test
    Beginning of script : 67072 bytes
    shapeFile object test
    -------------------------------------------------
    shapeFile object created: 68568
    Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 16 bytes) in /opt/fgs/www/htdocs/2767/shapeTest.php on line 19
    
  1. with the patch (php config: memory_limit = 16M)
    PHP Memory usage test
    Beginning of script : 67072 bytes
    shapeFile object test
    -------------------------------------------------
    shapeFile object created: 68568
    End of script : 69812 bytes
    

Now, the "internal GC of php" releases properly the C objects when needed. However, there is still a "small" bug. All "normal php objects" are well freed when it's the time, but all "php objects that have other php objects in their class members" are just freed at the end of the script. I tried a few "patches" from other php extensions... but no success. Our php extension will need to be rewrited, a day :P, using all up-to-date PHP API methods (or using swig). At least, the primary bug is fixed.

Note: See TracTickets for help on using tickets.