Opened 19 years ago

Closed 18 years ago

Last modified 18 years ago

#1322 closed defect (fixed)

saveWebImage filenames not unique

Reported by: brage@… Owned by: dmorissette
Priority: high Milestone: 4.8 release
Component: MapScript-PHP Version: 4.4
Severity: major Keywords: VERIFIED
Cc: bartvde@…, sylvain.pasche@…, bill@…

Description

I have had several instances of map images beeing overwritten because of the
filenames  generated by saveWebImage are not unique when running PHP/Mapscript
under mod_php.

The time() and getpid() parts of the filename generated by saveWebImage refer to
the parent apache process, but the counter part of the filename is only unique
within each child process.

Attachments (2)

diff (3.9 KB ) - added by brage@… 18 years ago.
let savewebimage use msTmpFile
bug1322.tar.gz (667 bytes ) - added by dmorissette 18 years ago.
Scripts to reproduce the issue (.tar.gz)

Download all attachments as: .zip

Change History (41)

comment:1 by dmorissette, 19 years ago

blocked: 67
Cc: mapserver-bugs@… added
Milestone: 4.6 release
Owner: changed from mapserverbugs to dmorissette@…
Um, nice catch, that would explain the old bug 67 (I now feel ashamed for never
taking time to research bug 67 any deeper).

I'll need to get setup to test with the PHP module and look into this ASAP.

comment:2 by bartvde@…, 19 years ago

Cc: bartvde@… added
We have been experiencing this as well using Neapoljs, which uses Mapscript
4.2.5, when using PHP as a module on Linux.

Adding myself to the cc.

comment:3 by brage@…, 19 years ago

Maybe this could be fixed before releasing 4.4.3/4.6? The fix should be trivial
- move the getpid()-call from phpms_init_globals() to php3_ms_img_saveWebImage().

comment:4 by brage@…, 19 years ago

| The fix should be trivial
| move the getpid()-call from phpms_init_globals() to php3_ms_img_saveWebImage().

I was a bit fast there. Obviously just moving the getpid() won't be enough for
all cases since the pids might be reused.



comment:5 by dmorissette, 19 years ago

Cc: bill@… added

comment:6 by dmorissette, 19 years ago

Status: newassigned
Bill Binko wrote:

> Ok: quickly looking at that: is there any reason we're not using the system
tempnam?
>

I don't remember exactly, but I think the reason was that tempnam() returns a
filename that doesn't exist yet but it doesn't create it in an atomic way, so by
the time you create that file, nothing guarantees that another process/thread
didn't call tempnam() as well and get the same response. Instead of dealing with
the fact that tempnam() + fopen(..., "w") is not atomic, it seemed simpler to
use a unique Id for the filename based on processId + timestamp + counter which
is guaranteed to be unique... well, there seems to be a bug with the way the
counter is initialized at the moment, but once that's fixed we should be in
business. I hope so anyway, it's been a while since I looked at that code.

> Either that or appending a count if the file exists.
>

Looping and appending a count is also an option but is not atomic either and
needs the same kind of handling of possible concurrent processes trying to
create files with the same name.

I'll have another look now, I just need to find the right place to initialize
the counter in the PHP init functions. 

comment:7 by dmorissette, 19 years ago

The lifecycle of an extension and the way to deal with globals and initalization
is all explained here... sounds like I need to go back and do some reading:
http://www.zend.com/php/internals/extension-writing1.php

comment:8 by brage@…, 19 years ago

Any reason why msTmpFile(), or the same approach as msTmpFile() cannot be used?

However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to be
unique, but collisions should be very rare.

comment:9 by dmorissette, 19 years ago

> Any reason why msTmpFile(), or the same approach as msTmpFile() cannot be used?

The PHP module tries to do exactly the same thing as msTmpFile(), except that it
tries to do things "the right way" for a multithread PHP extension.

> However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to
> be unique, but collisions should be very rare.

I guess that depends on whether each instance of mapscript in memory gets a
different pid or not. In your opening comment to this bug, you wrote that the
pid "refer to the parent apache process", I assume you meant the main apache
process as opposed to the pid of the child in which we're running, which implies
that multiple mapscript instances shared the same pid, right? In this case the
chances of collisions with msTmpFile() are not negligible and we need proper
handling of the counter (shared between all threads) to avoid collisions. That's
the kind of stuff I would like to verify to fix this.

comment:10 by brage@…, 19 years ago

> The PHP module tries to do exactly the same thing as msTmpFile(), except that
> it tries to do things "the right way" for a multithread PHP extension.

Now I'm a bit confused. Wouldn't thread safety for tmpid/tmpcounter defeat their
purpose?

>> However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to
>> be unique, but collisions should be very rare.

>I guess that depends on whether each instance of mapscript in memory gets a
>different pid or not. In your opening comment to this bug, you wrote that the
>pid "refer to the parent apache process", I assume you meant the main apache
>process as opposed to the pid of the child in which we're running, which 
>implies that multiple mapscript instances shared the same pid, right? In this
>case the chances of collisions with msTmpFile() are not negligible and we need 

Yes that is what I meant in the bug report.

> proper handling of the counter (shared between all threads) to avoid 
> collisions. That's the kind of stuff I would like to verify to fix this.

I might be wrong, but aren't static variables like in msTmpFile shared by all
threads within each process? 

The reason  timestamp + pid + counter isn't guaranteed to be unique even with
correct handling of timestamp/pid/counter is that the concatenation of
pid+counter may refer to several different pid/counter-combination (is 1111 pid
11 #11 or pid 111 #1?). This type of collisions will of course be very rare.

comment:11 by bill@…, 19 years ago

Daniel (et al),

I really think this should be done in a more standard way.  A bit of googling
has lead to MANY security exploits related to how tempfiles are created.

It looks like the current recommendation for temporary file creation on Unix
(and its friends) is to use mkstemp() (see
http://www.opengroup.org/onlinepubs/009695399/functions/mkstemp.html )

For Windows, it looks like GetTempFileName is the choice ( see
http://tinyurl.com/ag9ps )

Here's an example of an exploit based on flaws like this:
http://www.outpost9.com/exploits/mailx.bug

Perhaps its time to redo msTmpFile() to use a more standard (and secure) method?

Bill


comment:12 by bill@…, 19 years ago

By the way, I should have mentioned that the reason those two methods are
recommended is that they both create an actual file before they return.  That
allows the atomic file creation system call to be used.


comment:13 by dmorissette, 19 years ago

I like the idea of using mkstemp() and GetTempFileName() if they both are
atomic, but I'm worried of breaking the build a few days before the final
release. I know this sounds like a trivial change, but it's this kind of last
minute trivial changes that can somtimes turn bad and cause delayed or buggy
releases.

I'd propose that we do this right after the 4.6.0 release to be included in
4.6.1, and that we take the time to test this well on all platforms.

comment:14 by bill@…, 19 years ago

I'm fine with that: I only got involved with this bug because you didn't want to
invite PHP testers until it was fixed.  If we're postponing this to 4.6.1, can
we ask for PHP testers now?

Bill

comment:15 by dmorissette, 19 years ago

Sure, go ahead, but you may want to warn them that saveWebImage() is not safe
yet so their script should avoid using it.

comment:16 by bill@…, 19 years ago

I was just thinking about this, and shouldn't we change msTempFile to use the
atomic temp file and then use that from Mapscript?  That would change the scope
of the bug to be mapserver wide, and we should include Steve, but wouldn't that
be the best approach?

Write one, secure, atomic, thread safe implementaiotn of msTempFile and then use
it from everywhere?

Just a thought

comment:17 by dmorissette, 19 years ago

Cc: steve.lime@… added
Yes, that's what I was thinking of doing, but forgot to mention it. I've added
Steve to the CC to bring him in the loop.

comment:18 by bill@…, 19 years ago

Daniel,

Since we are reducing this problem to fixing the PHP Mapscript filename creation
(for 4.6) here is what I've found.  It is a simple problem (and solution).

The Time and PID portion of the tmpid are set in phpms_init_globals(), and the
counter is set to 0.  XXX_init_globals is called only once for each process
(top-level process) that loads module XXX.  That is, when the .so gets loaded by
the process.  

See http://www.zend.com/php/internals/extension-writing1.php

Here's the relavent quote:

"The key lies in when the two functions are called. php_hello_init_globals() is
only called when a new process or thread is started; however, each process can
serve more than one request..."

So what is happening (on multi-threaded platforms) is that when the parent
process is created, the tmpId is being set to the time&pid of the parent and
never changes.  The counter is set to 0.

The problem is that the variables defines inside ZEND_BEGIN_MODULE_GLOBALS are
thread-safe and (HTTP) request-specific.  That means that the increment that is
used is only seen within a single request.  Two concurrent requests will not see
each others increments.


The problem is fairly simple, the solution may not be.  I think the simplest
solution would be to generate tempId during each request (saveImage()) using
something other than PID.  Perhaps the php APIs have a getRequestId() or
getThreadID() that could be used.

Bill

comment:19 by brage@…, 19 years ago

> ------- Additional Comments From bill@binko.net  2005-08-02 17:07 -------

> The Time and PID portion of the tmpid are set in phpms_init_globals(),
> and the counter is set to 0.  XXX_init_globals is called only once for
> each process (top-level process) that loads module XXX.  That is, when
> the .so gets loaded by the process.
>
> So what is happening (on multi-threaded platforms) is that when the parent
> process is created, the tmpId is being set to the time&pid of the parent and
> never changes.  The counter is set to 0.

This is a problem with all apache environments, not just multi-threaded.
In the case of multi-process and preloaded modules, the listening process load
the module and run XXX_init_globals before it forks of the serving processes.

The current code probably works in the multi-process case if the
mapserver module is loaded with dl() instead of preloaded (like it is in
my case).

> The problem is that the variables defines inside
> ZEND_BEGIN_MODULE_GLOBALS are thread-safe and (HTTP) request-specific.
> That means that the increment that is used is only seen within a single
> request. Two concurrent requests will not see each others increments.

Making tmpId and tmpCount thread-safe with ZEND_BEGIN_MODULE_GLOBALS makes
the problem worse. But making them normal globals would only help the
multi-threaded case, not the multi-process one. For the multi-process case
the initialization of tmpId must be postponed to after forking.

> The problem is fairly simple, the solution may not be.  I think the
> simplest solution would be to generate tempId during each request
> (saveImage()) using something other than PID.  Perhaps the php APIs
> have a getRequestId() or getThreadID() that could be used.

Making tmpId and tmpCount normal process globals, and initializing tmpId
with getpid() and time() upon first use should in principle work for both
for multi-threaded and multi-process apache.

However, there might be a small problem with Linux for the threaded case. Linux
assigns a new pid to each thread, and getpid() returns this temporary pid
instead of the main pid. This could lead to collisions if the pid is reused
by different processes within the same second, but that should be extremly
rare if not impossible.


comment:20 by brage@…, 19 years ago

Since 4.8 is drawing closer..

Again, why wouldn't msTmpFile() work?
- tmpCount is a global, so it is shared between all threads within each process
- tmpCount is initialized upon first use, so it should always refer
  to the correct process

Or am I totally mistaken?

comment:21 by dmorissette, 19 years ago

Milestone: 4.6 release4.8 release
I haven't checked but I want to try that alternative, perhaps simply with a
mutex around the counter.

comment:22 by bill@…, 18 years ago

Dan,

Wrapping all of the contents of msForceTmpFileBase() and the one line in
msTmpFile() that calls tmpCount++ with a single mutex should solve the PHP race
condition.  It doesn't solve the security issues, but those can be post 4.8, or
never dealt with (although a separate bug might be nice).

Bill

comment:23 by bartvde@…, 18 years ago

Guys, what is the status of this? I am running into this again with 4.8.3 :-(
Any chance of a 4.8.X fix?

comment:24 by bartvde@…, 18 years ago

I tried Bill's suggestion (as per comment #21) but had no luck, but maybe I got
things wrong. What I did is:

1)
in mapthread.h I added a constant:
#define TLOCK_TMPFILE 11

2)
in maputil.c I changed msForceTmpFileBase in order to use the lock.

/************************************************************************/
/*                         msForceTmpFileBase()                         */
/************************************************************************/

static int tmpCount = 0;
static char *ForcedTmpBase = NULL;

void msForceTmpFileBase( const char *new_base )
{
  msAcquireLock( TLOCK_TMPFILE );

/* -------------------------------------------------------------------- */
/*      Clear previous setting, if any.                                 */
/* -------------------------------------------------------------------- */
    if( ForcedTmpBase != NULL )
    {
        free( ForcedTmpBase );
        ForcedTmpBase = NULL;
    }

    tmpCount = -1;

    if( new_base == NULL )
    {
        msReleaseLock( TLOCK_TMPFILE );
        return;
    }

/* -------------------------------------------------------------------- */
/*      Record new base.                                                */
/* -------------------------------------------------------------------- */
    ForcedTmpBase = strdup( new_base );
    tmpCount = 0;
    msReleaseLock( TLOCK_TMPFILE );
}

3) in maputil.c, msTmpFile I used the same lock around the line which increases
the counter:

    msAcquireLock( TLOCK_TMPFILE );
    sprintf(tmpFname, "%s%d.%s", tmpId, tmpCount++, ext);
    msReleaseLock( TLOCK_TMPFILE );

Am I doing something wrong?

comment:25 by brage@…, 18 years ago

Did you update php_mapscript.c as well? It doesn't currently use msTmpfile().

comment:26 by bartvde@…, 18 years ago

No I didn't. Do you have an example/code fragment of what should be changed?
Thanks in advance.

by brage@…, 18 years ago

Attachment: diff added

let savewebimage use msTmpFile

comment:27 by bartvde@…, 18 years ago

I have tried your patch next to the changes I described in comment #23 but now I
get the same tmpfilename all the time:

http://145.50.148.45/ms_tmp/1150202079133901.png

it's always the same filename.

comment:28 by bartvde@…, 18 years ago

I was too quick, it's not always the same, but still the same problem persists
of temp file names not being unique.

comment:29 by sylvain.pasche@…, 18 years ago

Cc: sylvain.pasche@… added

comment:30 by brage@…, 18 years ago

Bart,
Can you confirm that you still got non-unique filenames with the patch applied? 

I have been testing the patch, and it does return filenames created by
msTmpFile, that is time()+getpid()+(within process counter). It seems strange
that the msTmpFile() code should suffer the same problems.

(To check if the correct msTmpFile() is used, check that the time()-part of the
filename (the first ten digits) is incremented every second.)

I have also been running it for months without problems.


comment:31 by dmorissette, 18 years ago

Guys, I would really like to fix this bug, but before committing any change I
would like to reproduce the issue, and so far I am unable to reproduce it.

My test environment is Apache 1.3.33 with PHP 4.3.11 compiled as a DSO
(libexec/libphp4.so), with php_mapscript.so from CVS HEAD. I have two PHP
scripts that each draw a different map, call save WebImage() and then use
readfile() to return the contents of the temp file to the client. I use
http_load to generate multiple parallel requests on the server with both URLs
with the -checksum option to validate the results and no errors are detected.

I tried both loading php_mapscript.so at runtime with dl() or preloading in
php.ini and in both cases no errors.

Can you please tell me which Apache/PHP combinations you use to get this error?

comment:32 by bartvde@…, 18 years ago

Hi Daniel,

we use Apache 2.0.41 and PHP 4.3.11. We cannot use dl in our environment (dl()
is not supported in multithreaded Web servers.). Phpinfo in our case says for
the Server API Apache 2.0 Handler (from my bare head).

Bart

comment:33 by brage@…, 18 years ago

To reproduce the bug use a non-threaded apache, (1.3 or 2.0 using the Prefork
MPM), and load php_mapscript.so in php.ini.

Notice that the stem (time+pid) of the filename is the same for all generated
filenames. A collision occurs when the counter of each process matches. To
increase the chance of collision, set the MaxRequestsPerChild to some ridiculous
low value (2?) in httpd.conf. 

by dmorissette, 18 years ago

Attachment: bug1322.tar.gz added

Scripts to reproduce the issue (.tar.gz)

comment:34 by dmorissette, 18 years ago

Another note: I verified that with my config if you use dl() then I get no
collisions. I get collisions only if I preload php_mapscript.so in php.ini.

comment:35 by dmorissette, 18 years ago

Resolution: fixed
Status: assignedclosed
Using msTmpFile() fixes the issue for my test environment.

I committed the fix in CVS HEAD (4.9/4.10) and will mark this bug fixed.

Bart, can you please test again with the CVS version and reopen with more
details if you still notice the problem?

Notes:
- The fix uses msTmpFile() in php_mapscript.so as per Brage's patch
- I modified msTmpFile() to put a lock around the line that increments tmpCount
- I modified msTmpFile() to include '_' as separators in the tmp filename
between the pid, time and counter to prevent the possibility of collisions that
was raised earlier in this bug
- I modified msTmpFile() to use hex encoding for the pid, time and counter
values as suggested by John N. on the -dev list since that makes the temporary
filename more compact.
- I noticed that Brage's patch uses msBuildPath() to build the URL returned by
saveWebImage(). This works now but I'm worried that this may break later if
msBuildPath() is ever modified to do file system checks on the path that it
builds. Anyway, we'll keep it like this for now.

comment:36 by dmorissette, 18 years ago

Cc: dmorissette@… added

comment:37 by dmorissette, 18 years ago

*** Bug 67 has been marked as a duplicate of this bug. ***

comment:38 by bartvde@…, 18 years ago

Daniel, I tested and it seems to work fine now. Thanks. Bart. Also a big thank
you to Brage.

comment:39 by dmorissette, 18 years ago

Keywords: VERIFIED added
Woohoo! Finally this saga is over! Thanks for testing the fix Bart. And thanks
to all for the help and input.
Note: See TracTickets for help on using tickets.