#1322 closed defect (fixed)
saveWebImage filenames not unique
Reported by: | Owned by: | dmorissette | |
---|---|---|---|
Priority: | high | Milestone: | 4.8 release |
Component: | MapScript-PHP | Version: | 4.4 |
Severity: | major | Keywords: | VERIFIED |
Cc: | bartvde@…, sylvain.pasche@…, bill@… |
Description
I have had several instances of map images beeing overwritten because of the filenames generated by saveWebImage are not unique when running PHP/Mapscript under mod_php. The time() and getpid() parts of the filename generated by saveWebImage refer to the parent apache process, but the counter part of the filename is only unique within each child process.
Attachments (2)
Change History (41)
comment:2 by , 19 years ago
Cc: | added |
---|
We have been experiencing this as well using Neapoljs, which uses Mapscript 4.2.5, when using PHP as a module on Linux. Adding myself to the cc.
comment:3 by , 19 years ago
Maybe this could be fixed before releasing 4.4.3/4.6? The fix should be trivial - move the getpid()-call from phpms_init_globals() to php3_ms_img_saveWebImage().
comment:4 by , 19 years ago
| The fix should be trivial | move the getpid()-call from phpms_init_globals() to php3_ms_img_saveWebImage(). I was a bit fast there. Obviously just moving the getpid() won't be enough for all cases since the pids might be reused.
comment:5 by , 19 years ago
Cc: | added |
---|
comment:6 by , 19 years ago
Status: | new → assigned |
---|
Bill Binko wrote: > Ok: quickly looking at that: is there any reason we're not using the system tempnam? > I don't remember exactly, but I think the reason was that tempnam() returns a filename that doesn't exist yet but it doesn't create it in an atomic way, so by the time you create that file, nothing guarantees that another process/thread didn't call tempnam() as well and get the same response. Instead of dealing with the fact that tempnam() + fopen(..., "w") is not atomic, it seemed simpler to use a unique Id for the filename based on processId + timestamp + counter which is guaranteed to be unique... well, there seems to be a bug with the way the counter is initialized at the moment, but once that's fixed we should be in business. I hope so anyway, it's been a while since I looked at that code. > Either that or appending a count if the file exists. > Looping and appending a count is also an option but is not atomic either and needs the same kind of handling of possible concurrent processes trying to create files with the same name. I'll have another look now, I just need to find the right place to initialize the counter in the PHP init functions.
comment:7 by , 19 years ago
The lifecycle of an extension and the way to deal with globals and initalization is all explained here... sounds like I need to go back and do some reading: http://www.zend.com/php/internals/extension-writing1.php
comment:8 by , 19 years ago
Any reason why msTmpFile(), or the same approach as msTmpFile() cannot be used? However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to be unique, but collisions should be very rare.
comment:9 by , 19 years ago
> Any reason why msTmpFile(), or the same approach as msTmpFile() cannot be used? The PHP module tries to do exactly the same thing as msTmpFile(), except that it tries to do things "the right way" for a multithread PHP extension. > However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to > be unique, but collisions should be very rare. I guess that depends on whether each instance of mapscript in memory gets a different pid or not. In your opening comment to this bug, you wrote that the pid "refer to the parent apache process", I assume you meant the main apache process as opposed to the pid of the child in which we're running, which implies that multiple mapscript instances shared the same pid, right? In this case the chances of collisions with msTmpFile() are not negligible and we need proper handling of the counter (shared between all threads) to avoid collisions. That's the kind of stuff I would like to verify to fix this.
comment:10 by , 19 years ago
> The PHP module tries to do exactly the same thing as msTmpFile(), except that > it tries to do things "the right way" for a multithread PHP extension. Now I'm a bit confused. Wouldn't thread safety for tmpid/tmpcounter defeat their purpose? >> However, the concatenation of timestamp + pid + counter isn't _guaranteed_ to >> be unique, but collisions should be very rare. >I guess that depends on whether each instance of mapscript in memory gets a >different pid or not. In your opening comment to this bug, you wrote that the >pid "refer to the parent apache process", I assume you meant the main apache >process as opposed to the pid of the child in which we're running, which >implies that multiple mapscript instances shared the same pid, right? In this >case the chances of collisions with msTmpFile() are not negligible and we need Yes that is what I meant in the bug report. > proper handling of the counter (shared between all threads) to avoid > collisions. That's the kind of stuff I would like to verify to fix this. I might be wrong, but aren't static variables like in msTmpFile shared by all threads within each process? The reason timestamp + pid + counter isn't guaranteed to be unique even with correct handling of timestamp/pid/counter is that the concatenation of pid+counter may refer to several different pid/counter-combination (is 1111 pid 11 #11 or pid 111 #1?). This type of collisions will of course be very rare.
comment:11 by , 19 years ago
Daniel (et al), I really think this should be done in a more standard way. A bit of googling has lead to MANY security exploits related to how tempfiles are created. It looks like the current recommendation for temporary file creation on Unix (and its friends) is to use mkstemp() (see http://www.opengroup.org/onlinepubs/009695399/functions/mkstemp.html ) For Windows, it looks like GetTempFileName is the choice ( see http://tinyurl.com/ag9ps ) Here's an example of an exploit based on flaws like this: http://www.outpost9.com/exploits/mailx.bug Perhaps its time to redo msTmpFile() to use a more standard (and secure) method? Bill
comment:12 by , 19 years ago
By the way, I should have mentioned that the reason those two methods are recommended is that they both create an actual file before they return. That allows the atomic file creation system call to be used.
comment:13 by , 19 years ago
I like the idea of using mkstemp() and GetTempFileName() if they both are atomic, but I'm worried of breaking the build a few days before the final release. I know this sounds like a trivial change, but it's this kind of last minute trivial changes that can somtimes turn bad and cause delayed or buggy releases. I'd propose that we do this right after the 4.6.0 release to be included in 4.6.1, and that we take the time to test this well on all platforms.
comment:14 by , 19 years ago
I'm fine with that: I only got involved with this bug because you didn't want to invite PHP testers until it was fixed. If we're postponing this to 4.6.1, can we ask for PHP testers now? Bill
comment:15 by , 19 years ago
Sure, go ahead, but you may want to warn them that saveWebImage() is not safe yet so their script should avoid using it.
comment:16 by , 19 years ago
I was just thinking about this, and shouldn't we change msTempFile to use the atomic temp file and then use that from Mapscript? That would change the scope of the bug to be mapserver wide, and we should include Steve, but wouldn't that be the best approach? Write one, secure, atomic, thread safe implementaiotn of msTempFile and then use it from everywhere? Just a thought
comment:17 by , 19 years ago
Cc: | added |
---|
Yes, that's what I was thinking of doing, but forgot to mention it. I've added Steve to the CC to bring him in the loop.
comment:18 by , 19 years ago
Daniel, Since we are reducing this problem to fixing the PHP Mapscript filename creation (for 4.6) here is what I've found. It is a simple problem (and solution). The Time and PID portion of the tmpid are set in phpms_init_globals(), and the counter is set to 0. XXX_init_globals is called only once for each process (top-level process) that loads module XXX. That is, when the .so gets loaded by the process. See http://www.zend.com/php/internals/extension-writing1.php Here's the relavent quote: "The key lies in when the two functions are called. php_hello_init_globals() is only called when a new process or thread is started; however, each process can serve more than one request..." So what is happening (on multi-threaded platforms) is that when the parent process is created, the tmpId is being set to the time&pid of the parent and never changes. The counter is set to 0. The problem is that the variables defines inside ZEND_BEGIN_MODULE_GLOBALS are thread-safe and (HTTP) request-specific. That means that the increment that is used is only seen within a single request. Two concurrent requests will not see each others increments. The problem is fairly simple, the solution may not be. I think the simplest solution would be to generate tempId during each request (saveImage()) using something other than PID. Perhaps the php APIs have a getRequestId() or getThreadID() that could be used. Bill
comment:19 by , 19 years ago
> ------- Additional Comments From bill@binko.net 2005-08-02 17:07 ------- > The Time and PID portion of the tmpid are set in phpms_init_globals(), > and the counter is set to 0. XXX_init_globals is called only once for > each process (top-level process) that loads module XXX. That is, when > the .so gets loaded by the process. > > So what is happening (on multi-threaded platforms) is that when the parent > process is created, the tmpId is being set to the time&pid of the parent and > never changes. The counter is set to 0. This is a problem with all apache environments, not just multi-threaded. In the case of multi-process and preloaded modules, the listening process load the module and run XXX_init_globals before it forks of the serving processes. The current code probably works in the multi-process case if the mapserver module is loaded with dl() instead of preloaded (like it is in my case). > The problem is that the variables defines inside > ZEND_BEGIN_MODULE_GLOBALS are thread-safe and (HTTP) request-specific. > That means that the increment that is used is only seen within a single > request. Two concurrent requests will not see each others increments. Making tmpId and tmpCount thread-safe with ZEND_BEGIN_MODULE_GLOBALS makes the problem worse. But making them normal globals would only help the multi-threaded case, not the multi-process one. For the multi-process case the initialization of tmpId must be postponed to after forking. > The problem is fairly simple, the solution may not be. I think the > simplest solution would be to generate tempId during each request > (saveImage()) using something other than PID. Perhaps the php APIs > have a getRequestId() or getThreadID() that could be used. Making tmpId and tmpCount normal process globals, and initializing tmpId with getpid() and time() upon first use should in principle work for both for multi-threaded and multi-process apache. However, there might be a small problem with Linux for the threaded case. Linux assigns a new pid to each thread, and getpid() returns this temporary pid instead of the main pid. This could lead to collisions if the pid is reused by different processes within the same second, but that should be extremly rare if not impossible.
comment:20 by , 19 years ago
Since 4.8 is drawing closer.. Again, why wouldn't msTmpFile() work? - tmpCount is a global, so it is shared between all threads within each process - tmpCount is initialized upon first use, so it should always refer to the correct process Or am I totally mistaken?
comment:21 by , 19 years ago
Milestone: | 4.6 release → 4.8 release |
---|
I haven't checked but I want to try that alternative, perhaps simply with a mutex around the counter.
comment:22 by , 19 years ago
Dan, Wrapping all of the contents of msForceTmpFileBase() and the one line in msTmpFile() that calls tmpCount++ with a single mutex should solve the PHP race condition. It doesn't solve the security issues, but those can be post 4.8, or never dealt with (although a separate bug might be nice). Bill
comment:23 by , 18 years ago
Guys, what is the status of this? I am running into this again with 4.8.3 :-( Any chance of a 4.8.X fix?
comment:24 by , 18 years ago
I tried Bill's suggestion (as per comment #21) but had no luck, but maybe I got things wrong. What I did is: 1) in mapthread.h I added a constant: #define TLOCK_TMPFILE 11 2) in maputil.c I changed msForceTmpFileBase in order to use the lock. /************************************************************************/ /* msForceTmpFileBase() */ /************************************************************************/ static int tmpCount = 0; static char *ForcedTmpBase = NULL; void msForceTmpFileBase( const char *new_base ) { msAcquireLock( TLOCK_TMPFILE ); /* -------------------------------------------------------------------- */ /* Clear previous setting, if any. */ /* -------------------------------------------------------------------- */ if( ForcedTmpBase != NULL ) { free( ForcedTmpBase ); ForcedTmpBase = NULL; } tmpCount = -1; if( new_base == NULL ) { msReleaseLock( TLOCK_TMPFILE ); return; } /* -------------------------------------------------------------------- */ /* Record new base. */ /* -------------------------------------------------------------------- */ ForcedTmpBase = strdup( new_base ); tmpCount = 0; msReleaseLock( TLOCK_TMPFILE ); } 3) in maputil.c, msTmpFile I used the same lock around the line which increases the counter: msAcquireLock( TLOCK_TMPFILE ); sprintf(tmpFname, "%s%d.%s", tmpId, tmpCount++, ext); msReleaseLock( TLOCK_TMPFILE ); Am I doing something wrong?
comment:25 by , 18 years ago
Did you update php_mapscript.c as well? It doesn't currently use msTmpfile().
comment:26 by , 18 years ago
No I didn't. Do you have an example/code fragment of what should be changed? Thanks in advance.
comment:27 by , 18 years ago
I have tried your patch next to the changes I described in comment #23 but now I get the same tmpfilename all the time: http://145.50.148.45/ms_tmp/1150202079133901.png it's always the same filename.
comment:28 by , 18 years ago
I was too quick, it's not always the same, but still the same problem persists of temp file names not being unique.
comment:29 by , 18 years ago
Cc: | added |
---|
comment:30 by , 18 years ago
Bart, Can you confirm that you still got non-unique filenames with the patch applied? I have been testing the patch, and it does return filenames created by msTmpFile, that is time()+getpid()+(within process counter). It seems strange that the msTmpFile() code should suffer the same problems. (To check if the correct msTmpFile() is used, check that the time()-part of the filename (the first ten digits) is incremented every second.) I have also been running it for months without problems.
comment:31 by , 18 years ago
Guys, I would really like to fix this bug, but before committing any change I would like to reproduce the issue, and so far I am unable to reproduce it. My test environment is Apache 1.3.33 with PHP 4.3.11 compiled as a DSO (libexec/libphp4.so), with php_mapscript.so from CVS HEAD. I have two PHP scripts that each draw a different map, call save WebImage() and then use readfile() to return the contents of the temp file to the client. I use http_load to generate multiple parallel requests on the server with both URLs with the -checksum option to validate the results and no errors are detected. I tried both loading php_mapscript.so at runtime with dl() or preloading in php.ini and in both cases no errors. Can you please tell me which Apache/PHP combinations you use to get this error?
comment:32 by , 18 years ago
Hi Daniel, we use Apache 2.0.41 and PHP 4.3.11. We cannot use dl in our environment (dl() is not supported in multithreaded Web servers.). Phpinfo in our case says for the Server API Apache 2.0 Handler (from my bare head). Bart
comment:33 by , 18 years ago
To reproduce the bug use a non-threaded apache, (1.3 or 2.0 using the Prefork MPM), and load php_mapscript.so in php.ini. Notice that the stem (time+pid) of the filename is the same for all generated filenames. A collision occurs when the counter of each process matches. To increase the chance of collision, set the MaxRequestsPerChild to some ridiculous low value (2?) in httpd.conf.
comment:34 by , 18 years ago
Another note: I verified that with my config if you use dl() then I get no collisions. I get collisions only if I preload php_mapscript.so in php.ini.
comment:35 by , 18 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Using msTmpFile() fixes the issue for my test environment. I committed the fix in CVS HEAD (4.9/4.10) and will mark this bug fixed. Bart, can you please test again with the CVS version and reopen with more details if you still notice the problem? Notes: - The fix uses msTmpFile() in php_mapscript.so as per Brage's patch - I modified msTmpFile() to put a lock around the line that increments tmpCount - I modified msTmpFile() to include '_' as separators in the tmp filename between the pid, time and counter to prevent the possibility of collisions that was raised earlier in this bug - I modified msTmpFile() to use hex encoding for the pid, time and counter values as suggested by John N. on the -dev list since that makes the temporary filename more compact. - I noticed that Brage's patch uses msBuildPath() to build the URL returned by saveWebImage(). This works now but I'm worried that this may break later if msBuildPath() is ever modified to do file system checks on the path that it builds. Anyway, we'll keep it like this for now.
comment:36 by , 18 years ago
Cc: | added |
---|
comment:38 by , 18 years ago
Daniel, I tested and it seems to work fine now. Thanks. Bart. Also a big thank you to Brage.
comment:39 by , 18 years ago
Keywords: | VERIFIED added |
---|
Woohoo! Finally this saga is over! Thanks for testing the fix Bart. And thanks to all for the help and input.
Note:
See TracTickets
for help on using tickets.