Opened 19 years ago

Closed 19 years ago

Last modified 19 years ago

#1354 closed defect (fixed)

PHP Source Dependency for Regex Fix

Reported by: bill@… Owned by: dmorissette
Priority: high Milestone: 4.6 release
Component: MapScript-PHP Version: 4.6
Severity: normal Keywords:
Cc: mapserver@…

Description

Here is a patch (plus some new files) to fix the source dependency when 
PHP is compiled with its internal regex.

To install:
$ cd mapserver
$ tar xvfz regex.tar.gz
$ patch < regex.patch
$ ./configure .... --with-php=/usr (assuming PHP include files are in /usr/include/php)

I have only tested this on Linux Mandrake 10.1 using PHP 4.3.8, however, 
I've tried to make it as clean a fix as possible.  I would suspect that most
*nix systems will work fine, and that some tweaking will be needed in Windows.

Attachments (2)

regex.tar.gz (5.0 KB ) - added by bill@… 19 years ago.
Patch to allow PHP/Mapscript to build w/o PHP Sources (.tgz format)
regex2.patch (22.7 KB ) - added by bill@… 19 years ago.
Additional patch to fix segfault on PHP calls that rely on sizeof expressionObj

Download all attachments as: .zip

Change History (26)

by bill@…, 19 years ago

Attachment: regex.tar.gz added

Patch to allow PHP/Mapscript to build w/o PHP Sources (.tgz format)

comment:1 by bill@…, 19 years ago

Here's a thread on the mailing list that explains this:

http://lists.umn.edu/cgi-bin/wa?A2=ind0505&L=mapserver-dev&T=0&F=&S=&X=6E3CFD2A93D55E119B&Y=bill@binko.net&P=1423

comment:2 by bill@…, 19 years ago

Here's a quick breakdown on how the solution works:

1) The basics of the detection haven't changed: if PHP/Mapscript isn't being
built or PHP uses the system regex, it just build with the system regex.
2) If PHP uses its own regex, map.h includes mapregex.h instead of the PHP regex
include files.  mapregex.h redirects all calls to regcomp and friends to their
ms_reg* counterparts using #defines.
3) There are two implementations of the ms_reg* counterparts: one in cgiregex.c
(in the main mapserver directory) and one in php_regex.c (under mapscript/php3).
 The first calls the system regex, the second calls php's
4) The main makefile links all of the command line utilities with cgiregex.o,
and the php/mapscript makefile links the .so with php_regex.o.

This way, all of the command line utilities call the system regex (no matter
what), and the php_mapscript.so calls PHP's if they don't use the system regex.

I hope this explains it.

Bill

comment:3 by dmorissette, 19 years ago

attachments.description: Patch to allow PHP/Mapscript to build w/o PHP SourcesPatch to allow PHP/Mapscript to build w/o PHP Sources (.tgz format)

comment:4 by mapserver@…, 19 years ago

Cc: mapserver@… added

comment:5 by dmorissette, 19 years ago

Cc: mapserver-bugs@… added
Owner: changed from mapserverbugs to dmorissette@…
I'll try to integrate this now.

comment:6 by dmorissette, 19 years ago

Status: newassigned

comment:7 by dmorissette, 19 years ago

I had to make the following changes to the submitted patch to make it compile
(using PHP 4.3.11):

- php_regex.c: include regex_extra.h before regex.h to solve a problem with
undefined php_regex* symbols.

- renamed cgiregex.c to mapregex.c (this is what configure.in was expecting anyway)

- Added @PHP_REGEX_OBJ@ to the mapscript/php3/Makefile.in

- Modified the main makefile to always link mapregex.o into the libmap.a/.so.
This way external apps that link ith libmap.so won't need to worry about linking
with mapregex.o as well. The trick is that php_mapscript.so is linked with both
libmap.a/.so and php_regex.o, and the symbols from php_regex.o take precedence
over those from mapregex.o that was included in libmap.a/.so. This works fine on
Linux and I remember using that trick on other platforms as well in a previous
life (HP-UX, AIX, Solaris)

comment:8 by dmorissette, 19 years ago

Bill, the changes in mapfile.c to msEvalRegex() (malloc or the regex struct)
causes a memory leak. Can you please explain why this change is required? Could
we revert back to the original version where the regex struct was a local variable?

Also, why was the 'test = strdup(filename);' added in loadMapInternal()? Won't
that cause a leak as well?

comment:9 by dmorissette, 19 years ago

Another question: why the 'msyyin = (FILE *)0;' in loadMapInternal()?

comment:10 by bill@…, 19 years ago

Sorry Daniel,

the changes to mapfile.c were for debugging only: I have reverted to the CVS
version and it works fine.

I will check the others in the patch, but I don't think I left anything else in.

Bill

comment:11 by dmorissette, 19 years ago

Cool. I'll drop the mapfile.c changes then.

I think that what's been done is not sufficient, there is a regex_t member in
expressionObj that can vary in size depending on which regex it is compiled
with.  I think we'll need to add two new functions to mapregex.c and php_regex.c
to alloc and free a regex_t struct, and modify all of MapServer's source to use
them, including changing the definition of expressionObj to use a (regex_t *).
Fortunately the expressionObj is not exposed in SWIG so we won't need to worry
about updating MapScript (I hope).

Before I start making all those changes, can you please tell me if my
interpretation seems right to you?

comment:12 by dmorissette, 19 years ago

Cc: steve.lime@… added
Milestone: 4.6 release
Adding Steve to CC since this is going to affect quite a bit of his code.

comment:13 by bill@…, 19 years ago

Milestone: 4.6 release
Daniel,

I don't seen anything wrong with your changes.  I made the mapregex ->cgiregex
when I decided that it shouldn't be in the libmap.a

You should probably remove all of the $(CGI_REGEX_OBJ) from the final
command-line link phase or you'll get duplicate symbols (some platforms might
care -- Windows?)

I suppose since the actual symbols are map_reg*, you shouldn't have a problem.

Hope this wasn't too big a hassle.

I'll be happy to test when you're done.

comment:14 by bill@…, 19 years ago

That shouldn't be a problem

If you're building with USE_PHP_REGEX, then the #define in mapregex.h will cause
all regex_t's to be map_regex_t's.  Those are fixed size regardless of library
(note the ugly void*)

If you're not building with USE_PHP_REGEX, then it will use the system libraries
in all cases, and it's ok.

comment:15 by dmorissette, 19 years ago

Milestone: 4.6 release
Disregard my concerns from coment #9 about alloc/free of the regex_t. Bill has
explained to me on IRC how things work and once you understand it, everything
seems to be fine.

I have committed the changes to CVS. 

Assefa, we'd need you to try to Build on Windows and tell us if you run into any
problems

comment:16 by dmorissette, 19 years ago

Resolution: fixed
Status: assignedclosed
Marking fixed. Reopen if there are problems (on Windows or other platforms)

comment:17 by dmorissette, 19 years ago

Resolution: fixed
Status: closedreopened, dmorissette@dmsolutions.ca
Reopening. The current approach causes a seg fault:

In bug 1364, Bill wrote:
-----
dan, it is definitely related to the regex.  

It looks like it's going to be a bit more difficult than I thought yesterday. 
Because expressionObj holds it by value, and that gets included on both sides of
the fence, the #define needs be present on both sides.  However, PHP already
#defines it, which causes a problem.

This isn't a problem when things are held by reference.

comment:18 by dmorissette, 19 years ago

*** Bug 1365 has been marked as a duplicate of this bug. ***

comment:19 by dmorissette, 19 years ago

Um... could it be something that I broke while applying your patch? 


comment:20 by bill@…, 19 years ago

No, Daniel didn't break anything.  The patch caused problems because PHP uses
the same hack: #defining regex_t to php_regex_t.  That made my "interesting"
#define compile fine, but break at runtime.

It's true that ms_regex_t was fixed size, and we need that.  However, PHP would
use its own type (also #defined to regex_t) and boom.

After lengthy discussion, Daniel and I both set out to modify all calls to the
regex library to use the ms_reg* functions and types.  It actually wasn't that
much code since most of them used msEvelRegex anyway.  Our patches were nearly
identical, so I think we were on the right track.

This patch fixes the php segfault, and has been tested in the following
environments:

1) Built without any -with-php support: 
         CGI and command lines work fine
2) Built with -with-php=/usr using PHP Regex
         CGI and command line work fine.  
         PHP (including calls that rely on sizeof(expressionObj) ) work from
command line
         PHP (including calls that rely on sizeof(expressionObj) ) work from
Apache as a DSO

I have not tested PHP as a CGI
I have not tested using a PHP that uses the system regex

(I'm sorry for the premature commit - it really did look right)

by bill@…, 19 years ago

Attachment: regex2.patch added

Additional patch to fix segfault on PHP calls that rely on sizeof expressionObj

comment:21 by dmorissette, 19 years ago

Status: reopenedassigned
Thanks for your continued work on this Bill. I'll try to integrate the new patch
now.

comment:22 by dmorissette, 19 years ago

Resolution: fixed
Status: assignedclosed
I've applied and tested the new patch with the PHP regex (PHP DSO) and the
system regex (PHP CGI) and everything appears to be working fine now.

Committed to CVS, marking fixed. Thanks a lot for your work Bill!

comment:23 by assefa, 19 years ago

adding this in mapregex.c seems to fix the build problems on Windows

#if defined(_WIN32) && !defined(__CYGWIN__)
#define off_t	_off_t
#endif

Do I need to use at all the php_regex.c on Windows ?

comment:24 by dmorissette, 19 years ago

no you don't need php_regex.c on Windows because you already built mapregex.obj
using the PHP regex, so everything in your case is using the PHP regex and you
onlu need mapregex.obj.
Note: See TracTickets for help on using tickets.