Ticket #2661 (assigned enhancement)

Opened 2 months ago

Last modified 2 months ago

support libharu as alternative PDF library

Reported by: kyngchaos Assigned to: kyngchaos (accepted)
Priority: normal Milestone: 5.4 release
Component: Output-PDF Version: unspecified
Severity: normal Keywords:
Cc: sdlime, dmorissette, zjames, assefa

Description

A couple problems with PDFlib:

  • it's bloated with it's own customized libtiff, libjpeg, libpng, & libz, though I have been successful in reducing that to just libtiff, but that's the biggest one.
  • the "Lite" version is not all that "free".
  • it took them a while to update it for OSX 10.5 64bits. Even then, the Lite version has a crippled build system so that a multi-architecture build is difficult (but still possible).

It looks like libharu is matured from when I originally found it last year.

It may not be as full featured, yet, as PDFlib, but it could be enough for MapServer.

Attachments

mappdf.c (66.9 kB) - added by kyngchaos on 07/05/08 18:28:26.
mappdf.h (4.7 kB) - added by kyngchaos on 07/05/08 18:28:51.
mapserver.h.patch (2.4 kB) - added by kyngchaos on 07/05/08 18:29:02.

Change History

06/25/08 15:28:59 changed by dmorissette

  • cc set to sdlime.
  • milestone set to 5.4 release.

Good to know that there is an alternative to pdflib... its not-all-that-free licence has been bugging me as well.

I'll tag this ticket as 5.4 release milestone, but for anything to happen we'll need someone with the time and interest to investigate libharu and possibly reimplement mappdf.c if libharu turns out to be a good option.

06/30/08 23:58:28 changed by kyngchaos

  • status changed from new to assigned.
  • owner changed from mapserverbugs to kyngchaos.

I started working on this. So far my basic C skills are enough, and I can figure out the equivalent Haru functions. One problem is that haru can't set the page translation matrix, yet, to move the origin to UL, but I think that can be worked around.

07/01/08 09:11:59 changed by dmorissette

  • cc changed from sdlime to sdlime, dmorissette, zjames, assefa.

Great news. Adding Assefa and Zak to CC since they know the current mappdf.c code and may be able to assist if you have any questions.

07/05/08 18:27:51 changed by kyngchaos

OK, here goes nothin'!

  • Patch for mapserver.h to conditionalize including the correct PDF library header, add needed hashtable to the PDFObj, and comment a bogus msDrawMapPDF() function.
  • New mappdf.h for some macros for various PDF library functions.
  • Fresh copy of mappdf.c - so many changes, and I tidied up the formatting to help me understand what was going on.

All based on 5.2.0b2 sources. I've compiled it to clean out all the warnings and errors, but I need to look up a test to try.

Notes and questions follow. I added some /**** FIXME ****/ comments for stuff I couldn't figure out or didn't want to do yet or that I noticed could be improved in the PDFlib implementation, and I'll mention them below.

- haru version - minimum 2.2.0 (not yet released, but dev OK) - needed for new function to load a PNG from memory (2.1 can only load raw from mem). Check for HPDF_LoadPngImageFromMem() in configure.

- configure and USE_PDF - USE_PDF is now the general define for enabling PDF support. Additional defines are USE_HARUPDF and USE_PDFLIB for the two different libraries, and are mutually exclusive. Configure needs --with-pdflib and --with-libharu options, and --with-pdf could be dropped (or retained as an alias for pdflib). The PDF var in the makefile would then get -DUSE_PDF -DUSE_HARUPDF or -DUSE_PDF -DUSE_PDFLIB depending on which one the user chooses.

- msDrawMapPDF() - I noticed that this was declared twice in mapserver.h, but it's not even used nor is there source in mappdf.c. I removed a dup def, and commented the other in mapserver.h, but that could also be removed, unless someone knows better.

- PDF info attribs - I wonder if the PDF info attributes should be updated for the MapServer version? Or is 3.7 meant to be when PDF support was added?

- Image loading - I made a request, and it's now in the dev libharu, for loading PNGs from memory. I see that the PDFlib implementation uses JPEG because of a PDF_open_image() limitation. And, I see in the PDFlib docs at least as far back as v5 that they suggest using virtual files to load a PNG from memory. I wonder if this should be done now and make PDFlib v5 a minimum requirement? On their web site, I could not find any downloads at all of old versions, and only documentation back to v5.

- Dash patterns - Haru currently supports a max 8 pattern segments, vs PDFlib's unlimited, and Mapserver needs 10. I requested the Haru devs to remove the limit, but until (if) that happens, this will be a small nuisance.

I also see a conditional for PDFlib >= 6 to support dashed lines at all. PDFlib v5 can do dashed lines (I can't think why even earlier versions wouldn't support them). Is there a bug in earlier versions?

- Font loading - is a bit of a mess for PDFlib implementation. On starting the PDF with msImageCreatePDF(), all fonts in the fontset are loaded into the PDF, ready to be referenced by mapfile name alias. But in msDrawTextPDF(), the requested font for each call is loaded again by file path, instead of referenced directly in the PDF by name alias. Yet in msDrawMarkerSymbolPDF() when drawing a TT symbol, the font is referenced directly by the name alias.

And, in msLoadFontSetPDF(), which is used by msImageCreatePDF() to load the fonts at the start, it actually reloads the fontset data from the font file, while using the fonset obj only to get the path to that file, instead of using the font file paths already loaded into the fonset object. I didn't try to fix this, but it looks like a big waste of time.

Also, loading al the fonts in the fontset into the PDF seems like a waste of time, if not all the fonts are used in the map. And I'm not sure if PDFlib or libHaru strip out unused fonts when the PDF data is returned. I would suggest loading fonts as requested. PDFlib already handles load requests transparently for a font already loaded. Haru does not, but I have a hash table set up anyways.

- Font aliases - Haru doesn't refer to fonts loaded with aliases, but the "real" names it finds in the font file. So I added a hash table to the MS PDFObj to store this for lookup later. I don't know if this was the right place to put it, but it was the best place I could think of that would survive between the PDF initialization and various drawing calls.

- Font types - Haru handles only TTF and T1 fonts for now, using its own library. No FreeType? at this time. Can't do OTF fonts (bummer) yet - I made a request for this - or Mac-style font files (only flat ttf/pfa/pfb). T1 fonts require their companion AFM file, so this would have to be calculated from the file name.

Also, it has separate routines for each font type, so we need to figure out from file extension which type a font is. I didn't do this and just assumed all are TT for now.


So, a lot I did by example. Pointers and addresses (&) confuse me a bit. I hope I didn't go overboard on the macros.

07/05/08 18:28:26 changed by kyngchaos

  • attachment mappdf.c added.

07/05/08 18:28:51 changed by kyngchaos

  • attachment mappdf.h added.

07/05/08 18:29:02 changed by kyngchaos

  • attachment mapserver.h.patch added.