Opened 16 years ago

Closed 13 years ago

#2661 closed enhancement (wontfix)

support libharu as alternative PDF library

Reported by: kyngchaos Owned by: kyngchaos
Priority: normal Milestone: 6.0 release
Component: Output-PDF Version: unspecified
Severity: normal Keywords:
Cc: sdlime, dmorissette, zjames, assefa, aboudreault, bfraser

Description

A couple problems with PDFlib:

  • it's bloated with it's own customized libtiff, libjpeg, libpng, & libz, though I have been successful in reducing that to just libtiff, but that's the biggest one.
  • the "Lite" version is not all that "free".
  • it took them a while to update it for OSX 10.5 64bits. Even then, the Lite version has a crippled build system so that a multi-architecture build is difficult (but still possible).

It looks like libharu is matured from when I originally found it last year.

It may not be as full featured, yet, as PDFlib, but it could be enough for MapServer.

Attachments (3)

mappdf.c (66.9 KB ) - added by kyngchaos 16 years ago.
mappdf.h (4.7 KB ) - added by kyngchaos 16 years ago.
mapserver.h.patch (2.4 KB ) - added by kyngchaos 16 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 by dmorissette, 16 years ago

Cc: sdlime added
Milestone: 5.4 release

Good to know that there is an alternative to pdflib... its not-all-that-free licence has been bugging me as well.

I'll tag this ticket as 5.4 release milestone, but for anything to happen we'll need someone with the time and interest to investigate libharu and possibly reimplement mappdf.c if libharu turns out to be a good option.

comment:2 by kyngchaos, 16 years ago

Owner: changed from mapserverbugs to kyngchaos
Status: newassigned

I started working on this. So far my basic C skills are enough, and I can figure out the equivalent Haru functions. One problem is that haru can't set the page translation matrix, yet, to move the origin to UL, but I think that can be worked around.

comment:3 by dmorissette, 16 years ago

Cc: dmorissette zjames assefa added

Great news. Adding Assefa and Zak to CC since they know the current mappdf.c code and may be able to assist if you have any questions.

comment:4 by kyngchaos, 16 years ago

OK, here goes nothin'!

  • Patch for mapserver.h to conditionalize including the correct PDF library header, add needed hashtable to the PDFObj, and comment a bogus msDrawMapPDF() function.
  • New mappdf.h for some macros for various PDF library functions.
  • Fresh copy of mappdf.c - so many changes, and I tidied up the formatting to help me understand what was going on.

All based on 5.2.0b2 sources. I've compiled it to clean out all the warnings and errors, but I need to look up a test to try.

Notes and questions follow. I added some /**** FIXME ****/ comments for stuff I couldn't figure out or didn't want to do yet or that I noticed could be improved in the PDFlib implementation, and I'll mention them below.

  • haru version - minimum 2.2.0 (not yet released, but dev OK) - needed for new function to load a PNG from memory (2.1 can only load raw from mem). Check for HPDF_LoadPngImageFromMem() in configure.
  • configure and USE_PDF - USE_PDF is now the general define for enabling PDF support. Additional defines are USE_HARUPDF and USE_PDFLIB for the two different libraries, and are mutually exclusive. Configure needs --with-pdflib and --with-libharu options, and --with-pdf could be dropped (or retained as an alias for pdflib). The PDF var in the makefile would then get -DUSE_PDF -DUSE_HARUPDF or -DUSE_PDF -DUSE_PDFLIB depending on which one the user chooses.
  • msDrawMapPDF() - I noticed that this was declared twice in mapserver.h, but it's not even used nor is there source in mappdf.c. I removed a dup def, and commented the other in mapserver.h, but that could also be removed, unless someone knows better.
  • PDF info attribs - I wonder if the PDF info attributes should be updated for the MapServer version? Or is 3.7 meant to be when PDF support was added?
  • Image loading - I made a request, and it's now in the dev libharu, for loading PNGs from memory. I see that the PDFlib implementation uses JPEG because of a PDF_open_image() limitation. And, I see in the PDFlib docs at least as far back as v5 that they suggest using virtual files to load a PNG from memory. I wonder if this should be done now and make PDFlib v5 a minimum requirement? On their web site, I could not find any downloads at all of old versions, and only documentation back to v5.
  • Dash patterns - Haru currently supports a max 8 pattern segments, vs PDFlib's unlimited, and Mapserver needs 10. I requested the Haru devs to remove the limit, but until (if) that happens, this will be a small nuisance.

I also see a conditional for PDFlib >= 6 to support dashed lines at all. PDFlib v5 can do dashed lines (I can't think why even earlier versions wouldn't support them). Is there a bug in earlier versions?

  • Font loading - is a bit of a mess for PDFlib implementation. On starting the PDF with msImageCreatePDF(), all fonts in the fontset are loaded into the PDF, ready to be referenced by mapfile name alias. But in msDrawTextPDF(), the requested font for each call is loaded again by file path, instead of referenced directly in the PDF by name alias. Yet in msDrawMarkerSymbolPDF() when drawing a TT symbol, the font is referenced directly by the name alias.

And, in msLoadFontSetPDF(), which is used by msImageCreatePDF() to load the fonts at the start, it actually reloads the fontset data from the font file, while using the fonset obj only to get the path to that file, instead of using the font file paths already loaded into the fonset object. I didn't try to fix this, but it looks like a big waste of time.

Also, loading al the fonts in the fontset into the PDF seems like a waste of time, if not all the fonts are used in the map. And I'm not sure if PDFlib or libHaru strip out unused fonts when the PDF data is returned. I would suggest loading fonts as requested. PDFlib already handles load requests transparently for a font already loaded. Haru does not, but I have a hash table set up anyways.

  • Font aliases - Haru doesn't refer to fonts loaded with aliases, but the "real" names it finds in the font file. So I added a hash table to the MS PDFObj to store this for lookup later. I don't know if this was the right place to put it, but it was the best place I could think of that would survive between the PDF initialization and various drawing calls.
  • Font types - Haru handles only TTF and T1 fonts for now, using its own library. No FreeType at this time. Can't do OTF fonts (bummer) yet - I made a request for this - or Mac-style font files (only flat ttf/pfa/pfb). T1 fonts require their companion AFM file, so this would have to be calculated from the file name.

Also, it has separate routines for each font type, so we need to figure out from file extension which type a font is. I didn't do this and just assumed all are TT for now.


So, a lot I did by example. Pointers and addresses (&) confuse me a bit. I hope I didn't go overboard on the macros.

by kyngchaos, 16 years ago

Attachment: mappdf.c added

by kyngchaos, 16 years ago

Attachment: mappdf.h added

by kyngchaos, 16 years ago

Attachment: mapserver.h.patch added

comment:5 by dmorissette, 16 years ago

Cc: aboudreault added

comment:6 by assefa, 16 years ago

William,

I have been able to build/test the code on windows. I will be providing more comments on this in the coming days/weeks as I continue to play with it.

comment:7 by bfraser, 15 years ago

Cc: bfraser added

Assefa,

I've been able to compile and run with the HARU pdf library using VC++ 7.1. A few minor changes to mappdf.c were required, but it still does not create a valid pdf. The error value in the HARU's pdf structure shows 0x1025 (HPDF_INVALID_DOCUMENT). To find out where this is being set, I'll likely implement a user-defined error handler. But I thought I'd see if anyone else was working on this...

Brent Fraser

comment:8 by bfraser, 15 years ago

More on the Above:

  1. There is already an error handler routine (good)
  2. My problem was in using the OpenType fonts delivered with Windows (bad), Haru not setting the error code as stated in its doc (bad), and our mappdf.c code not checking for null returns from HPDF_LoadTTFontFromFile (well heck, it's pre-alpha)

I'll stick with PDFLib until Haru supports OpenType fonts. I might look into PoDoFo.

Brent

comment:9 by assefa, 15 years ago

Brent,

I had initially, I think the same issues but was able to produce a pdf with small changes. The changes are not part of the patch yet. I am not actively working on this right now. But my hope is that this would make it to the 5.4 release.

comment:10 by sdlime, 15 years ago

Where is this one at? It feels mute with Cairo PDF support sitting not too far out with 6.0...

Steve

comment:11 by dmorissette, 15 years ago

Milestone: 5.4 release6.0 release

I guess we'd need someone to compare Haru vs Cairo PDF before we make a decision

(Pushing target milestone to 6.0 since this is clearly not going into 5.4)

comment:12 by kyngchaos, 15 years ago

Off the top of my head, I can say that Haru is missing some features that would make it more universal, though what's available may be enough for MapServer (for now). And development is slow, there was a spurt just before I found it that made it mostly usable, but it has slowed since then. Major missing features I can think of:

  • OTF fonts
  • TIFF images

Cairo is probably more complete (?) I haven't spent much time yet to get a working Cairo library on OSX (I don't like their insistence on using pkg-config and font-config, though I'm giving in to needing font-config). Cairo may be more complicated to program for - I heard that you have to program knowledge of the various backends you want to support into your software, so for the OSX quartz font backend you need to know about the OSX font API, similar for the Win32 font and other system-specific render backends. Though MapServer should be fine limited to the freetype+standard render backends.

Personally, I'd want Cairo to be usable as a universal PDF library: MapServer, PHP extension, GRASS, Python. I'm not sure about PHP, but the others are OK or in the works.

So, Cairo is OK with me. As long as we get a less-bloated and less-license-restricted PDF library.

comment:13 by tbonfort, 13 years ago

Resolution: wontfix
Status: assignedclosed

so we went the cairo way...

Note: See TracTickets for help on using tickets.