Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#6200 closed defect (invalid)

xls file is not properly read

Reported by: EliL Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: svn-trunk
Severity: normal Keywords: xls, formula, freeXL,
Cc: Bas Couwenberg, a.furieri@…

Description

The attached .xls file is not read properly by OGR with the XLS/freeXL driver. Testing with editing (with LibreOffice? - file probably originates with ancient Excel) different files trying to isolate the problem to specific rows has been very inconsistent. Sometimes I copy and paste into new sheets/tabs and get a working result sometimes I don't with no apparent explanation.

Also attached are my OGR commands.

There are some formulas which based on the documentation, I expect to maybe not work based reading the driver page. From other experience (OSGeo4W), I would expect formulas to return null for that field, not to have the entire layer fail to read.

I worked around my file by converting to .xlsx which is working fine with the xlsx driver.

Contact me if you need additional similar files that work differently.

Attachments (2)

Rd_list2.xls (14.0 KB) - added by EliL 5 years ago.
xls file that is not read properly
OGR_commands_xls.txt (2.9 KB) - added by EliL 5 years ago.
OGR commands and output

Download all attachments as: .zip

Change History (11)

Changed 5 years ago by EliL

Attachment: Rd_list2.xls added

xls file that is not read properly

Changed 5 years ago by EliL

Attachment: OGR_commands_xls.txt added

OGR commands and output

comment:1 Changed 5 years ago by Even Rouault

I can't reproduce the problem with the freexl version I use (no idea which one I use... haven't upgraded it in years...). The general structure of sheets is returned, except computed fields which are null. Which freexl version do you use ?

comment:2 Changed 5 years ago by EliL

Nevermind that, early morning confusion and I checked the wrong (virtual) computer.

Version 1, edited 5 years ago by EliL (previous) (next) (diff)

comment:3 Changed 5 years ago by EliL

Here are the results I expect, PAVED and ROCKED have values, TOTAL is null (different file names and layers, but similar data):

eadam@ELA-gdal:~$ ogrinfo Rd_list_again.xls -sql "SELECT * FROM Sheet2 WHERE FID <3" --config OGR_XLS_HEADERS FORCE
Had to open data source read-only.
INFO: Open of `Rd_list_again.xls'
      using driver `XLS' successful.

Layer name: Sheet2
Geometry: None
Feature Count: 2
Layer SRS WKT:
(unknown)
Field1: String (0.0)
RD_Num: Integer (0.0)
ROAD_NAME: String (0.0)
PAVED: Real (0.0)
ROCKED: Real (0.0)
TOTAL: Real (0.0)
OD_Permit: String (0.0)
Field8: String (0.0)
Field9: String (0.0)
ZONE: String (0.0)
OGRFeature(Sheet2):2
  Field1 (String) = (null)
  RD_Num (Integer) = 1
  ROAD_NAME (String) = THREE ROCKS RD                
  PAVED (Real) = 2.36
  ROCKED (Real) = 0.57
  TOTAL (Real) = (null)
  OD_Permit (String) = Y
  Field8 (String) = (null)
  Field9 (String) = (null)
  ZONE (String) = N 

comment:4 Changed 5 years ago by Even Rouault

Cc: Bas Couwenberg added

Bas, I've cc'ed you since I'm wondering if there's not a problem with the ubuntu 14.04 freexl package (1.0.0g-1ubuntu0.14.04.1). When I use freexl 1.0.0g compiled from upstream sources with no additional patch (or latest 1.0.2 as well), I get a non null content with ogrinfo on the above mentionned xls file. But if I use packaged freexl, I get null content. It appears that the ubuntu version has security patches. I'm wondering if there isn't an issue with them that cause read problems with legit files ? Actually I've just manually downgraded to the previous version https://launchpad.net/ubuntu/+archive/primary/+files/libfreexl1_1.0.0g-1_amd64.deb and I get the expected result, so there's definitely an issue with 1.0.0g-1ubuntu0.14.04.1

comment:5 in reply to:  4 Changed 5 years ago by Bas Couwenberg

Cc: a.furieri@… added

The only change in freexl (1.0.0g-1ubuntu0.14.04.1) is the addition of afl-vulnerabilitities.patch which is taken from the freexl repository.

The Ubuntu security update is nearly identical to the freexl (1.0.0g-1+deb8u1) security update for Debian.

If afl-vulnerabilitities.patch is the problem, it should also affect freexl in Debian jessie. And it does, I get the same results as in attachment:OGR_commands_xls.txt​ on Debian jessie with gdal (1.10.1+dfsg-8+b3) & freexl (1.0.0g-1+deb8u2).

I've added Sandro to the CC to get his view on the freexl changes.

comment:6 Changed 5 years ago by Even Rouault

Resolution: invalid
Status: newclosed

Sebastian, by looking at the fossil history, it seems that the security patch caused a regression that was later fixed with https://www.gaia-gis.it/fossil/freexl/fdiff?v1=61618ce51a9b0c15&v2=4f9408c216ead322&sbs=1

It isn't very visible in the HTML output but the fix is not the addition of the freexl_version() function, but the change of if (workbook->sector_end <= (workbook->p_in - workbook->sector_buf)) to if (workbook->sector_end < (workbook->p_in - workbook->sector_buf)) at line 3762 of freexl.c

I'm going to close this ticket, as it is now obvious this isn't a GDAL bug.

comment:7 Changed 5 years ago by Bas Couwenberg

Thanks for pointing out the regression fix. I've confirmed that less than change fixes the regression and prepared updates of the freexl package to include the fix, these will hopefully be available in Debian & Ubuntu soon.

comment:8 Changed 5 years ago by Bas Couwenberg

In Debian the regression has been fixed for jessie in freexl (1.0.0g-1+deb8u3) and wheezy in freexl (1.0.0b-1+deb7u3) with DSA 3208-2 freexl regression update.

In Ubuntu the regression has been fixed for trusty in freexl (1.0.0g-1ubuntu0.14.04.2) and for vivid in freexl (1.0.0h-1~exp1ubuntu1.1) via Launchpad Bug #1516257. The updates for Ubuntu also include the security fix from FreeXL 1.0.2 as mentioned in the Launchpad bug.

comment:9 Changed 5 years ago by EliL

With this update, OGR and these .xls files now operate as expected. Thanks.

Note: See TracTickets for help on using tickets.