Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#4314 closed enhancement (fixed)

Add ability to detect geojson files without an extension

Reported by: springmeyer Owned by: warmerdam
Priority: normal Milestone: 1.9.0
Component: OGR_SF Version: unspecified
Severity: normal Keywords:
Cc:

Description (last modified by springmeyer)

GeoJSON is becoming a very common format for data apis. And therefore it is common for applications to grab it remotely. OGR nicely supports reading geojson (and detecting it as geojson) from a string via url or locally.

But in some cases an application may wish to first download the data locally, cache it, and then pass that data to ogr. In the common case that the url does not end in a well know geojson extension, simply saving the file without an extension and passing the filename to ogr fails.

This failure to detect as geojson a file like 'jsonwithoutextension' is understandable. You would have to open it to see if there is json inside and because OGR already has a ton of other checks for different file types if each driver did this you would likely see quite slow initialization of OGR datasources. So, feel free to reject this patch outright if this seems like to much to ask of OGR.

But, from my limited view, it would be nice if ogr would handle this kind of detection better for GeoJSON because it seems like a potentially common situation. Instead of requiring the calling application to open the file and detect if it is geojson just to give it a filename extension, ogr could do this in one pass (from the perspective of the calling application).

The attached patch is a first stab at that. I'm totally unfamiliar with CPL usage and coding styles (eg. I free the char* but should the CPLString also be freed?), so please have mercy.

Attachments (3)

detect_files_with_no_ext_if_geojson.diff (2.6 KB ) - added by springmeyer 12 years ago.
ogr_geojson_test_for_extensionless_geojson.diff (2.0 KB ) - added by springmeyer 12 years ago.
patch to test new ability to detect extension-less geojson files
detect_files_with_no_ext_if_geojson2.diff (2.6 KB ) - added by springmeyer 12 years ago.
read a few more bytes to enable latest autotests to pass

Download all attachments as: .zip

Change History (9)

comment:1 by Even Rouault, 12 years ago

Which attached patch ? ;-)

(CPLString is a C++ object, so no need to explicitely destroy it.)

by springmeyer, 12 years ago

comment:2 by springmeyer, 12 years ago

Lost my internet as patch was uploading yesterday, apologies. Also just realized my autotest directory was out of data, so all tests (that I've added) do not yet pass. Another patch coming in a sec (along with mod to autotests).

by springmeyer, 12 years ago

patch to test new ability to detect extension-less geojson files

by springmeyer, 12 years ago

read a few more bytes to enable latest autotests to pass

comment:3 by springmeyer, 12 years ago

Description: modified (diff)

With latest patch and updated tests I get:

$python ogr_geojson.py

Test Script: ogr_geojson
Succeeded: 21
Failed:    0 (0 blew exceptions)
Skipped:   0
Expected fail:0

comment:4 by Even Rouault, 12 years ago

Milestone: 1.9.0
Resolution: fixed
Status: newclosed

Dane, I've applied your patch with a few changes :

1) no need to check the file size (which can be a very slow opertion when reading gzipped files for example) : we just try to read 6000 bytes and if there are less, it is OK. 2) I've improved the detection logic, which didn't pass with the ESRI json files.

r23293 /trunk/gdal/ogr/ogrsf_frmts/geojson/ogrgeojsonutils.cpp: Add ability to detect geojson files without an extension (#4314)

comment:5 by Even Rouault, 12 years ago

r23294 /trunk/autotest/ogr/ogr_geojson.py: Test detection of geojson files without an extension (#4314)

comment:6 by springmeyer, 12 years ago

great, thanks for the fixes and for applying!

Note: See TracTickets for help on using tickets.