Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#4314 closed enhancement (fixed)

Add ability to detect geojson files without an extension

Reported by: springmeyer Owned by: warmerdam
Priority: normal Milestone: 1.9.0
Component: OGR_SF Version: unspecified
Severity: normal Keywords:
Cc:

Description (last modified by springmeyer)

GeoJSON is becoming a very common format for data apis. And therefore it is common for applications to grab it remotely. OGR nicely supports reading geojson (and detecting it as geojson) from a string via url or locally.

But in some cases an application may wish to first download the data locally, cache it, and then pass that data to ogr. In the common case that the url does not end in a well know geojson extension, simply saving the file without an extension and passing the filename to ogr fails.

This failure to detect as geojson a file like 'jsonwithoutextension' is understandable. You would have to open it to see if there is json inside and because OGR already has a ton of other checks for different file types if each driver did this you would likely see quite slow initialization of OGR datasources. So, feel free to reject this patch outright if this seems like to much to ask of OGR.

But, from my limited view, it would be nice if ogr would handle this kind of detection better for GeoJSON because it seems like a potentially common situation. Instead of requiring the calling application to open the file and detect if it is geojson just to give it a filename extension, ogr could do this in one pass (from the perspective of the calling application).

The attached patch is a first stab at that. I'm totally unfamiliar with CPL usage and coding styles (eg. I free the char* but should the CPLString also be freed?), so please have mercy.

Attachments (3)

detect_files_with_no_ext_if_geojson.diff (2.6 KB) - added by springmeyer 6 years ago.
ogr_geojson_test_for_extensionless_geojson.diff (2.0 KB) - added by springmeyer 6 years ago.
patch to test new ability to detect extension-less geojson files
detect_files_with_no_ext_if_geojson2.diff (2.6 KB) - added by springmeyer 6 years ago.
read a few more bytes to enable latest autotests to pass

Download all attachments as: .zip

Change History (9)

comment:1 Changed 6 years ago by Even Rouault

Which attached patch ? ;-)

(CPLString is a C++ object, so no need to explicitely destroy it.)

Changed 6 years ago by springmeyer

comment:2 Changed 6 years ago by springmeyer

Lost my internet as patch was uploading yesterday, apologies. Also just realized my autotest directory was out of data, so all tests (that I've added) do not yet pass. Another patch coming in a sec (along with mod to autotests).

Changed 6 years ago by springmeyer

patch to test new ability to detect extension-less geojson files

Changed 6 years ago by springmeyer

read a few more bytes to enable latest autotests to pass

comment:3 Changed 6 years ago by springmeyer

Description: modified (diff)

With latest patch and updated tests I get:

$python ogr_geojson.py

Test Script: ogr_geojson
Succeeded: 21
Failed:    0 (0 blew exceptions)
Skipped:   0
Expected fail:0

comment:4 Changed 6 years ago by Even Rouault

Milestone: 1.9.0
Resolution: fixed
Status: newclosed

Dane, I've applied your patch with a few changes :

1) no need to check the file size (which can be a very slow opertion when reading gzipped files for example) : we just try to read 6000 bytes and if there are less, it is OK. 2) I've improved the detection logic, which didn't pass with the ESRI json files.

r23293 /trunk/gdal/ogr/ogrsf_frmts/geojson/ogrgeojsonutils.cpp: Add ability to detect geojson files without an extension (#4314)

comment:5 Changed 6 years ago by Even Rouault

r23294 /trunk/autotest/ogr/ogr_geojson.py: Test detection of geojson files without an extension (#4314)

comment:6 Changed 6 years ago by springmeyer

great, thanks for the fixes and for applying!

Note: See TracTickets for help on using tickets.