Opened 14 years ago

Closed 9 years ago

#3443 closed enhancement (wontfix)

PATCH: Speed up the MI_TAB driver by lazy loading the attribute data

Reported by: ksgeograf Owned by: warmerdam
Priority: normal Milestone:
Component: OGR_SF Version: 1.7.1
Severity: normal Keywords: mitab TAB columns attributes speedup
Cc: Daniel Morissette

Description

When reading features from an MI_TAB layer, the attributes are always loaded (copied into local variables). For some uses (eg. rendering) the attributes are not used. For many other uses (eg. labeling) only a few attributes are used.

My fix simply declares the IsFieldSet function virtual, so the MI_TAB layer can intercept the call and make sure that the attribute data only gets loaded when needed.

On actual data, I get a +30% percent speedup when rendering some 20.000 parcels in MapGuide.

The downside to this patch is that the IsFieldSet can no longer be declared const (as it may modify data).

The filter evaluation uses GetRawFieldRef so I inserted a call to IsFieldSet here as well.

The lazy loading approach can probably be more generic and use a designated method to perform lazy loading, instead of hooking the IsFieldSet. It could be used for other filebased formats as well.

The method that actually lazy loads the fields just scan the record from left to right, meaning that the speedup will be highest if the used fields are at the very left (position 0). It could be further optimized so the index of fields are calculated without reading the data, and then keeping a map of which columns are loaded. But the load is fairly fast, so there is a risk that the check and file seek makes the operation slower.

Attachments (1)

gdal-ogr-mitab-lazyload.patch (6.4 KB ) - added by ksgeograf 14 years ago.

Download all attachments as: .zip

Change History (5)

by ksgeograf, 14 years ago

comment:1 by warmerdam, 14 years ago

Cc: Daniel Morissette added
Keywords: mitab added

As there is no requirement on applications to call IsFieldSet(), I don't think this is a particularly general approach. I think lazy/optional loading of attribute (and/or spatial data) needs to be handled more comprehensively in OGR.

I'm cc:ing Daniel who maintains MITAB in case he is interested.

We can happily leave this patch here for folks who need the speedup in the meantime (possibly quite a long meantime).

comment:2 by ksgeograf, 14 years ago

I agree that this is not the best possible way to do this. Intercepting the IsFieldSet is bad practice as it abuses a function with another purpose. I chose it because all the property reading functions call it.

It would be much nicer if OGR had explicit methods to choose when/if the attribute loading should occur.

Unfortunately my knowledge about OGR is not great enough to provide a proposal that enables generic lazy-load support.

comment:3 by Jukka Rahkonen, 9 years ago

I do not believe that an open GDAL ticket is the best place for storing a mitab patch that gives some speedup for some users.

Perhaps mitab maintainer could copy the patch to wiki and close this ticket.

comment:4 by Even Rouault, 9 years ago

Resolution: wontfix
Status: newclosed

A better and safer approach now would be to implement the SetIgnoredFields() API used in a few drivers. So this patch will never be applied as it was designed.

Closing this ticket consequently.

Note: See TracTickets for help on using tickets.