ogr2ogr - SQL Server data loading very slow
Following is a discussion we've had on the mailing list.
Basically, it seems that the SQL Server driver does not use batch/bulk insert.
It's at least 10 times slower than what it could be.
I left the priority and severity to normal, but in reality, for me, it means that I will have to go with a custom made solution. I could have done the fix myself I guess, but the time involved to setup a development environment for ogr2ogr would exceed the time to manually read a mapinfo file and right through jdbc.
Keep up the good work by the way.
Seems we can get the significant improvement by using bulk inserts then. You may submit a ticket to keep this enhancements in scope. Some of the guys used the driver successfully by using FreeTDS on Linux.
2012/8/27 Duchesne, Jimmy <JDuchesne@…> Thanks for the answers guys.
What you're saying concurs with what I was experiencing, and I believe, what I was seeing with the SQL Server Management profiler.
Also, I don't know if inserting with something else than WKT would actually improve the performance because my test, bulk inserts done with Java, were also inserting using WKT.
Finally, just to be sure, is there any issue with using this driver on a Linux OS?
From: fwarmerdam@… fwarmerdam@… On Behalf Of Frank Warmerdam Sent: August-24-12 6:39 PM To: Duchesne, Jimmy Cc: gdal-dev@…; Szekeres Tamás Subject: Re: [gdal-dev] SQL Server Driver
From an examination of MSSQLSpatialTableLayer::CreateFeature() it appears a regular INSERT statement is done for each feature written to the db rather than any sort of bulk mechanism. I could be missing something of course. Hopefully Tamas will provide a more experienced answer.
Best regards, Frank
On Fri, Aug 24, 2012 at 3:03 PM, Duchesne, Jimmy <JDuchesne@…> wrote:
I've been using SQL Server Driver for a few days now, and I noticed it was very slow, whatever the parameters, such as GT, I was using.
I need to insert millions of rows in a table.
To give you an idea, it takes around 1 minute to import 10k records with the driver
On the other hand, if I write my own routine which reads a MapInfo file and does batch insert into the database, I can insert over 125k records per minute.
I tried reading the same TAB file with ogr2ogr but writing to PostGis and noticed the speed was about the same than my routine, which is enough for my needs.
Would it be possible that the SQL Server Driver does not do bulk insert?
Thanks for the help.