Opened 17 years ago

Closed 11 years ago

Last modified 10 years ago

#175 closed defect (wontfix)

Unmanaged Files slow down as More Files get Added

Reported by: cgountanis Owned by:
Priority: high Milestone:
Component: Server Version: 2.0.2
Severity: major Keywords: fdo, mapguide 1.2 beta 2, shp, dbf
Cc: External ID: 940227

Description

Why do unmanaged FeatureSources respond slower as more files get added to the directory location? We have a FeatureSource called SHPDefualt for example. The zooms take longer as well as other functions. It seems when you query a Feature Source is does not hit the DBF you want directly it seems to fiddle with everything in the unmanaged directory. We love the concept of a folder that contains all SHP files. One location is great for batch processes and great for editing new layers. It is just simple for customers to understand. For sure easier to back up and recreate. Will this always be a performance killer? Was testing on Windows 2000, XP and Vista with same results using MGOS 1.2 B2 and FDO 3.2.2 B2.

Is this a bug? Will this be fixed soon?

This really takes the complication out of knowing what the heck is going on in the repository. Here is XML Example:

<?xml version="1.0" encoding="utf-8" ?> <FeatureSource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:noNamespaceSchemaLocation="FeatureSource-1.0.0.xsd"> <Provider>OSGeo.SHP</Provider> <Parameter> <Name>DefaultFileLocation</Name> <Value>C:\SHPFILES</Value> </Parameter> </FeatureSource>

Easy way to reproduce this would be create an unmanaged shp feature source. Put one shp file with associated files and create a map. Goto the web layout stage and make sure you test with something that queries feature sources like zooms. Now add 20+ shp files the bigger the fdo the slower it gets. Don't change the map or web layout. Do the same thing as before run the site with the initial one layer and zoom. You should notice a HUGE slow down compared to the single shp file as started off.

Change History (24)

comment:1 by cgountanis@…, 17 years ago

My 2Cents: This is becoming a huge issue for our customers that want a simple directory drop point. Actually most of our customers do not want to even learn Load Procedures and I don't true the repository enough to handle the data. Unmanaged SHP, SDF are the only way to go with 90% of our authors.

comment:2 by amorsell@…, 17 years ago

Along those same lines, I have discovered that displaying tooltips from unmanaged shape files causes a much more significant performance hit and delay. In my case, I have a 150 MB parcel polygon shape file for a county. In the unmanaged case, each tooltip AJAX fetch spikes the CPU with mgserver.exe at 100% for a couple of seconds. Doing the same with the exact same shape file in a managed state (not converted to SDF at load time), the CPU will only hit 25% or less and responds much more quickly.

Further, I discovered that the more unmanaged data in a map, the slower the tooltip response is for all layers. For instance, if I add another 7 or so layers (like streets, water bodies, etc) to the map, when I float over a parcel polygon it can take up to 10 seconds for a tooltip response during which time mgserver is cranking away. So, I'm not sure what it's doing at that point. All of these additional layers are all in the same unmanaged data source folder.

I went ahead and loaded all of the shape files into the repository and created a duplicate map pointing to the new managed feature sources. This map behaves fine with tooltip fetches consuming 25% of a single CPU and very quick response time. So, this seems to confirm that the more unmanaged feature sources in a map (at least from the same directory), the larger the performance hit and the slower the map responds to requests.

comment:3 by amorsell@…, 17 years ago

I don't recall seeing this tooltip delay with the exact same map and feature sources with 1.2.0 beta 1 so perhaps the problem has been introduced with beta 2.

comment:4 by zspitzer, 17 years ago

I there should be some caching going on with these fdo resources classes.

99.9% of the time underlying resources are going to be the same and should rendering increase performance.

MgFeatureService.setUseFeatureClassCache(true/false) MgFeatureService.refreshFeatureClassCache()

comment:5 by tomfukushima, 17 years ago

I tried this out and although I am seeing slowdown with 17 SHP files (200 MB worth of data). It is still very reasonable. The 17

Any help that you can provide in pinpointing this problem would be useful. For example, what particular thing with SHPs in a single directory causes the problem (since I'm not seeing the same problem). Also, does turning connection pooling on for SHP fix the problems?

comment:6 by cgountanis, 17 years ago

Either way this is still a performance bug. You should be able to query a feature source (zoom by id code) and not have slower speeds due to how many SHP files are in the directory. When will we see this improved please. Have customers complaining so I am complaining to you adn Autodesk :-) Rolls down hill what can I say.

In the meantime we will try the pooling and also try converting to SDF. For some customers SDF3 (Map2008) is not an option. Especially ESRI customers. Need unmanaged SHP to work FAST.

comment:7 by tomfukushima, 17 years ago

Hi Chris, what I am saying is it is not enough of a performance issue from what I am seeing to justify fixing it. Since this is an open source forum, please help us in finding the problem. Thanks

comment:8 by cgountanis, 17 years ago

Hey Tom,

If I was better at C I would try and look at the problem. I realize this is open source but with Autodesk as it's big brother. I am sure they want to hit ESRI with fast SHP file performance, right?

I brought the issue up in the user group and many others have same issue including tooltips. Your quote "although I am seeing slowdown" should prove the issue is in the way the FDO reads the SHP files in one single directory. My solution would be when you query unmanaged shp feature sources to ignore all other files in directory and hit the one set of files only. For example you have 32 shps/layers and you do a zoom on the parcel layer. It takes 15 seconds to zoom since there are other shp file sin same directory. Is it not an easy fix to just nail the parcel files and ignore the rest. You have to specify the layer name when creating the resID why not use it?

What else can I help with?

comment:9 by tomfukushima, 17 years ago

Hi Chris, open source projects do not expect the community to build and debug into the code, however if you could do this that would be great. Typically, what open source projects expect from the community is help in debugging and pinpointing a problem. I am not a QA person, so I'm not sure exactly what strategies are best here. But helping us find the problem by figuring out what combination of SHP files cause the problem, whether connection pooling makes a difference or not, etc. All of this could help us pinpoint and reproduce the problem.

To get Autodesk involved, I suggest you use your MGE subscription and contact Autodesk directly.

comment:10 by amorsell, 17 years ago

Yes, enabling connection pooling for shape made a huge (positive) difference. Performance is now acceptable. I'm not sure the slow request every 10 minutes will be a large issue. In my case, the slowness was not noticeable until fetching parcel polygon tooltips (10 seconds for each tooltip). But, at a higher zoom factor, other shape files from the same unmanaged feature source are referenced in the map which is where I imagine it creates the initial XML config and caches it.

One thing to note is that the default installation has connection pooling enabled, but the shape provider is excluded from pooling. The default line in serverconfig.ini reads:

DataConnectionPoolExcludedProviders = OSGeo.SDF,OSGeo.SHP

So it must be edited and OSGeo.SHP removed. Not a huge deal, but maybe it should be this way by default?

comment:11 by rexszeto, 17 years ago

External ID: 940227

comment:12 by cgountanis, 17 years ago

Yea, pooling changes within the serverconfig.ini did seem to make the basic navigation of the map much faster. However caching all of that data was harsh on the memory plus it has to refresh every 10 minutes or so. Not to mention is locks files. I don't see this as a good workaround for this unmanaged shp feature source problem we are all having. Zooms however are still slow due to the query aspect of the feature source still not hitting the dbf direcly but seemly hitting all in the directory before results are pushed to viewer.

Can we say this is a bug? Can this be fixed before 2008 Ent?

comment:13 by tomfukushima, 17 years ago

Milestone: 1.21.3

comment:14 by tomfukushima, 17 years ago

Some information from Bruce that was posted to the mailing list today: SDF and SHP are excluded from pooling because of an update/reader issue within FDO that has not been addressed yet. If you notice some problems with update/readers then you should leave it as excluded. I believe this is planned to be fixed with the next release of FDO post MGOS 1.2.

comment:15 by cgountanis, 17 years ago

So leave pooling out on this topic and just get the more than one shp file in a folder, unmanaged feature source, smoothed out then please. Pooling was only tried as a failed work around to the unmanaged solution, which I have to say customers are yelling for. They don't understand the repo and want to batch to a single folder location for simplicity.

comment:16 by tomfukushima, 16 years ago

Milestone: 2.0

Is this still an issue? Chris, please retest with the 2.0 final release.

in reply to:  16 comment:17 by cgountanis, 16 years ago

Resolution: fixed
Status: newclosed

Replying to tomfukushima:

Is this still an issue? Chris, please retest with the 2.0 final release.

I tried it breifly with 22 SHP files and a folder alias. It seems to function properly and no real noticeable delay. I would have to assume this is not an issue. If I find more when the next project comes up I will let you know. Thanks!

comment:18 by zspitzer, 16 years ago

Resolution: fixed
Status: closedreopened
Version: 1.2.02.0.0

I just encountered this issue with MG 2.0 on Linux. Performance was terrible until I created a single featureSource for each SHP and then it started to fly along.

here is the file list, with the problem i had only the rail and road shps being used and no labelling. I only solved it because i remembered reading this bug a while ago :)

total 55M 8.0K tr_air_infra_area_centroid.dbf 8.0K tr_air_infra_area_centroid.idx 8.0K tr_air_infra_area_centroid.prj 8.0K tr_air_infra_area_centroid.shp 8.0K tr_air_infra_area_centroid.shx

12K tr_air_infra_area_line.dbf

8.0K tr_air_infra_area_line.idx 8.0K tr_air_infra_area_line.prj

16K tr_air_infra_area_line.shp

8.0K tr_air_infra_area_line.shx

12K tr_air_infra_area_polygon.dbf

8.0K tr_air_infra_area_polygon.idx 8.0K tr_air_infra_area_polygon.prj

16K tr_air_infra_area_polygon.shp

8.0K tr_air_infra_area_polygon.shx 8.0K tr_airport_area_centroid.dbf 8.0K tr_airport_area_centroid.idx 8.0K tr_airport_area_centroid.prj 8.0K tr_airport_area_centroid.shp 8.0K tr_airport_area_centroid.shx

12K tr_airport_area_line.dbf

8.0K tr_airport_area_line.idx 8.0K tr_airport_area_line.prj

12K tr_airport_area_line.shp

8.0K tr_airport_area_line.shx 8.0K tr_airport_area_polygon.dbf 8.0K tr_airport_area_polygon.idx 8.0K tr_airport_area_polygon.prj

12K tr_airport_area_polygon.shp

8.0K tr_airport_area_polygon.shx

20K tr_airport_infrastructure.dbf

8.0K tr_airport_infrastructure.idx 8.0K tr_airport_infrastructure.prj

12K tr_airport_infrastructure.shp

8.0K tr_airport_infrastructure.shx 8.0K tr_ferry_route.dbf 8.0K tr_ferry_route.idx 8.0K tr_ferry_route.prj 8.0K tr_ferry_route.shp 8.0K tr_ferry_route.shx 200K tr_rail.dbf

40K tr_rail.idx 20K tr_rail_infrastructure.dbf

8.0K tr_rail_infrastructure.idx 8.0K tr_rail_infrastructure.prj 8.0K tr_rail_infrastructure.shp 8.0K tr_rail_infrastructure.shx 8.0K tr_rail.prj 200K tr_rail.shp

16K tr_rail.shx 24M tr_road.dbf

1.8M tr_road.idx 8.7M tr_road_infrastructure.dbf 1.5M tr_road_infrastructure.idx 8.0K tr_road_infrastructure.prj 1.4M tr_road_infrastructure.shp 400K tr_road_infrastructure.shx 2.6M tr_road_locality.dbf 3.0M tr_road_locality_section.dbf 8.0K tr_road.prj

11M tr_road.shp

492K tr_road.shx

comment:19 by zspitzer, 16 years ago

I am seeing the same issue using an unmanged SDF file with multiple schemas (20)

comment:20 by tomfukushima, 16 years ago

I think that if we can get someone to implement http://trac.osgeo.org/fdo/wiki/FDORfc23 on the SHP and SDF providers, this performance problem will be resolved no matter how many SHP files or feature classes there are. Just an educated guess though.

comment:21 by zspitzer, 15 years ago

Summary: Unmanaged SHP File Locations Performance Issues as More Files get AddedUnmanaged Files slow down as More Files get Added
Version: 2.0.02.0.2

comment:22 by jbirch, 14 years ago

Is this any better with 2.1? Has RFC23 been implemented in MapGuide yet? Will close this ticket if no response by next ticket cleanup cycle.

comment:23 by jng, 11 years ago

Resolution: wontfix
Status: reopenedclosed

Outside the scope of MapGuide.

If you want this fixed, push for FDO RFC23 to be implemented by the SHP provider.

comment:24 by jng, 10 years ago

SHP Provider in FDO trunk now implements RFC 23 APIs which resolve this issue. However this was too late to make it into 3.9, so the RFC23-enabled provider will land in the release after 2.6

Note: See TracTickets for help on using tickets.