Changes between Version 4 and Version 5 of PscMeeting04-02-2009


Ignore:
Timestamp:
Apr 2, 2009, 1:10:18 PM (15 years ago)
Author:
ksgeograf
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PscMeeting04-02-2009

    v4 v5  
    2020 * Way forward on Raster for 2.1
    2121 * Others?
     22
     23== Minutes ==
     24
     25PSC Members present: Tom, Bob, Bruce, Jason, Haris, Kenneth.
     26=== Review last meeting's action items ===
     27
     28PSC Members present:
     29Tom, Bob, Bruce, Jason, Haris, Kenneth
     30
     31=== Review last meeting's action items ===
     32 * All items are on todays meeting as well
     33
     34=== Status for the !PayPal initiative ===
     35 * There are legal issues involved, since Trevor is an Autodesk employee.
     36 * An outcome is expected before the next meeting.
     37
     38=== 2.1 and CS-MAP ===
     39 * Trunk should point to CS-MAP 12.01
     40 * Branches are not created until a specific major code change is expected
     41 * ... And a special case is the MG Connection Manager patches that Haris proposes.
     42 * Tom will assist Haris in setting up the branch, and Bruce agreed to work on the issue with changes.
     43
     44=== Fusio 2.0 beta timeline ===
     45 * Likely to be ready Monday
     46
     47=== Raster issues for 2.1 ===
     48 * Debate on what locks are required
     49 * Suggestion:
     50   * Add the two fixes produced by Haris (already commited)
     51   * Add the stylization lock produced by Haris
     52 * Votes: Tom +1, Bob +1, Bruce +1, Paul +1, Kenneth +1
     53 * Postpone the refcount fix produced by Traian, and initially suggested by Haris due to possible issues with the lock
     54
     55=== End of meeting ===
     56
     57== Full transcript ==
     58{{{
     59[INFO]  Channel view for “#mapguide” opened.
     60        <rbray> Hey everyone, are we ready to start?
     61        <tomf2> I'm ready
     62        <bdechant>      Same
     63        <ksgeograf>     ready.
     64        <rbray> Who want's to take minutes today - any volunteers?
     65        <ksgeograf>     I would like to give it a try today
     66        <bdechant>      I will volunteer
     67        <ksgeograf>     I'll step down then :)
     68        <bdechant>      you sure? :)
     69        <rbray> Ok, Kenneth it is
     70        <rbray> I checked last meetings minutes and there really are no actions with updates that are not in this weeks agenda already - so let's start there
     71        <HarisK>        hi
     72        <rbray> So first item: Status of the PayPal initiative for builds
     73        <rbray> Tom this is yours
     74        <tomf2> We are currently trying to figure out whether Trevor is allowed to do this as an employee of Autodesk
     75        <tomf2> I was hoping to have a better update today, but
     76        <tomf2> the meeting with the lawyers did not go through yesterday as planned. So we will have to wait some more
     77        <tomf2> That's it. Any questions?
     78        <rbray> Apparently the meeting with legal is monday - I asked Trevor just before the PSC meeting
     79        <rbray> So we should know more next week
     80        <rbray> I appologize for the delay on this, everything legal here takes a while
     81        <rbray> Anyway, next item: 2.1 branch (including setting externals to CS-Map 12.01 branch)
     82        <jasonbirch>    oops; sorry
     83        <tomf2> I'm wondering whether anyone is holding onto a big submission for branch, but they can't because we haven't created the branch yet
     84        <tomf2> I mean for trunk
     85        <tomf2> ...but maybe we are at the stage where it would be good to create the branch anyway
     86        <jasonbirch>    My main thing about getting a branch
     87        <jasonbirch>    Is that then we could hard-code specific tags for externals
     88        <jasonbirch>    Like for CS-Map
     89        <jasonbirch>    Instead of following their trunk
     90        <tomf2> We've always been holding off on creating branches because it saves developers some time if they don't have to merge. I'm good with pointing to CSMap 12.01 even from Trunk
     91        <jasonbirch>    OK, I'd be OK with that too.
     92        <jasonbirch>    We don't really have to branch until
     93        <jasonbirch>    we start getting people wanting to make massive chances
     94        <jasonbirch>    Or release, whatever comes first :)
     95        <jasonbirch>    chances -> changes
     96        <tomf2> OK, just say when you want the branch and I'll make it. I'll also make that change to point to CSMap 12.01 today.
     97        <bdechant>      thanks Tom
     98        <rbray> ok next item
     99        <HarisK>        If I would sumbit changes for MG connection manager, that would require branch to submit too ?
     100        <rbray> Oops - hold on
     101        <HarisK>        sorry, Bob slow here
     102        <rbray> Thats ok
     103        <jasonbirch>    If you were going to change connection manager to not use refcounting
     104        <jasonbirch>    I think that someone wanted that do be done in a post-2.1
     105        <jasonbirch>    Bruce maybe?
     106        <bdechant>      that was me Jason :)
     107        <bdechant>      The connection manager code is a vital part of MapGuide and changing it this close to release could be dangerous
     108        <HarisK>        agrre, only I would like to understand if I would like to change it
     109        <rbray> How big is the change?
     110        <HarisK>        not so big in coding
     111        <HarisK>        but may effect all parts of MG as Bruce said
     112        <rbray> I think we should assess the risk - maybe via code review
     113        <rbray> and then decide
     114        <bdechant>      agree
     115        <jasonbirch>    I guess we're still counting on the ADSK beta to give us some stability, and not relying on our own Beta process very much. That will have to change eventually.
     116        <jasonbirch>    We're not tehnically all that close to release is all I mean :)
     117        <HarisK>        agree, to do review I would ask fro branch and do changes there ?
     118        <rbray> That is why I think we should look at it
     119        <rbray> HarisK - can you make a patch?
     120        <jasonbirch>    HarisK: Someone could create a sandbox for you
     121        <jasonbirch>    to work in.
     122        <tomf2> There is a sandbox area, HarisK your change seems more appropriate for that
     123        <HarisK>        ok
     124        <tomf2> It seems that you are doing a proof of concept type exercise; please correct me if I'm wrong
     125        <HarisK>        I hope not :)
     126        <HarisK>        connection manager is not working properly in MG
     127        <jasonbirch>    If more than a couple files have to change, then a sandbox is better than patch anyway (I think)
     128        <jasonbirch>    And a patch can easily be generated via SVN once the changes are made and tested in sandbox.
     129        <HarisK>        and not a proof concept but try to make it work
     130        <jasonbirch>    If the changes aren't too risky, it would be nice.
     131        <tomf2> Couldn't you do that all locally? Why do you need a branch for this?
     132        <jasonbirch>    He could, but then would have to upload patch for review.
     133        <jasonbirch>    Also, local changes don't get backed up :)
     134        <ksgeograf>     He needs a branch for review purposes
     135        <tomf2> Reviews can be done via patches
     136        <jasonbirch>    So if he's doing a lot of work, would be better to do it in a branch.
     137        <jasonbirch>    The sandbox can easily be deleted post-work
     138        <tomf2> yes, if it's a lot of work a branch would be good
     139        <tomf2> That's a good reason
     140        <bdechant>      I think sandbox is right place to do this
     141        <tomf2> BTW, HarisK, thanks for doing this. Do you need any help creating the branch?
     142        <HarisK>        yes, i would need help with svn in any case, thanks
     143        <tomf2> OK, I'll create you a sandbox/haris/2.1 branch
     144        <tomf2> (feel free to give me a different name)
     145        <HarisK>        ok, I was also waiting when 2.1 will roll out
     146        <rbray> Anything else on this?
     147        <HarisK>        and chekc with others when would be appropriate time to start this bigger change
     148        <HarisK>        I am done :)
     149        <tomf2> Is there any reason not to start the change now?
     150        <HarisK>        I was kind of hopping that sombody who was working on current connection manager
     151        <HarisK>        will come to email list
     152        <HarisK>        and we discuss it
     153        <tomf2> I think that person would be Bruce
     154        <bdechant>      that is me :) I've started to look at the code
     155        <jasonbirch>    How about you commit what you have to sandbox so people can see,
     156        <jasonbirch>    And then discuss? :)
     157        <HarisK>        it is not only code which i wrote
     158        <HarisK>        but will need also little bit of changes
     159        <bdechant>      I'm well aware of the issue and would love to get rid of the reference count tracking
     160        <HarisK>        which would influence perhaps fewer places of MG code
     161        <HarisK>        ok, great
     162        <HarisK>        I would like that we exchange ideas how to solve it, before writting full imp[lementtation
     163        <bdechant>      I would happy to work with you on this in the sandbox :)
     164        <bdechant>      sounds good
     165        <HarisK>        sounds perfect to me
     166        <bdechant>      we can finish this issue chat offline so we can contiune
     167        <rbray> IMO - it would be great to get that in for 2.1
     168        <rbray> So let's see how it shapes up
     169        <bdechant>      we can try :)
     170        <rbray> Next item was: timeline for next Fusion 2.0 beta?
     171        <rbray> But no Paul today
     172        <jasonbirch>    next beta?
     173        <rbray> Anyone else know the status?
     174        -->|    pagameba has joined #mapguide
     175        <ksgeograf>     :)
     176        <jasonbirch>    lol
     177        <jasonbirch>    pagameba: ?
     178        <tomf2> not bad timing
     179        <jasonbirch>    your ears ringing?
     180        <pagameba>      hola!
     181        <pagameba>      sorry, was in a meeting and lost track of time
     182        <pagameba>      :D
     183        <pagameba>      burning more like
     184        <HarisK>        agile development
     185        <jasonbirch>    So, Fusion 2.0 beta next?
     186        <jasonbirch>    when?
     187        <pagameba>      zjames was eavesdropping for me
     188        <pagameba>      um
     189        <pagameba>      hang on ... lemme ask madair what his plans are
     190        <jasonbirch>    pagameba: he said on list he wanted to fix http://trac.osgeo.org/fusion/ticket/238 first
     191        <pagameba>      looking
     192        <pagameba>      oh right - we talked about it
     193        <pagameba>      yes, we want to fix that first
     194        <pagameba>      but we could do another beta right after
     195        <pagameba>      I won't be able to work on it until tomorrow
     196        <pagameba>      so earliest would be tomorrow or perhaps monday
     197        <rbray> ok - thanks for joining Paul and for the update
     198        <jasonbirch>    pagameba: I don't want to push. I could just point at trunk when building test installers
     199        <jasonbirch>    Modify my working copy externals, update, etc...
     200        <pagameba>      I guess although theoretically trunk could get changes that aren't going into 2.0
     201        <pagameba>      I'd rather we be more thorough in keeping 2.0 current
     202        <jasonbirch>    That would be cool :)
     203        <jasonbirch>    IS OL 2.8 going into 2.0? :)
     204        <pagameba>      yes, I think so
     205        <pagameba>      Mike has OL trunk in 2.0
     206        <jasonbirch>    That's cool. Guess I'm going too OT :)
     207        <pagameba>      which is basically 2.8
     208        <jasonbirch>    Next topic?
     209        <rbray> Sure - everyones favorite. Raster
     210        <rbray> There was some more mail on that today I think
     211        <rbray> Personally I'd like to see Traian's approach tested, as that change is more scalable than one big lock
     212        <HarisK>        I think we need to decide for 2.1
     213        <Traian_>       Hi, for 2.1, just ship the lock
     214        <Traian_>       people seem to like the lock
     215        <rbray> Well maybe everyone but me :)
     216        <jasonbirch>    rbray, I worry about that. Primarily because the only reason we started with locks in FDOGDAL, and single pool in MapGuide was multithreading instability in GDAL
     217        <ksgeograf>     I have a few machines that I can load test on, if I get some pathed dll's
     218        <Traian_>       You guys are trying to fix a refcounting bug with locks -- it's not right.
     219        <HarisK>        I really think we need to understand problems and divide them
     220        <HarisK>        no we are not
     221        <ksgeograf>     I don't have time to build a 2.0.2 from scratch, then patch, but I can test with a prebuilt dll set
     222        <jasonbirch>    If we can get the non-reference-counted connection manager into 2.1, then we can do away with the lock and allow users to choose between threaded and non-threaded providers.
     223        <rbray> It really requires two things I think
     224        <HarisK>        connection manager is something wrong
     225        <rbray> One is the connection manager changes
     226        <HarisK>        another story is if gdal can support multithreading
     227        <HarisK>        we cant mix that in
     228        <rbray> true - that is the other
     229        <HarisK>        if/when it will support we will remove locks from fdo gdal
     230        <Traian_>       Even if you ignore the multithreading part of GDAL, there is still a refcounting problem, that is causing behavior that is different from that of other providers.
     231        <jasonbirch>    Traian_: is refcounting part of the FDO contract though?
     232        <HarisK>        it seems not all
     233        <rbray> That should be fixed
     234        <HarisK>        I would assume ADSK raster too
     235        <HarisK>        and I think i saw one more provider with it
     236        <bdechant>      no ADSK raster doesn't have the same issue
     237        <HarisK>        but regardless
     238        <HarisK>        it chrashes same way with rasters
     239        <HarisK>        but it is not point anyhow
     240        <Traian_>       The feature reader in the GDAL provider needs a strong reference to the connection, since it needs the connection.
     241        <HarisK>        yes, I worte that in my first email
     242        <HarisK>        but
     243        <Traian_>       It's the same for the feature reader in SDF or SHP or whatever
     244        <HarisK>        also fdo client as MG
     245        <HarisK>        can't get object from connection,close connection and use object
     246        <pagameba>      is there anything wrong with putting both in?
     247        <Traian_>       If that bug is fixed, then you don't need the lock in the stylization code.
     248        <rbray> Um...not sure I agree Harris
     249        <Traian_>       There is nothing wrong with putting both in.
     250        <HarisK>        with both it will lock
     251        <Traian_>       That's why I said, just ship with the lock, if users want it.
     252        <rbray> If an object needs another for it to be valid, then it should hold a strong reference to it
     253        <HarisK>        there is no library of db
     254        <HarisK>        hich will allow you to close connection
     255        <HarisK>        and read from connection
     256        <HarisK>        but we are talking abut something which is not the reall issue/ not reall cause of problem
     257        <rbray> I agree with that too, Ideally the reader would hold a strong reference to the connection
     258        <ksgeograf>     If there is time pressure, I'm pro shipping with a lock, but if there is a possibility to test a lock free solution that would be nicer
     259        <rbray> if the connection is closed, the readeer should return an error
     260        <Traian_>       There is not a big performance difference between lock and no lock in practice -- I tried it, at least on a dual core machine.
     261        <HarisK>        btw, it is not even reader
     262        <rbray> But that is not the way any of the providers are currently implemented
     263        <HarisK>        but a raster object returned
     264        <HarisK>        I played with gdal a year or so ago
     265        <HarisK>        there were problems with it
     266        <rbray> Makes no difference - same logic should apply regardless of the object
     267        <HarisK>        Frank added lock on every gdal call in provider
     268        <HarisK>        i would not now unlock that until we have solve MG issues and then really look at gdal multithreading capabillities
     269        <Traian_>       Yes, even with the lock you get the crash, as I said, you can't fix a refcounting problem with locks. You can repro the problem from a single thread too.
     270        <HarisK>        and also reall gains in prfoamnce from that part
     271        <rbray> Gee - thanks Walt!
     272        <HarisK>        yes, closing connection and resuing object from it will crash
     273        <rbray> Seems the order of things should be:
     274        <HarisK>        there were 3 probems with MG 2.1 rasters
     275        <HarisK>        1. memory leak 2. unalloacted pointer use
     276        <HarisK>        and 3. this connection manager problem
     277        <rbray> Evaluate teh connection manager issues, if we fix, then decide what to do here
     278        <jasonbirch>    rbray, agreed
     279        <HarisK>        we cant fix connection manager for 2.1 ( two weks)
     280        <rbray> If we don't fix the connection manager, then let's just ship with the big freaking lock
     281        <jasonbirch>    2.1 doesn't have to be in two weeks.
     282        <HarisK>        not, fix
     283        <HarisK>        change
     284        <rbray> and the difference between fix and change is?
     285        <Traian_>       Can we also fix the refcounting in the provider? It can't hurt.
     286        <HarisK>        yes, ofcourse
     287        <bdechant>      I would like to do both - the strong reference in the reader of GDAL and the FDO connection manager
     288        <HarisK>        :) I really wrote in email that I want to change that
     289        <HarisK>        and in my code I did
     290        <HarisK>        just it is not the main issue to solve
     291        <ksgeograf>     Traian: your fix is only for the provider, not anything in the server code?
     292        <Traian_>       Yes, provider only, but with the fix, you can potentially use the provider without pooling connections to it, since connecting becomes pretty fast.
     293        <Traian_>       Thus bypassing the whole connection pool thing in theory...
     294        <HarisK>        I believe it will come to
     295        <HarisK>        unable to create new connections
     296        <ksgeograf>     So your fix will only introduce a new gdalprovider.dll ?
     297        <Traian_>       Yes, you can get to that, in my tests it does, and then it waits a bit and then it can create new ones.
     298        <Traian_>       In my code, I also removed the locks from the MG stylization code, and messed with serverconfig.ini.
     299        <Traian_>       The real change is in the provider though...
     300        <rbray> And what results did you get from your tests Traian?
     301        <ksgeograf>     So, if only the provider is updated, it gets stable, but has issues in long runs?
     302        <Traian_>       I am in favor with shipping with the lock, since I get a suspcious thing with FDO XML parsing, which happens when you open connections concurrently. I have not had time to debug this, but I think it happens any time FDO parses XML.
     303        <Traian_>       It affects all providers which parse XML on open though...
     304        <Traian_>       The problem with the crash/deadlock disappears in my tests.
     305        <jasonbirch>    I'd prefer to either fix the connection manager to work properly with single-threaded providers, or ship with the lock, before presenting multithreaded GDAL provider as a testing option for users.
     306        <jasonbirch>    It had problems in 1.2 when it wasn't constrained
     307        <jasonbirch>    And I value stability over performance.
     308        <Traian_>       jasonbirch: may be the problems had to do with refcounting, and not threading :)
     309        <jasonbirch>    Well, maybe, but we need a stable base to test from.
     310        <ksgeograf>     If it ships with singlethread setup as default in serverconfig.ini (and it works) I see no issues in allowing users to test multithreading, unless it is known to be unstable
     311        <Traian_>       Yes, let's just add the global raster lock, but also fix the refcoutning in the provider. We can remove the global lock from the provider itself since it is superceded by the lock in MG.
     312        <HarisK>        GDAL is allready single threaded
     313        <HarisK>        Imean fdo gdal
     314        <HarisK>        perfoamnce is gained on enabling pool for fdo gdal
     315        <HarisK>        and resuing connections
     316        <tomf2> GDAL Provider was made single threaded in a failed attempt to fix the crashing problem wasn't it? So there is no reason for it to be single threaded
     317        <ksgeograf>     HarisK: if Traian's refcount fix is added to the provider, what is the problems that you see?
     318        <HarisK>        :)
     319        <tomf2> I mean, we eventually want to remove the locks from the GDAL Raster provider, so we shouldn't be doing thigns thatwill assume that it will always be single-threaded
     320        <HarisK>        It is my suggestion from first email taht provider needs to add ref count as well behaved
     321        <ksgeograf>     I thought that was what Traians fix does?
     322        <jasonbirch>    tomf2: what are we doing that assumes single-threaded, other than temporary lock?
     323        <HarisK>        but only that will lock if we have lock on rasters too
     324        <Traian_>       I just did what Haris suggested for the provider. It's not something I invented
     325        <Traian_>       I also removed the locks and sped up connection opening
     326        <Traian_>       The locks inside the provider will be unnecessary due to the BFL in MapGuide itself
     327        <HarisK>        schema cache ?
     328        <Traian_>       BFL = big f-ing lock
     329        <HarisK>        but we have also other fdo clients
     330        <tomf2> jasonbirch: I was concerned about Haris' statement that "GDAL is allready single threaded", so if we do things that assume the slow nature of a single threaded provider we could end up with something that is not as fast as it could be in the future
     331        <HarisK>        and because we dont know (and i think is not stable) how gdal is safe in mt, i would leave those gdal locks
     332        <rbray> We certainly don't want BFL's in all clients
     333        <ksgeograf>     From here, it looks as if Traian and Haris disagree on the solution, but if Traians patch is what Haris suggested, I'm not following the dicussion...
     334        <HarisK>        no, not in all clienst
     335        <HarisK>        other clienst who open connection, use connection, close connection
     336        <HarisK>        will work
     337        <jasonbirch>    tomf2: I think maybe terminology difference between single-threaded and per-connection-threaded.
     338        <Traian_>       My patch is only part of what Haris wants... I claim it's sufficient...
     339        <Traian_>       This is going nowhere. Here is what I would do this release, you guys take it for what it's worth: 1. Fix refcount in provider, remove lock from provider. 2. Add BFL to MG.
     340        <jasonbirch>    I'd only want to do that if the changes proposed to the connection manager are not implemented.
     341        <jasonbirch>    The #2
     342        <jasonbirch>    The #1 should be done regardless
     343        <jasonbirch>    I think
     344        <rbray> Yep - I agree
     345        <ksgeograf>     isn't #2 what "CustomConnectionPoolSize=1" means?
     346        <rbray> That #1 should be done
     347        <rbray> Should be - yes
     348        <bdechant>      #1 should be done regardless
     349        <tomf2> ksgeograf, that's correct, that's what it's supposed to mean, but it didn't work...
     350        <Traian_>       I'd like #2 for now due to the FDO XML parsing being single-threaded, which gives me a heap corruption every few thousand concurrent requests. I don't think it will happen in real use cases, but until we fix it, I'd keep the lock.
     351        <HarisK>        remove gdal lock from provider is not right way, I strgly beleiev
     352        <HarisK>        I dont understand why adding uanother lkevel of uncertainty
     353        <tomf2> ksgeograf: perhaps it didn't work because of the refcount problem?
     354        <HarisK>        we have problem with rasters for very long time
     355        <HarisK>        why experiment with multithreading gdal
     356        <ksgeograf>     tomf2: yes, that was what I was thinking
     357        <rbray> Perhaps we need to think about this differently
     358        <jasonbirch>    You guys are guessing; haris tested this fairly thoroughly now, and a year ago.
     359        <rbray> Stability is key for 2.1 - yes
     360        <rbray> So lets do what we need to do to achieve that, even if there are extra locks we may not need
     361        <rbray> For 2.2, we find some time to remove all the freaking locks and look for the real source of hte problem
     362        <rbray> You'll never find it with a bunch of irrelevent locks in the way, which mask the problem
     363        <ksgeograf>     jasonbirch: Does that mean that Haris has found the "CustomConnectionPoolSize" to be faulty, even in the case of correct refcounts?
     364        <jasonbirch>    ksgeograf: yes
     365        <jasonbirch>    I think so.
     366        <jasonbirch>    Haris?
     367        <tomf2> jasonbirch: I thought that Haris didn't touch the GDAL Raster provider yet
     368        <HarisK>        I looked at provider year ago
     369        <HarisK>        and in last 2 weeks a lot
     370        <jasonbirch>    tomf2: his proposed fix didn't involve the provlder, but his testing (including this time around) included changes to refcounting.
     371        <HarisK>        it wasn't easy to find what is exactly going on
     372        <jasonbirch>    Alone, they didn't fix the problem.
     373        <Traian_>       What does the patch that people are happy with in the last week contain?
     374        <tomf2> HarisK: kenneth and I were wondering if you did the refcounting fix in the GDAL Raster PRovider and then make sure CustomConnectioPoolSize was set to 1 at the same time
     375        <HarisK>        3 fixes
     376        <HarisK>        PoolSize doesn pla with GDAL
     377        <HarisK>        there is fix peace of code in MG, if single threaded then pool size == 1
     378        <HarisK>        which is also something to change, I believe
     379        <jasonbirch>    Traian_: that patch contains a fix for the memory leak and invalid allocation, and the big lock. That's all.
     380        <HarisK>        there were 2 bugs causing unhandled exception when working with rasters
     381        <Traian_>       so, the only thing that is not checked in yet, is the big lock. Let's add that, and also fix the refcounting in the provider, and call it a day?
     382        <tomf2> +1
     383        <bdechant>      +1
     384        <rbray> +1 - for 2.1
     385        <jasonbirch>    I am concerned that you're looking at 2.2 to fix the connection manager.
     386        <ksgeograf>     HarisK: does that seem reasonable for you?
     387        <Traian_>       jason: correct.
     388        <jasonbirch>    I'm not convinced that the GDAL provider is the only place this problem is manifesting, just the most obvious place.
     389        <HarisK>        that would need to be tested
     390        <jasonbirch>    If the work that HarisK is doing is deemed safe enough, I want to see it in the 2.1 branch, if not for the initial 2.1 release.
     391        <HarisK>        if you add ref count and leave lock it will not work
     392        <HarisK>        if i remember correctly
     393        <HarisK>        also, there is allready one lock in stylization code
     394        <HarisK>        on excecute query
     395        <Traian_>       So fixing the refcount breaks your fix?
     396        <pagameba>      +1 for 2.1
     397        <pagameba>      not that I really understand :)
     398        <HarisK>        could be, cant remeber this second
     399        <tomf2> jasonbirch: so if I have this right, you want to put in some changes that might destabilize the code to fix some problems that we don't know exist (but I agree, probably do somewhere- but may not be caused by the connection manager)
     400        <jasonbirch>    tomf2: is the code stable? Not my experience...
     401        <ksgeograf>     +1 for 2.1
     402        <tomf2> it doesn't sound right to me, we could put it in a 2.1.1.
     403        <HarisK>        sorry guys, I am lost in debate now
     404        <HarisK>        I now what works
     405        <HarisK>        that is what we sumbited as patches
     406        <rbray> So then let's go with that for 2.1
     407        <HarisK>        all other stuff is something we need to further wok and test
     408        <ksgeograf>     ok, but it looks to me as if the patches Traian submitted are indeed fixing the refcount problem you describe?
     409        <HarisK>        there is a huge potential in improving rasters in MG
     410        <ksgeograf>     For 2.1, I can live with stable and slow... Right now anything is better than a constantly crashing service
     411        <jasonbirch>    OK, so as long as you're open to working on the connection manager for 2.1.1, then I'd be happy to see 2.1 ship with just the BFL :)
     412        <ksgeograf>     yes, I don't think that refcounting is the best possible way to determine usage of a connection
     413        <HarisK>        Kenneth, yes only adding ref to provider will halt MG ( or in combo with lock) can't remember now
     414        <bdechant>      I'm happy to work on the FDO connection manager post 2.1
     415        <jasonbirch>    I'd be worried about fixing the refcounting, if HarisK saw a negative interaction between that and the lock.
     416        <tomf2> jasonbirch: BTW I got some news today that there may be a problem with the way the web tier closes connections and this may cause some instability. Perhaps this is one of the stability issues you are seeing?
     417        <jasonbirch>    tomf2: yes, quite possibly.
     418        <jasonbirch>    tomf2: is fix for that likely to make it into 2.1? :)
     419        <tomf2> jasonbirch: ask Trevor :)
     420        <rbray> So are we done?
     421        <tomf2> I think that supporting many concurrent users hasn't been thoroughly done yet. And it seems that many developers are working on this at the same time
     422        <tomf2> Would it be worthwhile to set up a wiki or something for people to put their findings?
     423        <rbray> We should just make an area on the trac wiki for that
     424        <rbray> and Yes - that would be helpful as long as people update it
     425        <jasonbirch>    I plan to blog about the use of 'The Grinder" to replicate many users too. All of my testing so far has been with real users, but artificial load can be useful too.
     426        <jasonbirch>    Haris used it to good effect for the raster testing.
     427        <tomf2> We use the GRinder here for our load tests
     428        <tomf2> Chris has made some nice scripts
     429        <jasonbirch>    Oh. Would be cool to share those in the SVN :)
     430        <HarisK>        great, share it please :)
     431        <ksgeograf>     Is this the final suggestion for 2.1 then?
     432        <ksgeograf>     * Suggestion:
     433        <ksgeograf>     * Add the two fixes produced by Haris
     434        <ksgeograf>     * Add the refcount fix produced by Traian, and initially suggested by Haris
     435        <ksgeograf>     * Add a lock produced by Haris, but not yet submitted as a patch
     436        <HarisK>        we obviusly need better unittest
     437        <jasonbirch>    ksgeograf: I think all we're going to add is the big lock in the stylization code.
     438        <rbray> ksgeograf: I believe so
     439        <tomf2> Chris is on holidays until the 20th
     440        <tomf2> Remind me at that time
     441        <jasonbirch>    tomf2: will do.
     442        <ksgeograf>     the one called "mapguide_raster_stability.patch" ?
     443        <tomf2> I'm done, thanks Bob
     444        <jasonbirch>    I think so. The other one was already submitted (in a modified form)
     445        <rbray> OK, so that's what we'll do for 2.1
     446        <rbray> I have to run, so thanks everyone
     447        <jasonbirch>    And adding the refcounting fixes into fdo 3.4 may introduce more instability.
     448        <jasonbirch>    Nice short meeting...
     449        <Traian_>       jasonbirch: the refcounting fix does not introduce instability, but is useless if there is a giant lock. Of course, I can't convince you of that by just saying it.
     450        <jasonbirch>    no :)
     451        <jasonbirch>    Show me the money :)
     452        <Traian_>       it's impossible to prove that a problem doesn't exist
     453        <Traian_>       it's a general scientific concept
     454        <jasonbirch>    exactly...
     455        <Traian_>       the burden is on you to show a problem...
     456        <ksgeograf>     :D
     457        <Traian_>       I would personally go to B.C. and buy you a a beer if the lock + refcount fix breaks rasters...
     458        <Traian_>       and then go to Slovenia to buy a beer for Haris
     459        <jasonbirch>    OK, so if first thread has strong connection to fdo provider, and does execute, then second thread comes in and there's only a single connection what happens? Does it throw exception because second connection can't be opened? Or does it wait for the first one to close then proceed?
     460        <Traian_>       it waits
     461        <bdechant>      there is retry logic to wait
     462        <HarisK>        and it gies out after a short period
     463        <Traian_>       right, the retry logic kicks in
     464        <HarisK>        and with many concurent users no images for them
     465        <ksgeograf>     so it waits, then retries?
     466        <HarisK>        it doesnt work
     467        <Traian_>       I tested it with many concurrent and it worked fine
     468        <Traian_>       many = 16
     469        <HarisK>        on very short raster access yes
     470        <HarisK>        ut it doesnt really work
     471        <Traian_>       I tested with short and long (small and big rasters). Of course, I can't be 100% sure it works.
     472        <Traian_>       You guys have more real tests
     473        <HarisK>        we will talk about beer :)
     474        <HarisK>        either in Canada or Slovenia
     475        <Traian_>       Haris: it will have to be in June/July, that's when I go visit in the area
     476        <jasonbirch>    Haris will be in BC for GeoWeb :)
     477        <jasonbirch>    I wonder who else here is going to be there...
     478        <ksgeograf>     So, If I were to produce a raster provider with the fixes submitted by Traian, on a MapGuide running with the patched dll's Jason made, it would lock up?
     479        <HarisK>        great
     480        <Traian_>       I can send you guys a patched provider to play with
     481        <Traian_>       I compile against trunk, so I need to rebuild on 3.4...
     482        <jasonbirch>    Traian_: can you rebuild against 3.4 but using GDAL 1.6? :)
     483        <Traian_>       I can, but I thought it was using GDAL 1.6 already...
     484        <jasonbirch>    Oh...
     485        <jasonbirch>    You're right. I was thinking 3.3
     486        <jasonbirch>    Kenneth's still running MG 2.0.2 I think.
     487        <Traian_>       urgh
     488        <ksgeograf>     yes I am :D
     489        <jasonbirch>    I am too, in production.
     490        <jasonbirch>    But am testing against 2.1
     491        <Traian_>       in Open Source time frames, that's like running a Ford Model T
     492        <Traian_>       :)
     493        <ksgeograf>     I have MGE 2010 running in test, but I can't get the OGR provider working
     494        <HarisK>        I forgot a bit, but trying to remember
     495        <HarisK>        only adding ref count to provider will again throw exception
     496        <Traian_>       what part of OGR doesn't work?
     497        <Traian_>       HarisK: it would help if you tell me what exception -- I have found another problem which has to do with threading, when parsing XML using FDO's XML stuff.
     498        <ksgeograf>     not sure yet, it just throws Unclassified Exception
     499        <ksgeograf>     and occasionally "Resource busy"
     500        <Traian_>       what driver? PostGIS?
     501        <ksgeograf>     MapInfo
     502        <Traian_>       hmm... MapInfo is file-based, should be rock solid... Strange.
     503        <ksgeograf>     Not really asking for help, just an excuse for driving a Ford T :D
     504        <jasonbirch>    Is OGR provider advertised as single-threaded? :)
     505        <ksgeograf>     not sure...
     506        <Traian_>       No, MapGuide does dangerous things when the provider says "single-threaded" :)
     507        <jasonbirch>    Yeah, it's hard-coded to set the pool size to 1, regardless of the settings in the config file.
     508        <jasonbirch>    That was a bit annoying :)
     509        <tomf2> was?
     510        <jasonbirch>    For Haris when trying to figure out why his config file settings weren't changing the behaviour :)
     511        <tomf2> Ah, I see
     512        <Traian_>       Yeah, I had the same problem with the settings when I was testing... Took a while to figure out.
     513        <tomf2> Traian_, what are those dangerous thrings?
     514        <Traian_>       It was a joke -- it's the override setting that sets the Gdal provider to use one connection
     515        <Traian_>       it's not necessary anyway, since the provider already reports it's single-threaded
     516        <Traian_>       when I removed the setting from the config, it still remained single connection limit and I couldn't figure out why...
     517tomf2>  Blame bdecahnt :)
     518        <Traian_>       so, was there a decision what to do for 2.1?
     519        <jasonbirch>    I think it was commit the lock and be done with it.
     520        <Traian_>       ok
     521        <ksgeograf>     do you mean: not fix the refcounting?
     522        <jasonbirch>    The lock works OK without it, and make it pointless.
     523        <Traian_>       The refcounting is not a problem if the global lock is there... Unless you are doing something other than stylization, in which case you are screwed.
     524        <Traian_>       Since other APIs don't have the global lock...
     525        <ksgeograf>     ok, so basically what is out now in the patched dll's ?
     526        <jasonbirch>    ksgeograf: yes.
     527        <jasonbirch>    The memory leak was already fixed by ADSK for 2.1
     528        <jasonbirch>    And the second issue has been committed by Walt,
     529        <jasonbirch>    So the lock is all that's remaining.
     530        <ksgeograf>     ok
     531}}}