Opened 15 years ago

Closed 14 years ago

Last modified 14 years ago

#900 closed defect (fixed)

Connections in CLOSE_WAIT status never close and are not reused

Reported by: amitmarty Owned by:
Priority: high Milestone: 2.2
Component: Server Version: 2.0.2
Severity: critical Keywords:
Cc: brucedechant, trevorwekel External ID:

Description

Some of this was discussed in trac 726.
But I believe that one should be left to the 55 connections issue which might be attributed to the ACE's ACE_WFMO_Reactor implementation in mapguide.


My setup is Apache / Tomcat / Mapguide 2.0.2 with Ajax Viewer. / mysql db / Win 2003 Server


When a user tries to access a map we load the base map (DWF) from the library and based on user options load a spatial feature layer from a mysql database.


On the initial site load tomcat opens about 6 to 8 connections to port 2811 on mapguide. After about 5 minutes of no activity on the map all these connections go to a close_wait status.


If a user then tries a activity one of these 8 connections that are in close_wait will be reused and the user activity is processed. But the rest of the connections are never closed nor are they ever reused.


As more users try to access the maps we start seeing a high amount of build up of these connections to map guide on port 2811 ( Client Connections ) and at some point I have to restart the mapguide server before the users can access maps again.


Below is my workers.properties file, i have commented couple of the directives as they were deprecated.


# Define 1 real worker using ajp13

worker.list=worker1

# Set properties for worker1 (ajp13)

worker.worker1.type=ajp13

worker.worker1.host=localhost

worker.worker1.port=8009

worker.worker1.lbfactor=50

#Deprecated cachesize since 1.2.16 , replaced by connection_pool_size

# For Apache autodiscovery is done to match the ThreadsPerChild Apache directive

#worker.worker1.cachesize=10

worker.worker1.cache_timeout=600

worker.worker1.socket_keepalive=true

#Deprecated since 1.2.16, replaced by connection_pool_timeout

#worker.worker1.recycle_timeout=300

worker.worker1.connect_timeout=10000

worker.worker1.connection_pool_timeout=60


What I would like to do is setup a debug environment with the source for mapguide server and web server extensions for java and try to first find what the initial connection are and then where the problem might be.


Is there any documentation on how to setup such a env. ? Any Suggestion for debugging JNI ?

Thanks for all the great work on Mapguide and any help you could extend.

Change History (16)

comment:1 by tomfukushima, 15 years ago

Cc: brucedechant trevorwekel added

comment:2 by brucedechant, 15 years ago

Milestone: 2.1
Resolution: fixed
Status: newclosed

Fixed.

See submission r3847

comment:3 by brucedechant, 15 years ago

Resolution: fixed
Status: closedreopened

Reopening this ticket as the behavior was recently observed while testing.

comment:4 by jbirch, 14 years ago

Bruce, has this been re-encountered since? Is it still occurring in 2.1 or trunk? I'd like to either update the version for the ticket or close it.

comment:5 by brucedechant, 14 years ago

I believe the CLOSE_WAIT issue is resolved, but the connection limit of the ACE reactor used on Windows still exists. Linux uses a different ACE reactor that doesn't have a limit.

We should change the Windows implementation to use the same ACE reactor as used on Linux - ie: Select reactor

The change involves adding the following line to the top of the ACE/ACE_wrappers/ace/config.h file: #define ACE_USE_SELECT_REACTOR_FOR_REACTOR_IMPL

The Windows Service Control Manager code will need to be updated to work with this new ACE reactor as the MapGuide server service fails to start with just this change. It does work properly from the command line though.

comment:6 by trevorwekel, 14 years ago

I agree with Bruce. The limit of 62 concurrent connections for the ACE_WFMO_Reactor will be too small to fully load upcoming multi-core CPUs. The select reactor limit is somwhere near 1000 connections. I believe a 3:1 ratio of connections to cores may be required to adequately load the MapGuide Server under certain conditions.

AMD is scheduled to release 24 cores on a 2 socket machine sometime in Q1 2010 and Intel also has larger core counts in the works. Assuming a 3:1 ratio, 72 connections will be required to load a 24 core machine.

comment:7 by brucedechant, 14 years ago

Resolution: fixed
Status: reopenedclosed

Closing this ticket.

I have created ticket 1272 for the 62 connection limit. http://trac.osgeo.org/mapguide/ticket/1272

comment:8 by brucedechant, 14 years ago

Resolution: fixed
Status: closedreopened

comment:9 by brucedechant, 14 years ago

Milestone: 2.3

comment:10 by brucedechant, 14 years ago

Reopening this ticket as the issue has been replicated.

comment:11 by brucedechant, 14 years ago

Fixed. sandbox/adsk/2.2gp r4681

comment:12 by brucedechant, 14 years ago

Milestone: 2.32.2

comment:13 by brucedechant, 14 years ago

Resolution: fixed
Status: reopenedclosed

Fixed. trunk r4682

comment:14 by brucedechant, 14 years ago

Fixed Linux build.

comment:15 by brucedechant, 14 years ago

Fixed in sandbox/adsk/2.1 r4717

comment:16 by brucedechant, 14 years ago

Removed flush() statement that is not needed.

Fixed in sandbox/adsk/2.2gp r4726

Fixed in sandbox/adsk/2.1 r4727

Fixed in trunk r4728

Note: See TracTickets for help on using tickets.