#1258 closed defect (fixed)
Healthcheck using wrong query for database healthcheck
Reported by: | ianwallen | Owned by: | jesseeichar |
---|---|---|---|
Priority: | major | Milestone: | v2.8.1 |
Component: | General | Version: | v2.8.0 |
Keywords: | Cc: |
Description
As pointed out by Montse in http://osgeo-org.1560.n6.nabble.com/Oracle-Database-health-check-issues-td5044740.html
DatabaseHealthCheck.java is using query "SELECT id from Metadata LIMIT 1" which is not compatible with other databases.
As this is a simple check to see if the database is responding, I believe it should be using the query that was identified in the config.xml "validationQuery" as this query should be compatible with the database.
Change History (10)
comment:1 by , 12 years ago
Summary: | Helthcheck using wrong query for database healthcheck → Healthcheck using wrong query for database healthcheck |
---|
follow-up: 4 comment:2 by , 12 years ago
comment:3 by , 12 years ago
Owner: | changed from | to
---|
Often the database configuration has a validation check sql. That should probably be used for this check since it is database specific.
comment:4 by , 12 years ago
Do you have some sql suggestions? Should the sql just have a ; ending the statement for oracle?
Replying to drh:
My sense is that the validationQuery is to test the JDBC connection, but the DatabaseHealthCheck should be a query (queries) aimed at checking the status of the database as well as the connection. In 2.8.0, the DatabaseHealthCheck fails on Oracle:
* Csw GetCapabilities: OK * Csw GetRecords: OK ! Database Connection: ERROR ! ORA-00933: SQL command not properly ended java.sql.SQLSyntaxErrorException: ORA-00933: SQL command not properly ended at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:445) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523) at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207) at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:863) at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1153) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1275) at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3576) at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3620) at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1491) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at jeeves.resources.dbms.Dbms.selectFull(Dbms.java:213) at jeeves.resources.dbms.Dbms.select(Dbms.java:176) at org.fao.geonet.monitor.health.DatabaseHealthCheck$1.check(DatabaseHealthCheck.java:25) at com.yammer.metrics.core.HealthCheck.execute(HealthCheck.java:165) at com.yammer.metrics.core.HealthCheckRegistry.runHealthChecks(HealthCheckRegistry.java:53) at com.yammer.metrics.reporting.HealthCheckServlet.doGet(HealthCheckServlet.java:61) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at com.yammer.metrics.reporting.AdminServlet.doGet(AdminServlet.java:103) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.fao.geonet.monitor.webapp.WebappMetricsFilter.doFilter(WebappMetricsFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176) at org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145) at org.tuckey.web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java:92) at org.tuckey.web.filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java:381) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.fao.geonet.monitor.MonitorSecurityFilter.doFilter(MonitorSecurityFilter.java:51) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.fao.geonet.monitor.webapp.MetricsRegistryInitializerFilter.doFilter(MetricsRegistryInitializerFilter.java:31) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690) at java.lang.Thread.run(Thread.java:662) * Lucene Index: OK * Metadata Index Errors: OK
comment:5 by , 12 years ago
The ";" is just to signify the end of the statement when using sqlplus or sql developer or similar tools. It cannot be used when supplying a statement via jdbc. So no it is not required.
If converting the following statement to oracle
SELECT id from Metadata LIMIT 1
then it would be
SELECT id from Metadata where rownum < 2
But I'm still not clear what the healtcheck is suppose to do?
This (DatabaseHealthCheck) will ensure that jdbc connection is working and that the database is up.
SELECT 1 from dual
This will ensure that the account that was logged in contains a table called metadata with at least 1 record.
SELECT id from Metadata where rownum < 2
If there is no data in the metadata table then it will return nothing (Which does not mean there is a database or application error). And if the table does not exist it will fail (which does not mean the database is down - just that the table does not exist which most likely means that geonetwork is not available). Also why the metadata table - i.e. the metadata table may exist however one of the other tables (Settings, languages, ....) may not. Seems like we are simply playing statistics - chances are that if the metadata exist then the others will exist as well.
So in the end I believe we are probably better off using the DatabaseHealthCheck query.
But if we want a quick fix for the existing logic then we can probably change the query to
SELECT count(*) from Metadata
As I believe all the databases understand this query. Note: It may not be as fast as the previous query.
comment:6 by , 12 years ago
Basically the check is to make sure that the system is running and usable. If there is no metadata or the table does not exist, then the system is not usable, although typically the failure of the healthcheck indicates the database is not reachable or something catastrophic has happened.
I wanted a little more than just a simple connection check and I wanted it to be an extremely fast check because in production I have nagios check it every 10 seconds.
I haven't looked into the performance characteristics of database queries to much but I would expect that SELECT count(*) from MEtadata; should be a very query even on very large datasets. Am I right?
comment:7 by , 12 years ago
I have geonetwork setup with the metadata table containing 4,000 records on a basic desktop system running oracle (nothing special). It took 1.638 seconds to perform the count(*) and 0.016 seconds to "select 1 from metadata where rownum < 2"
On the same system I have a table with a few column and 30,000,000 records - It took 5.085 seconds to perform a count(*) and 0.062 seconds to get the first row.
The point being - as the amount of data increases the count(*) will degrade. Whether this degradation is acceptable or not is debatable.
Why not choose a table that does not grow with the database. The "settings" table should not change as the database grows so it may be a good candidate. As I mentioned before, if the metadata table is all intact but the settings table is missing, geonetwork will not be functioning. So I think as long a you do a select count(*) from a core table that does not increase as the database increases then you should be ok.
comment:8 by , 12 years ago
Sounds good. I will use count(*) on another table. I have to take a look.
comment:9 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
My sense is that the validationQuery is to test the JDBC connection, but the DatabaseHealthCheck should be a query (queries) aimed at checking the status of the database as well as the connection. In 2.8.0, the DatabaseHealthCheck fails on Oracle: