This recipe is for users who manually handle persistent (shared) database connection, with Oracle, although it's probably easy to adapt for other database backends with small adjustments. I needed to implement this both in Ruby (Rails' ActiveRecord) and Python (django.db) daemons, both working in the same setting. I thought the solution was elegant enough to be cleaned up into a recipe.
Automated database connection pooling for Python (nor Ruby) is not supported with Oracle 10g. Only 11g introduced Database Resident Connection Pooling (DRCP). It probably makes this recipe redundant for Python users with 11g. Oracle has official support for using Python with Oracle Database 11g, using the cx_Oracle driver. I'm not sure but I think Oracle doesn't have any form of official support for Ruby, but the de-facto ruby-oci8 driver works well with ActiveRecord.
Without automated connection pooling you have two decent options. Either open and close a new connection for each request, or save open connections into shared memory. The latter has obvious performance benefits if the expected request rate is high, but exposes a number of other problems this recipe aims to address. A connection can abruptly terminate because of e.g. a network glitch or maintenance downtime, and the sysadmins deploying your software will be happier having to deal with less services to reboot.
The handler should catch exceptions only related to network or database resource failures, whereas in other cases (SQL typo?) the original exception should not be reacted upon. Figuring out the proper ORA error codes to use was an exciting task, I hope I got them right ;). Those guys at Oracle sure love to make things easy. Since I use a regular expression to match the error message string, a multitude of errors can be matched with a single rule. Fortunately, all errors dealing with Transparent Network Substrate are marked by prefix "TNS". In my tests most errors were indeed caught with the single "ORA-.....: TNS
" regexp.
Python and Ruby have much in common, both could let me write this very expressively with about the same LOC. There is one small detail I'd like to bring up. These two expressions both compare the codes listed with the exception (error message in variable "message"), and return true if the message matches to one of the given ORA error codes.
Python
# Example message="ORA-03113: end-of-file on communication channel" any(filter(lambda oracode: re.search('ORA-'+oracode,message), re.split(' *\n *| +(?=\d)',""" .....: TNS 01000 01001 01014 01033 01034 01037 01089 01090 011.. 015.. 016.. 031.. 28547 30678 """)[1:-1]))Ruby
# Example message="ORA-01034: ORACLE not available" %w{ .....:\ TNS 01000 01001 01014 01033 01034 01037 01089 01090 011.. 015.. 016.. 031.. 28547 30678 }.select {|oracode| message[/ORA-#{oracode}/]}.any?Especially take notice of Ruby's
%w{multiline string}
versus Python's re.split(' *\n *| +(?=\d)',"""multiline string""")[1:-1]
. This just for the sake of the visual appearance of input data on the screen. Ruby makes it easier to write aesthetic code. Perhaps Python has an idiom for this, though, it just didn't pass my mind.
example_oci_exception_handler.py
# -*- coding: utf-8 -*- """ You may use this file under the terms of the BSD license as follows: Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALLexample_oci_exception_handler.rbOR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. The views and conclusions contained in the software and documentation are those of the authors and should not be interpreted as representing official policies, either expressed or implied, of copyright holder. """ import re import sys from time import sleep from myapp.settings import settings # your django settings module from django.core.management import setup_environ setup_environ(settings) import cx_Oracle from django.db import connection from django.db.utils import DatabaseError from logging import getLogger log = getLogger(__name__) class ExampleOCIExceptionHandler(): """Recipe for handling lost Oracle database connection. Depends on Django ORM, and that proper settings.py exist and is initialized properly. Selects the network and connection errors, waits until the server is reachable again, and calls the method again during upon which the exception happened. This is very useful if the connection, for some reason, is shared between requests. Oracle 10g does not support connection pooling with cx_Oracle. """ @staticmethod def dispatch(message_data): """Processes request. In case of database connection failure, the waitForOCIConnection() loop is called and connection should be re-established. In case of other Oracle errors, the error is raised again. """ try: # do some database operations... except (cx_Oracle.Error, DatabaseError): """ Catch lost database connection. Handles all TNS errors: ORA-xxxxx: TNS errors And all errors from these ranges: ORA-011xx: Database file errors ORA-015xx: Execution errors ORA-016xx: Execution errors ORA-031xx: communication errors Along with these specific errors: ORA-01000: maximum open cursors exceeded ORA-01001: invalid cursor ORA-01014: ORACLE shutdown in progress ORA-01033: ORACLE initialization or shutdown in progress ORA-01034: ORACLE not available ORA-01037: cannot allocate sort work area cursor; too many cursors ORA-01089: immediate shutdown in progress - no operations are permitted ORA-01090: shutdown in progress - connection is not permitted ORA-28547: connection to server failed, probable Oracle Net admin error ORA-30678: too many open connections """ msg = str(sys.exc_info()[1]).rstrip() # is the error about broken connection? if any(filter(lambda oracode: re.search('ORA-'+oracode,msg), re.split(' *\n *| +(?=\d)',""" .....: TNS 01000 01001 01014 01033 01034 01037 01089 01090 011.. 015.. 016.. 031.. 28547 30678 """)[1:-1]) # ignore outermost items as they are empty strings ): log.error(msg) ExampleOCIExceptionHandler.waitForOCIConnection() # enter recursion and call this method again.. sleep(2) return ExampleOCIExceptionHandler.dispatch(message_data) else: log.debug(msg) log.debug("Error seems to be not about lost connection, raising again ..") raise @staticmethod def waitForOCIConnection(time_to_wait=60): """Creates a new database connection, loops until one is established.""" # nullify the connection first, since it can't discover that the socket is gone connection.connection = None while not connection._valid_connection(): try: log.info("Attempt to establish OCI connection to %s ..." % [ settings.DATABASES['default']['NAME']]) # in django parlance, calling _cursor() opens the database connection cursor = connection._cursor() sleep(2) return connection._valid_connection() except (cx_Oracle.Error, DatabaseError): exctype, message = sys.exc_info()[:2] log.error(str(message).rstrip()) log.info("Retrying OCI connection in %i seconds" % time_to_wait) sleep(time_to_wait) return ExampleOCIExceptionHandler.waitForOCIConnection(time_to_wait)
# encoding: utf-8 =begin You may use this file under the terms of the BSD license as follows: Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALLOR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. The views and conclusions contained in the software and documentation are those of the authors and should not be interpreted as representing official policies, either expressed or implied, of copyright holder. =end require 'rubygems' require 'active_record' # Recipe for handling lost Oracle database connection with ActiveRecord. # # Depends on RAILS_ROOT and RAILS_ENV variables to be set and RAILS_ROOT/config/database.yml file to exist. # # Selects the network and connection errors, waits until the server is reachable again, # and calls the method again during upon which the exception happened. # # This is very useful if the connection, for some reason, is shared between requests. class ExampleOCIExceptionHandler # Processes request. # # In case of database connection failure, the waitForOCIConnection() loop is called # and connection should be re-established. In case of other Oracle errors, # the error is raised again. def self.dispatch(message_data) begin # do some database operations... # Catch lost database connection. # # Handles all TNS errors: # ORA-xxxxx: TNS errors # # And all errors from these ranges: # ORA-011xx: Database file errors # ORA-015xx: Execution errors # ORA-016xx: Execution errors # ORA-031xx: communication errors # # Along with these specific errors: # ORA-01000: maximum open cursors exceeded # ORA-01001: invalid cursor # ORA-01014: ORACLE shutdown in progress # ORA-01033: ORACLE initialization or shutdown in progress # ORA-01034: ORACLE not available # ORA-01037: cannot allocate sort work area cursor; too many cursors # ORA-01089: immediate shutdown in progress - no operations are permitted # ORA-01090: shutdown in progress - connection is not permitted # ORA-28547: connection to server failed, probable Oracle Net admin error # ORA-30678: too many open connections # rescue OCIError, ActiveRecord::StatementInvalid msg = $!.message # is the error about broken connection? if %w{ .....: TNS 01000 01001 01014 01033 01034 01037 01089 01090 011.. 015.. 016.. 031.. 28547 30678 }.select{|oracode| msg[/ORA-#{oracode}/]}.any? logger.error msg waitForOCIConnection() # enter recursion and call this method again.. return dispatch(message_data) else logger.debug msg logger.debug("Error seems to be not about lost connection, raising again ..") raise $! end end end # Attempts to establish database connection. # Loops until ActiveRecord is connected. # # Set time_to_wait in seconds. def self.waitForOCIConnection(time_to_wait=60) begin database_configuration = YAML.load_file(File.join(RAILS_ROOT,"config","database.yml")) ActiveRecord::Base.configurations = database_configuration logger.info( "Attempt to establish OCI connection to %s ..." % database_configuration[RAILS_ENV]['database']) ActiveRecord::Base.establish_connection(database_configuration[RAILS_ENV]) ActiveRecord::Base.connection # essential to open connection return ActiveRecord::Base.connected? rescue OCIError, ActiveRecord::StatementInvalid logger.error($!.message) logger.info("Retrying OCI connection in %i seconds" % time_to_wait) sleep time_to_wait return waitForOCIConnection(time_to_wait) end end end