Intermittent ORA-12514 error connecting to SCAN listeners

Users were complaining of ORA-12514: TNS:listener does not currently know of service requested in connect descriptor errors whilst connecting to a database on a RAC cluster.

Testing manually:

  • The TNS entry using the SCAN listener connects successfully nearly all the time when using SQLplus.
  • The TNS entry using the SCAN listener failed every time using the batch script (upon viewing the code, the batch script created 3 connections in quick succession)
  • Changing the TNS entry to use the local listener caused the batch script to run successfully.

I checked that the database is configured to register the service with the SCAN and VIP listeners. The parameters were correct:

NAME               VALUE
-------------------------------------------------
service_names      SALES.MYDOMAIN.COM
remote_listener    devdbs-scan:1521
local_listener     (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=ora012-vip)(PORT=1528))))          

Then I checked that the service was registered with all the SCAN listeners:

ora014> lsnrctl status LISTENER_SCAN1 | grep -i SALES
Service "SALES.MYDOMAIN.COM" has 1 instance(s).
  Instance "SALES_1", status READY, has 1 handler(s) for this service...

ora012> lsnrctl status LISTENER_SCAN2 | grep -i SALES
Service "SALES.MYDOMAIN.COM" has 1 instance(s).
  Instance "SALES_1", status READY, has 1 handler(s) for this service...

ora015> lsnrctl status LISTENER_SCAN3 | grep -i SALES
So, one of the SCAN listeners did not know of the service...?
Next, I checked how long the SCAN listeners have been running:
ora014> lsnrctl status LISTENER_SCAN1 | grep -i uptime
Uptime                    106 days 23 hr. 3 min. 32 sec
ora012> lsnrctl status LISTENER_SCAN2 | grep -i uptime
Uptime                    46 days 14 hr. 57 min. 14 sec
ora015> lsnrctl status LISTENER_SCAN3 | grep -i uptime
Uptime                    7 days 12 hr. 37 min. 7 sec

The SCAN listener that did not have the service registered had an uptime much shorter than the others, meaning that this SCAN listener had failed over fairly recently. This led me to Bug 13066936 ("Instance does not register services when scan fails over") which was mentioned in DocID 1373350.1 in Metalink/MOS. The problem is fixed in 11.2.0.3.11, 11.2.0.4, 12.1.0.1 and 11.2.0.3 Bundle 24 for Exadata

Solution:

Upgrade the Grid Infrastructure and RDBMS Oracle installations to a fixed version.

Workaround:

The workaround is to make a note of the current value of remote_listener, change it to a null value and then change it back as follows:

col remote_val new_value remote_val
select value remote_val from v\$parameter where name='remote_listener';
alter system set remote_listener='';
alter system register;
alter system set remote_listener='&remote_val';
alter system register;

I have written a short kornshell script (below) which will re-register all running instances on a node with the SCAN listeners using the workaround detailed above. If you encounter this bug and need a quick workaround you should run this script once on each node in the cluster.

#!/bin/ksh
# ------------------------------------------------------------------------
# A simple kornshell script to force all running instances to re-register
# with the remote_listener (workaround for bug 13066936)
# ------------------------------------------------------------------------
for INST in `ps -aef | grep 'ora_pmon' | egrep -v '(grep|sed)' | sed 's/^.*ora_pmon_//'`
do
  echo . Reregistering $INST with remote_listener
  export ORAENV_ASK=NO
  . oraenv $INST
  sqlplus -SL "/ as sysdba" <<-EOF >/dev/null
    col remote_val new_value remote_val
    select value remote_val from v\$parameter where name='remote_listener';
    alter system set remote_listener='';
    alter system register;
    alter system set remote_listener='&remote_val';
    alter system register;
EOF
done