Wednesday, August 19, 2015

How to detect gaps in your Data Guard environments

In the OTN Data Guard Forum, a user was trying to understand if there's a gap between the Primary and the Standby database so he executed the following query:
select count(*) from v$archived_log where applied = 'NO'

The output of the query was higher than 3000, my response was that the fact that his query returns more than 3000  doesn't necessarily tell that there's any problem becuase when you query v$archived_log it will always report "NO" for the local archive log destinations, therefore he should check for APPLIED = 'NO' and also standby_dest = 'YES'.

Link to the Discussion:

I decided to enhance the query even more so it will dipslay per each standby archive log destinaion the following information:
  • INST_ID - The Instance ID, relevant for RAC with Data Guard configurations
  • DEST_ID - The ID of the Archive Log Destination
  • Status - Will indicate whether the standby destination status is valid or not
  • Destination - The name of the Archive Log Destination
  • Applied_Gap - The applied gap between the Primary and the Standby
  • Received_Gap - The received gap between the Primary and the Standby
  • Last_Received_Seq - Last archive log received in the standby destination
  • Last_Applied_Seq - Last archive log applied in the standby destination
The following query should always be executed on the Primary database:
select ar.inst_id "inst_id",
       ar.dest_id "dest_id",
       ar.status "dest_status",
       ar.destination "destination",
         (select MAX (sequence#) highiest_seq
            from v$archived_log val, v$database vdb
           where     val.resetlogs_change# = vdb.resetlogs_change#
                 and thread# = ar.inst_id
                 and dest_id = ar.dest_id)
       - NVL (
            (select MAX (sequence#)
               from v$archived_log val, v$database vdb
              where     val.resetlogs_change# = vdb.resetlogs_change#
                    and thread# = ar.inst_id
                    and dest_id = ar.dest_id
                    and standby_dest = 'YES'
                    and applied = 'YES'),
            0)
          "applied_gap",
         (SELECT MAX (sequence#) highiest_seq
            from v$archived_log val, v$database vdb
           where     val.resetlogs_change# = vdb.resetlogs_change#
                 AND thread# = ar.inst_id)
       - NVL (
            (SELECT MAX (sequence#)
               from v$archived_log val, v$database vdb
              where     val.resetlogs_change# = vdb.resetlogs_change#
                    and thread# = ar.inst_id
                    and dest_id = ar.dest_id
                    and standby_dest = 'YES'),
            0)
          "received_gap",
       NVL (
          (SELECT MAX (sequence#)
             from v$archived_log val, v$database vdb
            where     val.resetlogs_change# = vdb.resetlogs_change#
                  and thread# = ar.inst_id
                  and dest_id = ar.dest_id
                  and standby_dest = 'YES'),
          0)
          "last_received_seq",
       NVL (
          (SELECT MAX (sequence#)
             from v$archived_log val, v$database vdb
            where     val.resetlogs_change# = vdb.resetlogs_change#
                  and thread# = ar.inst_id
                  and dest_id = ar.dest_id
                  and standby_dest = 'YES'
                  and applied = 'YES'),
          0)
          "last_applied_seq"
  from (SELECT DISTINCT dest_id,
                        inst_id,
                        status,
                        target,
                        destination,
                        error
          from sys.gv_$archive_dest
         where target = 'STANDBY' and STATUS <> 'DEFERRED') ar

Example of the SQL output:






I executed the query in one of our data guard environments.
As you can see from the example output, the sequence# of the last archive log that was sent to the standby destination is 9699 and the last one that was applied is 9698, the applied gap is 1 which is definitely OK because I'm working with a standby redo log therefore the last sequence# should always be the standby redolog that Oracle is currently writing to.

Bottom line:

As long as the applied and the received gap are lower than 2 the it means that there are no gaps in your Data Guard environment.

9 comments:

  1. That is very nice query!

    Another way you can figure out if there is a gap between standby and primary, is if you use the V$DATAGUARD_STATS (in the standby database) to see how long is the gap (for both transport and apply processes).

    If the Value is null, the standby is not applying at all and if the number is > +00 00:00:00 then you have a gap between the databases and that is the time between them.

    you can also convert this varchar value to seconds if you like... :)

    ReplyDelete
    Replies
    1. Yes, definitely V$DATAGUARD_STATS completes the picture and provides the gap in terms of time (while my query provides the gap in terms of # of archive logs).
      Thanks Zohar!

      Delete