Recovering from BSDS or log failures during restart

When the bootstrap data set (BSDS) or part of the recovery log for Db2 is damaged or lost and that damage prevents restart, you need to recover from that situation. What you do to recover varies based on the particular circumstances.

If the problem is discovered at restart, begin with one of the following recovery procedures:

If the problem persists, return to the procedures in this section.

When Db2 recovery log damage terminates restart processing, Db2 issues messages to the console to identify the damage and issue an abend reason code. (The SVC dump title includes a more specific abend reason code to assist in problem diagnosis.) If the explanations for the reason codes indicate that restart failed because of some problem that is not related to a log error, contact IBM® Software Support.

To minimize log problems during restart, the system requires two copies of the BSDS. Dual logging is also recommended.

Basic approaches to recovery: The two basic approaches to recovery from problems with the log are:

Bypassing the damaged log

Even if the log is damaged, and Db2 is started by circumventing the damaged portion, the log is the most important source for determining what work was lost and what data is inconsistent.

Bypassing a damaged portion of the log generally proceeds with the following steps:
  1. Db2 restart fails. A problem exists on the log, and a message identifies the location of the error. The following abend reason codes, which appear only in the dump title, can be issued for this type of problem. This is not an exhaustive list; other codes might occur.
    • 00D10261
    • 00D10262
    • 00D10263
    • 00D10264
    • 00D10265
    • 00D10266
    • 00D10267
    • 00D10268
    • 00D10329
    • 00D1032A
    • 00D1032B
    • 00D1032C
    • 00E80084

    The following figure illustrates the general problem:

    Figure 1. General problem of damaged Db2 log information
    Begin figure description. This figure is a time line that depicts a damaged log. End figure description.
  2. Db2 cannot skip over the damaged portion of the log and continue restart processing. Instead, you restrict processing to only a part of the log that is error free. For example, the damage shown in the preceding figure occurs in the log RBA range between X to Y. You can restrict restart to all of the log before X; then changes later than X are not made. Alternatively, you can restrict restart to all of the log after Y; then changes between X and Y are not made. In either case, some amount of data is inconsistent.
  3. You identify the data that is made inconsistent by your restart decision. With the SUMMARY option, the DSN1LOGP utility scans the accessible portion of the log and identifies work that must be done at restart, namely, the units of recovery that are to be completed and the page sets that they modified.

    Because a portion of the log is inaccessible, the summary information might not be complete. In some circumstances, your knowledge of work in progress is needed to identify potential inconsistencies.

  4. You use the CHANGE LOG INVENTORY utility to identify the portion of the log to be used at restart, and to tell whether to bypass any phase of recovery. You can choose to do a cold start and bypass the entire log.
  5. You restart Db2. Data that is unaffected by omitted portions of the log is available for immediate access.
  6. Before you allow access to any data that is affected by the log damage, you resolve all data inconsistencies. That process is described under Resolving inconsistencies resulting from a conditional restart.

Where to start

The specific procedure depends on the phase of restart that was in control when the log problem was detected. On completion, each phase of restart writes a message to the console. You must find the last of those messages in the console log. The next phase after the one that is identified is the one that was in control when the log problem was detected. Accordingly, start at:

As an alternative, determine which, if any, of the following messages was last received and follow the procedure for that message. Other DSN messages can also be issued.

Another procedure (Recovering from a failure resulting from total or excessive loss of log data) provides information to use if you determine (by using Recovering from failure during log initialization or current status rebuild) that an excessive amount (or all) of Db2 log information (BSDS, active, and archive logs) has been lost.

The last procedure,Resolving inconsistencies resulting from a conditional restart, can be used to resolve inconsistencies introduced while using one of the restart procedures in this information. If you decide to use Recovering from unresolvable BSDS or log data set problem during restart, you do not need to use Resolving inconsistencies resulting from a conditional restart.

Because of the severity of the situations described, the procedures identify Operations management action, rather than Operator action. Operations management might not be performing all the steps in the procedures, but they must be involved in making the decisions about the steps to be performed.