Oracle数据恢复 – 注意Expdp导出文件可能初始写入损坏

最近一个客户遭遇到了一次非常意外的数据损失,客户在维护时重建系统,先通过expdp备份了数据,结果当进行导入恢复数据时,发现部分重要的数据表无法恢复,出现错误。这个案例提示我们:不能尽信expdp等导出文件备份,必须通过多重手段协同,全面保障数据安全。

导入DMP文件时出现的错误大致如下:

Import: Release 11.2.0.1.0 – Production on 星期日 12月 30 15:37:27 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
;;;
连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning and Real Application Testing options
已成功加载/卸载了主表 “CW9999”.”SYS_IMPORT_TABLE_02″
启动 “CW9999”.”SYS_IMPORT_TABLE_02″:  cw9999/******** parfile=data10.par
处理对象类型 SCHEMA_EXPORT/TABLE/TABLE
处理对象类型 SCHEMA_EXPORT/TABLE/TABLE_DATA
ORA-31693: 表数据对象 “CW9999″.”PSXM” 无法加载/卸载并且被跳过, 错误如下:
ORA-02354: 导出/导入数据时出错
ORA-02368: the following file is not valid for this load operation
ORA-02369: internal number in header in file H:\DMP\bj.dmp is not valid
ORA-31693: 表数据对象 “CW9999″.”PSMOD” 无法加载/卸载并且被跳过, 错误如下:
ORA-02354: 导出/导入数据时出错
ORA-39776: 加载表 “CW9999″.”SPMOD” 时出现致命的直接路径 API 错误
ORA-00600: 内部错误代码, 参数: [klaprs_11], [60], [0], [], [], [], [], [], [], [], [], []

注意这其中最关键的错误是ORA-02369,提示DMP文件中的内部数字标记非法,这其实说明导出文件损坏了,以下是该错误的官方解释:

ORA-02369 internal number in header in file string is not valid
Cause: If the dump files were transferred over, verify the transfer operation was done correctly.
Action: The specified file could not be used for this load because the internal number in the header was not valid.

在MOS上的Note:785473.1 上有提示,如果文件通过ftp等方式传输,可能丢失了内容,可以尝试重传,但是这种概率极低,事实上是,在执行导出时文件就损坏了。这种情况下,不可避免的要丢失数据。在MOS上,有几十个与此错误有关的SR。

在这个客户案例中,部分数据表是完好的,可以导入,但是其中一些表出错,无法导入还原到数据库中。

在这种情况下,极限的数据挽救情况是:通过ODU、DUL等工具,可以扫描DMP文件,尝试恢复其中完好的数据,但是通常无法做到完好无损了。

技术提示:ORA-02369 不是指dmp文件的开始部分损坏,对于DMP文件来说,每个表开始部分都存在一个Header信息,这个错误是指对于指定表,这个信息损坏。在这个案例中,可能是并行导出导致的写损坏。

案例警示:在重要数据备份中,不能仅采用单一手段,多重手段结合,才能确保数据万无一失

Oracle ASM Truncate Table恢复

云和恩墨独立研发的Oracle 数据库恢复软件DATA-LADR 4.0.0以后的版本已经支持ASM,它能够直接从ASM磁盘中解析数据。即使由于硬件或错误地FORMAT导致ASM磁盘头部数据损坏而导致ASM磁盘组不能加载时,DATA-LADR仍然能够从ASM中恢复数据。正是由于DATA-LADR的这一特性,现在在ASM里恢复被误Truncate掉的表中的数据将变得轻而易举。

下面我利用DATA-LADR 4.1.2的试用版来演示如何在Oracle 11gR2 ASM中恢复被误Truncate掉的表中的数据。

请注意——DATA-LADR试用版仅用于测试、学习和验证,只能恢复SYSTEM表空间下的数据,对于其他表空间的数据,仅恢复少量的数据以验证数据可恢复。而正式版在获取LICENSE后能够恢复所有能够恢复的数据。

准备好DATA-LADR所需要的控制文件

ODU运行所需的控制文件control.txt的内容可以在mount或open状态下查询v$datafile,如下所示:

[oracle@bspdev odu]$ sqlplus ‘/ as sysdba’

SQL*Plus: Release 11.2.0.1.0 Production on Tue Apr 12 13:29:01 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> set wrap off
SQL> set linesize 800
SQL> select ts#,file#,rfile#,name from v$datafile;

TS# FILE# RFILE# NAME
———- ———- ———- ————————————-
0 1 1 +DATA/ora11g/datafile/system.256.747310449
1 2 2 +DATA/ora11g/datafile/sysaux.257.747310449
2 3 3 +DATA/ora11g/datafile/undotbs1.258.747310451
4 4 4 +DATA/ora11g/datafile/users.259.747310451

将上述查询到的内容添加到control.txt中即可,如下所示:

[oracle@bspdev odu]$ cat control.txt
0 1 1 +DATA/ora11g/datafile/system.256.747310449
1 2 2 +DATA/ora11g/datafile/sysaux.257.747310449
2 3 3 +DATA/ora11g/datafile/undotbs1.258.747310451
4 4 4 +DATA/ora11g/datafile/users.259.747310451

准备好DATA-LADR所需要的ASM磁盘信息文件

DATA-LADR运行所需要的ASM磁盘信息文件实际上是来源于v$asm_disk这个视图:

SQL> select group_number,name,state from v$asm_diskgroup;

GROUP_NUMBER NAME STATE
———— —————————— ———–
1 DATA CONNECTED
2 RECO MOUNTED

SQL> col path for a30
SQL> col name for a30
SQL> select disk_number,name,path,group_number from v$asm_disk order by group_number,disk_number;

DISK_NUMBER NAME PATH GROUP_NUMBER
———– —————————— ——————————
0 DATA_0000 /dev/raw/raw3 1
1 DATA_0001 /dev/raw/raw5 1
2 DATA_0002 /dev/raw/raw6 1
0 RECO_0000 /dev/raw/raw7 2
1 RECO_0001 /dev/raw/raw8 2

注意,上述系统是建立在Linux中的RAW分区上,这里/dev/raw/raw[i],对应的实际设备名是/dev/sda[i]。而对于Linux下的RAW磁盘(或RAW分区),DATA-LADR不直接支持,但是DATA-LADR支持RAW磁盘所对应的实际设备文件,比如数据库中使用的是/dev/raw/raw1,其对应的实际设备文件为/dev/sdc1,则需要在DATA-LADR的ASM磁盘信息文件的磁盘路径列(栏)填上/dev/sdc1。如果运行DATA-LADR的用户没有访问权限,则对用户赋予可读权限,或以有权限的用户比如root用户来运行DATA-LADR

如果直接用RAW磁盘的名称,则DATA-LADR将不能识别出正确的磁盘信息(这是由于LINUX的RAW限制所致,现在已经不鼓励使用RAW,而使用LVM):

[oracle@bspdev odu]$ cat asmdisk.txt
# disk_no disk_path group_name meta_block_size ausize disk_size header_offset
0 /dev/raw/raw3 DATA 4096 1048576
1 /dev/raw/raw5 DATA 4096 1048576
2 /dev/raw/raw6 DATA 4096 1048576
0 /dev/raw/raw7 RECO 4096 1048576
1 /dev/raw/raw8 RECO 4096 1048576

[oracle@bspdev odu]$ ./odu

Oracle Data Unloader:Release 4.1.3

Copyright (c) 2008,2009,2010,2011 XiongJun. All rights reserved.

Web: http://www.support.enmotech.com
Email: magic007cn@gmail.com

loading default config…….

byte_order little
block_size 8192
block_buffers 1024
db_timezone -7
client_timezone 8
asmfile_extract_path /odu/asmfile
data_path data
lob_path /odu/data/lob
charset_name AL32UTF8
ncharset_name AL16UTF16
output_format text
lob_storage infile
clob_byte_order big
trace_level 1
delimiter |
unload_deleted no
file_header_offset 0
is_tru64 no
record_row_addr no
convert_clob_charset yes
use_scanned_lob yes
trim_scanned_blob yes
lob_switch_dir_rows 20000
db_block_checksum yes
db_block_checking yes
rdba_file_bits 10
compatible 10
load config file ‘config.txt’ successful
loading default asm disk file ……

read data error from asm disk ‘/dev/raw/raw3’.error message:Invalid argument
read data error from asm disk ‘/dev/raw/raw5’.error message:Invalid argument
read data error from asm disk ‘/dev/raw/raw6’.error message:Invalid argument
read data error from asm disk ‘/dev/raw/raw7’.error message:Invalid argument
read data error from asm disk ‘/dev/raw/raw8’.error message:Invalid argument

grp# dsk# bsize ausize disksize diskname groupname path
—- —- —– —— ——– ————— ————— ———-

load asm disk file ‘asmdisk.txt’ successful
loading default control file ……

can not found diskgroup for file +DATA/ora11g/datafile/system.256.747310449.
can not found diskgroup for file +DATA/ora11g/datafile/sysaux.257.747310449.
can not found diskgroup for file +DATA/ora11g/datafile/undotbs1.258.747310451.
can not found diskgroup for file +DATA/ora11g/datafile/users.259.747310451.

ts# fn rfn bsize blocks bf offset filename
—- —- —- —– ——– — —— ————————————-
load control file ‘oductl.dat’ successful
loading dictionary data……done

loading scanned data……done

从提示信息里我们可以看到DATA-LADR并不能直接识别RAW类型的磁盘。

此时,我们应该将asmdisk.txt中的磁盘路径列改成RAW磁盘实际对应的设备文件名,并对用户赋予可读权限,或以有权限的用户比如root用户来运行DATA-LADR,如下所示:

[oracle@bspdev odu]$ su
Password:
[root@bspdev odu]# cat asmdisk.txt
# disk_no disk_path group_name meta_block_size ausize disk_size header_offset
0 /dev/sda3 DATA 4096 1048576
1 /dev/sda5 DATA 4096 1048576
2 /dev/sda6 DATA 4096 1048576
0 /dev/sda7 RECO 4096 1048576
1 /dev/sda8 RECO 4096 1048576
[root@bspdev odu]# ./odu

Oracle Data Unloader:Release 4.1.3

Copyright (c) 2008,2009,2010,2011 XiongJun. All rights reserved.

Web: http://www.support.enmotech.com
Email: magic007cn@gmail.com

loading default config…….

byte_order little
block_size 8192
block_buffers 1024
db_timezone -7
client_timezone 8
asmfile_extract_path /odu/asmfile
data_path data
lob_path /odu/data/lob
charset_name AL32UTF8
ncharset_name AL16UTF16
output_format text
lob_storage infile
clob_byte_order big
trace_level 1
delimiter |
unload_deleted no
file_header_offset 0
is_tru64 no
record_row_addr no
convert_clob_charset yes
use_scanned_lob yes
trim_scanned_blob yes
lob_switch_dir_rows 20000
db_block_checksum yes
db_block_checking yes
rdba_file_bits 10
compatible 10
load config file ‘config.txt’ successful
loading default asm disk file ……

grp# dsk# bsize ausize disksize diskname groupname path
—- —- —– —— ——– ————— ————— ——
1 0 4096 1024K 9000 DATA_0000 DATA /dev/sda3
1 1 4096 1024K 9000 DATA_0001 DATA /dev/sda5
1 2 4096 1024K 9000 DATA_0002 DATA /dev/sda6
2 0 4096 1024K 9000 RECO_0000 RECO /dev/sda7
2 1 4096 1024K 7288 RECO_0001 RECO /dev/sda8

load asm disk file ‘asmdisk.txt’ successful
loading default control file ……

ts# fn rfn bsize blocks bf offset filename
—- —- —- —– ——– — —— ——————————
0 1 1 8192 88320 N 0 +DATA/ora11g/datafile/system.256.747310449
1 2 2 8192 89600 N 0 +DATA/ora11g/datafile/sysaux.257.747310449
2 3 3 8192 12160 N 0 +DATA/ora11g/datafile/undotbs1.258.747310451
4 4 4 8192 640 N 0 +DATA/ora11g/datafile/users.259.747310451
load control file ‘oductl.dat’ successful
loading dictionary data……done

loading scanned data……done

从结果里我们可以看到,现在DATA-LADR已经能够正确的识别出所有的ASM磁盘。

依法炮制我们准备好上述库的ASM磁盘配置信息文件asmdisk.txt,如下所示:

[oracle@bspdev odu]$ cat asmdisk.txt
# disk_no disk_path group_name meta_block_size ausize disk_size header_offset
0 /dev/sda3 DATA 4096 1048576
1 /dev/sda5 DATA 4096 1048576
2 /dev/sda6 DATA 4096 1048576
0 /dev/sda7 RECO 4096 1048576
1 /dev/sda8 RECO 4096 1048576

准备好相关的测试数据及执行Truncate操作

[oracle@bspdev odu]$ sqlplus ‘/ as sysdba’;

SQL*Plus: Release 11.2.0.1.0 Production on Wed Apr 13 15:33:37 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

创建测试表t1:

SQL> create table t1 as select * from dba_objects;

Table created.

SQL> select count(*) from t1;

COUNT(*)
———-
72206

SQL> desc t1;
Name Null? Type
—————————————– ——– —————–
OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
NAMESPACE NUMBER
EDITION_NAME VARCHAR2(30)

创建表t1的备份表t1_backup,这是用于恢复数据后的比对工作:

SQL> create table t1_backup as select * from t1;

Table created.

SQL> select count(*) from t1_backup;

COUNT(*)
———-
72206

SQL> select * from t1 minus select * from t1_backup;

no rows selected

执行对表t1的truncate操作:

SQL> truncate table t1;

Table truncated.

执行一次full checkpoint,目的是把ODU所需要的数据字典信息让Oracle写回到datafile中:

SQL> alter system checkpoint;

System altered.

可以看到,t1中的72206条数据已经被清空了:

SQL> select count(*) from t1;

COUNT(*)
———-
0

为了说明ODU可以在ASM diskgroup不能成功mount的情况下依然可以恢复出数据的这一特点,我们现在关闭数据库实例和ASM实例:

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

[oracle@bspdev odu]$ su – grid
Password:
[grid@bspdev ~]$ sqlplus ‘/ as sysasm’;

SQL*Plus: Release 11.2.0.1.0 Production on Wed Apr 13 15:37:47 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Automatic Storage Management option

SQL> shutdown immediate
ASM diskgroups dismounted
ASM instance shutdown
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Automatic Storage Management option

从结果里我们可以看到,现在所有的diskgroup都已经dismount了:

[grid@bspdev ~]$ crsctl stat res
NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE
STATE=ONLINE on bspdev

NAME=ora.RECO.dg
TYPE=ora.diskgroup.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.asm
TYPE=ora.asm.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.cssd
TYPE=ora.cssd.type
TARGET=ONLINE
STATE=ONLINE on bspdev

NAME=ora.diskmon
TYPE=ora.diskmon.type
TARGET=ONLINE
STATE=ONLINE on bspdev

NAME=ora.ora11g.db
TYPE=ora.database.type
TARGET=OFFLINE
STATE=OFFLINE

[grid@bspdev ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.DATA.dg ora….up.type OFFLINE OFFLINE
ora….ER.lsnr ora….er.type ONLINE ONLINE bspdev
ora.RECO.dg ora….up.type OFFLINE OFFLINE
ora.asm ora.asm.type OFFLINE OFFLINE
ora.cssd ora.cssd.type ONLINE ONLINE bspdev
ora.diskmon ora….on.type ONLINE ONLINE bspdev
ora.ora11g.db ora….se.type OFFLINE OFFLINE

使用ODU试用版恢复上述被Truncate掉的表

现在我们来使用ODU的试用版来恢复上述被Truncate掉的表t1。
因为使用了Linux下的raw分区,所以这里我切换回root用户后再执行ODU:

[grid@bspdev ~]$ su
Password:
[root@bspdev grid]# cd /u01/app/oracle/odu
[root@bspdev odu]# ./odu

Oracle Data Unloader trial version 4.1.2

Copyright (c) 2008,2009,2010,2011 XiongJun. All rights reserved.

Web: http://www.support.enmotech.com
Email: magic007cn@gmail.com

loading default config…….

byte_order little
block_size 8192
block_buffers 1024
error at line 3.
db_timezone -7
client_timezone 8
asmfile_extract_path /odu/asmfile
data_path data
lob_path /odu/data/lob
charset_name AL32UTF8
ncharset_name AL16UTF16
output_format text
lob_storage infile
clob_byte_order big
trace_level 1
delimiter |
unload_deleted no
file_header_offset 0
is_tru64 no
record_row_addr no
convert_clob_charset yes
use_scanned_lob yes
trim_scanned_blob yes
lob_switch_dir_rows 20000
db_block_checksum yes
db_block_checking yes
rdba_file_bits 10
compatible 10
load config file ‘config.txt’ successful
loading default asm disk file ……

grp# dsk# bsize ausize disksize diskname groupname path
—- —- —– —— ——– ————— ————— ——
1 0 4096 1024K 9000 DATA_0000 DATA /dev/sda3
1 1 4096 1024K 9000 DATA_0001 DATA /dev/sda5
1 2 4096 1024K 9000 DATA_0002 DATA /dev/sda6
2 0 4096 1024K 9000 RECO_0000 RECO /dev/sda7
2 1 4096 1024K 7288 RECO_0001 RECO /dev/sda8

load asm disk file ‘asmdisk.txt’ successful
loading default control file ……

ts# fn rfn bsize blocks bf offset filename
—- —- —- —– ——– — —— —————————–
0 1 1 8192 92160 N 0 +DATA/ora11g/datafile/system.256.747310449
1 2 2 8192 92160 N 0 +DATA/ora11g/datafile/sysaux.257.747310449
2 3 3 8192 12160 N 0 +DATA/ora11g/datafile/undotbs1.258.747310451
4 4 4 8192 1280 N 0 +DATA/ora11g/datafile/users.259.747310451
load control file ‘control.txt’ successful
loading dictionary data……done

loading scanned data……done

ODU恢复被Truncate的表的数据的过程非常简单,一共只需3步即可:
第一步,首先unload数据字典:

ODU> unload dict
CLUSTER C_USER# file_no: 1 block_no: 208
TABLE OBJ$ file_no: 1 block_no: 240
CLUSTER C_OBJ# file_no: 1 block_no: 144
CLUSTER C_OBJ# file_no: 1 block_no: 144
found IND$’s obj# 19
found IND$’s dataobj#:2,ts#:0,file#:1,block#:144,tab#:3
found TABPART$’s obj# 576
found TABPART$’s dataobj#:576,ts#:0,file#:1,block#:3824,tab#:0
found INDPART$’s obj# 581
found INDPART$’s dataobj#:581,ts#:0,file#:1,block#:3872,tab#:0
found TABSUBPART$’s obj# 588
found TABSUBPART$’s dataobj#:588,ts#:0,file#:1,block#:3928,tab#:0
found INDSUBPART$’s obj# 593
found INDSUBPART$’s dataobj#:593,ts#:0,file#:1,block#:3968,tab#:0
found IND$’s obj# 19
found IND$’s dataobj#:2,ts#:0,file#:1,block#:144,tab#:3
found LOB$’s obj# 80
found LOB$’s dataobj#:2,ts#:0,file#:1,block#:144,tab#:6
found LOBFRAG$’s obj# 609
found LOBFRAG$’s dataobj#:609,ts#:0,file#:1,block#:4096,tab#:0

接着获得待恢复的表的tablespace number和object id:

ODU> desc sys.t1

Object ID:74332
Storage(Obj#=74332 DataObj#=74334 TS#=0 File#=1 Block#=83800 Cluster=0)

NO. SEG INT Column Name Null? Type
— — — —————————— ——— —————-
1 1 1 OWNER VARCHAR2(30)
2 2 2 OBJECT_NAME VARCHAR2(128)
3 3 3 SUBOBJECT_NAME VARCHAR2(30)
4 4 4 OBJECT_ID NUMBER
5 5 5 DATA_OBJECT_ID NUMBER
6 6 6 OBJECT_TYPE VARCHAR2(19)
7 7 7 CREATED DATE
8 8 8 LAST_DDL_TIME DATE
9 9 9 TIMESTAMP VARCHAR2(19)
10 10 10 STATUS VARCHAR2(7)
11 11 11 TEMPORARY VARCHAR2(1)
12 12 12 GENERATED VARCHAR2(1)
13 13 13 SECONDARY VARCHAR2(1)
14 14 14 NAMESPACE NUMBER
15 15 15 EDITION_NAME VARCHAR2(30)

第二步,根据desc的结果,执行scan extent的操作,scan extent命令后可以指定待扫描的表所在的tablespace number和object id:

ODU> scan extent tablespace 0 object 74332

scan extent start: 2011-04-13 15:40:33
scanning extent…
scanning extent finished.
scan extent completed: 2011-04-13 15:40:59

第三步,开始执行恢复操作,从结果里可以看到,ODU已经成功把被Truncate掉的72206条数据给恢复出来了:

ODU> unload table sys.t1 object truncate
Auto mode truncated table.

Unloading table: T1,object ID: 74332
Unloading segment,storage(Obj#=74332 DataObj#=74332 TS#=0 File#=1 Block#=83800 Cluster=0)
72206 rows unloaded

ODU> exit

数据的导入及验证过程

我们现在来用SQL*Loader导入被ODU恢复的那72206条记录。
缺省情况下,恢复出来的数据位于ODU所在目录中的data子目录:

[root@bspdev data]# pwd
/u01/app/oracle/odu/data

ODU会帮你把待恢复的表的建表语句,SQL*Loader所需要的control文件都自动生成好:

[root@bspdev data]# ls -lrt
total 8592
-rw-r–r–. 1 root root 468 Apr 13 15:41 SYS_T1.sql
-rw-r–r–. 1 root root 644 Apr 13 15:41 SYS_T1.ctl
-rw-r–r–. 1 root root 8770686 Apr 13 15:41 SYS_T1.txt

SYS_T1.sql就是重建表t1所需要的sql语句:

[root@bspdev data]# cat SYS_T1.sql
CREATE TABLE “SYS”.”T1″
(
“OWNER” VARCHAR2(30) ,
“OBJECT_NAME” VARCHAR2(128) ,
“SUBOBJECT_NAME” VARCHAR2(30) ,
“OBJECT_ID” NUMBER ,
“DATA_OBJECT_ID” NUMBER ,
“OBJECT_TYPE” VARCHAR2(19) ,
“CREATED” DATE ,
“LAST_DDL_TIME” DATE ,
“TIMESTAMP” VARCHAR2(19) ,
“STATUS” VARCHAR2(7) ,
“TEMPORARY” VARCHAR2(1) ,
“GENERATED” VARCHAR2(1) ,
“SECONDARY” VARCHAR2(1) ,
“NAMESPACE” NUMBER ,
“EDITION_NAME” VARCHAR2(30)
);

SYS_T1.ctl就是SQL*Loader所需要的控制文件:

[root@bspdev data]# cat SYS_T1.ctl

–Generated by ODU,for table “SYS”.”T1″

OPTIONS(BINDSIZE=8388608,READSIZE=8388608,ERRORS=-1,ROWS=50000)
LOAD DATA
INFILE ‘SYS_T1.txt’ “STR X’0a’”
APPEND INTO TABLE “SYS”.”T1″
FIELDS TERMINATED BY X’7c’ TRAILING NULLCOLS
(
“OWNER” CHAR(30),
“OBJECT_NAME” CHAR(128),
“SUBOBJECT_NAME” CHAR(30),
“OBJECT_ID” ,
“DATA_OBJECT_ID” ,
“OBJECT_TYPE” CHAR(19),
“CREATED” DATE “yyyy-mm-dd hh24:mi:ss”,
“LAST_DDL_TIME” DATE “yyyy-mm-dd hh24:mi:ss”,
“TIMESTAMP” CHAR(19),
“STATUS” CHAR(7),
“TEMPORARY” CHAR(1),
“GENERATED” CHAR(1),
“SECONDARY” CHAR(1),
“NAMESPACE” ,
“EDITION_NAME” CHAR(30)
)

缺省情况下,恢复出来的数据会以文本方式存在data子目录下,这里就是SYS_T1.txt:

[root@bspdev data]# tail -n 20 SYS_T1.txt
SYS|WRH$_ACTIVE_SESSION_HISTORY|WRH$_ACTIVE_4143510747_257|74296|74296|TABLE PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|1
SYS|WRH$_ACTIVE_SESSION_HISTORY_PK|WRH$_ACTIVE_4143510747_257|74298|74298|INDEX PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|4
SYS|WRH$_ACTIVE_SESSION_HISTORY_PK|WRH$_ACTIVE_4143510747_234|74297|74191|INDEX PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|4
SYS|WRH$_TABLESPACE_STAT|WRH$_TABLES_4143510747_257|74300|74300|TABLE PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|1
SYS|WRH$_TABLESPACE_STAT_PK|WRH$_TABLES_4143510747_257|74302|74302|INDEX PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|4
SYS|WRH$_TABLESPACE_STAT_PK|WRH$_TABLES_4143510747_234|74301|74195|INDEX PARTITION|2011-04-12 19:10:38|2011-04-12 19:10:38|2011-04-12:19:10:38|VALID|N|N|N|4
SYS|WRH$_OSSTAT|WRH$_OSSTAT_4143510747_257|74304|74304|TABLE PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|1
SYS|WRH$_OSSTAT_PK|WRH$_OSSTAT_4143510747_257|74306|74306|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_OSSTAT_PK|WRH$_OSSTAT_4143510747_234|74305|74199|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_SYS_TIME_MODEL|WRH$_SYS_TI_4143510747_257|74308|74308|TABLE PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|1
SYS|WRH$_SYS_TIME_MODEL_PK|WRH$_SYS_TI_4143510747_257|74310|74310|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_SYS_TIME_MODEL_PK|WRH$_SYS_TI_4143510747_234|74309|74203|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_SERVICE_WAIT_CLASS|WRH$_SERVIC_4143510747_257|74312|74312|TABLE PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|1
SYS|WRH$_SERVICE_WAIT_CLASS_PK|WRH$_SERVIC_4143510747_257|74314|74314|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_SERVICE_WAIT_CLASS_PK|WRH$_SERVIC_4143510747_234|74313|74207|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_EVENT_HISTOGRAM|WRH$_EVENT__4143510747_257|74316|74316|TABLE PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|1
SYS|WRH$_EVENT_HISTOGRAM_PK|WRH$_EVENT__4143510747_257|74318|74318|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|WRH$_EVENT_HISTOGRAM_PK|WRH$_EVENT__4143510747_234|74317|74211|INDEX PARTITION|2011-04-12 19:10:39|2011-04-12 19:10:39|2011-04-12:19:10:39|VALID|N|N|N|4
SYS|T1||74332|74332|TABLE|2011-04-13 15:34:08|2011-04-13 15:34:08|2011-04-13:15:34:08|VALID|N|N|N|1
SYS|T5||74330|74331|TABLE|2011-04-13 10:15:32|2011-04-13 10:48:23|2011-04-13:10:15:32|VALID|N|N|N|1

可以看到,恢复出来的数据的条数确实是72206:

[root@bspdev data]# cat SYS_T1.txt|wc -l
72206

为了导入这72206条数据,我们把ASM和数据库实例都重启起来:
先启ASM实例:

[root@bspdev data]# su – grid
[grid@bspdev ~]$ sqlplus ‘/ as sysasm’;

SQL*Plus: Release 11.2.0.1.0 Production on Wed Apr 13 15:44:13 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ASM instance started

Total System Global Area 284565504 bytes
Fixed Size 1336036 bytes
Variable Size 258063644 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Automatic Storage Management option

再启数据库实例:

[grid@bspdev ~]$ su – oracle
Password:
[oracle@bspdev ~]$ sqlplus ‘/ as sysdba’;

SQL*Plus: Release 11.2.0.1.0 Production on Wed Apr 13 15:44:39 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 845348864 bytes
Fixed Size 1339796 bytes
Variable Size 637537900 bytes
Database Buffers 201326592 bytes
Redo Buffers 5144576 bytes
Database mounted.
Database opened.

可以看到t1依然是处于被清空的状态:

SQL> select count(*) from t1;

COUNT(*)
———-
0

SQL> select count(*) from t1_backup;

COUNT(*)
———-
72206

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

[oracle@bspdev ~]$ cd $ORACLE_HOME/odu/data
[oracle@bspdev data]$ ls -lrt
total 8592
-rw-r–r–. 1 root root 468 Apr 13 15:41 SYS_T1.sql
-rw-r–r–. 1 root root 644 Apr 13 15:41 SYS_T1.ctl
-rw-r–r–. 1 root root 8770686 Apr 13 15:41 SYS_T1.txt

更改一下权限,让SQL*Loader能读这些文件:

[oracle@bspdev data]$ su
Password:
[root@bspdev data]# chmod 777 SYS_T1*
[root@bspdev data]# ls -lrt
total 8592
-rwxrwxrwx. 1 root root 468 Apr 13 15:41 SYS_T1.sql
-rwxrwxrwx. 1 root root 644 Apr 13 15:41 SYS_T1.ctl
-rwxrwxrwx. 1 root root 8770686 Apr 13 15:41 SYS_T1.txt

开始用SQL*Loader导入这72206条数据:

[root@bspdev data]# su – oracle
[oracle@bspdev ~]$ cd $ORACLE_HOME/odu/data
[oracle@bspdev data]$ ls
SYS_T1.ctl SYS_T1.sql SYS_T1.txt
[oracle@bspdev data]$ sqlldr “‘sys/oracle as sysdba’ control=SYS_T1.ctl”

SQL*Loader: Release 11.2.0.1.0 – Production on Wed Apr 13 15:48:58 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.

Commit point reached – logical record count 5302
Commit point reached – logical record count 10604
Commit point reached – logical record count 15906
Commit point reached – logical record count 21208
Commit point reached – logical record count 26510
Commit point reached – logical record count 31812
Commit point reached – logical record count 37114
Commit point reached – logical record count 42416
Commit point reached – logical record count 47718
Commit point reached – logical record count 53020
Commit point reached – logical record count 58322
Commit point reached – logical record count 63624
Commit point reached – logical record count 68926
Commit point reached – logical record count 69266
Commit point reached – logical record count 72206

[oracle@bspdev data]$ sqlplus ‘/ as sysdba’;

SQL*Plus: Release 11.2.0.1.0 Production on Wed Apr 13 15:49:07 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

可以看到,被清空的72206条数据已经失而复得:

SQL> select count(*) from t1;

COUNT(*)
———-
72206

SQL> select count(*) from t1_backup;

COUNT(*)
———-
72206

从下面的结果我们可以看出,ODU已经精确的恢复了那被Truncate掉的72206条数据:

SQL> select * from t1 minus select * from t1_backup;

no rows selected

从此,在ASM里恢复被误Truncate的数据将不再是一件极其困难的事情。

Oracle dmp文件损坏如何恢复其中的数据

在某些情况下,如果没有物理备份,只要逻辑的导出备份,需要进行恢复时,发现dmp文件又损坏了,通常来讲是比较悲剧的。我们可以通过dul 进行文件扫描并抽取其中的数据以实现数据恢复的目的。这些我们的测试过程:
##### 测试DUL是否支持抽取exp的dmp

当使用scan dump file扫描完exp的dmp文件之后,最后是讲scan的数据输出到文件,方便查看,然后可以直接抽取我们需要的表,如下:

使用unexp命令抽取之后,最后需会产生2个文件,其中是一个sqlldr的ctl文件,最后我们通过sqlldr讲数据加载到数据库即可。如下:

当测试完DUL的exp dmp支持之后,我比较好奇是否会支持expdp呢? 如下是测试过程!

###### 测试DUL是否支持expdp

下面我们来测试,Oracle DUL是否支持expdp的dmp文件的抽取。
++++使用unpump命令查看dmp文件头信息

这里需要注意的是,scan的结果有10条,表示这个expdp的dmp文件中包含了10个表的内容,如果我只想抽取之前的一个表怎么办呢? 可以结合expdp的日志来进行判断,或者利用strings来确认。

这里假设我想抽取第一个表的数据:

上面的结果中,最为关键的一行是:Table data from 94872 until 77916680
根据这行结果,我们知道该表的offset为94872到 77916680,因此下面根据偏移量来抽取该表的数据:

我们可以看到,目前DUL 算是比较完美的支持了exp和expdp 的dmp。(我测试过程中遇到了一些其他的状况,这里不描述了)。

某客户数据库崩溃同时dmp损坏的恢复案例

某客户的一个库系统损坏,导致oracle 崩溃,最后通过安全模式将数据文件拷贝出来,发现无法启动,非归档环境,

而且只有dmp 备份,之前他们通过dmp 备份进行了恢复,但是发现部分dmp 可能存在问题,导致部分表无法恢复,又尝试使用ODU进行数据文件的抽取,也发现部分表无法抽取(可能是system损坏较为严重,dbv检测有1000多个坏块)。

如下是尝试open时的alert log信息:

很明显,Oracle 在执行递归SQL的适合报错了,而且遇到了坏块。通过dbv检测,我发现存在大量的坏块,而且部分块还是连续损坏,极有可能是某个extent都损坏了。如下是dbv的检测结果:

我们看到,部分坏块是连续的,这种情况处理就比较麻烦了。开始我尝试通过一些其他手段想把库先拉起来,发现不行,如下是10046 trace的内容:

我们可以看到,在访问某个block的适合出问题了,而且该block后面连续几个都是损坏的。我尝试bbed copy修改了几个block都不行。最后发现其实这个递归SQL,可以想办法绕过去的,及通过修改Oracle 二进制文件的方法,可惜的是客户

的机器上已经跑了一个库了,无法停,因此这种方法也就作罢;当然或许还能通过gdb来实现。总的来讲比较麻烦。

考虑到他们本身具有dmp 备份,因此直接dul 抽取dmp 即可,把需要的表弄出来就完了。

如下是dul 抽取dump的步骤: