转载

批量插入数据时主键冲突的处理

客户有这样一个需求:有一个大表A,有主键,经常需要往这个表中批量插入大量数据,但插入的数据可能自身重复或者跟表A重复。
下面测试几种插入的方法:

DB:ORACLE 11.2.0.4

新建测试表:
新建表scott.tb_01
create table scott.tb_01
as
SELECT level c1,level c2,level c3
FROM DUAL
CONNECT BY LEVEL <= 100000;

在表scott.tb_01上增加主键
create unique index scott.pk_tb_01 on scott.tb_01(c1); --这一步不是必需的,因为下一步会自动建索引,但这种建索引再加主键的方式可以在建索引的时候加并行。
alter table scott.tb_01 add constraint pk_tb_01 primary key (c1) using index;

新建表scott.tb_02
create table scott.tb_02
as
SELECT level+100000 c1,level c2,level c3
FROM DUAL
CONNECT BY LEVEL <= 100000;

在表scott.tb_02上插入表scott.tb_01的100条数据做为重复数据
insert into scott.tb_02
SELECT c1,c2, c3
FROM scott.tb_01
where rownum <= 100;

commit;

现在需要将表scott.tb_02中和表scott.tb_01非重复的数据插入到scott.tb_01
方法1:关联插入
INSERT INTO SCOTT.TB_01
SELECT A.* --这里如果表SCOTT.TB_02自身有重复数据,还要加上distinct
FROM SCOTT.TB_02 A
LEFT JOIN SCOTT.TB_01 B
ON A.c1 = B.c2
WHERE B.c1 IS NULL;

100000 rows created.
上面一种常见的插入方法,这个方法的的问题在于,如果表SCOTT.TB_02和表SCOTT.TB_01都很大,两个表关联查询的成本会很高,影响性能。

方法2:使用HINT:IGNORE_ROW_ON_DUPKEY_INDEX
不使用HINT:
INSERT
INTO SCOTT.TB_01
SELECT * FROM SCOTT.TB_02 ;

ERROR at line 1:
ORA-00001: unique constraint (SCOTT.PK_TB_01) violated
从上面可以看出直接将表SCOTT.TB_02插入到表SCOTT.TB_01会出现主键冲突,报ORA-00001错误。

使用HINT:
INSERT /*+IGNORE_ROW_ON_DUPKEY_INDEX(a PK_tb_01)*/
INTO SCOTT.TB_01 A
SELECT * FROM SCOTT.TB_02 ;

100000 rows created.
从上面可以看出,加上HINT后,表SCOTT.TB_02中和表SCOTT.TB_01没有主键冲突的记录插入到表SCOTT.TB_01中,冲突的100条记录没有插入,也没有报错。

方法3:使用impdp的skip_constraint_errors选项
创建dump目录
create directory dump_home as '/home/oracle';

导出表scott.tb_02
expdp system tables=scott.tb_02 directory=dump_home dumpfile=expdp_tb_02.dmp logfile=expdp_tb_02.log
使用expdp导出表scott.tb_02,用于后续导入到表scott.tb_01中。

导入表scott.tb_02到scott.tb_01,不加skip_constraint_errors选项
impdp system tables=scott.tb_02 remap_table=scott.tb_02:tb_01 content=data_only directory=dump_home dumpfile=expdp_tb_02.dmp logfile=impdp_tb_02.log

Import: Release 11.2.0.4.0 - Production on Tue Dec 5 23:18:49 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TABLE_01": system/******** tables=scott.tb_02 remap_table=scott.tb_02:tb_01 content=data_only directory=dump_home dumpfile=expdp_tb_02.dmp logfile=impdp_tb_02.log
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-31693: Table data object "SCOTT"."TB_01" failed to load/unload and is being skipped due to error:
ORA-00001: unique constraint (SCOTT.PK_TB_01) violated
Job "SYSTEM"."SYS_IMPORT_TABLE_01" completed with 1 error(s) at Tue Dec 5 23:19:02 2017 elapsed 0 00:00:09
从上面可以看出,如果不加skip_constraint_errors选项就会报主键冲突错误,导入失败。

导入表scott.tb_02到scott.tb_01,加skip_constraint_errors选项
impdp system tables=scott.tb_02 remap_table=scott.tb_02:tb_01 content=data_only directory=dump_home dumpfile=expdp_tb_02.dmp logfile=impdp_tb_02.log data_options=skip_constraint_errors

Import: Release 11.2.0.4.0 - Production on Tue Dec 5 23:21:29 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Password:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TABLE_01": system/******** tables=scott.tb_02 remap_table=scott.tb_02:tb_01 content=data_only directory=dump_home dumpfile=expdp_tb_02.dmp logfile=impdp_tb_02.log data_options=skip_constraint_errors
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SCOTT"."TB_01" 1.796 MB 100000 out of 100100 rows
100 row(s) were rejected with the following error:
ORA-00001: unique constraint (SCOTT.PK_TB_01) violated

Rejected rows with the primary keys are:
Rejected row #1:
column C1: 1
Rejected row #99:
column C1: 99
Rejected row #100:
column C1: 100
Job "SYSTEM"."SYS_IMPORT_TABLE_01" successfully completed at Tue Dec 5 23:21:56 2017 elapsed 0 00:00:23
从上面可以看出,导入顺利完成,并跳过了主键冲突的行,并在impdp的log中显示行的信息。
正文到此结束
Loading...