EXADATA混合列压缩
今天搞EXADATA,正好有时间,测试下只能在EXADATA下才能用的11g新特性-混合列压缩。
SQL> select segment_name,bytes/1024/1024/1024 from dba_segments where owner='XXX' and segment_name='XXXX'; SEGMENT_NAME BYTES/1024/1024/1024 ------------------------------ -------------------- XXXX 20.5195313 SQL> select count(*) from XXX.XXXX; COUNT(*) ---------- 40270380
把这张20G的表做数据来源,分别创建不压缩、BASIC、OLTP、HCC1、HCC2、HCC3、HCC4级别的压缩表,看看压缩率对比以及创建所需的时间情况。
SQL> create table enmo_basic nologging parallel 16 compress basic as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:01:15.05 SQL> create table enmo_oltp nologging parallel 16 compress for oltp as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:01:11.44 SQL> create table enmo_hcc1 nologging parallel 16 compress for query low as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:01:18.46 SQL> create table enmo_hcc2 nologging parallel 16 compress for query high as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:01:38.65 SQL> create table enmo_hcc3 nologging parallel 16 compress for archive low as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:02:35.13 SQL> create table enmo_hcc4 nologging parallel 16 compress for archive high as select /*+ parallel (a 16) */ * from XXX.XXXX a; Table created. Elapsed: 00:04:44.39
下面看下这几张表分别占用的空间情况。
SQL> select table_name,COMPRESSION,COMPRESS_FOR from user_tables where table_name like 'ENMO%' and table_name not like 'ENMO_A%'; TABLE_NAME COMPRESS COMPRESS_FOR ------------------------------ -------- ------------ ENMO_BASIC ENABLED BASIC ENMO_HCC1 ENABLED QUERY LOW ENMO_HCC2 ENABLED QUERY HIGH ENMO_HCC3 ENABLED ARCHIVE LOW ENMO_HCC4 ENABLED ARCHIVE HIGH ENMO_NOCOMPOSE DISABLED ENMO_OLTP ENABLED OLTP SQL> select segment_name,bytes/1024/1024/1024 from user_segments where segment_name like 'ENMO%'; SEGMENT_NAME BYTES/1024/1024/1024 ------------------------------ -------------------- ENMO_BASIC 19.1474609 ENMO_HCC1 1.69042969 ENMO_HCC2 1.015625 ENMO_HCC3 .686523438 ENMO_HCC4 .59375 ENMO_NOCOMPOSE 20.8486328 ENMO_OLTP 20.8486328
在看下全表扫描以上各表需要的时间。
SQL> select sum(CUST_ID) from ENMO_NOCOMPOSE; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:03.20 SQL> select sum(CUST_ID) from ENMO_basic; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:03.90 SQL> select sum(CUST_ID) from ENMO_oltp; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:04.01 SQL> select sum(CUST_ID) from ENMO_hcc1; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:01.27 SQL> select sum(CUST_ID) from ENMO_hcc2; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:01.32 SQL> select sum(CUST_ID) from ENMO_hcc3; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:01.20 SQL> select sum(CUST_ID) from ENMO_hcc4; SUM(CUST_ID) ----------------------- 9046133444298327 Elapsed: 00:00:01.80
对比情况参见下表。
表名 | 压缩模式 | 表创建时间 | 占用空间(GB) | 全表扫描时间 |
ENMO_NOCOMPOSE | 未压缩 | 00:01:30.51 | 20.8486328 | 00:00:03.20 |
ENMO_BASIC | BASIC | 00:01:15.05 | 19.1474609 | 00:00:03.90 |
ENMO_OLTP | OLTP | 00:01:11.44 | 20.8486328 | 00:00:04.01 |
ENMO_HCC1 | QUERY LOW | 00:01:18.46 | 1.69042969 | 00:00:01.27 |
ENMO_HCC2 | QUERY HIGH | 00:01:38.65 | 1.015625 | 00:00:01.32 |
ENMO_HCC3 | ARCHIVE LOW | 00:02:35.13 | 0.686523438 | 00:00:01.20 |
ENMO_HCC4 | ARCHIVE HIGH | 00:04:44.39 | 0.59375 | 00:00:01.80 |
从表创建时间看,除ARCHIVE模式的压缩,其他压缩模式对表创建时间影响不大,从占用空间看,混合列压缩的压缩率明显要比默认的BASIC压缩模式和高级压缩(OLTP)要好,从全表扫描的时间看,由于混合列压缩后的表占用空间比较少,全秒扫描的数据块要少很多,虽然解压需要消耗CPU等资源,但全表扫描的时间大大减少。
以上测试结果仅针对这张测试表,因为每张表上的数据,重复度都是不一样的,压缩程度也会不一样,本测试使用的是某银行的数据表,可能在其他行业的数据表上测试结果会不同。
混合列压缩虽然好,但也不是所有场景都适用,对OLTP系统来说,混合列压缩可能会带来大量的行级锁、热点块竞争,对频繁DML操作的表也不建议压缩,一是压缩、解压会消耗CPU资源,对单行操作影响较大,而且混合列压缩后的数据,每一个压缩后的数据块里可能含有很多的记录,当修改其中一条记录的时候,整个压缩后的数据块都将被锁住,这样锁的粒度就会相当大。
当对混合列压缩的表做DML操作时,会自动解压为OLTP压缩模式,当数据块写满后,会自动将解压为OLTP压缩模式的数据再压缩成混合列压缩模式。对压缩的表做DML操作,都会涉及到解压缩过程,不论是那种压缩模式,比如更新压缩表的一个字段的所有数据,压缩的表就会被放大,因此压缩技术适用于读操作多,写操作少的表。
从这个测试也能看出EXADATA强悍的性能,20G的表全表扫描只需要3秒多,而且还是在没开并行的情况下。