CSSCAN字符集检查工具
我们在创建数据库的时候,通常会把字符集设置为ZHS16GBK,有些时候,ZHS16GBK字符集满足不了业务的需求,会更换字符集,以AL32UTF8为例,ZHS16GBK字符集每个汉字以2个字符存放,而AL32UTF8字符集每个汉字是以3个自己存放,这样,有可能就要增大某些表的字段长度。ORACLE提供了CSSCAN工具来查看字符集转换的时候哪些字段需要增大到多大的长度。(当然,也可以写PL/SQL将所有的表为CHAR,VARCHAR2的字段都增大1.5倍),既然ORACLE提供了CSSCAN工具,我们就要学着使用。
在我们创建数据库的时候,CSSCAN工具一般不会被安装,我们需要手动执行$ORACLE_HOME/rdbms/admin/csminst.sql手动创建,这个操作很简单:
SQL> @?/rdbms/admin/csminst.sql 用户已创建。 授权成功。 … …
有可能会遇到如下的错误
drop public synonym csm$parameters * 第 1 行出现错误: ORA-01432: 要删除的公用同义词不存在
查看csminst.sql文件,会发现
drop public synonym csm$parameters / create public synonym csm$parameters for csmig.csm$parameters /
这个错误是由于csminst.sql这个脚本在创建同义词csm$parameters之前,会先删除这个同义词,然后在创建,也就是如果这个同义词存在,先删除,然后再创建,这个错误可以忽略。
CSSCAN工具安装完后,就可以使用CSSCAN工具查看字符集转换之后,哪些字段需要增大长度。操作也很简单,我们拿T_FONDS_INFO表为例:
SQL> desc T_FONDS_INFO 名称 是否为空? 类型 ----------------------------------------- -------- ------------------- FOND_CODE VARCHAR2(2) FOND_NAME VARCHAR2(33) FONDS_NUMBER VARCHAR2(40) ADMITER VARCHAR2(3) TERM VARCHAR2(35) FILE_SUM NUMBER PRINCIPLE VARCHAR2(315) FOND_INTRO CLOB REMARK VARCHAR2(70)
利用CSSCAN查看哪些字段需要增大
C:Documents and SettingsAdministrator>csscan system/oracle table= T_FONDS_INFO tochar= al32utf8 log=d:css.log Character Set Scanner v2.2 : Release 11.2.0.1.0 - Production on 星期五 9月 23 16:36:01 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Enter array fetch buffer size: 1024000 --默认为1M,可根据实际情况输入大一点的值 Enter number of scan processes to utilize(1..64): 1 --多少个进程来检查 Enumerating tables to scan... . process 1 scanning STREAM.T_FONDS_INFO[AAASgRAAEAAAADwAAA] Creating Database Scan Summary Report... Creating Individual Exception Report... Scanner terminated successfully.
执行完之后会在D:下看到css.log.err、css.log1.out和css.log1.txt文件,检查css.log1.err文件可以看到哪些字段需要修改到多大。例如:
User : STREAM Table : T_FONDS_INFO Column: TERM Type : VARCHAR2(35) Number of Exceptions : 3 Max Post Conversion Data Size: 46
此部分说明STREAM下的T_FONDS_INFO表的TERM字段更改为AL32UTF8字符集之后,有3条数据需要增大字段长度,需要从原长度35增大到46,下面是数据在AL32UTF8字符集下的实际长度:
ROWID Exception Type Size Cell Data(first 30 bytes) ------------------ ------------------ ----- ------------------------------ AAASgRAAEAAAAD0AAB exceed column size 46 光绪二十七年1901--光绪二十八年 AAASgRAAEAAAAD1AAB exceed column size 40 光绪二十八年1902--宣统三年1911 AAASgRAAEAAAAD1AAC exceed column size 40 光绪二十一年1895--宣统四年1912 ------------------ ------------------ ----- ------------------------------
可以通过csscan help=y查看CSSCAN的各个参数
C:Documents and SettingsAdministrator>csscan help=y You can let Scanner prompt you for parameters by entering the CSSCAN command followed by your username/password: Example: CSSCAN "SYSTEM/MANAGER AS SYSDBA" Or, you can control how Scanner runs by entering the CSSCAN command followed by various parameters. To specify parameters, you use keywords: Example: CSSCAN "SYSTEM/MANAGER AS SYSDBA" FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3 Keyword Default Prompt Description ---------- ------- ------ ------------------------------------------------- USERID yes username/password FULL N yes scan entire database USER yes owner of tables to be scanned TABLE yes list of tables to scan COLUMN yes list of columns to scan EXCLUDE list of tables to exclude from scan TOCHAR yes new database character set name FROMCHAR current database character set name TONCHAR new national character set name FROMNCHAR current national character set name ARRAY 1024000 yes size of array fetch buffer PROCESS 1 yes number of concurrent scan process MAXBLOCKS split table if block size exceed MAXBLOCKS CAPTURE N capture convertible data SUPPRESS maximum number of exceptions logged for each table FEEDBACK report progress every N rows BOUNDARIES list of column size boundaries for summary report LASTRPT N generate report of the last database scan LOG scan base file name of report files PARFILE parameter file name PRESERVE N preserve existing scan results LCSD N no enable language and character set detection LCSDDATA LOSSY no define the scope of the detection HELP N show help screen (this screen) QUERY N select clause to scan subset of tables or columns Scanner terminated successfully.
Linux系统下运行CSSCAN工具可能会遇到csscan: error while loading shared libraries: libclntsh.so.11.1: cannot open shared object file: No such file or directory错误,可以用下面的方法解决:
[oracl e@dbserver2 bin]$ env|grep LD_LIB LD_LIBRARY_PATH=/etc/emc/rsa/cst/lib [oracle@dbserver2 bin]$export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
当然,也可以利用数据字典生成脚本直接修改所有表的CHAR类型的字段长度为原值的1.5倍,例如:
创建spool.sql文件,内容如下:
set heading off set feedback OFF set echo off set pages 1000 lines 120 spool d:length.sql select 'alter table '||table_name||' modify'||'('||COLUMN_NAME||' '||data_type ||'('||round(DATA_LENGTH*1.5)||'));' from user_tab_columns where data_type like '%CHAR%'; spool off
在SQL*PLUS里执行
SQL> @d:spool.sql
这样生成的length.sql文件内容如下:
alter table T_SPECIAL modify(CLASS_NUM VARCHAR2(45)); alter table T_SPECIAL modify(CLASS_NUM_A VARCHAR2(45)); alter table T_FONDS_INFO_RAW modify(FOND_ID VARCHAR2(15)); alter table T_FONDS_INFO_RAW modify(FOND_CODE VARCHAR2(60));
直接在SQL*PLUS里面运行D:length.sql即可讲所有表的%CHAR%类型的字段长度都增大1.5倍。