当前位置: 首页 > GreenPlum > 正文

GreenPlum数据库扩容-增加Segment

本文档主要演示GreenPlum数据库非Mirror模式的增加Segment节点。当前数据库包含mdw、mdw-std、sdw1、sdw2、sdw3服务器,mdw为Master,mdw-std为Master Standby,sdw1-3为3个Segment。

dbdream=# select * from gp_segment_configuration;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+--------
    1 |       0 | p    | p              | s    | u      |  5432 | mdw      | mdw     |                  | 
    2 |       1 | p    | p              | s    | u      | 40000 | sdw1     | sdw1    |                  | 
    3 |       2 | p    | p              | s    | u      | 40000 | sdw2     | sdw2    |                  | 
    4 |       3 | p    | p              | s    | u      | 40000 | sdw3     | sdw3    |                  | 
    5 |      -1 | p    | p              | s    | u      |  5432 | mdw-std  | mdw-std |                  | 
(5 rows)

GreenPlum数据库增加Segment节点分两张情况,一种是在现有的服务器上增加节点,一种是在新的服务器上增加节点,在现有服务器上增加Segment节点比较简单,直接添加即可。在新服务器上Segment添加需要先安装GreenPlum软件,配置时间服务器,及SSH互信等。本文主要演示在新服务器上增加Segment节点,以下为详细步骤。

软件安装部分参考GreenPlum数据库安装系列第一篇:非mirror模式Master软件安装,软件安装完成后,编辑新增节点的添加文件,名字自定义,文件内描述的是新增节点的主机名,新增节点的主机名要在所有节点的hosts文件中配置,如果一次添加多个Segment主机,每行写一个,本文只添加一个Segment主机sdw4。

[gpadmin@mdw config]$ vi newseg

sdw4

通过gpcheck工具进行先决条件检查,查看当前配置是否满足Segment节点的添加。

[gpadmin@mdw config]$ gpcheck -f newseg 
20160227:09:25:46:006901 gpcheck:mdw:gpadmin-[INFO]:-dedupe hostnames
20160227:09:25:46:006901 gpcheck:mdw:gpadmin-[INFO]:-Detected platform: Generic Linux Cluster
20160227:09:25:46:006901 gpcheck:mdw:gpadmin-[INFO]:-generate data on servers
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[INFO]:-copy data files from servers
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[INFO]:-delete remote tmp files
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[INFO]:-Using gpcheck config file: /gpdb/greenplum-db/./etc/gpcheck.cnf
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(None): utility will not check all settings when run as non-root user
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(sdw4): ntpd not detected on machine
20160227:09:25:47:006901 gpcheck:mdw:gpadmin-[INFO]:-gpcheck completing...

这个脚本需要用root用户运行,使用gpadmin用户运行会遇到上面的错误。

[root@mdw ~]# gpcheck -f /gpdb/app/config/newseg 
20160227:09:34:24:007203 gpcheck:mdw:root-[INFO]:-dedupe hostnames
20160227:09:34:24:007203 gpcheck:mdw:root-[INFO]:-Detected platform: Generic Linux Cluster
20160227:09:34:24:007203 gpcheck:mdw:root-[INFO]:-generate data on servers
20160227:09:34:25:007203 gpcheck:mdw:root-[INFO]:-copy data files from servers
20160227:09:34:25:007203 gpcheck:mdw:root-[INFO]:-delete remote tmp files
20160227:09:34:25:007203 gpcheck:mdw:root-[INFO]:-Using gpcheck config file: /gpdb/greenplum-db/./etc/gpcheck.cnf
20160227:09:34:25:007203 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): on device (/dev/sda3) blockdev readahead value  
'256' does not match expected value '16384'
20160227:09:34:25:007203 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): on device (/dev/sda1) blockdev readahead value  
'256' does not match expected value '16384'
20160227:09:34:25:007203 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): on device (/dev/sda2) blockdev readahead value  
'256' does not match expected value '16384'
20160227:09:34:25:007203 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): on device (/dev/sda) blockdev readahead value  
'256' does not match expected value '16384'
20160227:09:34:25:007203 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): ntpd not detected on machine
20160227:09:34:25:007203 gpcheck:mdw:root-[INFO]:-gpcheck completing...

经过检查,提示磁盘的预读设置不满足需求,GreenPlum要求磁盘预读16384个数据块,也有资料说最好设置成65535,当前版本linux默认预读256个数据块。修改磁盘预读设置。

[root@sdw4 ~]# /sbin/blockdev --setra 16384 /dev/sda
[root@sdw4 ~]# /sbin/blockdev --setra 16384 /dev/sda1
[root@sdw4 ~]# /sbin/blockdev --setra 16384 /dev/sda2
[root@sdw4 ~]# /sbin/blockdev --setra 16384 /dev/sda3

修改完成后,再次检查。

[root@mdw ~]# gpcheck -f /gpdb/app/config/newseg 
20160227:09:37:04:007288 gpcheck:mdw:root-[INFO]:-dedupe hostnames
20160227:09:37:04:007288 gpcheck:mdw:root-[INFO]:-Detected platform: Generic Linux Cluster
20160227:09:37:04:007288 gpcheck:mdw:root-[INFO]:-generate data on servers
20160227:09:37:04:007288 gpcheck:mdw:root-[INFO]:-copy data files from servers
20160227:09:37:04:007288 gpcheck:mdw:root-[INFO]:-delete remote tmp files
20160227:09:37:05:007288 gpcheck:mdw:root-[INFO]:-Using gpcheck config file: /gpdb/greenplum-db/./etc/gpcheck.cnf
20160227:09:37:05:007288 gpcheck:mdw:root-[ERROR]:-GPCHECK_ERROR host(sdw4): ntpd not detected on machine
20160227:09:37:05:007288 gpcheck:mdw:root-[INFO]:-gpcheck completing...

经检查,NTP设置不满足需求,我已经设置了NTP服务,而且所有服务器的时间是同步的,可能是配置方法不是GreenPlum的推荐方法,此处可以忽略。通过gpssh命令查看,所有服务器的时间是同步的。

[root@mdw ~]# gpssh -f /gpdb/app/config/hostlist -e date
[mdw-std] date
[mdw-std] Sat Feb 27 09:39:25 CST 2016
[   sdw1] date
[   sdw1] Sat Feb 27 09:39:25 CST 2016
[    mdw] date
[    mdw] Sat Feb 27 09:39:25 CST 2016
[   sdw2] date
[   sdw2] Sat Feb 27 09:39:25 CST 2016
[   sdw4] date
[   sdw4] Sat Feb 27 09:39:25 CST 2016
[   sdw3] date
[   sdw3] Sat Feb 27 09:39:25 CST 2016

下面执行添加Segment操作。我的GreenPlum数据库是使用gpadmin用户安装的,此处也使用gpadmin用户操作。

[gpadmin@mdw ~]$ gpexpand -f /gpdb/app/config/newseg 
20160227:09:48:03:007702 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build  
1'
20160227:09:48:03:007702 gpexpand:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database  
4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160227:09:48:03:007702 gpexpand:mdw:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20160227:09:48:04:007702 gpexpand:mdw:gpadmin-[ERROR]:-gpexpand failed: Invalid database 'template1' specified.  Cannot use  
a template database.
Please set the environment variable PGDATABASE to a different database or use the -D option to specify a database and re-run 

Exiting...

gpexpand failed: Invalid database ‘template1’ specified.  Cannot use错误是由于没有设置PGDATABASE环境变量,如果不设置PGDATABASE环境变量,可以直接在命令上指定数据库的名字,这个数据库用来存放添加节点时需要的数据。

[gpadmin@mdw ~]$ gpexpand -f /gpdb/app/config/newseg -D dbdream
20160227:09:49:27:007820 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build 1'
20160227:09:49:27:007820 gpexpand:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database  
4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160227:09:49:27:007820 gpexpand:mdw:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state

System Expansion is used to add segments to an existing GPDB array.
gpexpand did not detect a System Expansion that is in progress.

Before initiating a System Expansion, you need to provision and burn-in
the new hardware.  Please be sure to run gpcheckperf/gpcheckos to make
sure the new hardware is working properly.

Please refer to the Admin Guide for more information.

Would you like to initiate a new System Expansion Yy|Nn (default=N):

修改profile文件,添加PGDATABASE环境变量。

[gpadmin@mdw ~]$ vi .bash_profile
source /gpdb/app/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/gpdb/gpdata/master/gpseg-1
export PGDATABASE=dbdream

然后添加Segment时就不需要指定-d参数了。

[gpadmin@mdw ~]$ gpexpand -f /gpdb/app/config/newseg
20160227:09:51:41:007941 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build 1'
20160227:09:51:41:007941 gpexpand:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database  
4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160227:09:51:41:007941 gpexpand:mdw:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state

System Expansion is used to add segments to an existing GPDB array.
gpexpand did not detect a System Expansion that is in progress.

Before initiating a System Expansion, you need to provision and burn-in
the new hardware.  Please be sure to run gpcheckperf/gpcheckos to make
sure the new hardware is working properly.

Please refer to the Admin Guide for more information.

Would you like to initiate a new System Expansion Yy|Nn (default=N):

输入Y确认添加Segment。

> y

    By default, new hosts are configured with the same number of primary
    segments as existing hosts.  Optionally, you can increase the number
    of segments per host.

    For example, if existing hosts have two primary segments, entering a value
    of 2 will initialize two additional segments on existing hosts, and four
    segments on new hosts.  In addition, mirror segments will be added for
    these new primary segments if mirroring is enabled.
    
How many new primary segments per host do you want to add? (default=0):

此处为每个Segment主机添加的节点个数,0就是只有新的主机添加一个Segment,如果输入1的话,将会在所有主机再添加一个Segment,包括新添加的SDW4主机,也就是 如果这里输入1的话,会在sdw4主机上添加两个Segment,下面是这里输入1的时候产生的配置文件,看了文件很容易理解。

[gpadmin@mdw ~]$ cat gpexpand_inputfile_20160227_095345
sdw4:sdw4:40000:/gpdb/gpdata/primary/gpseg3:6:3:p
sdw1:sdw1:40001:/gpdb/gpdata/primary/gpseg4:7:4:p
sdw2:sdw2:40001:/gpdb/gpdata/primary/gpseg5:8:5:p
sdw3:sdw3:40001:/gpdb/gpdata/primary/gpseg6:9:6:p
sdw4:sdw4:40001:/gpdb/gpdata/primary/gpseg7:10:7:p

如果输入0的话,只会有第一条记录,输入1的话每个Segment主机都要增加一个Segment,也就是每个Segment主机上都会存在两个Segment。

继续上述实验,这里输入0,默认就是0,直接回车也可以。

> 0
Enter new primary data directory 1:
> /gpdb/gpdata/primary
Enter new file space location for file space name: fspc1:

这里输入文件空间fspc1的存放路径,最好和其他节点的路径一样。

> /gpdb/gpdata/fspc_segment

Generating configuration file...

20160227:09:57:49:008121 gpexpand:mdw:gpadmin-[INFO]:-Generating input file...
20160227:09:57:49:008121 gpexpand:mdw:gpadmin-[INFO]:-Generating filespaces input file...

Input configuration files were written to 'gpexpand_inputfile_20160227_095749' and 'gpexpand_inputfile_20160227_095749.fs'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20160227_095749 -D dbdream
                
20160227:09:57:49:008121 gpexpand:mdw:gpadmin-[INFO]:-Exiting...

查看生产的配置文件,因为上文输入的是0,会发现里面只有一条记录。

[gpadmin@mdw ~]$ cat gpexpand_inputfile_20160227_095749
sdw4:sdw4:40000:/gpdb/gpdata/primary/gpseg3:6:3:p

上面在gpexpand操作并不会直接添加Segment节点,而是生产添加节点的配置文件,如果对配置文件格式很了解,可以直接手工创建配置文件,就不需要执行上面的命令了,在使用gpexpand命令创建完配置文件后,会提示运行gpexpand -i gpexpand_inputfile_20160227_095749 -D dbdream命令来添加Segment节点,也就是gpexpand –i命令才是真正添加Segment节点。

在添加Segment节点时遇到如下错误:

[gpadmin@mdw ~]$ gpexpand -i ./gpexpand_inputfile_20160227_095749 -D dbdream -S -V -v -n 1 -B 1 -t /tmp

20160227:10:02:20:008290 gpexpand:mdw:gpadmin-[INFO]:-Configuring new segments (primary)
20160227:10:02:28:008290 gpexpand:mdw:gpadmin-[INFO]:-Configuring new segments (mirror)
20160227:10:02:38:008290 gpexpand:mdw:gpadmin-[ERROR]:-gpexpand failed. exiting...
Traceback (most recent call last):
  File "/gpdb/greenplum-db/./bin/gpexpand", line 3089, in <module>
    gp_expand.update_catalog()
  File "/gpdb/greenplum-db/./bin/gpexpand", line 1669, in update_catalog
    allowPrimary = True
  File "/gpdb/greenplum-db/lib/python/gppylib/system/configurationImplGpdb.py", line 121, in updateSystemConfig
    update.validate()
  File "/gpdb/greenplum-db/lib/python/gppylib/system/ComputeCatalogUpdate.py", line 181, in validate
    raise Exception("Primary must have mirror when mirroring enabled: %s" % repr(seg))
Exception: Primary must have mirror when mirroring enabled: 2|0|p|p|s|u|sdw1|sdw1|40000|None|/gpdb/gpdata/primary/gpseg0| 
17196:/gpdb/gpdata/fspc_segment/gpseg0|/gpdb/gpdata/primary/gpseg0/base/17198,/gpdb/gpdata/primary/gpseg0/base/10899,/gpdb/g 
pdata/fspc_segment/gpseg0/17199/17226,/gpdb/gpdata/primary/gpseg0/base/1,/gpdb/gpdata/primary/gpseg0/base/10898
20160227:10:02:38:008290 gpexpand:mdw:gpadmin-[ERROR]:-Please run 'gpexpand -r -D dbdream' to rollback to the original  
state.
20160227:10:02:38:008290 gpexpand:mdw:gpadmin-[INFO]:-Shutting down gpexpand...
20160227:10:02:38:008290 gpexpand:mdw:gpadmin-[DEBUG]:-WorkerPool haltWork()
20160227:10:02:38:008290 gpexpand:mdw:gpadmin-[DEBUG]:-[worker0] haltWork
20160227:10:02:41:008290 gpexpand:mdw:gpadmin-[DEBUG]:-[worker0] stopping

在添加Segment节点时遇到上面的错误,提示通过gpexpand -r -D dbdream命令回滚添加节点的操作。

[gpadmin@mdw ~]$ gpexpand -r -D dbdream
20160227:16:12:25:015950 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build  
1'
20160227:16:12:25:015950 gpexpand:mdw:gpadmin-[ERROR]:-gpexpand failed: could not connect to server: Connection refused
        Is the server running on host "localhost" (::1) and accepting
        TCP/IP connections on port 5432?
could not connect to server: Connection refused
        Is the server running on host "localhost" (127.0.0.1) and accepting
        TCP/IP connections on port 5432?

在回滚时遇到上面的错误,发现数据库已经关闭,在启动数据库时,遇到下面的错误。

[gpadmin@mdw ~]$ gpstart -a
20160227:16:13:14:015962 gpstart:mdw:gpadmin-[INFO]:-Starting gpstart with args: -a
20160227:16:13:14:015962 gpstart:mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20160227:16:13:14:015962 gpstart:mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.6.2 build  
1'
20160227:16:13:14:015962 gpstart:mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20160227:16:13:14:015962 gpstart:mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20160227:16:13:15:015962 gpstart:mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20160227:16:13:15:015962 gpstart:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20160227:16:13:15:015962 gpstart:mdw:gpadmin-[INFO]:-Setting new master era
20160227:16:13:15:015962 gpstart:mdw:gpadmin-[INFO]:-Master Started...
20160227:16:13:15:015962 gpstart:mdw:gpadmin-[INFO]:-Shutting down master
20160227:16:13:16:015962 gpstart:mdw:gpadmin-[ERROR]:-gpstart error: Found a System Expansion Setup in progress. Please run 'gpexpand --rollback'

启动数据库时,提示需要回滚添加Segment操作,数据库无法启动。这需要将数据库启动到utility模式才能回滚。

[gpadmin@mdw ~]$ gpstart -R

通过gpstart –R将数据库启动到utility模式,然后回滚掉之前的添加Segment操作。

[gpadmin@mdw ~]$ gpexpand -r -D dbdream
20160227:16:14:28:016133 gpexpand:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build 1'
20160227:16:14:28:016133 gpexpand:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database  
4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160227:16:14:47:016133 gpexpand:mdw:gpadmin-[INFO]:-Rolling back building of new segments
20160227:16:14:47:016133 gpexpand:mdw:gpadmin-[INFO]:-Rolling back segment template build
20160227:16:14:49:016133 gpexpand:mdw:gpadmin-[INFO]:-Rollback complete.  Greenplum Database can now be started

然后重启数据库,上面添加Segment节点失败,抛出的错误为Primary must have mirror when mirroring enabled: 2|0|p|p|s|u|sdw1|sdw1|40000|None|/gpdb/gpdata/primary/gpseg0|,可是我的数据库所有的Segment节点都没有配置mirror,但是Master姐弟啊配置的Standby,从上面的错误信息可以看到,主机名和IP地址部分都指向了Master的Standby,猜测是由于Master节点存在Standby导致的,GreenPlum认为Master有Standby,那么Segment就应该有Mirror,认为数据库开启了Mirror模式,但是没有查到相关的资料。

删掉Master的Standby后再进行添加Segment节点测试,删除Master的Standby使用gpinitstandby –r命令即可。我并没有删除Master的Standby,而是做了Master与Standby的切换操作,将Standby切换成了新的Master,原有的Master被踢出了数据库。详见GreenPlum数据库Master节点Standby同步与切换

在切换后,原先的Master已经不属于数据库的一部分,原先的Standby成为新的Master。

dbdream=# select * from gp_segment_configuration;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+--------
    2 |       0 | p    | p              | s    | u      | 40000 | sdw1     | sdw1    |                  | 
    3 |       1 | p    | p              | s    | u      | 40000 | sdw2     | sdw2    |                  | 
    4 |       2 | p    | p              | s    | u      | 40000 | sdw3     | sdw3    |                  | 
    1 |      -1 | p    | p              | s    | u      |  5432 | mdw-std  | mdw-std |                  | 
(4 rows)

再次添加Segment节点,没有再遇到问题,实验也证实了上文的猜测,如果Master存在Standby,在添加Segment时,GreenPlum就认为数据库开启了Mirror模式。

[gpadmin@mdw-std ~]$ gpexpand -i gpexpand_inputfile_20160227_173917 -D dbdream
20160227:17:41:00:004181 gpexpand:mdw-std:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2  
build 1'
20160227:17:41:00:004181 gpexpand:mdw-std:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database  
4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160227:17:41:00:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20160227:17:41:00:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Readying Greenplum Database for a new expansion
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database dbdream for unalterable tables...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database postgres for unalterable tables...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database tt for unalterable tables...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database template1 for unalterable tables...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database dbdream for tables with unique indexes...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database postgres for tables with unique indexes...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database tt for tables with unique indexes...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking database template1 for tables with unique indexes...
20160227:17:41:18:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Syncing Greenplum Database extensions
20160227:17:41:19:004181 gpexpand:mdw-std:gpadmin-[INFO]:-The packages on sdw4 are consistent.
20160227:17:41:19:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Creating segment template
20160227:17:41:19:004181 gpexpand:mdw-std:gpadmin-[INFO]:-VACUUM FULL on the catalog tables
20160227:17:41:21:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Starting copy of segment dbid 1 to location  
/gpdb/gpdata/master/gpexpand_02272016_4181
20160227:17:41:23:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Cleaning up catalog for schema only copy on destination
20160227:17:41:23:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Copying postgresql.conf from existing segment into template
20160227:17:41:23:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Copying pg_hba.conf from existing segment into template
20160227:17:41:23:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Adding new segments into template pg_hba.conf
20160227:17:41:23:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Creating schema tar file
20160227:17:41:26:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Distributing template tar file to new hosts
20160227:17:41:28:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Configuring new segments (primary)
20160227:17:41:32:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Configuring new segments (mirror)
20160227:17:41:32:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Backing up pg_hba.conf file on original segments
20160227:17:41:32:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Copying new pg_hba.conf file to original segments
20160227:17:41:32:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Configuring original segments
20160227:17:41:32:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Cleaning up temporary template files
20160227:17:41:33:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Starting Greenplum Database in restricted mode
20160227:17:41:38:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Stopping database
20160227:17:41:55:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking if Transaction filespace was moved
20160227:17:41:55:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Checking if Temporary filespace was moved
20160227:17:41:55:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Configuring new segment filespaces
20160227:17:41:56:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Cleaning up databases in new segments.
20160227:17:41:56:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Starting master in utility mode
20160227:17:41:57:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Stopping master in utility mode
20160227:17:42:01:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Starting Greenplum Database in restricted mode
20160227:17:42:07:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Creating expansion schema
20160227:17:42:07:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database dbdream
20160227:17:42:08:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database postgres
20160227:17:42:08:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database tt
20160227:17:42:08:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database  
template1
20160227:17:42:09:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Stopping Greenplum Database
20160227:17:42:26:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Starting Greenplum Database
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-************************************************
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Initialization of the system expansion complete.
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-To begin table expansion onto the new segments
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-rerun gpexpand
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-************************************************
20160227:17:42:31:004181 gpexpand:mdw-std:gpadmin-[INFO]:-Exiting...

查询gp_segment_configuration表会发现,sdw4已经成功添加到数据库。

dbdream=# select * from gp_segment_configuration;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts 
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+--------
    2 |       0 | p    | p              | s    | u      | 40000 | sdw1     | sdw1    |                  | 
    3 |       1 | p    | p              | s    | u      | 40000 | sdw2     | sdw2    |                  | 
    4 |       2 | p    | p              | s    | u      | 40000 | sdw3     | sdw3    |                  | 
    5 |       3 | p    | p              | s    | u      | 40000 | sdw4     | sdw4    |                  | 
    1 |      -1 | p    | p              | s    | u      |  5432 | mdw-std  | mdw-std |                  | 
(5 rows)

添加完Segment后,操作并未完成,还需要进行数据重分布操作,否则旧数据并不会分布到新增的Segment上。此时查询数据会发现,新增的Segment节点并没有任何数据,以ORD_PAY表为例。

dbdream=# select gp_segment_id,count(*) from ORD_PAY group by 1;
 gp_segment_id | count  
---------------+--------
             3 | 0
             2 | 333385
             1 | 333259
             0 | 333356
(3 rows)

由于在添加Segment节点的时候指定了-d参数,所以添加节点过程中的一些信息被记录到了-d参数指定的数据库中,通过\dn命令可以看到,数据库中新创建了gpexpand SCHEMA。

dbdream=# \dn
       List of schemas
        Name        |  Owner  
--------------------+---------
 gp_toolkit         | gpadmin
 gpexpand           | gpadmin
 information_schema | gpadmin
 pg_aoseg           | gpadmin
 pg_bitmapindex     | gpadmin
 pg_catalog         | gpadmin
 pg_toast           | gpadmin
 public             | gpadmin
(8 rows)

这个SCHEMA里面有status_detail和status两张表和基于这两张表的视图expansion_progress。

dbdream=# \d gpexpand.*
View "gpexpand.expansion_progress"
 Column | Type | Modifiers 
--------+------+-----------
 name   | text | 
 value  | text |

              Table "gpexpand.status"
 Column  |            Type             | Modifiers 
---------+-----------------------------+-----------
 status  | text                        | 
 updated | timestamp without time zone | 
Distributed by: (status)

                    Table "gpexpand.status_detail"
           Column            |            Type             | Modifiers 
-----------------------------+-----------------------------+-----------
 dbname                      | text                        | 
 fq_name                     | text                        | 
 schema_oid                  | oid                         | 
 table_oid                   | oid                         | 
 distribution_policy         | smallint[]                  | 
 distribution_policy_names   | text                        | 
 distribution_policy_coloids | text                        | 
 storage_options             | text                        | 
 rank                        | integer                     | 
 status                      | text                        | 
 expansion_started           | timestamp without time zone | 
 expansion_finished          | timestamp without time zone | 
 source_bytes                | numeric                     | 
Distributed by: (dbname)

status_detail表中记录数据库中所有表重分布的详细信息。

dbdream=# select dbname,fq_name,distribution_policy_names,rank,status from gpexpand.status_detail;
  dbname   |         fq_name          | distribution_policy_names | rank |   status    
-----------+--------------------------+---------------------------+------+-------------
 dbdream   | public.ord_pay_errs      | None                      |    2 | NOT STARTED
 dbdream   | public.ord_pay           | id                        |    2 | NOT STARTED
 dbdream   | public.med_ord_pgm_d_err | None                      |    2 | NOT STARTED
 dbdream   | public.med_ord_pgm_d     | id                        |    2 | NOT STARTED
 dbdream   | public.ord_pay_s         | id                        |    2 | NOT STARTED
 template1 | public.tt                | id                        |    2 | NOT STARTED
 tt        | public.tt                | id                        |    2 | NOT STARTED
(7 rows)

以上为还没有执行数据重分布操作时,status_detail表中记录的信息,执行数据重分布操作。

gpadmin@mdw-std ~]$ gpexpand -d 00:05:00
20160303:17:24:54:012524 gpexpand:mdw-std:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build 1'
20160303:17:24:54:012524 gpexpand:mdw-std:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160303:17:24:54:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20160303:17:24:55:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.ord_pay is 'id' 
20160303:17:24:55:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding dbdream.public.ord_pay
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding dbdream.public.ord_pay
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.med_ord_pgm_d_err is 'None' 
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding dbdream.public.med_ord_pgm_d_err
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding dbdream.public.med_ord_pgm_d_err
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.med_ord_pgm_d is 'id' 
20160303:17:24:56:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding dbdream.public.med_ord_pgm_d
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding dbdream.public.med_ord_pgm_d
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.ord_pay_s is 'id' 
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding dbdream.public.ord_pay_s
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding dbdream.public.ord_pay_s
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.tt is 'id' 
20160303:17:24:59:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding template1.public.tt
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding template1.public.tt
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.ord_pay_errs is 'None' 
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding dbdream.public.ord_pay_errs
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding dbdream.public.ord_pay_errs
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Distribution policy for table public.tt is 'id' 
20160303:17:25:00:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Expanding tt.public.tt
20160303:17:25:01:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Finished expanding tt.public.tt
20160303:17:25:11:012524 gpexpand:mdw-std:gpadmin-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20160303:17:25:11:012524 gpexpand:mdw-std:gpadmin-[INFO]:-Exiting...

gpexpand命令-d参数为重分布的周期,也就是最大多长时间完成,格式为hh:mm:ss,重分布完成后,再次查看数据重分布的状态。

dbdream=# select dbname,fq_name,status,expansion_started,expansion_finished from gpexpand.status_detail;
  dbname   |         fq_name          |  status   |     expansion_started      |     expansion_finished     
-----------+--------------------------+-----------+----------------------------+----------------------------
 dbdream   | public.ord_pay           | COMPLETED | 2016-03-03 17:24:55.18742  | 2016-03-03 17:24:56.499739
 dbdream   | public.med_ord_pgm_d_err | COMPLETED | 2016-03-03 17:24:56.583678 | 2016-03-03 17:24:56.726709
 dbdream   | public.med_ord_pgm_d     | COMPLETED | 2016-03-03 17:24:56.809434 | 2016-03-03 17:24:59.534905
 dbdream   | public.ord_pay_s         | COMPLETED | 2016-03-03 17:24:59.616341 | 2016-03-03 17:24:59.67507
 template1 | public.tt                | COMPLETED | 2016-03-03 17:24:59.75698  | 2016-03-03 17:25:00.326403
 dbdream   | public.ord_pay_errs      | COMPLETED | 2016-03-03 17:25:00.417212 | 2016-03-03 17:25:00.531194
 tt        | public.tt                | COMPLETED | 2016-03-03 17:25:00.617067 | 2016-03-03 17:25:01.228834
(7 rows)

再次查询ORD_PAY表的数据分布情况。

dbdream=# select gp_segment_id,count(*) from public.ord_pay group by gp_segment_id;
 gp_segment_id | count  
---------------+--------
             3 | 250000
             1 | 249999
             2 | 250001
             0 | 250000
(4 rows)

数据已经分布到新增的Segment上,数据重分布后,通过gpexpand –c命令清楚重分布时产生的数据,也就是gpexpand SCHEMA。

[gpadmin@mdw-std ~]$ gpexpand -c
20160303:17:28:31:012962 gpexpand:mdw-std:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.6.2 build 1'
20160303:17:28:31:012962 gpexpand:mdw-std:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.6.2 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 12 2015 23:50:28'
20160303:17:28:31:012962 gpexpand:mdw-std:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state

Do you want to dump the gpexpand.status_detail table to file? Yy|Nn (default=Y):

输入Y确认清除这些信息。

> y
20160303:17:28:34:012962 gpexpand:mdw-std:gpadmin-[INFO]:-Dumping gpexpand.status_detail to /gpdb/gpdata/master/gpseg-1/gpexpand.status_detail
20160303:17:28:34:012962 gpexpand:mdw-std:gpadmin-[INFO]:-Removing gpexpand schema
20160303:17:28:34:012962 gpexpand:mdw-std:gpadmin-[INFO]:-Cleanup Finished.  exiting...

清除操作完成后,在数据库中验证是否成功删除gpexpand相关信息。

dbdream=# \dn
       List of schemas
        Name        |  Owner  
--------------------+---------
 gp_toolkit         | gpadmin
 information_schema | gpadmin
 pg_aoseg           | gpadmin
 pg_bitmapindex     | gpadmin
 pg_catalog         | gpadmin
 pg_toast           | gpadmin
 public             | gpadmin
(7 rows)

可见gpexpand SCHEMA已经被删除,至此添加Segment工作完成。

本文固定链接: https://www.dbdream.com.cn/2016/03/greenplum%e6%95%b0%e6%8d%ae%e5%ba%93%e6%89%a9%e5%ae%b9-%e5%a2%9e%e5%8a%a0segment/ | 信春哥,系统稳,闭眼上线不回滚!

该日志由 dbdream 于2016年03月02日发表在 GreenPlum 分类下, 你可以发表评论,并在保留原文地址及作者的情况下引用到你的网站或博客。
原创文章转载请注明: GreenPlum数据库扩容-增加Segment | 信春哥,系统稳,闭眼上线不回滚!
关键字: , ,

GreenPlum数据库扩容-增加Segment:等您坐沙发呢!

发表评论

快捷键:Ctrl+Enter