IBM/ddlmig

Name: ddlmig

Owner: International Business Machines

Description: Db2 DDL migration and data migration through HPU

Created: 2018-02-22 19:28:33.0

Updated: 2018-02-24 20:59:37.0

Pushed: 2018-02-28 13:47:52.0

Homepage:

Size: 62

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ddlmig

Db2 DDL migration and data migration through HPU

This tool is used to migrate DDL and Data from a Db2 system to another Db2 system such as IBM Integrated Analytics System.

  1. The tool is designed to allow each partitioned and local indexes (as many as possible) will reside on their own table spaces on target Db2 System.

  2. Each non-partitioned index will reside on its own table space which will have name of index in the table space.

  3. The table space naming conventions are :

  4. “SCHEMA_TABLE_DATA” for data and “SCHEMA_TABLE_IDX” for index for local indexes.

  5. “SCHEMA_TABLE_PARTNAME_DATA” for data

  6. “SCHEMA_TABLE_PARTNAME_IDX” for partitioned data and indexes.

  7. “SCHEMA_INDEXNAME_IDX” for non-partitioned indexes (each for separate index)

  8. The overall storage groups are defined as follows:

TE STOGROUP stogroupdata ON '/data/data1','/local/data2';
TE STOGROUP stogroupidx ON '/data/idx1';
TE STOGROUP stogrouptemp ON '/data/temp';
  1. The buffer pools are just limited to 3 as shown:
    TE BUFFERPOOL BP_DATA_32K PAGESIZE 32768;
    TE BUFFERPOOL BP_IDX_32K PAGESIZE 32768;
    TE BUFFERPOOL BP_TEMP_32K PAGESIZE 32768;
    
    The name of the storage group and their mount points and buffer pools can be specified through src.properties file. For example:

Note: These are the best practices and it is not necessary that you have to adhere to these. The tool is designed to make moving parts as less as possible and still gain maximum benefits.

The main contentious point could be each table and index in their own table space. When we have choices, you will get as many differing opinions.

From recoverability perspective, it makes things much easier. With automatic storage, the management of table spaces is not a hassle any more and it gives a much better recoverability options.

src.properties

er=<IPAddressOfTheSourceDatabase>
=<DB2SourceDatabasePortNumber>
me=<DB2SourceDatabaseName>
id=<DB2SourceDatabaseUserID>
u can also specify plain text password instead of encrypted password.
 generate the encrypted password, use crypt.sh to generate the encrypted password.
word=<EncryptedDB2SourceDatabasePassword> # use utility crypt.sh to encrypt the password
Commit=true
hSize=1000
leSysplexWLB=false
ECTNODE=0
entSchema=<DB2SourceDatabaseSchemaNameForMigration>
icationName=<ApplicationName-WriteAnyName>
ntUser=<WriteAnyNameforClientUser>
ntAccountingInformation=<WriteAnyNameforAccounting>
ntHostname=<WriteSourceServerHostName>
 all schema required then specify ALL otherwise use comma separated list
ationSchemaList=<CommaSeparatedListofSchema|ALL>
itial size increment - increase by a factor
ialSizeIncrement=0.50 # This is the factor by which you would get the size of initial table space
w may target MLN are there
etMLNCount=<TotalNumberofDatabasePartitions> # Just count number of lines in db2nodes.cfg and that is MLN count
 bufferpool is defined, we will use this and not from the source
Bufferpool=BP_DATA_32K,32768 # Name of the data bufferpool and size
ufferpool=BP_IDX_32K,32768 # Name of the index bufferpool and size
Bufferpool=BP_TEMP_32K,32768 # Name of the temporary bufferpool and size
tent of target tablespaces extent size
ntSize=8
 mapdbpg is defined then we will use this and not from the source
bpg=ALLDATANODES:DATANODES:1,2,3,45,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52|SINGLEDATANODE:FIRSTHOST:0
roupdata=stogroupdata|'/data/data1','/local/data1' # The paths for data storage group. Command separated list for multiple
roupidx=stogroupidx|'/local/idx1' # The paths for index storage group. Command separated list for multiple
rouptemp=stogrouptemp|'/data/temp1' # The paths for temporary storage group. Command separated list for multiple

The initial size of the table space is calculated from the source database. This is an important step to allocate required space upfront so that Db2 and GPFS are not in conflict. This makes things faster for the data load. It is a good practice. The src.properties file has a paratemter initialSizeIncrement through which you can scale up or down the initial size calculation.

HPU Installation

HPU needs to be installed on all hosts on source and target. Please make sure that the HPU version is same on all hosts on source and target.

The example given here is on IBM Integrated Analytics System (IIAS).

t /bin/runall
in/bash

S="node0101-fab node0102-fab node0103-fab"

host in $NODES

cho Running command on $host
sh $host "$@"


mod +x  /bin/runall
p /bin/runall node0102-fab:/bin
p /bin/runall node0103-fab:/bin
nall "rpm -ivh /scratch/xinetd-2.3.15-13.el7.ppc64le.rpm"             
nall "systemctl enable xinetd"
nall "systemctl start xinetd"
Install HPU on all hosts

Assuming that you have copied HPU GA and Fixpack in /scratch/hpu

nall /scratch/hpu/HPU6100/install_hpu.sh -d /opt/ibm/HPU/V6.1 -s
nall /scratch/hpu/HPU6101/install_hpu.sh -d /opt/ibm/HPU/V6.1 -s

Add 3 entries in /etc/xinet.d/db2hpudm61 for better Performance

            = 5000 10
ances       = 5000
source      = 100

For example:

ice db2hpudm61

disable         = no
flags           = REUSE
socket_type     = stream
protocol        = tcp
wait            = no
user            = root
server          = /opt/ibm/HPU/V6.1/bin/db2hpudm
server_args     = --tophpu /opt/ibm/HPU/V6.1 --loglevel 3 --inetd --logfile /var/log/hpu/db2hpudm61.log
log_on_failure += USERID HOST
log_on_success += USERID PID HOST DURATION
cps             = 5000 10
instances       = 5000
per_source      = 100

And, do not forget to add this to all hosts.

p /etc/xinetd.d/db2hpudm61 node0102-fab:/etc/xinetd.d/db2hpudm61
p /etc/xinetd.d/db2hpudm61 node0103-fab:/etc/xinetd.d/db2hpudm61

Restart the service

stemctl stop xinetd
stemctl start xinetd

Change owner of `/opt/ibm/HPU/V6.1/cfg directory on all hosts to the instance owner so that you do not have to depend upon root access

nall "chown -R dbpemon.db2iadm1 /opt/ibm/HPU/V6.1/cfg"

Create, db2hpu.cfg file with the following parameters.

U default configuration
ize=xxxxxx
bdft=BLUDB
nstance=db2inst1
ledelim=binary
ervice=db2hpudm61
w_unlimited_memory=yes
alive_time=10
hreads=8
elects=10
pipe_timeout=60
extent_per_thread=4
stats=true
u=8
k=022

Create softlink for db2hpu in bin

ll "ln -s /opt/ibm/HPU/V6.1/bin/db2hpu /bin/db2hpu"

Check the version

nall "db2hpu -version"

Check the daemon running

nall "netstat -a | grep -i hpu"

Login as you inside the container. Create .db2hpu in the home directory of user who is going to do the migration.

 - db2psc
dir .db2hpu
 .db2hpu
uch db2hpu.creds
2hpu --credentials local
059I Optim High Performance Unload for DB2 for Linux, UNIX and Windows 06.01.00.001(170531)
gement of credentials for 'local' type connections:
do you want to create or remove data (1/2)? 1
is it a new section (Y/N)? y
provide a section name: db2inst1
provide a user name: db2psc
ou want to validate your data (Y/N)? y
061I Credentials of connections created for 'db2inst1'

Do this on all hosts

Create a map file so that HPU on host can map source to the target. Login as root

 /opt/ibm/HPU/V6.1/cfg
t db2hpu.map
0101-fab=srchost401
0102-fab=srchost402
0103-fab=srchost403

The first entry is the host name of the target (as it is in db2nodes.cfg) and the second entry is the host name of the source machine.

DDL Migration

migr.sh calls com.ibm.migr.data.GenInitialObjects and it expects source and database connection informations. The JDBC drivers must on the class path.

HPU Migration

The Db2 High Performance Unload is used as this is the fastest method to unload and load data.

The HPU has capabilities in which it will run on source database on each database partition. It will then transfer data using inetd to the target. The HPU must be running on each node on the target as xinetd daemon listening on a specific port.

When HPU sends data from source to multiple targets using inetd/xinetd protocol, it will invoke LOAD utility on target and copy the data from tcp/ip socket to the pipe so that LOAD can consume the data.

The HPU will repartition data on source and send the desired data to the target database partition so this method is the best when MLN topology is changing from source to the target.

hpumigr.sh calls com.ibm.migr.data.GenHPUScripts and it will detect the source MLN and reads target MLN count from the src.properties and generates migration scripts with proper syntax.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.