DB2 load partitioned data in parallel
Posted
by Erik Paulson
on Stack Overflow
See other posts from Stack Overflow
or by Erik Paulson
Published on 2010-02-02T22:19:32Z
Indexed on
2010/04/06
13:03 UTC
Read the original article
Hit count: 329
I have a 10-node DB2 9.5 database, with raw data on each machine (ie node1:/scratch/data/dataset.1 node2:/scratch/data/dataset.2 ... node10:/scratch/data/dataset.10
There is no shared NFS mount - none of my machines could handle all of the datasets combined.
each line of a dataset file is a long string of text, column delimited. The first column is the key. I don't know the hash function that DB2 will use, so dataset is not pre-partitioned.
Short of renaming all of my files, is there any way to get DB2 to load them all in parallel? I'm trying to do something like
load from '/scratch/data/dataset' of del modified by coldel| fastparse messages /dev/null replace into TESTDB.data_table part_file_location '/scratch/data';
but I have no idea how to suggest to db2 that it should look for dataset.1 on the first node, etc.
© Stack Overflow or respective owner