DB2 load partitioned data in parallel
- by Erik Paulson
I have a 10-node DB2 9.5 database, with raw data on each machine (ie
node1:/scratch/data/dataset.1
node2:/scratch/data/dataset.2
...
node10:/scratch/data/dataset.10
There is no shared NFS mount - none of my machines could handle all of the datasets combined.
each line of a dataset file is a long string of text, column delimited. The first column is the key. I don't know the hash function that DB2 will use, so dataset is not pre-partitioned.
Short of renaming all of my files, is there any way to get DB2 to load them all in parallel?
I'm trying to do something like
load from '/scratch/data/dataset' of del modified by coldel| fastparse messages /dev/null replace into TESTDB.data_table part_file_location '/scratch/data';
but I have no idea how to suggest to db2 that it should look for dataset.1 on the first node, etc.