A long time ago in a galaxy far, far away...
Trying to migrate a database from MySQL to PostgreSQL. All the documentation I have read covers, in great detail, how to migrate the structure. I have found very little documentation on migrating the data. The schema has 13 tables (which have been migrated successfully) and 9 GB of data.
MySQL version: 5.1.x
PostgreSQL version: 8.4.x
I want to use the R programming language to analyze the data using SQL select statements; PostgreSQL has PL/R, but MySQL has nothing (as far as I can tell).
A New Hope
Create the database location (/var has insufficient space; also dislike having the PostgreSQL version number everywhere -- upgrading would break scripts!):
sudo mkdir -p /home/postgres/main
sudo cp -Rp /var/lib/postgresql/8.4/main /home/postgres
sudo chown -R postgres.postgres /home/postgres
sudo chmod -R 700 /home/postgres
sudo usermod -d /home/postgres/ postgres
All good to here. Next, restart the server and configure the database using these installation instructions:
sudo apt-get install postgresql pgadmin3
sudo /etc/init.d/postgresql-8.4 stop
sudo vi /etc/postgresql/8.4/main/postgresql.conf
Change data_directory to /home/postgres/main
sudo /etc/init.d/postgresql-8.4 start
sudo -u postgres psql postgres
\password postgres
sudo -u postgres createdb climate
pgadmin3
Use pgadmin3 to configure the database and create a schema.
The episode continues in a remote shell known as bash, with both databases running, and the installation of a set of tools with a rather unusual logo: SQL Fairy.
perl Makefile.PL
sudo make install
sudo apt-get install perl-doc (strangely, it is not called perldoc)
perldoc SQL::Translator::Manual
Extract a PostgreSQL-friendly DDL and all the MySQL data:
sqlt -f DBI --dsn dbi:mysql:climate --db-user user --db-password password -t PostgreSQL > climate-pg-ddl.sql
mysqldump --skip-add-locks --complete-insert --no-create-db --no-create-info --quick --result-file="climate-my.sql" --databases climate --skip-comments -u root -p
The Database Strikes Back
Recreate the structure in PostgreSQL as follows:
pgadmin3 (switch to it)
Click the Execute arbitrary SQL queries icon
Open climate-pg-ddl.sql
Search for TABLE " replace with TABLE climate." (insert the schema name climate)
Search for on " replace with on climate." (insert the schema name climate)
Press F5 to execute
This results in:
Query returned successfully with no result in 122 ms.
Replies of the Jedi
At this point I am stumped.
Where do I go from here (what are the steps) to convert climate-my.sql to climate-pg.sql so that they can be executed against PostgreSQL?
How to I make sure the indexes are copied over correctly (to maintain referential integrity; I don't have constraints at the moment to ease the transition)?
How do I ensure that adding new rows in PostgreSQL will start enumerating from the index of the last row inserted (and not conflict with an existing primary key from the sequence)?
How do you ensure the schema name comes through when transforming the data from MySQL to PostgreSQL inserts?
Resources
A fair bit of information was needed to get this far:
https://help.ubuntu.com/community/PostgreSQL
http://articles.sitepoint.com/article/site-mysql-postgresql-1
http://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL#MySQL
http://pgfoundry.org/frs/shownotes.php?release_id=810
http://sqlfairy.sourceforge.net/
Thank you!