kick off a map reduce job from my java/mysql webapp
- by Brian
Hi guys,
I need a bit of archecture advice. I have a java based webapp, with a JPA based ORM backed onto a mysql relational database. Now, as part of the application I have a batch job that compares thousands of database records with each other. This job has become too time consuming and needs to be parallelized. I'm looking at using mapreduce and hadoop in order to do this. However, I'm not too sure about how to integrate this into my current architecture. I think the easiest initial solution is to find a way to push data from mysql into hadoop jobs. I have done some initial research on this and found the following relevant information and possibilities:
1) https://issues.apache.org/jira/browse/HADOOP-2536 this gives an interesting overview of some inbuilt JDBC support
2) This article http://architects.dzone.com/articles/tools-moving-sql-database describes some third party tools to move data from mysql to hadoop.
To be honest I'm just starting out with learning about hbase and hadoop but I really don't know how to integrate this into my webapp.
Any advice is greatly appreciated.
cheers,
Brian