MySQL Normalization stored procedure performance
- by srkiNZ84
Hi,
I've written a stored procedure in MySQL to take values currently in a table and to "Normalize" them. This means that for each value passed to the stored procedure, it checks whether the value is already in the table. If it is, then it stores the id of that row in a variable. If the value is not in the table, it stores the newly inserted value's id. The stored procedure then takes the id's and inserts them into a table which is equivalent to the original de-normailized table, but this table is fully normalized and consists of mainly foreign keys.
My problem with this design is that the stored procedure takes approximately 10ms or so to return, which is too long when you're trying to work through some 10million records. My suspicion is that the performance is to do with the way in which I'm doing the inserts. i.e.
INSERT INTO TableA (first_value) VALUES (argument_from_sp) ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id);
SET @TableAId = LAST_INSERT_ID();
The "ON DUPLICATE KEY UPDATE" is a bit of a hack, due to the fact that on a duplicate key I don't want to update anything but rather just return the id value of the row. If you miss this step though, the LAST_INSERT_ID() function returns the wrong value when you're trying to run the "SET ..." statement.
Does anyone know of a better way to do this in MySQL?
Thank you