In MySQL, what is the most effective query design for joining large tables with many to many relatio

Posted by lighthouse65 on Stack Overflow See other posts from Stack Overflow or by lighthouse65
Published on 2010-03-13T19:27:39Z Indexed on 2010/03/13 19:35 UTC
Read the original article Hit count: 516

Filed under:

In our application, we collect data on automotive engine performance -- basically source data on engine performance based on the engine type, the vehicle running it and the engine design. Currently, the basis for new row inserts is an engine on-off period; we monitor performance variables based on a change in engine state from active to inactive and vice versa. The related engineState table looks like this:

+---------+-----------+---------------+---------------------+---------------------+-----------------+
| vehicle | engine    | engine_state  | state_start_time    | state_end_time      | engine_variable |
+---------+-----------+---------------+---------------------+---------------------+-----------------+
| 080025  | E01       | active        | 2008-01-24 16:19:15 | 2008-01-24 16:24:45 |             720 | 
| 080028  | E02       | inactive      | 2008-01-24 16:19:25 | 2008-01-24 16:22:17 |             304 |
+---------+-----------+---------------+---------------------+---------------------+-----------------+

For a specific analysis, we would like to analyze table content based on a row granularity of minutes, rather than the current basis of active / inactive engine state. For this, we are thinking of creating a simple productionMinute table with a row for each minute in the period we are analyzing and joining the productionMinute and engineEvent tables on the date-time columns in each table. So if our period of analysis is from 2009-12-01 to 2010-02-28, we would create a new table with 129,600 rows, one for each minute of each day for that three-month period. The first few rows of the productionMinute table:

+---------------------+ 
| production_minute   |
+---------------------+
| 2009-12-01 00:00    |
| 2009-12-01 00:01    |
| 2009-12-01 00:02    |     
| 2009-12-01 00:03    |
+---------------------+

The join between the tables would be engineState AS es LEFT JOIN productionMinute AS pm ON es.state_start_time <= pm.production_minute AND pm.production_minute <= es.event_end_time. This join, however, brings up multiple environmental issues:

The engineState table has 5 million rows and the productionMinute table has 130,000 rows
When an engineState row spans more than one minute (i.e. the difference between es.state_start_time and es.state_end_time is greater than one minute), as is the case in the example above, there are multiple productionMinute table rows that join to a single engineState table row
When there is more than one engine in operation during any given minute, also as per the example above, multiple engineState table rows join to a single productionMinute row

In testing our logic and using only a small table extract (one day rather than 3 months, for the productionMinute table) the query takes over an hour to generate. In researching this item in order to improve performance so that it would be feasible to query three months of data, our thoughts were to create a temporary table from the engineEvent one, eliminating any table data that is not critical for the analysis, and joining the temporary table to the productionMinute table. We are also planning on experimenting with different joins -- specifically an inner join -- to see if that would improve performance.

What is the best query design for joining tables with the many:many relationship between the join predicates as outlined above? What is the best join type (left / right, inner)?

Developer IT

In MySQL, what is the most effective query design for joining large tables with many to many relatio - Developer IT

In MySQL, what is the most effective query design for joining large tables with many to many relatio

mysql

mysql-query

sql

query

joins

Related posts about mysql

How to remove MySQL completely with config and library files on ubuntu 12.04 gnome 3.0

mysql: Cannot load from mysql.proc. The table is probably corrupted

Why is there a /etc/init.d/mysql file on this Slackware machine? How could it have gotten there?

mysql: Bind on unix socket: Permission denied

MySQL – Learning MySQL Online in 6 Hours – MySQL Fundamentals in 320 Minutes

Related posts about mysql-query

MySQL query returns different set of results on two identical databases

What's wrong with my MySql query ?!

Optimizing the MySQL Query Cache

index help for a MySQL query using greater-than operator and ORDER BY

mysql query using jdbc

Categories cloud