JPA : optimize EJB-QL query involving large many-to-many join table

Posted by Fabien on Stack Overflow See other posts from Stack Overflow or by Fabien
Published on 2010-02-09T11:06:50Z Indexed on 2010/04/09 7:03 UTC
Read the original article Hit count: 942

Filed under:
|
|
|
|

Hi all. I'm using Hibernate Entity Manager 3.4.0.GA with Spring 2.5.6 and MySql 5.1. I have a use case where an entity called Artifact has a reflexive many-to-many relation with itself, and the join table is quite large (1 million lines). As a result, the HQL query performed by one of the methods in my DAO takes a long time. Any advice on how to optimize this and still use HQL ? Or do I have no choice but to switch to a native SQL query that would perform a join between the table ARTIFACT and the join table ARTIFACT_DEPENDENCIES ?

Here is the problematic query performed in the DAO :

@SuppressWarnings("unchecked")
   public List<Artifact> findDependentArtifacts(Artifact artifact) {
      Query query = em.createQuery("select a from Artifact a where :artifact in elements(a.dependencies)");
      query.setParameter("artifact", artifact);
      List<Artifact> list = query.getResultList();
      return list;
   }

And the code for the Artifact entity :

package com.acme.dependencytool.persistence.model;

import java.util.ArrayList;
import java.util.List;

import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.JoinColumn;
import javax.persistence.JoinTable;
import javax.persistence.ManyToMany;
import javax.persistence.Table;
import javax.persistence.UniqueConstraint;

@Entity
@Table(name = "ARTIFACT", uniqueConstraints={@UniqueConstraint(columnNames={"GROUP_ID", "ARTIFACT_ID", "VERSION"})})
public class Artifact {

   @Id
   @GeneratedValue
   @Column(name = "ID")
   private Long id = null;

   @Column(name = "GROUP_ID", length = 255, nullable = false)
   private String groupId;

   @Column(name = "ARTIFACT_ID", length = 255, nullable = false)
   private String artifactId;

   @Column(name = "VERSION", length = 255, nullable = false)
   private String version;

   @ManyToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER)
   @JoinTable(
         name="ARTIFACT_DEPENDENCIES",
         joinColumns = @JoinColumn(name="ARTIFACT_ID", referencedColumnName="ID"),
         inverseJoinColumns = @JoinColumn(name="DEPENDENCY_ID", referencedColumnName="ID")
   )
   private List<Artifact> dependencies = new ArrayList<Artifact>();

   public Long getId() {
      return id;
   }

   public void setId(Long id) {
      this.id = id;
   }

   public String getGroupId() {
      return groupId;
   }

   public void setGroupId(String groupId) {
      this.groupId = groupId;
   }

   public String getArtifactId() {
      return artifactId;
   }

   public void setArtifactId(String artifactId) {
      this.artifactId = artifactId;
   }

   public String getVersion() {
      return version;
   }

   public void setVersion(String version) {
      this.version = version;
   }

   public List<Artifact> getDependencies() {
      return dependencies;
   }

   public void setDependencies(List<Artifact> dependencies) {
      this.dependencies = dependencies;
   }
}

Thanks in advance.

EDIT 1 : The DDLs are generated automatically by Hibernate EntityMananger based on the JPA annotations in the Artifact entity. I have no explicit control on the automaticaly-generated join table, and the JPA annotations don't let me explicitly set an index on a column of a table that does not correspond to an actual Entity (in the JPA sense). So I guess the indexing of table ARTIFACT_DEPENDENCIES is left to the DB, MySQL in my case, which apparently uses a composite index based on both clumns but doesn't index the column that is most relevant in my query (DEPENDENCY_ID).

mysql> describe ARTIFACT_DEPENDENCIES;
+---------------+------------+------+-----+---------+-------+
| Field         | Type       | Null | Key | Default | Extra |
+---------------+------------+------+-----+---------+-------+
| ARTIFACT_ID   | bigint(20) | NO   | MUL | NULL    |       |
| DEPENDENCY_ID | bigint(20) | NO   | MUL | NULL    |       |
+---------------+------------+------+-----+---------+-------+

EDIT 2 : When turning on showSql in the Hibernate session, I see many occurences of the same type of SQL query, as below :

select dependenci0_.ARTIFACT_ID as ARTIFACT1_1_, dependenci0_.DEPENDENCY_ID as DEPENDENCY2_1_, artifact1_.ID as ID1_0_, artifact1_.ARTIFACT_ID as ARTIFACT2_1_0_, artifact1_.GROUP_ID as GROUP3_1_0_, artifact1_.VERSION as VERSION1_0_ from ARTIFACT_DEPENDENCIES dependenci0_ left outer join ARTIFACT artifact1_ on dependenci0_.DEPENDENCY_ID=artifact1_.ID where dependenci0_.ARTIFACT_ID=?

Here's what EXPLAIN in MySql says about this type of query :

mysql> explain select dependenci0_.ARTIFACT_ID as ARTIFACT1_1_, dependenci0_.DEPENDENCY_ID as DEPENDENCY2_1_, artifact1_.ID as ID1_0_, artifact1_.ARTIFACT_ID as ARTIFACT2_1_0_, artifact1_.GROUP_ID as GROUP3_1_0_, artifact1_.VERSION as VERSION1_0_ from ARTIFACT_DEPENDENCIES dependenci0_ left outer join ARTIFACT artifact1_ on dependenci0_.DEPENDENCY_ID=artifact1_.ID where dependenci0_.ARTIFACT_ID=1;
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+
| id | select_type | table        | type   | possible_keys     | key               | key_len | ref                                         | rows | Extra |
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+
|  1 | SIMPLE      | dependenci0_ | ref    | FKEA2DE763364D466 | FKEA2DE763364D466 | 8       | const                                       |  159 |       |
|  1 | SIMPLE      | artifact1_   | eq_ref | PRIMARY           | PRIMARY           | 8       | dependencytooldb.dependenci0_.DEPENDENCY_ID |    1 |       |
+----+-------------+--------------+--------+-------------------+-------------------+---------+---------------------------------------------+------+-------+

EDIT 3 : I tried setting the FetchType to LAZY in the JoinTable annotation, but I then get the following exception :

Hibernate: select artifact0_.ID as ID1_, artifact0_.ARTIFACT_ID as ARTIFACT2_1_, artifact0_.GROUP_ID as GROUP3_1_, artifact0_.VERSION as VERSION1_ from ARTIFACT artifact0_ where artifact0_.GROUP_ID=? and artifact0_.ARTIFACT_ID=?
51545 [btpool0-2] ERROR org.hibernate.LazyInitializationException - failed to lazily initialize a collection of role: com.acme.dependencytool.persistence.model.Artifact.dependencies, no session or session was closed
org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.acme.dependencytool.persistence.model.Artifact.dependencies, no session or session was closed
    at org.hibernate.collection.AbstractPersistentCollection.throwLazyInitializationException(AbstractPersistentCollection.java:380)
    at org.hibernate.collection.AbstractPersistentCollection.throwLazyInitializationExceptionIfNotConnected(AbstractPersistentCollection.java:372)
    at org.hibernate.collection.AbstractPersistentCollection.readSize(AbstractPersistentCollection.java:119)
    at org.hibernate.collection.PersistentBag.size(PersistentBag.java:248)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.createArtifactViewBean(DependencyToolServiceImpl.java:93)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.createArtifactViewBean(DependencyToolServiceImpl.java:109)
    at com.acme.dependencytool.server.DependencyToolServiceImpl.search(DependencyToolServiceImpl.java:48)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.google.gwt.user.server.rpc.RPC.invokeAndEncodeResponse(RPC.java:527)
    at com.google.gwt.user.server.rpc.RemoteServiceServlet.processCall(RemoteServiceServlet.java:166)
    at com.google.gwt.user.server.rpc.RemoteServiceServlet.doPost(RemoteServiceServlet.java:86)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:362)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:324)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:843)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:647)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:205)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)
    at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)
    at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)

© Stack Overflow or respective owner

Related posts about jpa

Related posts about query