Tag Archive | jpa

Using JINQ with JPA and H2

A few days ago I have read the interesting interview with Ming-Yee Iu about JINQ. JINQ is, like the name already suggests, the attempt to provide something similar to LINQ for Java. The basic idea is to close the semantic gap between object-oriented code that executes queries on a relational data model. The queries for the relational database model should be easily integrated into the code such that it feels more natural.

The research behind LINQ came to the conclusion that the algorithms transforming the code into relational database queries work best with functional code. As Java 8 comes with the streams API, the author uses it to implement the ideas of his PhD in Java.

To get our hands dirty, we start with a simple project that uses Hibernate over JPA together with an H2 database and JINQ:

<dependencies>
	<dependency>
		<groupId>javax</groupId>
		<artifactId>javaee-api</artifactId>
		<version>${jee.version}</version>
		<scope>provided</scope>
	</dependency>
	<dependency>
		<groupId>com.h2database</groupId>
		<artifactId>h2</artifactId>
		<version>${h2.version}</version>
	</dependency>
	<dependency>
		<groupId>org.hibernate</groupId>
		<artifactId>hibernate-entitymanager</artifactId>
		<version>${hibernate.version}</version>
	</dependency>
	<dependency>
		<groupId>org.jinq</groupId>
		<artifactId>jinq-jpa</artifactId>
		<version>1.8.10</version>
	</dependency>
</dependencies>

In order to use JINQ streams we have to create a provider that gets the EntityManagerFactory as argument:

EntityManagerFactory factory = Persistence.createEntityManagerFactory("PersistenceUnit");
JinqJPAStreamProvider streams = new JinqJPAStreamProvider(factory);

Having inserted some persons into our database, we can easily query them:

List<String> firstNames = streams.streamAll(entityManager, Person.class)
		.map(Person::getFirstName)
		.collect(toList());
firstNames.forEach(System.out::println);

Using the method streamAll() of the previously created JinqJPAStreamProvider gives us access to all persons within the database. In this simple example we only want to output the first name of each person; hence we map the list and collect all results into a List. This list gets printed using the forEach() method and a reference to the println() method.

Taking a look at the generated SQL code, we see that all columns are selected:

select
	person0_.id as id1_4_,
	person0_.FIRST_NAME as FIRST_NA2_4_,
	person0_.ID_CARD_ID as ID_CARD_4_4_,
	person0_.LAST_NAME as LAST_NAM3_4_,
from
	T_PERSON person0_ 

Of course we can refine the statement using the select() method:

List<String> firstNames = streams.streamAll(entityManager, Person.class)
		.select(Person::getFirstName)
		.where(p -> p.equals("Homer"))
		.collect(toList());
firstNames.forEach(System.out::println);

Additionally we have also added a predicate (where firstName = 'Homer'):

    select
        person0_.FIRST_NAME as FIRST_NA2_4_
    from
        T_PERSON person0_ 
    where
        person0_.FIRST_NAME='Homer'

Leaving this simple example, we now want to create a query that selects all geeks with first name “Christian” that work in a time and material project:

List<String> geeks = streams.streamAll(entityManager, Project.class)
		.where(p -> p.getProjectType() == Project.ProjectType.TIME_AND_MATERIAL)
		.joinList(Project::getGeeks)
		.where(g -> g.getTwo().getFirstName().equals("Christian"))
		.map(p -> p.getTwo().getFirstName())
		.collect(toList());
geeks.forEach(System.out::println);

As can be seen from the code above, we use the first where() clause to select all time and material projects. The joinList() invocation joins the geek table while the subsequent where() clause also restricts to only select geeks with first name “Christian”. Et voila, that is the created SQL query:

select
	geek2_.FIRST_NAME as col_0_0_ 
from
	T_PROJECT project0_ 
inner join
	T_GEEK_PROJECT geeks1_ 
		on project0_.id=geeks1_.PROJECT_ID 
inner join
	T_GEEK geek2_ 
		on geeks1_.GEEK_ID=geek2_.id 
where
	project0_.projectType='TIME_AND_MATERIAL' 
	and geek2_.FIRST_NAME='Christian' limit ?

Conclusion: Having worked with JPA’s criteria API some time ago, I must say that the first steps with JINQ are more intuitive and where easier to write down. JINQ really helps to close the gap between the relational database world by using streams in Java 8.

JPA 2.1: Unsynchronized persistence context

The JPA version 2.1 brings a new way how to handle the synchronization between the persistence context and the current JTA transaction as well as the resource manager. The term resource manager comes from the Java Transaction API and denotes a component that manipulates one resource (for example a concrete database that is manipulated by using its JDBC driver). Per default a container-managed persistence context is of type SynchronizationType.SYNCHRONIZED, i.e. this persistence context automatically joins the current JTA transaction and updates to the persistence context are propagated to the underlying resource manager.

By creating a persistence context that is of the new type SynchronizationType.UNSYNCHRONIZED, the automatic join of the transaction as well as the propgation of updates to the resource manager is disabled. In order to join the current JTA transaction the code has to call the method joinTransaction() of the EntityManager. This way the EntityManager’s persistence context gets enlisted in the transaction and is registered for subsequent notifications. Once the transaction is commited or rolled back, the persistence context leaves the transaction and is not attached to any further transaction until the method joinTransaction() is called once again for a new JTA transaction.

Before JPA 2.1 one could implement a conversation that spans multiple method calls with a @Stateful session bean as described by Adam Bien here:

@Stateful
@TransactionAttribute(TransactionAttributeType.NEVER)
public class Controller {
	@PersistenceContext(type = PersistenceContextType.EXTENDED)
	EntityManager entityManager;

	public Person persist() {
		Person p = new Person();
		p.setFirstName(&quot;Martin&quot;);
		p.setLastName(&quot;Developer&quot;);
		return entityManager.merge(p);
	}

	public List&lt;Person&gt; list() {
		return entityManager.createQuery(&quot;from Person&quot;, Person.class).getResultList();
	}

	@TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
	public void commit() {
		
	}

	@Remove
	public void remove() {

	}
}

The persistence context is of type EXTENDED and therefore lives longer than the JTA transactions it is attached to. As the persistence context is per default also of type SYNCHRONIZED it will automatically join any transaction that is running when any of the session bean’s methods are called. In order to prevent that to happen for most of the bean’s methods, the annotation @TransactionAttribute(TransactionAttributeType.NEVER) tells the container to not open any transaction for this bean. Therefore the methods persist() and list() run without a transaction. This behavior is different for the method commit(). Here the annotation @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) tells the container to create a new transaction before the method is called and therefore the bean’s EntityManager will join it automatically.

With the new type SynchronizationType.UNSYNCHRONIZED the code above can be rewritten as depicted in the following listing:

@Stateful
public class Controller {
	@PersistenceContext(type = PersistenceContextType.EXTENDED, 
		synchronization = SynchronizationType.UNSYNCHRONIZED)
	EntityManager entityManager;

	public Person persist() {
		Person p = new Person();
		p.setFirstName(&quot;Martin&quot;);
		p.setLastName(&quot;Developer&quot;);
		return entityManager.merge(p);
	}

	public List&lt;Person&gt; list() {
		return entityManager.createQuery(&quot;from Person&quot;, Person.class).getResultList();
	}

	public void commit() {
		entityManager.joinTransaction();
	}

	@Remove
	public void remove() {

	}
}

Now that the EntityManager won’t automatically join the current transaction, we can omit the @TransactionAttribute annotations. Any running transaction won’t have an impact on the EntityManager until we explicitly join it. This is now done in the method commit() and could even be done on the base on some dynamic logic.

In order to test the implementation above, we utilize a simple REST resource:

@Path(&quot;rest&quot;)
@Produces(&quot;text/json&quot;)
@SessionScoped
public class RestResource implements Serializable {
	@Inject
	private Controller controller;

	@GET
	@Path(&quot;persist&quot;)
	public Person persist(@Context HttpServletRequest request) {
		return controller.persist();
	}

	@GET
	@Path(&quot;list&quot;)
	public List&lt;Person&gt; list() {
		return controller.list();
	}

	@GET
	@Path(&quot;commit&quot;)
	public void commit() {
		controller.commit();
	}

	@PreDestroy
	public void preDestroy() {
                controller.remove();
	}
}

This resource provides methods to persist a person, list all persisted person and to commit the current changes. As we are going to use a stateful session bean, we annotate the resource with @SessionScoped and let the container inject the Controller bean.

By calling the following URL after the application has been deployed to some Java EE container, a new person gets added to the unsynchronized persistence context, but is not stored in the database.

http://localhost:8080/jpa2.1-unsychronized-pc/rest/persist

Even a call of the list() method won’t return the newly added person. Only by finally synchronizing the changes in the persistence context to the underlying resource with a call of commit(), the insert statement is send to the underlying database.

Conclusion: The new UNSYNCHRONIZED mode of the persistence context lets us implement conversations over more than one method invocation of a stateful session bean with the flexibility to join a JTA transaction dynamically based on our application logic without the need of any annotation magic.

PS: The source code is available at github.

JPA 2.1 criteria delete/update and temporary tables in Hibernate

Since JPA version 2.0 the EntityManager offers the method getCriteriaBuilder() to dynamically build select queries without the need of string concatenation using the Java Persistence Query Languge (JPQL). With version 2.1 this CriteriaBuilder offers the two new methods createCriteriaDelete() and createCriteriaUpdate() that let us formulate delete and update queries using the criteria API.

For illustration purposes lets use a simple inheritance use case with the two entities Person and Geek:

@Entity
@Table(name = "T_PERSON")
@Inheritance(strategy = InheritanceType.JOINED)
public class Person {
	@Id
	@GeneratedValue
	private Long id;
	@Column(name = "FIRST_NAME")
	private String firstName;
	@Column(name = "LAST_NAME")
	private String lastName;
	...
}

@Entity
@Table(name = "T_GEEK")
@Access(AccessType.PROPERTY)
public class Geek extends Person {
	private String favouriteProgrammingLanguage;
	...
}

To delete all geeks from our database that favour Java as their programming language, we can utilize the following code using EntityManager’s new createCriteriaDelete() method:

EntityTransaction transaction = null;
try {
	transaction = entityManager.getTransaction();
	transaction.begin();
	CriteriaBuilder builder = entityManager.getCriteriaBuilder();
	CriteriaDelete<Geek> delete = builder.createCriteriaDelete(Geek.class);
	Root<Geek> geekRoot = delete.from(Geek.class);
	delete.where(builder.equal(geekRoot.get("favouriteProgrammingLanguage"), "Java"));
	int numberOfRowsUpdated = entityManager.createQuery(delete).executeUpdate();
	LOGGER.info("Deleted " + numberOfRowsUpdated + " rows.");
	transaction.commit();
} catch (Exception e) {
	if (transaction != null && transaction.isActive()) {
		transaction.rollback();
	}
}

Like with pure SQL we can use the method from() to specify the table the delete query should be issued against and where() to declare our predicates. This way the criteria API allows the definition of bulk deletion operations in a dynamic way without using too much string concatenations.

But how does the SQL look like that is created? First of all the ORM provider has to pay attention that we are deleting from an inheritance hierarchy with the strategy JOINED, meaning that we have two tables T_PERSON and T_GEEK where the second tables stores a reference to the parent table. Hibernate in version 4.3.8.Final creates the following SQL statements:

insert 
into
	HT_T_GEEK
	select
		geek0_.id as id 
	from
		T_GEEK geek0_ 
	inner join
		T_PERSON geek0_1_ 
			on geek0_.id=geek0_1_.id 
	where
		geek0_.FAV_PROG_LANG=?;

delete 
from
	T_GEEK 
where
	(
		id
	) IN (
		select
			id 
		from
			HT_T_GEEK
	);

delete 
from
	T_PERSON 
where
	(
		id
	) IN (
		select
			id 
		from
			HT_T_GEEK
	)

delete 
from
	HT_T_GEEK;

As we can see, Hibernate fills a temporary table with the ids of the geeks/persons that match our search criteria. Then it deletes all rows from the geek table and then all rows from the person table. Finally the temporary table gets purged.

The sequence of delete statements is clear, as the table T_GEEK has a foreign key constraint on the id column of the T_PERSON table. Hence the rows in the child table have to be deleted before the rows in the parent table. The reason why Hibernate creates a temporary table is explained in this article. To summarize it, the underlying problem is that the query restricts the rows to be deleted on a column that only exists in the child table. But the rows in the child table have to be deleted before the corresponding rows in the parent table. Having deleted the rows in the child table, i.e. all geeks with FAV_PROG_LANG='Java', makes it impossible to delete afterwards all corresponding persons as the geek rows have already been deleted. The solution to this problem is the temporary table that first collects all row ids that should be deleted. Once all ids are known, this information can be used to delete the rows first from the geek table and then from the person table.

The generated SQL statements above are of course independent from the usage of the criteria API. Using the JPQL approach leads to the same generated SQL:

EntityTransaction transaction = null;
try {
	transaction = entityManager.getTransaction();
	transaction.begin();
	int update = entityManager.createQuery("delete from Geek g where g.favouriteProgrammingLanguage = :lang").setParameter("lang", "Java").executeUpdate();
	LOGGER.info("Deleted " + update + " rows.");
	transaction.commit();
} catch (Exception e) {
	if (transaction != null && transaction.isActive()) {
		transaction.rollback();
	}
}

When we change the inheritance strategy from JOINED to SINGLE_TABLE, the generated SQL statements also changes to a single one (here the discriminator column is DTYPE):

delete 
from
	T_PERSON 
where
	DTYPE='Geek' 
	and FAV_PROG_LANG=?

Conclusion: The new additions to the criteria API for deletion and update let you construct your SQL statements without the need of any string concatenation. But be aware that bulk deletions from an inheritance hierarchy can force the underlying ORM to use temporary tables in order to assemble the list of rows that have to be removed in advance.

Using @NamedEntityGraph to load JPA entities more selectively in N+1 scenarios

The N+1 problem is a common issue when working with ORM solutions. It happens when you set the fetchType for some @OneToMany relation to lazy, in order to load the child entities only when the Set/List is accessed. Let’s assume we have a Customer entity with two relations: a set of orders and a set of addresses for each customer.

@OneToMany(mappedBy = "customer", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
private Set<OrderEntity> orders;

@OneToMany(mappedBy = "customer", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
private Set<AddressEntity> addresses;

To load all customers, we can issue the following JPQL statement and afterwards load all orders for each customer:

List<CustomerEntity> resultList = entityManager.createQuery("SELECT c FROM CustomerEntity AS c", CustomerEntity.class).getResultList();
for(CustomerEntity customerEntity : resultList) {
    Set<OrderEntity> orders = customerEntity.getOrders();
    for(OrderEntity orderEntity : orders) {
	...
    }
}

Hibernate 4.3.5 (as shipped with JBoss AS Wildfly 8.1.0CR2) will generate the following series of SQL statements out of it for only two(!) customers in the database:

Hibernate: 
     select
         customeren0_.id as id1_1_,
         customeren0_.name as name2_1_,
         customeren0_.numberOfPurchases as numberOf3_1_ 
     from
         CustomerEntity customeren0_
Hibernate: 
     select
         orders0_.CUSTOMER_ID as CUSTOMER4_1_0_,
         orders0_.id as id1_2_0_,
         orders0_.id as id1_2_1_,
         orders0_.campaignId as campaign2_2_1_,
         orders0_.CUSTOMER_ID as CUSTOMER4_2_1_,
         orders0_.timestamp as timestam3_2_1_ 
     from
         OrderEntity orders0_ 
     where
         orders0_.CUSTOMER_ID=?
Hibernate: 
     select
         orders0_.CUSTOMER_ID as CUSTOMER4_1_0_,
         orders0_.id as id1_2_0_,
         orders0_.id as id1_2_1_,
         orders0_.campaignId as campaign2_2_1_,
         orders0_.CUSTOMER_ID as CUSTOMER4_2_1_,
         orders0_.timestamp as timestam3_2_1_ 
     from
         OrderEntity orders0_ 
     where
         orders0_.CUSTOMER_ID=?

As we can see, the first query selects all customers from the table CustomerEntity. The following two selects fetch then the orders for each customer we have loaded in the first query. When we have 100 customers instead of two, we will get 101 queries. One initial query to load all customers and then for each of the 100 customers an additional query for the orders. That is the reason why this problem is called N+1.

A common idiom to solve this problem is to force the ORM to generate an inner join query. In JPQL this can be done by using the JOIN FETCH clause like demonstrated in the following code snippet:

entityManager.createQuery("SELECT c FROM CustomerEntity AS c JOIN FETCH c.orders AS o", CustomerEntity.class).getResultList();

As expected the ORM now generates an inner join with the OrderEntity table and therewith only needs one SQL statement to load all data:

select
    customeren0_.id as id1_0_0_,
    orders1_.id as id1_1_1_,
    customeren0_.name as name2_0_0_,
    orders1_.campaignId as campaign2_1_1_,
    orders1_.CUSTOMER_ID as CUSTOMER4_1_1_,
    orders1_.timestamp as timestam3_1_1_,
    orders1_.CUSTOMER_ID as CUSTOMER4_0_0__,
    orders1_.id as id1_1_0__
from
    CustomerEntity customeren0_
inner join
    OrderEntity orders1_
        on customeren0_.id=orders1_.CUSTOMER_ID

In situations where you know that you will have to load all orders for each customer, the JOIN FETCH clause minimizes the number of SQL statements from N+1 to 1. This comes of course with the drawback that you now transfer for all orders of one customer the customer data again and again (due to the additional customer columns in the query).

The JPA specification introduces with version 2.1 so called NamedEntityGraphs. This annotation lets you describe the graph a JPQL query should load in more detail than a JOIN FETCH clause can do and therewith is another solution to the N+1 problem. The following example demonstrates a NamedEntityGraph for our customer entity that is supposed to load only the name of the customer and its orders. The orders are described in the subgraph ordersGraph in more detail. Here we see that we only want to load the fields id and campaignId of the order.

@NamedEntityGraph(
        name = "CustomersWithOrderId",
        attributeNodes = {
                @NamedAttributeNode(value = "name"),
                @NamedAttributeNode(value = "orders", subgraph = "ordersGraph")
        },
        subgraphs = {
                @NamedSubgraph(
                        name = "ordersGraph",
                        attributeNodes = {
                                @NamedAttributeNode(value = "id"),
                                @NamedAttributeNode(value = "campaignId")
                        }
                )
        }
)

The NamedEntityGraph is given as a hint to the JPQL query, after it has been loaded via EntityManager using its name:

EntityGraph entityGraph = entityManager.getEntityGraph("CustomersWithOrderId");
entityManager.createQuery("SELECT c FROM CustomerEntity AS c", CustomerEntity.class).setHint("javax.persistence.fetchgraph", entityGraph).getResultList();

Hibernate supports the @NamedEntityGraph annotation since version 4.3.0.CR1 and creates the following SQL statement for the JPQL query shown above:

Hibernate: 
    select
        customeren0_.id as id1_1_0_,
        orders1_.id as id1_2_1_,
        customeren0_.name as name2_1_0_,
        customeren0_.numberOfPurchases as numberOf3_1_0_,
        orders1_.campaignId as campaign2_2_1_,
        orders1_.CUSTOMER_ID as CUSTOMER4_2_1_,
        orders1_.timestamp as timestam3_2_1_,
        orders1_.CUSTOMER_ID as CUSTOMER4_1_0__,
        orders1_.id as id1_2_0__ 
    from
        CustomerEntity customeren0_ 
    left outer join
        OrderEntity orders1_ 
            on customeren0_.id=orders1_.CUSTOMER_ID

We see that Hibernate does not issue N+1 queries but that instead the @NamedEntityGraph annotation has forced Hibernate to load the orders per left outer join. This is of course a subtle difference to the FETCH JOIN clause, where Hibernate created an inner join. The left outer join would also load customers for which no order exists in contrast to the FETCH JOIN clause, where we would only load customers that have at least one order.

Interestingly is also that Hibernate loads more than the specified attributes for the tables CustomerEntity and OrderEntity. As this conflicts with the specification of @NamedEntityGraph (section 3.7.4) I have created an JIRA issue for that.

Conclusion: We have seen that with JPA 2.1 we have two solutions for the N+1 problem: We can either use the FETCH JOIN clause to eagerly fetch a @OneToMany relation, which results in an inner join, or we can use @NamedEntityGraph feature that lets us specify which @OneToMany relation to load via left outer join.

Tracing SQL statements in JBoss AS 7 using a custom logging handler

Using an ORM to abstract from your specific database and to let it create and issue all the SQL statements you would have to write by hand yourself seems handy. This is what made ORM solutions popular.

But it also comes with a downside: As the ORM does a lot of work for you, you lose to some degree control over the generated SQL and you have to rely on the ORM to create a high-performance statement for you. But it can happen that the SQL generated by the ORM is not what you might have written by hand and expected the ORM to do for you. In this case you have to get back control over the SQL and put your hands on the code again.

In huge applications this task is not as trivial, as there might be hundreds of statements issued to the database that stem from hundreds of lines of Java code that makes heavy usage of JPA features. Tracing the SQL statement that your database profiling tool has identified as problematic down to the actual code line becomes tedious.

We know that we can enable SQL statement logging for Hibernate with the following two lines in our persistence.xml:

<property name="hibernate.show_sql" value="true"/>
<property name="hibernate.format_sql" value="true"/>

But this will only output the already generated SQL; the actual Java code line is still not visible. For smaller applications it might be feasible to attach a debugger to the application server and debug through the code until you have found the line that logs the problematic SQL statement, but for bigger applications this is time consuming.

As Hibernate itself does not provide any means of intercepting the logging and enhance it with more information, we will have to do this on our own. The JBoss documentation indicates that it is possible to write your own custom logging handler. As this logging handler receives all the logging messages and therewith also the messages produces by Hibernate with enabled SQL logging, we can try to find the line we are looking for and then output a stack trace to our own log file.

Writing a custom logging handler turns out to be very simple. All you have to do is setup a small project with a class that extends the class Handler from the JDK package java.util.logging:

package mypackage;

import java.util.logging.Handler;
import java.util.logging.LogRecord;

public class MyJBossLogger extends Handler {

	@Override
	public void publish(LogRecord record) {
	
	}
	
	@Override
	public void flush() {
	}
	
	@Override
	public void close() throws SecurityException {

	}
}

The publish() method receives all logging output in form of an instance of LogRecord. Its method getMessage() lets us access the output directly. Hence we can match this message against some keywords we have loaded from some configuration file:

@Override
public void publish(LogRecord record) {
	String message = record.getMessage();
	buffer.add(message + "\n");
	if (keywords == null) {
		keywords = loadKeywords();
	}
	if (matches(message, keywords)) {
		String stacktrace = "\nStacktrace:\n";
		StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();
		for (StackTraceElement element : stackTrace) {
			stacktrace += element.toString() + "\n";
		}
		buffer.add(stacktrace);
		flush();
	}
}

Buffer is here some simple data structure (e.g. guava’s EvictingQueue) that buffers the last lines, as the method publish() is called for each line(!) of output. As a complete SQL statement spans more than one line, we have to remember a couple of them. Next to the buffered lines and the current line we also output a String representation of the current stack trace. This tells us later in the log file from where we are called and therewith which line of Java code in our project causes the current statement.

Once we have compiled the project we can copy the resulting jar file to the newly created folder structure under: $JBOSS_HOME/modules/system/layers/base/com/mydomain/mymodule/main (for JBoss AS 7.2). In order to tell JBoss AS about our new module, we have to create a XML file called module.xml with the following content:

<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="urn:jboss:module:1.1" name="com.mydomain.mymodule">
	<resources>
		<resource-root path="MyJBossLogger-0.0.1-SNAPSHOT.jar"/>
	</resources>
</module>

The name of the module corresponds to the path within the JBoss modules folder. It will also be used in the configuration file to configure our custom logging handler:

...
<subsystem xmlns="urn:jboss:domain:logging:1.2">
	<custom-handler name="CUSTOM" module="com.mydomain.mymodule" class="com.mydomain.mymodule.MyJBossLogger">
		<level name="DEBUG"/>
	</custom-handler>
	...

When we implement the flush() method of our logging handler to write the output to some log file, we will see something like the following (of course in condensed form):

Hibernate:     select ... from customer ...
Stacktrace:
java.lang.Thread.getStackTrace(Thread.java:1568)
com.mydomain.mymodule.MyJBossLogger.publish(MyJBossLogger.java:20)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:292)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:300)
org.jboss.logmanager.Logger.logRaw(Logger.java:721)
org.jboss.logmanager.Logger.log(Logger.java:506)
...
com.mydomain.myapp.ArticleEntity.getCustomers(ArticleRepository.java:234)
...

Here we can see clearly which OneToMany relation causes the problematic select statement we were looking for.

Conclusion: Using a custom logging handler to inject the current stack trace into the logging of the SQL statements may help you when you want to find the
exact location in the source code where a concrete query is issued. It turned also out that writing your own custom logging handler for JBoss AS is also a straight forward task.

Using Java EE’s ManagedExecutorService to asynchronously execute transactions

One year has passed by since the Java EE 7 specification has been published. Now that Wildfly 8 Final has been released, it is time to take a closer look at the new features.

One thing which was missing since the beginning of the Java EE days is the ability to work with fully-fledged Java EE threads. Java EE 6 has already brought us the @Asynchronous annotation with which we could execute single methods in the background, but a real thread pool was still out of reach. But all this is now history since Java EE 7 introduced the ManagedExecutorService:

@Resource
ManagedExecutorService managedExecutorService;

Like the well-known ExecutorService from the Standard Edition, the ManagedExecutorService can be used to submit tasks that are executed within a thread pool. One can choose if the tasks submitted should implement the Runnable or Callable interface.

In contrast to a normal SE ExecutorService instance, the ManagedExecutorService provides threads that can access for example UserTransactions from JNDI in order to execute JPA transactions during their execution. This feature is a huge difference to threads started like in a SE environment.

It is important to know, that the transactions started within the ManagedExecutorService’s thread pool run outside of the scope of the transaction of the thread which submits the tasks. This makes it possible to implement scenarios in which the submitting thread inserts some information about the started tasks into the database while the long-running tasks execute their work within an independent transaction.

Now, after we have learned something about the theory, let’s put our hands on some code. First we write a @Stateless EJB that gets the ManagedExecutorService injected:

@Stateless
public class MyBean {
    @Resource
    ManagedExecutorService managedExecutorService;
    @PersistenceContext
    EntityManager entityManager;
    @Inject
    Instance<MyTask> myTaskInstance;

    public void executeAsync() throws ExecutionException, InterruptedException {
        for(int i=0; i<10; i++) {
            MyTask myTask = myTaskInstance.get();
            this.managedExecutorService.submit(myTask);
        }
    }

    public List<MyEntity> list() {
        return entityManager.createQuery("select m from MyEntity m", MyEntity.class).getResultList();
    }
}

The tasks that we will submit to the ManagedExecutorService are retrieved from CDI’s Instance mechanism. This lets us use the power of CDI within our MyTask class:

public class MyTask implements Runnable {
    private static final Logger LOGGER = LoggerFactory.getLogger(MyTask.class);
    @PersistenceContext
    EntityManager entityManager;

    @Override
    public void run() {
        UserTransaction userTransaction = null;
        try {
            userTransaction = lookup();
            userTransaction.begin();
            MyEntity myEntity = new MyEntity();
            myEntity.setName("name");
            entityManager.persist(myEntity);
            userTransaction.commit();
        } catch (Exception e) {
            try {
                if(userTransaction != null) {
                    userTransaction.rollback();
                }
            } catch (SystemException e1) {
                LOGGER.error("Failed to rollback transaction: "+e1.getMessage());
            }
        }
    }

    private UserTransaction lookup() throws NamingException {
        InitialContext ic = new InitialContext();
        return (UserTransaction)ic.lookup("java:comp/UserTransaction");
    }
}

Here we can inject the EntityManager to persist some entities into our database. The UserTransaction that we need for the commit has to be retrieved from the JNDI. An injection using the @Resource annotation is not possible within a normal managed bean.

To circumvent the UserTransaction we could of course call the method of another EJB and use the other EJB’s transaction to commit the changes to the database. The following code shows an alternative implementation using the injected EJB to persist the entity:

public class MyTask implements Runnable {
    private static final Logger LOGGER = LoggerFactory.getLogger(MyTask.class);
    @PersistenceContext
    EntityManager entityManager;
    @Inject
    MyBean myBean;

    @Override
    public void run() {
		MyEntity myEntity = new MyEntity();
        myBean.persit(myEntity);
    }
}

Now we only need to utilize JAX-RS to call the functionality over a REST interface:

@Path("/myResource")
public class MyResource {
    @Inject
    private MyBean myBean;

    @Path("list")
    @GET
    @Produces("text/json")
    public List<MyEntity> list() {
        return myBean.list();
    }

    @Path("persist")
    @GET
    @Produces("text/html")
    public String persist() throws ExecutionException, InterruptedException {
        myBean.executeAsync();
        return "<html><h1>Successful!</h1></html>";
    }
}

That’s it. With these few lines of code we have implemented a fully working Java EE application whose functionality can be called over a REST interface and that executes its core functionality asynchronously within worker threads with their own transactions.

Conclusion: The ManagedExecutorService is a great feature to integrate asynchronous functionality using all the standard Java EE features like JPA and transactions into enterprise applications. I would say the waiting was worthwhile.

Example source code can be found on github.

Efficiently delete data with JPA and Hibernate

You may come to the situation where you have to perform a bulk deletion on a huge amount of datasets stored in a relational database. If you use JPA with Hibernate as underlying OR mapper, you might try to call the remove() method of the EntityManager in a way like the following:

public void removeById(long id) {
    RootEntity rootEntity = entityManager.getReference(RootEntity.class, id);
    entityManager.remove(rootEntity);
}

First of all, we load a reference representation of the entity we want to delete and then pass this reference to the EntityManager. Let’s assume the RootEntity from above has a child relation to a class called ChildEntity:

@OneToMany(mappedBy = "rootEntity", fetch = FetchType.EAGER, cascade = CascadeType.ALL)
private Set childEntities = new HashSet(0);

If we now turn on the property show_sql of hibernate, we will wonder what SQL statements are issued:

    select
        rootentity0_.id as id5_1_,
        rootentity0_.field1 as field2_5_1_,
        rootentity0_.field2 as field3_5_1_,
        childentit1_.PARENT as PARENT5_3_,
        childentit1_.id as id3_,
        childentit1_.id as id4_0_,
        childentit1_.field1 as field2_4_0_,
        childentit1_.field2 as field3_4_0_,
        childentit1_.PARENT as PARENT4_0_
    from
        ROOT_ENTITY rootentity0_
    left outer join
        CHILD_ENTITY childentit1_
            on rootentity0_.id=childentit1_.PARENT
    where
        rootentity0_.id=?

    delete
    from
        CHILD_ENTITY
    where
        id=?

   delete
   from
       ROOT_ENTITY
   where
       id=?

Why does Hibernate first load all data into memory in order to delete this data immediately afterwards? The reason is that JPA’s lifecycle requires that the object is in “managed” state, before it can be deleted. Only in this state all lifecycle functionality like interceptors is available (see here). Therefore Hibernate issues a SELECT query before the deletion in order to transfer both RootEntity and ChildEntity to the “managed” state.
But what can we do, if we just want to delete RootEntity and ChildEntity, if we know the id of RootEntity? The answer is to use a simple DELETE query like the following one. But due to the integrity constraint on the child table, we first have to delete all depending child entities. The following code demonstrates how:

List childIds = entityManager.createQuery("select c.id from ChildEntity c where c.rootEntity.id = :pid").setParameter("pid", id).getResultList();
for(Long childId : childIds) {
    entityManager.createQuery("delete from ChildEntity c where c.id = :id").setParameter("id", childId).executeUpdate();
}
entityManager.createQuery("delete from RootEntity r where r.id = :id").setParameter("id", id).executeUpdate();

The above code results in the three SQL statements we would have expected by calling remove(). Now you may argue, that this way of deletion is more complicated than just calling the EntityManager’s remove() method. It also ignores annotations like @OneToMany and @ManyToOne we have placed in the two entity classes.
So why not write some code that uses the knowledge about the two entities that already exists in the two class files? First of all, we look for @OneToMany annotations using reflection in the RootEntity class, extract the type of the child entity and then look for its back relation field annotated with @ManyToOne. Having done this, we can easily write the three SQL statements in a more generic way:

public void delete(EntityManager entityManager, Class parentClass, Object parentId) {
    Field idField = getIdField(parentClass);
    if (idField != null) {
        List oneToManyFields = getOneToManyFields(parentClass);
        for (Field field : oneToManyFields) {
            Class childClass = getFirstActualTypeArgument(field);
            if (childClass != null) {
                Field manyToOneField = getManyToOneField(childClass, parentClass);
                Field childClassIdField = getIdField(childClass);
                if (manyToOneField != null && childClassIdField != null) {
                    List childIds = entityManager.createQuery(String.format("select c.%s from %s c where c.%s.%s = :pid", childClassIdField.getName(), childClass.getSimpleName(), manyToOneField.getName(), idField.getName())).setParameter("pid", parentId).getResultList();
                    for (Long childId : childIds) {
                        entityManager.createQuery(String.format("delete from %s c where c.%s = :id", childClass.getSimpleName(), childClassIdField.getName())).setParameter("id", childId).executeUpdate();
                    }
                }
            }
        }
        entityManager.createQuery(String.format("delete from %s e where e.%s = :id", parentClass.getSimpleName(), idField.getName())).setParameter("id", parentId).executeUpdate();
    }
}

The methods getFirstActualTypeArgument(), getManyToOneField(), getIdField() and getOneToManyFields() in the code above are not depicted here, but do what their name sounds like. Once implemented we can easily delete all entities beginning with the root of the tree.

A simple example application that can be used to examine the behavior and solution described above, can be found on github.