Archive | October 2013

Efficiently delete data with JPA and Hibernate

You may come to the situation where you have to perform a bulk deletion on a huge amount of datasets stored in a relational database. If you use JPA with Hibernate as underlying OR mapper, you might try to call the remove() method of the EntityManager in a way like the following:

public void removeById(long id) {
    RootEntity rootEntity = entityManager.getReference(RootEntity.class, id);
    entityManager.remove(rootEntity);
}

First of all, we load a reference representation of the entity we want to delete and then pass this reference to the EntityManager. Let’s assume the RootEntity from above has a child relation to a class called ChildEntity:

@OneToMany(mappedBy = "rootEntity", fetch = FetchType.EAGER, cascade = CascadeType.ALL)
private Set childEntities = new HashSet(0);

If we now turn on the property show_sql of hibernate, we will wonder what SQL statements are issued:

    select
        rootentity0_.id as id5_1_,
        rootentity0_.field1 as field2_5_1_,
        rootentity0_.field2 as field3_5_1_,
        childentit1_.PARENT as PARENT5_3_,
        childentit1_.id as id3_,
        childentit1_.id as id4_0_,
        childentit1_.field1 as field2_4_0_,
        childentit1_.field2 as field3_4_0_,
        childentit1_.PARENT as PARENT4_0_
    from
        ROOT_ENTITY rootentity0_
    left outer join
        CHILD_ENTITY childentit1_
            on rootentity0_.id=childentit1_.PARENT
    where
        rootentity0_.id=?

    delete
    from
        CHILD_ENTITY
    where
        id=?

   delete
   from
       ROOT_ENTITY
   where
       id=?

Why does Hibernate first load all data into memory in order to delete this data immediately afterwards? The reason is that JPA’s lifecycle requires that the object is in “managed” state, before it can be deleted. Only in this state all lifecycle functionality like interceptors is available (see here). Therefore Hibernate issues a SELECT query before the deletion in order to transfer both RootEntity and ChildEntity to the “managed” state.
But what can we do, if we just want to delete RootEntity and ChildEntity, if we know the id of RootEntity? The answer is to use a simple DELETE query like the following one. But due to the integrity constraint on the child table, we first have to delete all depending child entities. The following code demonstrates how:

List childIds = entityManager.createQuery("select c.id from ChildEntity c where c.rootEntity.id = :pid").setParameter("pid", id).getResultList();
for(Long childId : childIds) {
    entityManager.createQuery("delete from ChildEntity c where c.id = :id").setParameter("id", childId).executeUpdate();
}
entityManager.createQuery("delete from RootEntity r where r.id = :id").setParameter("id", id).executeUpdate();

The above code results in the three SQL statements we would have expected by calling remove(). Now you may argue, that this way of deletion is more complicated than just calling the EntityManager’s remove() method. It also ignores annotations like @OneToMany and @ManyToOne we have placed in the two entity classes.
So why not write some code that uses the knowledge about the two entities that already exists in the two class files? First of all, we look for @OneToMany annotations using reflection in the RootEntity class, extract the type of the child entity and then look for its back relation field annotated with @ManyToOne. Having done this, we can easily write the three SQL statements in a more generic way:

public void delete(EntityManager entityManager, Class parentClass, Object parentId) {
    Field idField = getIdField(parentClass);
    if (idField != null) {
        List oneToManyFields = getOneToManyFields(parentClass);
        for (Field field : oneToManyFields) {
            Class childClass = getFirstActualTypeArgument(field);
            if (childClass != null) {
                Field manyToOneField = getManyToOneField(childClass, parentClass);
                Field childClassIdField = getIdField(childClass);
                if (manyToOneField != null && childClassIdField != null) {
                    List childIds = entityManager.createQuery(String.format("select c.%s from %s c where c.%s.%s = :pid", childClassIdField.getName(), childClass.getSimpleName(), manyToOneField.getName(), idField.getName())).setParameter("pid", parentId).getResultList();
                    for (Long childId : childIds) {
                        entityManager.createQuery(String.format("delete from %s c where c.%s = :id", childClass.getSimpleName(), childClassIdField.getName())).setParameter("id", childId).executeUpdate();
                    }
                }
            }
        }
        entityManager.createQuery(String.format("delete from %s e where e.%s = :id", parentClass.getSimpleName(), idField.getName())).setParameter("id", parentId).executeUpdate();
    }
}

The methods getFirstActualTypeArgument(), getManyToOneField(), getIdField() and getOneToManyFields() in the code above are not depicted here, but do what their name sounds like. Once implemented we can easily delete all entities beginning with the root of the tree.

A simple example application that can be used to examine the behavior and solution described above, can be found on github.

Advertisements

Passing SFSBs as an argument to a custom JSF component?

Have you ever developed a custom JSF component that renders an img tag? If so, you will know that this component consists at least of two parts: The first part is the implementation of UIComponent that actually renders the img tag, the second part is the servlet filter or phase listener that answers the asynchronously incoming request for the image URL. Does it work, to use a stateful session bean (SFSB) to load the image data from the database?
First of all we start with the UIComponent. As base class we choose UIOutput, which is sufficient for the current case:

@FacesComponent("EjbComponent")
public class EjbComponent extends UIOutput {
    private static final Logger logger = LoggerFactory.getLogger(EjbComponent.class);
    private static final String COMPONENT_FAMILY = "martins.developer.world.jsf.component.helloWorld";

    private enum PropertyKeys {
        statefulEjb
    };

    @Override
    public String getFamily() {
        return COMPONENT_FAMILY;
    }
    ...
}

We override the encodeBegin() method to render the img tag. We get the stateful session bean, which is passed as an argument to the component, using JSF’s StateHelper:

    @Override
    public void encodeBegin(FacesContext facesContext) throws IOException {
        ResponseWriter writer = facesContext.getResponseWriter();
        String imageSrc = createImageUrl(facesContext);
        writer.startElement("img", this);
        writer.writeAttribute("src", imageSrc, "");
        writer.endElement("img");
        Map<String,Object> sessionMap = FacesContext.getCurrentInstance().getExternalContext().getSessionMap();
        sessionMap.put("ejbComponent", getStatefulEjb());
    }

    public StatefulEjbLocal getStatefulEjb() {
        return (StatefulEjbLocal) getStateHelper().eval(PropertyKeys.statefulEjb);
    }

    public void setStatefulEjb(StatefulEjbLocal statefulEjb) {
        getStateHelper().put(PropertyKeys.statefulEjb, statefulEjb);
    }

The createImageUrl() method adds a parameter to the HTTP request, such that we can detect this specific request later on in the PhaseListener:

    private String createImageUrl(FacesContext context) {
        StringBuilder builder = new StringBuilder(context.getExternalContext().getRequestContextPath());
        if (builder.indexOf("?") == -1) {
            builder.append('?');
        } else {
            builder.append('&');
        }
        builder.append("ejbComponent").append("=").append("ejbComponent");
        return builder.toString();
    }

The PhaseListener has the task to detect the incoming request issued by the img tag rendered by the component above:

public class EjbComponentPhaseListener implements PhaseListener {
    private static final Logger logger = LoggerFactory.getLogger(EjbComponentPhaseListener.class);

    @Override
    public void beforePhase(PhaseEvent phaseEvent) {
        FacesContext facesContext = phaseEvent.getFacesContext();
        ExternalContext externalContext = facesContext.getExternalContext();
        String ejbComponentParameter = requestParameter.get("ejbComponent");
        if (ejbComponentParameter != null) {
            Map<String, Object> sessionMap = externalContext.getSessionMap();
            StatefulEjbLocal ejbComponent = (StatefulEjbLocal) sessionMap.get("ejbComponent");
            if (ejbComponent != null) {
                byte[] image = ejbComponent.getImage();
                HttpServletResponse response = (HttpServletResponse) externalContext.getResponse();
                ServletOutputStream outputStream = null;
                try {
                    outputStream = response.getOutputStream();
                    IOUtils.copy(new ByteArrayInputStream(image), outputStream);
                    outputStream.flush();
                } catch (Exception e) {
                    logger.error(e.getMessage(), e);
                } finally {
                    IOUtils.closeQuietly(outputStream);
                    facesContext.responseComplete();
                }
            } else {
                logger.debug("Could not retrieve ejbComponent from session.");
            }
        } else {
            logger.debug("Request parameter not found");
        }
    }

    @Override
    public void afterPhase(PhaseEvent phaseEvent) {
        logger.debug("afterPhase()");
    }

    @Override
    public PhaseId getPhaseId() {
        return PhaseId.RENDER_RESPONSE;
    }

As you see, we try to see if the currently handled request has a parameter with the name “ejbComponent”. If this is the case, we look up the SFSB from the session map, as we have put it in there before (see code for the JSF component). Now that we have a reference to the SFSB, we can use it to load the image data and pass it to the browser.
In our web application we now create a simple backing bean that holds a reference to the SFSB, which later on is passed to the component on the JSF page:

@Named
public class HelloWorldBackingBean {
    @EJB
    private StatefulEjbLocal statefulEjb;
    ...

Finally we pass the SFSB from the backing bean to the component on our JSF page:

<mdw:ejbComponent statefulEjb="#{helloWorldBackingBean.statefulEjb}"/>

If we compile and deploy the application, the JSF will render the image as expected. But there is one caveat: If the same SFSB is used to render more than one image within the same web session, the concurrent access to the SFSB is serialized as the EJB 3.1 specification demands this in section 4.3.14. This means that each invocation locks/blocks the other threads until a timeout occurs. This timeout can be given with the @AccessTimeout annotation, the default in the JBoss AS 7.x container is 5 seconds. If the waiting threads therefore have to wait too long, a ConcurrentAccessException is thrown. This may lead to sporadic failures that only happen, if the system is under load.
Sample code can be found in my github repository.

Setting up your application server with maven

In many cases there is no way to deploy an application without the need to setup your application before. In JBoss AS 7.x you may want to configure for example your database connection. Or you have to configure a security realm. Maybe you also want to adjust the SLSB pool… In any of these cases all developers in the team have to share a common or at least a similar configuration.
Often this information can be found in sporadically sent emails or on some wiki page. But what happens sometime after a release, when you have to check out a branch to fix some bug or to add a new feature? You will have to reconstruct the configuration that was valid for this branch. So why not add the configuration files to your version control system and together with the mere configuration files, a maven configuration that sets up the whole application server?
Let’s try to keep it simple and use only public available and commonly used plugins. First of all let’s add all versions that we will need in the following to the properties part of the pom.xml:

	<properties>
		<jboss.install.dir>${project.build.directory}/jboss</jboss.install.dir>
		<jboss.version>7.2.0.Final</jboss.version>
		<app.version>${project.version}</app.version>
		<ojdbc.version>11.2.0.1.0</ojdbc.version>
	</properties>

We also define here the installation directory of the JBoss AS. This way we can change it, if we want, using the command line option -D. Now we add a new profile, such that we have to explicitly switch the setup procedure on and that it is not part of the normal build:

<profile>
	<id>setupAs</id>
	<build>
		<plugins>
		...
				</plugins
		</build>
</profile>

If we have the current JBoss version as a maven artifact deployed in our maven repository, we can use the maven-dependency-plugin to download and unpack the JBoss to the installation directory given above:

<plugin>
	<groupId>org.apache.maven.plugins</groupId>
	<artifactId>maven-dependency-plugin</artifactId>
	<version>2.8</version>
	<executions>
		<execution>
			<id>unpack-jboss</id>
			<phase>package</phase>
			<goals>
				<goal>unpack</goal>
			</goals>
			<configuration>
				<artifactItems>
					<artifactItem>
						<groupId>org.jboss</groupId>
						<artifactId>jboss-as</artifactId>
						<version>${jboss.version}</version>
						<type>zip</type>
						<outputDirectory>${project.build.directory}/jboss</outputDirectory>
					</artifactItem>
				</artifactItems>
			</configuration>
		</execution>

Now that the application server is unpacked, we have to add the JDBC driver as well as our application (or anything else you need). We set this up by adding another execution block to the maven dependency plugin:

<execution>
	<id>copy</id>
	<phase>package</phase>
	<goals>
		<goal>copy</goal>
	</goals>
	<configuration>
		<artifactItems>
			<artifactItem>
				<groupId>our-company</groupId>
				<artifactId>our-application-ear</artifactId>
				<version>${app.version}</version>
				<type>ear</type>
				<outputDirectory>${jboss.install.dir}/jboss-as-${jboss.version}/standalone/deployments</outputDirectory>
			</artifactItem>
			<artifactItem>
				<groupId>com.oracle</groupId>
				<artifactId>ojdbc6</artifactId>
				<version>${ojdbc.version}</version>
				<outputDirectory>${jboss.install.dir}/jboss-as-${jboss.version}/standalone/deployments</outputDirectory>
				<destFileName>ojdbc6.jar</destFileName>
			</artifactItem>
		</artifactItems>
	</configuration>
</execution>

Last but not least, we also want to adjust the standard configuration files to our needs. We can use the maven-resources-plugin to substitute variable values within each file. Therefore we add templates for these files to the resources folder of our JBoss module and call the goal copy-resources:

<plugin>
	<artifactId>maven-resources-plugin</artifactId>
	<version>2.6</version>
	<executions>
		<execution>
			<id>copy-jboss-configuration</id>
			<phase>package</phase>
			<goals>
				<goal>copy-resources</goal>
			</goals>
			<configuration>
				<outputDirectory>${jboss.install.dir}/jboss-as-${jboss.version}/standalone/configuration</outputDirectory>
				<resources>
					<resource>
						<directory>src/main/resources/jboss/standalone/configuration</directory>
						<filtering>true</filtering>
					</resource>
				</resources>
			</configuration>
		</execution>
		<execution>
			<id>copy-jboss-bin</id>
			<phase>package</phase>
			<goals>
				<goal>copy-resources</goal>
			</goals>
			<configuration>
				<outputDirectory>${jboss.install.dir}/jboss-as-${jboss.version}/bin</outputDirectory>
				<resources>
					<resource>
						<directory>src/main/resources/jboss/bin</directory>
						<filtering>true</filtering>
					</resource>
				</resources>
			</configuration>
		</execution>
	</executions>
</plugin>

The values for the filtering can be given on the command line with the -D option. If the team has more than a few members, it is also possible to create for each user a properties file that contains his/her specific configuration values. If we use the OS user as filename, we can easily choose the file by the name of the currently logged in user. This way each team member can easily setup its own completely configured application server instance by simply running:

mvn clean install -PsetupAs

In order to prevent that the newly configured server is deleted with the next clean invocation, we disable the maven clean plugin for the normal build:

<plugin>
	<artifactId>maven-clean-plugin</artifactId>
	<version>2.5</version>
	<configuration>
		<skip>false</skip>
	</configuration>
</plugin>

Within the setupAs profile created above, we have to enable it of course, such that we can delete the whole installation just by calling “mvn clean -PsetupAs”. Now switching to an older branch is easy as we don’t lose any time searching for the right configuration…