Archive | Uncategorized RSS for this section

Using JINQ with JPA and H2

A few days ago I have read the interesting interview with Ming-Yee Iu about JINQ. JINQ is, like the name already suggests, the attempt to provide something similar to LINQ for Java. The basic idea is to close the semantic gap between object-oriented code that executes queries on a relational data model. The queries for the relational database model should be easily integrated into the code such that it feels more natural.

The research behind LINQ came to the conclusion that the algorithms transforming the code into relational database queries work best with functional code. As Java 8 comes with the streams API, the author uses it to implement the ideas of his PhD in Java.

To get our hands dirty, we start with a simple project that uses Hibernate over JPA together with an H2 database and JINQ:

<dependencies>
	<dependency>
		<groupId>javax</groupId>
		<artifactId>javaee-api</artifactId>
		<version>${jee.version}</version>
		<scope>provided</scope>
	</dependency>
	<dependency>
		<groupId>com.h2database</groupId>
		<artifactId>h2</artifactId>
		<version>${h2.version}</version>
	</dependency>
	<dependency>
		<groupId>org.hibernate</groupId>
		<artifactId>hibernate-entitymanager</artifactId>
		<version>${hibernate.version}</version>
	</dependency>
	<dependency>
		<groupId>org.jinq</groupId>
		<artifactId>jinq-jpa</artifactId>
		<version>1.8.10</version>
	</dependency>
</dependencies>

In order to use JINQ streams we have to create a provider that gets the EntityManagerFactory as argument:

EntityManagerFactory factory = Persistence.createEntityManagerFactory("PersistenceUnit");
JinqJPAStreamProvider streams = new JinqJPAStreamProvider(factory);

Having inserted some persons into our database, we can easily query them:

List<String> firstNames = streams.streamAll(entityManager, Person.class)
		.map(Person::getFirstName)
		.collect(toList());
firstNames.forEach(System.out::println);

Using the method streamAll() of the previously created JinqJPAStreamProvider gives us access to all persons within the database. In this simple example we only want to output the first name of each person; hence we map the list and collect all results into a List. This list gets printed using the forEach() method and a reference to the println() method.

Taking a look at the generated SQL code, we see that all columns are selected:

select
	person0_.id as id1_4_,
	person0_.FIRST_NAME as FIRST_NA2_4_,
	person0_.ID_CARD_ID as ID_CARD_4_4_,
	person0_.LAST_NAME as LAST_NAM3_4_,
from
	T_PERSON person0_ 

Of course we can refine the statement using the select() method:

List<String> firstNames = streams.streamAll(entityManager, Person.class)
		.select(Person::getFirstName)
		.where(p -> p.equals("Homer"))
		.collect(toList());
firstNames.forEach(System.out::println);

Additionally we have also added a predicate (where firstName = 'Homer'):

    select
        person0_.FIRST_NAME as FIRST_NA2_4_
    from
        T_PERSON person0_ 
    where
        person0_.FIRST_NAME='Homer'

Leaving this simple example, we now want to create a query that selects all geeks with first name “Christian” that work in a time and material project:

List<String> geeks = streams.streamAll(entityManager, Project.class)
		.where(p -> p.getProjectType() == Project.ProjectType.TIME_AND_MATERIAL)
		.joinList(Project::getGeeks)
		.where(g -> g.getTwo().getFirstName().equals("Christian"))
		.map(p -> p.getTwo().getFirstName())
		.collect(toList());
geeks.forEach(System.out::println);

As can be seen from the code above, we use the first where() clause to select all time and material projects. The joinList() invocation joins the geek table while the subsequent where() clause also restricts to only select geeks with first name “Christian”. Et voila, that is the created SQL query:

select
	geek2_.FIRST_NAME as col_0_0_ 
from
	T_PROJECT project0_ 
inner join
	T_GEEK_PROJECT geeks1_ 
		on project0_.id=geeks1_.PROJECT_ID 
inner join
	T_GEEK geek2_ 
		on geeks1_.GEEK_ID=geek2_.id 
where
	project0_.projectType='TIME_AND_MATERIAL' 
	and geek2_.FIRST_NAME='Christian' limit ?

Conclusion: Having worked with JPA’s criteria API some time ago, I must say that the first steps with JINQ are more intuitive and where easier to write down. JINQ really helps to close the gap between the relational database world by using streams in Java 8.

Advertisements

Verify HTML documents in junit tests with jsoup

Assume that you are developing an application that creates some kind of fancy HTML report for its users. When it comes down to writing your unit tests, you have two choices:

  • You test the generated report against a complete report prepared beforehand.
  • You parse the HTML document and test parts of it separately.

The first choice seems to be simple at first glance, because you have manually validated that the prepared report is correct. Writing such kind of tests is also easy as it boils down to the following pattern:

String preparedReport = loadReportFromSomeWhere();
assertThat(generatedReport, is(preparedReport));

But what happens when you change a small part of the report generating code? You will have to change probably some or even all of the prepared reports. Hence the second choice is in these cases the better one, as you only have to adjust the test cases that are affected (and that you would have to change anyhow).

Here is the part where jsoup comes in handy. jsoup is a Java library developed for parsing HTML documents, but in contrast to other options for parsing XML like structures it supports CSS selectors like those used in JavaScript libraries like jquery. This way you don’t have to write tons of code in order to verify exactly the part of the report that your current unit test is concerned with.

To demonstrate how jsoup can be used, we assume that our application has a simple HtmlReport class that can be used to create a valid HTML document using the builder pattern (https://en.wikipedia.org/wiki/Builder_pattern):

String html = HtmlReport.create()
	.addHeader1("title", "Testing HTML documents with jsoup")
	.addSection("intro", "This section explains what the text is all about.")
	.addHeader2("jsoup", "jsoup in a nutshell")
	.addSection("pjsopu", "This section explains jsoup in detail.")
	.addList("jsoup_adv", Arrays.asList("find data using CSS selectors", "manipulate HTML elements"))
	.build();

To keep it simple, the report just consists of a header element (h1) followed by a section (p) and a paragraph with a header h2 that contains an HTML list (ul). The first argument to each method is the id of the HTML element. This way we can use it later on to address exactly the element we want and beyond that support the formatting of all elements (the CSS designer will love us).

The first thing we want to test is that the document contains an h2 element with id “title”:

<h1 id="title">Testing HTML documents with jsoup</h1>

Using jsoup this verification becomes a two liner:

Document doc = Jsoup.parse(html);
assertThat(doc.select("h1#title").text(), is("Testing HTML documents with jsoup"));

While we let jsoup parse the document in the first line, we can use the provided method select() to query for the element using the selector h1#title, i.e. we are asking for an h1 element with id title. The same way we can assure that we have a paragraph with the correct content:

assertThat(doc.select("p#intro").text(), is("This section explains what the text is all about."));

A little bit more tricky is to verify that the list with id jsoup_adv is written in the correct order. For that we have to use the pseudo selector :eq(n) that allows use to query for a specific index position of a sibling:

assertThat(doc.select("ul#jsoup_adv > li:eq(0)").text(), is("find data using CSS selectors"));
assertThat(doc.select("ul#jsoup_adv > li:eq(1)").text(), is("manipulate HTML elements"));

The selector ul#jsoup_adv > li:eq(0) asks for the first (:eq(0)) li elements that is a direct child of an ul element with id jsoup_adv.

Beyond that one can even use regular expression to find for example all h2 elements whose text ends with the string “nutshell”:

assertThat(doc.select("h2:matches(.*nutshell$)").size(), is(1));

Conclusion: Using jsoup for parsing HTML documents in junit tests makes the verification of HTML documents much easier and robust. If one is used to and likes CSS selectors like they are used by jquery, then jsoup is worth a look.

Writing an interpreter in Ceylon

Ceylon is a statically typed language for the Java Virtual Machine (JVM) that comes with a powerful set of features that the Java language does not have or is missing (depending on your point of view). However, about three years after the first release and about one year after the version 1.0.0 release it is time to take a closer look.

A good way to get in touch with a new language is to start with a simple project, just for fun. One thing that came into my mind was to implement an interpreter for the programming language Brainfuck. The language consists of only eight simple commands and one instruction pointer. More details about Brainfuck can be found here.

This first surprising point about Ceylon is that we do not have the typically “main” method we know from other C-like languages. We can just implement a “global” method (here we call it “run”):

"Run the module `brainfuck`."
shared void run() {
	String? programFile = process.namedArgumentValue("program");
	if (exists programFile) {
		value bytes = readFile(programFile);
		BrainFuckInterpreter interpreter = BrainFuckInterpreter(bytes);
		interpreter.run();
	} else {
		print("Please provide the argument 'program'.");
	}
}

The string before the method definition is just the short form for the doc(“…”) annotation, the counterpart of the javadoc feature. To format the comments we can use the means of Markdown formatting.

Another surprising fact is the access modifier “shared” that we see in this example. Instead of using public in order to make the method accessible by other code, we use the keyword shared. In Ceylon the visibility hierarchy starts with modules that consist of one or more packages. A package consists of one or more source files, which again can share their elements. Sharing an element from one level of the hierarchy means to make it visible to the next higher level. This way even packages can be invisible to other modules, as we see in the following snippet from the package.celyon file, which is placed within each package and which declares in our case the package brainfuck as shared:

shared package brainfuck;

By the way: The package declarations that all Java classes start with are not necessary in Ceylon. Instead the package name is derived from the names of the folders the source file resides in, hence the redundant information can be left out. But let’s go back to the run() method above. As we do not have a special method which gets the command line arguments from the runtime environment passed in, we ask the global process instance for the named argument “program”. The returned String is an “optional” String, i.e. the value may be or may not be set. In Java that would be modelled by a String reference that may be null. But in order to circumvent possible NullPointerExceptions, the compiler ensures that you can only access the value of an “optional” references when you have checked them using the exists keyword.

Now that we have retrieved the path to our brainfuck “code”, we can read the bytes from the file using the following method:

ArrayList<Byte> readFile(String programFile) {
	value bytes = ArrayList<Byte>();
	print("Reading file '" + programFile + "'.");
	Path path = parsePath(programFile);
	OpenFile openfile = newOpenFile(path.resource);
	ByteBuffer buffer = newByteBuffer(1024);
	variable Integer bytesRead = openfile.read(buffer);
	print("bytesRead=" + bytesRead.string);
	while (bytesRead != -1) {
		buffer.flip();
		for (Byte byte in buffer) {
			bytes.add(byte);
		}
		buffer.clear();
		bytesRead = openfile.read(buffer);
	}
	openfile.close();
	return bytes;
}

As a storage for the bytes read from the program file, we utilize an ArrayList from the Java SDK. In order to do that, we have to import them in our module definition. This one resides in our module.celyon file:

module brainfuck "1.0.0" {
	shared import ceylon.io "1.1.0";
	shared import java.base "8";
}

Next to the name of the module and its version we can import other modules. Line 3 actually imports the java “base” module, i.e. the Java packages java.lang, java.util, java.io, java.net, java.text, NIO and security. Now that we have imported the java.base module, we can import the ArrayList at the beginning of our source file:

import java.util {
	ArrayList
}

Interestingly Ceylon emphasizes immutability, hence you can assign a value to every reference per default only once. If you want to change its value, you have to mark the reference as mutable using the keyword variable. You can see this in the code of the readFile() method as the reference to path, openfile and buffer are not marked as variable. In these cases the value is only assigned once. In contrast to that the number of bytes read from the file is of course variable, hence we mark it so.

After having read the code of the brainfuck program into memory, we can start its execution. This is done by creating an instance of the Brainfuck interpreter and calling its run() method:

BrainFuckInterpreter interpreter = BrainFuckInterpreter(bytes);
interpreter.run();

The beginning of this class is shown in the following:

shared class BrainFuckInterpreter(ArrayList<Byte> buffer) {
	ArrayList<Integer> field = ArrayList<Integer>();
	variable Integer ptr = 0;
	
	shared void run() {
		variable Integer bufferPos = 0;
		while (bufferPos >= 0 && bufferPos < buffer.size() - 1) {
			value oldBufferPos = bufferPos;
			value byte = buffer.get(bufferPos);
			switch (byte.unsigned)
			case (43) {
				ensureFieldCapacity();
				value oldValue = field.get(ptr);
				field.set(ptr, oldValue + 1);
			}
			...
			else {
				print("Ignoring char " + byte.string);
			}
			bufferPos++;
		}
		...
	}
	...
}

If you are used to Java, you might be missing the constructor. Where do we assign the passed in argument buffer to the corresponding member variable? The Ceylon feature here is that we can define some of our member variables within the braces after the class name. In our case we define a member with the name buffer of type ArrayList. The constructor we would have to write in Java that assigns the given value to the member variable is obsolete in Ceylon. The compiler creates some kind of “default” constructor that does this job for us. Additional members of the class, that are not passed to the constructor, can be defined as we would do that in Java (see field and ptr in our example). And as long as we do not declare them to be shared, they are only visible to the class members. Defining the “signature” of the constructor already within the class name also means that we can have only one constructor per class. In case we also want to execute additional code during the “construction”, we can put that code into the class body.

Our run() method now iterates of the buffer with the instructions that we have read from the file. Depending on which value the current byte has, we execute the appropriate operation. This is implemented by a switch statement. In contrast to Java we can omit the curly braces around all cases, but we cannot omit the “default” case, which is modelled in Ceylon as “else” clause of the switch statement. In this “else” case we just print the input byte that we are ignoring. Interesting about this point for Java developers is the fact that we cannot just “add” the byte to the previous String, as we would do in Java. Instead we have to access the String representation of the byte by writing byte.string. An implicit invocation of a method like toString() in Java does not exist in Ceylon.

Conclusion: The interpreter does not utilize all language features, but at least it covers a few interesting points. As someone who has worked with Java for years, the Ceylon features seem to tackle many well known issues of the Java language (versioned modules, package visibility, immutability, optional pattern, etc.). Although features like the absence of multiple constructors is a thing you have to get used to, I must admit that I liked to work with Ceylon.

PS: The sources can be found here.

Analyze package dependencies with structure101

One key to a stable application is a well-structured codebase. We know that we should build as many black boxes as possible, because as soon as one black box is finished, we no longer have to think about its interior. You just use the code you or another team member has written through a well-defined interface. This gives you the possibility to concentrate on the next feature you want to add.

When we think about black boxes we often have classes or whole jar packages in mind. Classes should of course be black boxes, no discussion about that. The same is true for jar packages. But in-between classes and jar packages there is another level of structure, which is often not seen this directly as a black box: packages.

Packages are often second class citizens and their interrelationship is not analyzed this thoroughly. But there is a great tool for such analysis: Structure101. It in general helps you to monitor and verify the dependency structures and complexity of your project by the means of well-organized diagrams.

So let’s start with a sample project. I have taken one of my own projects for this: japicmp is a tool to compute the differences between the API of two jar archives in means of what methods and classes have changed. structure101 has a great composition view, which shows you the dependencies between the packages of a project. This is how it looks for the current version of japicmp:

Image

 

Cleary we can see for example that the cli package, which is responsible for the command-line parsing, uses the exception as well as the config package and is used itself by the main package, where the main() method resides. With the cli package everything seems to be OK. But what about the three packages cmp, util and model. The difference computation between the classes and methods, i.e. the business logic, resides in the package cmp. Hence it should use the model as well as the util package. But these two packages should not have any backward dependencies. This problem is also shown in the matrix view:

Image

 

When we take a closer look at the tangle between these three packages, we see that the class AccessModifier, which is located in the cmp package, is used from the util package:

Image

 

Beyond that, this class is also used in the model. This clearly indicates that the class should rather stay in the model package as in the cmp package. This seems to make sense, as the access modifiers of a class or method are part of the model of a jar archive and do not belong into the business logic. If we move this class to the model package, we get the following result:

Image

This looks much better. We do not have any tangles within the package structure. The nice layout also shows clearly that the whole application depends on the model, as the package is located at the bottom of the diagram. The business logic, which resides within cmp, is called from the main package and uses util, config and the model, as it should be. The same is true for the output package, where the implementations for the cli and xml output reside. This package uses the config as well as the model, once it is computed.

Conclusion: Packages should not be second class citizens but help you to structure your application such that it is easy to overview the code and separate functionality. Tools like structure101 help you to analyze the dependencies between packages and therefore let packages be an important level between the jar on the one side and the class on the other side.