This is the final post of a series where we have looked at persistence options in Java. In part 1 we looked at rolling your own lightweight solution and in part 2 we visited BerkeleyDB as an embeddable non-SQL solution. In this final piece, we look at the Java Persistence API (JPA) which was accepted as part of Enterprise Javabeans 3.0 (EJB3) in JSR-220.
We’ll break format from the first post in this series, but rest assured that JPA addresses every issue we brought up. JPA (and its implementations) are humongous beasts and we highly suggest Java Persistence with Hibernate for more advanced features and a deeper understanding.
Everybody knows setup and installation is boring, so we’ll leave that to the end. This way we can get down to business.
Defining your Persistent Objects
The fantastic thing about JPA is that it allows you to define the persistence of your classes entirely using Java 5 annotations, and the implementation will translate this into native SQL to speak with any database of your choice. A class is made persistent simply by adding the @Entity annotation and an @Id
@Entity
public class MyClass {
@Id
@GeneratedValue
private Long id;
@Column(nullable = false, name = "MY_TEXT_COLUMN")
private String text;
...
}
You can choose to be in control of ID generation if you want, but most of the time you’ll want to let the database create one for you with @GeneratedValue. (Note that you can browse all the Javadocs at the Hibernate EJB3 API Documentation pages). The code above will result in an SQL schema for a table called “MYCLASS” with a column named “MYCLASS_ID” that is autogenerated and a custom-named column called “MY_TEXT_COLUMN” which stores some text that can’t be null.
Unfortunately, because everything is considered to be a JavaBean it means you have to add in a ridiculous amount of boilerplate in your code to be fully compliant. When you see ... in these examples, it means you should really add a parameter-less public constructor, implement Serializable and add getters/setters for all the fields (even the ID which cannot be changed here). Hibernate is forgiving, but other implementations may not be.
There is a subtle problem with the example above with regards to the text column. Namely, it will store the text as a variable length character array… which is usually quite a restrictive length. If you want to store anything as large as the text for a blog post or comment, you should really be using @Lob
@Entity
public class MyClass extends NonEntityClass {
// you can use Strings for keys if it
// makes sense for your data structures
// (i.e. a string is a unique identifier)
@Id
private String id;
// this can hold as much text as you care to throw at
// it, and should support UTF-8 encoding too!
// as long as you set your database up to use utf8
// e.g. in PostgreSQL "create database mydb with encoding='utf8';"
@Lob
private String reallyLongText;
// the 'transient' keyword does nothing to the
// SQL persistent form, so be careful when porting
// Serializable classes to JPA that you use the
// EJB3 equivalent annotation
@Transient
private Long trans;
// must use this if the field is an @Entity with 1-to-1 multiplicity
// we may also want to use ManyToOne if other MyClass objects can reference
// this particular instance
@OneToOne
private MyOtherPersistentClass other;
@Embedded
private MyEmbeddableClass embedded;
...
}
Note that in this example, we are extending NonEntityClass… which may have state. If it isn’t an @Entity, then those parts of the class will not be persisted and you will most likely have a broken object when reviving from your database. Object inheritance in persistence is a very tricky subject, but one that JPA has thought about. Read more in the InheritanceType parameter to the @Inheritance annotation.
We also introduced a new annotation, @Embedded. An embedded object is one which is persistent but only when it is part of a persistent class. Embeddable classes must be declared like so…
@Embeddable
public class MyEmbeddableClass {
private String text;
...
}
The final major piece of the puzzle is how Collections are persisted. Unfortunately, JPA only allows @Entitys to be a part of a collection, but Hibernate offers extensions to allow both value-type (i.e. numbers and String) and @Embeddable objects to be part of a collection.
@Entity
public class MyClass {
@Id
@GeneratedValue
private Long id;
// this is how we have sets of uni-directional entity types
// Must be a generic "Set", not a HashSet on the left!
@OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
private Set<EntityClass> entities = new HashSet<EntityClass>();
// this is how we have sets of value types
// this is not supported by EJB3
@org.hibernate.annotations.CollectionOfElements
@JoinTable
private Set<String> images = new HashSet<String>();
// this is how we have sets of embedded types
// this is not supported by EJB3
@org.hibernate.annotations.CollectionOfElements
@JoinTable
private Set<MyEmbeddableClass> embeds = new HashSet<MyEmbeddableClass>();
...
}
The “cascade” option says that when we persist MyClass, we persist each of the members of the entities collection. The fetching scheme that when a MyClass is loaded then all the values are loaded too! Your most typical use cases will dictate what makes the most sense for your persistent objects.
Beware that List objects will not necessarily be persisted in the correct order. Read more about orderings in @OrderBy. Notice that in our Set of @Entity types, we defined it to be @OneToMany. If the entities reference back to us (bi-directional), then we should use @ManyToMany instead and also list the name of the field which references us. For more details see the Hibernate documentation.
I will say that in my experience collections have resulted in very poor performance. However, to be fair, I did have about 1,000 @Entity objects in each collection. In that particular case, profiling found that storing a big parsable String as a @Lob (with each line de-serialisable into an object on demand) was a much more efficient way of storing my data. I also had trouble getting the @Embeddable classes to work in collections if they contained more than a single field, but as that isn’t supported by JPA anyway, I strayed away.
Persisting
Now that you know how to define your persistent classes, you’ll want to be able to get at them! This is accomplished through a EntityManager, obtained from an EntityManagerFactory. I noticed serious performance problems if I was creating a factory every time I needed one, so I recommend you create a single factory as a static field in one of your classes… but beware of static initialisers throwing runtime exceptions.
Once you have an EntityManager, you then want to do all your writing (persist and merge) operations inside an EntityTransaction. Boilerplate alert…
// statically imported from Persistence
// "Persistent Units" like this are defined in an XML file
EntityManagerFactory emf = createEntityManagerFactory("javablog.jpa.tutorial");
EntityManager em = emf.createEntityManager();
em.getTransaction().begin();
try {
MyClass myClass = new MyClass();
myClass.setText("Hello World!");
// "persist" is pretty much the same as "add and save". Use "merge"
// to "update and save" if the object is already persistent.
// Remember that you must separately persist objects in a Collection
// unless it has been defined with the appropriate CascadeType
em.persist(myClass);
em.getTransaction().commit();
} catch (Exception e){
// do dispute resolution here
em.getTransaction().rollback();
} finally {
em.close();
}
// and don't forget to close the emf when closing your application
// or add it as a shutdown hook... or pray to the Tomcat Gods if it
// is a servlet. Hmm, clean shutdowns...
Notice that you get to hear about failed commits… so the “last to commit, wins” strategy of other persistent options has been eradicated! This is good news, however actually coming up with a dispute resolution strategy is probably not the easiest thing in the world, and you’ll probably want to fallback to a user prompt. But at least stale data didn’t silently overwrite the new data!
If something is performed inside a transaction, then it is considered atomic. You’ll probably want to call several methods, but consider them to be a single action… so stray away from obtaining or committing transactions within those methods! Instead, prefer to keep this separate from the persistence handling and document the persistent requirements of your methods. Note that some read operations can only be performed inside a transaction, such as obtaining the elements of a Collection that was defined to have FetchType.LAZY.
Retrieving
The most obvious way to retrieve an object is by using its ID (we assume you now know how to obtain and close an EntityManager from here on)
MyClass myClass = em.find(MyClass.class, "String ID 1");
if (myClass == null) { /* wasn't found */ }
However, searching by keys isn’t particularly interesting… most of the time you’ll be using an auto-generated key that is of no significance except that it is unique. What is much more interesting is the ability to query the database using the JPA Query Language (JPQL)! It looks a lot like SQL except you get to use the Java names for everything. The possibilities are endless so I will only demonstrate the most useful query… lookup based on a text field that contains a substring, ignoring case
String search = "a search string";
Query query = em.createQuery("select answer from MyClass answer " +
" where upper(answer.text) like :search")
.setParameter("search", "%" + search.toUpperCase() + "%");
List results = query.getResultList();
In order to avoid hilarious SQL injections, make sure to use the :parameter notation and then replace it safely with the setParameter method which will perform the relevant character escaping. The % character is the wildcard character. Unfortunately you don’t get any type safety on the returned results.
You’re now ready to start using JPA to solve all your persistence needs, just skip to the end of this tutorial and get it installed! For further reading, add Hibernate Annotations Reference Guide to your del.icio.us feed and think about buying Java Persistence with Hibernate.
Downloads
The most popular open source implementation of JPA is Hibernate and it is a monster. To use the EJB3 annotations, you’ll need to download
- ejb3-persistence-3.3.0.jar (from Hibernate Annotations) hidden in the
libfolder - hibernate-annotations-3.3.0.jar (from Hibernate Annotations)
- hibernate-3.2.5.jar (from Hibernate Core)
- hibernate-entitymanager-3.3.1.jar (from Hibernate EntityManager)
although you might want to only place the EJB3 persistence jar on your compile-time classpath to make sure you don’t stray from the standard. Make sure you have the following dependencies on your runtime classpath (which are included in the above downloads)
- hibernate-commons-annotations.jar (hidden in Hibernate Annotations
libfolder) - antlr-2.7.6.jar
- cglib-2.1.3.jar
- jta.jar
- asm-attrs.jar
- commons-collections-2.1.1.jar
- javassist.jar
- log4j-1.2.11.jar
- asm.jar
- commons-logging-1.0.4.jar
- jboss-archive-browsing.jar
- c3p0-0.9.1.jar
- dom4j-1.6.1.jar
- jdbc2_0-stdext.jar
Unfortunately some of these jars are quite old and may result in dependency hell if you (or your middleware) depends on more recent versions. I don’t think there is a compatibility list anywhere that shows what versions will work. This list may be more than you actually need… but it’s not as long as the list that comes with Hibernate! (Although admittedly, a lot of those extra dependencies are only if you wish to enable caching… which is a very advanced topic that is only relevant if you have multiple machines accessing a central SQL database.)
XML Persistent Units
Earlier on we spoke about a Persistent Unit that is defined in an XML file. This is where we define everything related to the SQL connection itself. If you’re using NetBeans 6.0 you may never need to look at this file as it has a lovely UI for hiding the gory details. Here is a very basic example that shows a connection to a remote PostgreSQL database. The file must be placed in the META-INF/persistence.xml file of your application jar
<?xml version="1.0" encoding="UTF-8"?>
<persistence version="1.0" xmlns="http://java.sun.com/xml/ns/persistence"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd">
<persistence-unit name="javablog.jpa.tutorial" transaction-type="RESOURCE_LOCAL">
<provider>org.hibernate.ejb.HibernatePersistence</provider>
<!-- Persistent entities should be here for J2SE, but strangely not for J2EE
Hibernate is forgiving though, but try your best to remember. -->
<class>javablog.jpa.tutorial.MyClass</class>
<properties>
<property name="hibernate.connection.username" value="sa"/>
<property name="hibernate.connection.driver_class" value="org.hsqldb.jdbcDriver"/>
<property name="hibernate.connection.password" value=""/>
<property name="hibernate.connection.url" value="jdbc:hsqldb:file:hsqldb/cache;shutdown=true"/>
<property name="hibernate.cache.provider_class" value="org.hibernate.cache.NoCacheProvider"/>
<property name="hibernate.hbm2ddl.auto" value="update"/>
<!-- setting the batch size to 0 is useful for debugging -->
<!-- <property name="hibernate.jdbc.batch_size" value="0"/> -->
<!-- Enable pooling of connections for efficiency. Disabled for development -->
<!-- <property name="hibernate.c3p0.min_size" value="5" />
<property name="hibernate.c3p0.max_size" value="20" />
<property name="hibernate.c3p0.timeout" value="300" />
<property name="hibernate.c3p0.max_statements" value="50" />
<property name="hibernate.c3p0.idle_test_period" value="3000" /> -->
</properties>
</persistence-unit>
</persistence>
Note that the jdbc:hsqldb:file:hsqldb/cache;shutdown=true line is very important… if you leave out the shutdown=true piece, then HSQLDB won’t persist your database on shutdown. You should also look into the various options available for the hbm2ddl setting as it controls what Hibernate does to the existing database. Here the schema will be updated.
You should also be aware that all the JPA annotations can be avoided and you may completely define your persistence in Hibernate XML files! The XML files will override the annotations, so are perfect for fine-tuning of your schema on tricky specific deployment environments.
Backends
In the previous XML Persistent Unit we used a HSQLDB. HSQLDB is a great little testbed database, but it won’t scale as it keeps everything in memory. However, it dumps everything to a very readable script file on shutdown which is fantastic for debugging purposes.
Moving to the later stage development, production and maintenance… Hibernate supports pretty much anything that has a JDBC implementation. That includes Microsoft SQL, Oracle, MySQL (GPL alert!) and my personal favourite, PostgreSQL (BSD licence). Don’t forget to add the JDBC implementation onto your runtime classpath!
I highly recommend PostgreSQL, it is so much easier to configure than MySQL and can force SSL connections for remote connections. Does anyone know of a non-GPL JDBC for MySQL? Or a cross-platform implementation of MS-SQL? Please let us know in the comments.
Happy hacking!
ThinkBlog » Accessing Databases wrote:
January 25th, 2008 at 4:35 pm