= Querying =
Querying is a fundamental part of persistence. Being able to persist something is not very useful without being able to query it back. There are many querying languages and frameworks; the most common query language is SQL used in relational databases.
JPA provides several querying mechanisms:
JPQL : (BNF)
Native SQL QueriesJPA primarily uses the Java Persistence Querying Language (JPQL), which is based on the SQL language and evolved from the EJB Query Language (EJBQL). It basically provides the SQL syntax at the object level instead of at the data level. JPQL is similar in syntax to SQL and can be defined through its BNF definition.
JPA also provides the Criteria API that allows dynamic queries to be easily built using a Java API. The Criteria API mirrors the JPQL syntax, but provides Java API for each operation/function instead of using a separate query language.
JPA provides querying through the Query interface, and the @NamedQuery and @NamedNativeQuery annotations and the and XML elements.
Other querying languages and frameworks include:
EQL (EclipseLink Query Language)
Query By Example (QBE)
Object Query Language (OQL)
== Named Queries ==
There are two main types of queries in JPA, named queries and dynamic queries. A named query is used for a static query that will be used many times in the application. The advantage of a named query is that it can be defined once, in one place, and reused in the application. Most JPA providers also pre-parse/compile named queries, so they are more optimized than dynamic queries which typically must be parsed/compiled every time they are executed. Since named queries are part of the persistence meta-data they can also be optimized or overridden in the orm.xml without changing the application code.
Named queries are defined through the @NamedQuery and @NamedQueries annotations, or XML element.
Named queries are accessed through the EntityManager.createNamedQuery API, and executed through the Query interface.
Named queries can be defined on any annotated class, but are typically defined on the Entity that they query for. The name of the named query must be unique for the entire persistence unit, they name is not local to the Entity. In the orm.xml named queries can be defined either on the or on any .
Named queries are typically parametrized, so they can be executed with different parameter values. Parameters are defined in JPQL using the : syntax for named parameters, or the ? syntax for positional parameters.
A collection of query hints can also be provided to a named query. Query hints can be used to optimize or to provide special configuration to a query. Query hints are specific to the JPA provider. Query hints are defined through the @QueryHint annotation or query-hint XML element.
=== Example named query annotation ===
=== Example named query XML ===
=== Example named query execution ===
== Dynamic Queries ==
Dynamic queries are normally used when the query depends on the context. For example, depending on which items in the query form were filled in, the query may have different parameters. Dynamic queries are also useful for uncommon queries, or prototyping.
JPA provides two main options for dynamic queries, JPQL and the Criteria API.
Dynamic queries can use parameters, and query hints the same as named queries.
Dynamic queries are accessed through the EntityManager.createQuery API, and executed through the Query interface.
==== Example dynamic query execution ====
=== Criteria API (JPA 2.0) ===
See Criteria API.
== JPQL ==
== Parameters ==
Parameters are defined in JPQL using the : syntax, i.e. "Select e from Employee e where e.id = :id". The parameter values are set on the Query using the Query.setParameter API.
Parameters can also be defined using the ?, mainly for native SQL queries. You can also use ?. These are positional parameters, not named parameters and are set using the Query API Query.setParameter. The int is the index of the parameter in the SQL. Positional parameters start a 1 (not 0). Some JPA providers also allow the : syntax for native queries.
For temporal parameters (Date, Calendar) you can also pass the temporal type, depending on if you want the Date, Time or Timestamp from the value.
Parameters are normally basic values, but you can also reference objects if comparing on their Id, i.e., "Select e from Employee e where e.address = :address" can take the Address object as a parameter. The parameter values are always at the object level when comparing to a mapped attribute, for example if comparing a mapped enum the enum value is used, not the database value.
Parameters are always set on the Query, no matter what type of query it is (JPQL, Criteria, native SQL, NamedQuery).
== Query Results ==
Normally JPA queries return your persistent Entity objects. The returned objects will be managed by the persistent context (EntityManager) and changes made to the objects will be tracked as part of the current transaction. In some cases, more complex queries can be built that just return data instead of Entity objects, or even perform update or deletion operations.
There are three methods to execute a Query, each returning different results:
Query.executeUpdategetResultList returns a List of the results. This is normally a List of Entity objects, but could also be a list of data, or arrays.
getSingleResult returns the results. This is normally an Entity object, but could also be data, or an object array. If the query returns nothing, an exception is thrown. This is unfortunate, as typically just returning null would be desired. Some JPA providers may have an option to return null instead of throwing an exception if nothing is returned. An exception is also thrown if the query returns more than just a single row. This is also unfortunate, as typically just returning the first result is desired. Some JPA providers may have an option to return the first result instead of throwing an exception, otherwise you need to call getResultList and get the first element.
executeUpdate returns the database row count. This can be used for UPDATE DELETE JPQL queries, or any native SQL (DML or DDL) query that does not return a result.
== Common Queries ==
=== Joining, querying on a OneToMany relationship ===
To query all employees with a phone number in 613 area code a join is used.
=== Subselect, querying all of a to many relationship ===
To query all employees whose projects are all in trouble a subselect with a double negation is used.
=== Subselect, querying a to many relationship where all of the relations are in a list ===
To query all employees who are in all the list of projects a subselect with a count is used.
First collect the project ids from the projects (using the objects may work as well). Then get the size of the list.
Your query will need two arguments, one of the list of ids, and one for the size of the list.
=== Join fetch, read both employee and address in same query ===
To query all employees and their address a join fetch is used. This selects both the employee and address data in the same query.
If the join fetch was not used, the employee address would still be available, but could cause a query for each employee for its address.
This reduces n+1 queries to 1 query.
Join fetch can also be used on collection relationships:
Outer joins can be used to avoid null and empty relationships from filtering the results:
You can also select multiple objects in a query, but note that this does not instantiate the relationship, so accessing the relationship could still trigger another query:
=== Inverse ManyToMany, all employees for a given project ===
To query all employees for a given project where the employee project relationship is a ManyToMany.
If the relationship is bi-directional you could use:
If it is uni-directional you could use:
=== How to simulate casting to a subclass ===
To query all employees who have a large project with a budget greater than 1,000,000 where the employee only has a relationship to Project, not to
the LargeProject subclass. JPA 1.0 JPQL does not define a cast operation (JPA 2.0 may define this), so querying on an attribute of a subclass is not obvious.
This can be done indirectly however, if you add a secondary join to the subclass to the query.
=== How to select the first element in a collection ===
To query the employees first project for a particular employee. There are a few different ways to do this, some using straight JPQL, and some using the Query setMaxResuls API. If a JPA 2.0 indexed list is used to map the collection, then the INDEX function can be used.
=== How to order by the size of a collection ===
To query all employees ordered by the number of projects. There are a few different ways to do this, some end up using sub selects in SQL, and some use group by. Depending on your JPA provider and database you solution may be limited to one or the other.
Using SIZE function (uses sub-select in SQL)
Using SIZE function, also selects the size (uses group by)
Using GROUP BY
Using GROUP BY and alias
= Advanced =
== Join Fetch and Query Optimization ==
There are several ways to optimize queries in JPA. The typical query performance issue is that an object is read first, then its related objects are read one by one. This can be optimized using JOIN FETCH in JPQL, otherwise by query hints specific for each JPA provider.
== Timeouts, Fetch Size and other JDBC Optimizations ==
There are several JDBC options that can be used when executing a query.
These JDBC options are not exposed by JPA, but some JPA providers may support query hints for them.
Fetch size : Configures the number of rows to fetch from the database in each page. A larger fetch size is more efficient for large queries.
Timeout : Instructs the database to cancel the query if its execution takes too long.
EclipseLink/TopLink : Provide many query hints including:
"eclipselink.jdbc.fetch-size" - Fetch size.
"eclipselink.jdbc.timeout" - Timeout.
"eclipselink.read-only" - The objects returned from the query are not managed by the persistence context, and not tracked for changes.
"eclipselink.query-type" - Defines the native type of query to use for the query.
"eclipselink.sql.hint" - Allows an SQL hint to be included in the SQL for the query.
"eclipselink.jdbc.bind-parameters" - Specifies if parameter binding should be used or not, (is used by default).
== Update and Delete Queries ==
JPQL also allows for UPDATE and DELETE queries to be executed. This is not the recommend or normal way to modify objects in JPA. Normally in JPA you first read the object, then either modify it directly using its set methods to update it, or call the EntityManager.remove() method to delete it.
UPDATE and DELETE queries in JPQL are for performing batch updates or deletions. There allow a set of objects to be updated or deleted in a single query. These queries are useful for performing batch operations, or clearing test data.
UPDATE and DELETE queries have a WHERE the same as SELECT queries, and can use the same functions and operations, and traverse relationships and make use of sub selects. UPDATE and DELETE queries are executed using the Query.executeUpdate() method, and return the row count from the database. Note that some caution should be used in execute these queries in an active persistence context, as the queries may effect the objects that have already been registered in the EntityManager. Normally it is a good idea to clear() the EntityManager after executing the query, or to execute the query in a new EntityManager or transaction.
=== Example update query ===
=== Example delete query ===
== Flush Mode ==
Within a transaction context in JPA, changes made to the managed objects are normally not flushed (written) to the database until commit. So if a query were executed against the database directly, it would not see the changes made within the transaction, as these changes are only made in memory within the Java. This can cause issues if new objects have been persisted, or objects have been removed or changed, as the application may expect the query to return these results. Because of this JPA requires that the JPA provider performs a flush of all changes to the database before any query operation. This however can cause issues if the application is not expecting that a flush as a side effect of a query operation. If the application changes are not yet in a state to be flushed, a flush may not be desired. Flushing also can be expensive and causes the database transaction, and database locks are other resources to be held for the duration of the transaction, which can effect performance and concurrency.
JPA allows the flush mode for a query to be configured using the FlushModeType enum and the Query.setFlushMode() API. The flush mode is either AUTO the default which means flush before every query execution, or COMMIT which means only flush on commit. The flush mode can also be set on an EntityManager using the EntityManager.setFlushMode() API, to affect all queries executed with the EntityManager. The EntityManager.flush() API can be called directly on the EntityManager anytime that a flush is desired.
Some JPA providers also let the flush mode be configured through persistence unit properties, or offer alternatives to flushing, such as performing the query against the in memory objects.
TopLink / EclipseLink : Allow the auto flush to be disabled using the persistence unit property "eclipselink.persistence-context.flush-mode"="COMMIT".
== Pagination, Max/First Results ==
A common requirement is to allow the user to page through a large query result. Typically a web user is given the first page of n results
after a query execution, and can click next to go to the next page, or previous to go back.
If you are not concerned about performance, or the results are not too big, the easiest way to implement this is to query all of the results, then access the sub-list from
the result list to populate your page. However, you will then have to re-query the entire results on every page request.
One simple solution is to store the query results in a stateful SessionBean or an http session. This means the initial query make take a while, but paging will be fast.
Some JPA providers also support the caching of query results, so you can cache the results in your JPA providers cache and just re-execute the query to obtain the cached results.
If the query result is quite large, then another solution may be required. JPA provides the Query API setFirstResult, setMaxResults to allow paging through a large query result. The maxResults can also be used as a safeguard to avoid letting users execute queries that return too many objects.
How these query properties are implemented depends on the JPA provider and database. JDBC allows the maxResults to be set, and most JDBC drivers support this, so it will normally work for most JPA providers and most databases. Support for firstResult can be less guaranteed to be efficient, as it normally requires database specific SQL. There is no standard SQL for pagination, so whether if this is supported depends on your database, and your JPA providers support.
When performing pagination, it is also important to order the result. If the query does not order the result, then each subsequent query could potentially return the results in a different order, and give a different page. Also if rows are insert/deleted in between the queries, the results can be slightly different.
=== Example using firstResult, maxResults ===
An alternative to using firstResult is to filter the first result in the where clause based on the order by and the value from the previous page.
=== Example using maxResults and order by ===
Another alternative is to only query the Ids, and store this result in a stateful SessionBean or an http session.
Then query for the set of Ids for each page.
=== Example using Ids and IN ===
Pagination can also be used for server processes, or batch jobs. On the server, it is normally used to avoid using too much memory upfront, and allow processing each batch one at a time. Any of these techniques can be used, also some JPA providers support returning a database cursor for the query results that allows scrolling through the results.
TopLink / EclipseLink : Support streams and scrollable cursors through the query hints "eclipselink.cursor.scrollable" and "eclipselink.cursor", and CursoredStream and ScrollableCursor classes.
== Native SQL Queries ==
Typically queries in JPA are defined through JPQL. JPQL allows the queries to be defined in terms of the object model, instead of the data model. Since developers are programming in Java using the object model, this is normally more intuitive. This also allows for data abstraction and database schema and database platform independence. JPQL supports much of the SQL syntax, but some aspects of SQL, or specific database extensions or functions may not be possible through JPQL, so native SQL queries are sometimes required. Also some developers have more experience with SQL than JPQL, so may prefer to use SQL queries. Native queries can also be used for calling some types of stored procedures or executing DML or DDL operations.
Native queries are defined through the @NamedNativeQuery and @NamedNativeQueries annotations, or XML element. Native queries can also be defined dynamically using the EntityManager.createNativeQuery() API.
A native query can be for a query for instances of a class, a query for raw data, an update or DML or DDL operation, or a query for a complex query result. If the query is for a class, the resultClass attribute of the query must be set. If the query result is complex, a Result Set Mapping can be used.
Native queries can be parameterized, so they can be executed with different parameter values. Parameters are defined in SQL using the ? syntax for positional parameters, JPA does not require native queries support named parameters, but some JPA providers may. For positional parameter the position starts a 1 (not 0).
A collection of query hints can also be provided to a native query. Query hints can be used to optimize or to provide special configuration to a query. Query hints are specific to the JPA provider. Query hints are defined through the @QueryHint annotation or query-hint XML element.
==== Example native named query annotation ====
==== Example native named query XML ====
==== Example native named query execution ====
==== Example dynamic native query execution ====
=== Result Set Mapping ===
When a native SQL query returns objects, the SQL must ensure it returns the correct data to build the resultClass using the correct
column names as specified in the mappings. If the SQL is more complex and returns different column names, or returns data for multiple objects
then a @SqlResultSetMapping must be used.
@SqlResultSetMapping is a fairly complex annotation containing an array of @EntityResult, @ConstructorResult, and @ColumnResult. This allows multiple Entity objects in combination with raw data, and non-mapped classes, to be returned. The @EntityResult contains an array of @FieldResult, which can be used to map the alias name used in the SQL to the column name required by the mapping. This is required if you need to return two different instances of the same class, or if the SQL needs to alias the columns differently for some reason. Note that in the @FieldResult the name is the name of the attribute in the object, not the column name in the mapping. This seems odd, because this would make mapping an Embedded or composite id relationship not possible.
Normally it is easiest to either select raw data or a single object with native SQL queries, so @SqlResultSetMappings can normally be avoided, as they are quite complex. Also note that even if you select the Employee and its Address with the SQL, these are two unrelated objects, the employee's address is not set, and may trigger a query if accessed unless a cache hit occurs. Some JPA providers may provide a query hint to allow join fetching to be used with native SQL queries.
TopLink / EclipseLink : Support join fetching with native SQL queries through the "eclipselink.join-fetch" query hint.
==== Example result set mapping annotation ====
=== ConstructorResult (JPA 2.1) ===
JPA 2.1 defines a @ConstructorResult annotation to allow the returning of non-mapped classes from native SQL queries. The ConstructorResult is similar to the JPQL NEW operator that allows the calling of a class constructor passing in the raw data. The ConstructorResult has a targetClass and columns array of ColumnResults. The target class must define a constructor taking the same number of arguments and types as defined by the columns.
==== Example constructor result annotation ====
== Stored Procedures ==
See Stored Procedures
== Raw JDBC ==
It can sometimes be required to mix JDBC code with JPA code. This may be to access certain JDBC driver specific features, or to integrate with another application that uses JDBC instead of JPA.
If you just require a JDBC connection, you could access one from your JEE server's DataSource, or connect directly to DriverManager or a third party connection pool. If you need a JDBC connection in the same transaction context and your JPA application, you could use a JTA DataSource for JPA and your JDBC access to have them share the same global transaction. If you are not using JEE, or not using JTA, then you may be able to access the JDBC connection directly from your JPA provider.
Some JPA providers provide an API to access a raw JDBC connection from their internal connection pool, or from their transaction context.
In JPA 2.0 this API is somewhat standardized by the unwrap API on EntityManager.
To access a JDBC connection from an EntityManager, some JPA 2.0 providers may support:
This connection could then be used for raw JDBC access. It normally should not be close when finished, as the connection is being used by the EntityManager and will be released when the EntityManager is closed or transaction committed.
TopLink / EclipseLink : Support unwrapping the JDBC Connection.