JPA slow on remote database queryes

C

carmelo

Hi everybody,
I developed a desktop application which accesses data from a remote
database with JPA using TopLink Essentials as provider. I'm having a
very slow performance in loading data when the database is remote,
which does not happen locally.

I hope you can help me about how to fix it. I hope that it's not
necessary to modify the whole application for loading data using JDBC,
which I saw is fast in this case...


Thank you in advance for your help!
 
C

carmelo

Maybe I could solve the problem with paging...
What do you think? How could I implement it? The application was
developed with Netbeans, do you know if it offers anything about
paging?
 
M

markspace

carmelo said:
Maybe I could solve the problem with paging...
What do you think? How could I implement it? The application was
developed with Netbeans, do you know if it offers anything about
paging?

What's paging?
 
A

Arne Vajhøj

Thanks. Normally when I think of "paging" I think of virtual memory.
I've never thought of "return only some results of a query" as a
performance enhancement, but I see how it could be considered one.

I am pretty sure that Google works faster showing only a
page instead of all hits.

Arne
 
M

markspace

Arne said:
I am pretty sure that Google works faster showing only a
page instead of all hits.


Perhaps the whole concept is just too obvious. It's less of a
"performance enhancement" and more of a "duh."
 
A

Arved Sandstrom

carmelo said:
Hi everybody,
I developed a desktop application which accesses data from a remote
database with JPA using TopLink Essentials as provider. I'm having a
very slow performance in loading data when the database is remote,
which does not happen locally.

I hope you can help me about how to fix it. I hope that it's not
necessary to modify the whole application for loading data using JDBC,
which I saw is fast in this case...

Thank you in advance for your help!

For starters, how slow is slow? Roughly how many seconds for what size
result sets?

If you're going to stick with Oracle the first thing I'd suggest is
switching to EclipseLink from Toplink Essentials. If you want to handle
large result sets with streams or cursors, and do so in a fashion that
is close to JPA (query hints on JPA queries) then EclipseLink exposes
this capability (which was in full Toplink, but IIRC not in TLE).

Also, if you decide to page using setFirstResult()/setMaxResults() then
EclipseLink is also a much better bet than TLE: Toplink Essentials
brings back the complete result set from a query and then uses JDBC
absolute() etc at the driver level to actually give _you_ the limited
set. Whereas EclipseLink uses appropriate syntax on various platforms
(e.g. ROWNUM filtering on Oracle) to get the _database_ to do that
subsetting for you.

If you can't switch from TLE right now you can consider chunking by
retrieving all the primary keys for the full result set, and then using
IN with subsets of the keys (although even EclipseLink 1.2.x and up do
this better because they let you set Java lists as single parameters for
situations like this).

Other things to consider are:

1) caching - what's the nature of the data being queried? Is it amenable
to caching? What type of caching is best?

2) database query optimization - look at the SQL generated by your ORM.
Then look at the query execution plan for that query on your target DB.
Do you have an optimized situation? Can you improve the indices etc etc?

3) Is the actual JPA query efficient? Are you using joins, fetch joins
etc if they are suitable? Again, look at the generated SQL.

These are some initial ideas.

AHS
 
C

carmelo

Thank you guys for your answers.

< how slow is slow? Roughly how many seconds for what size result
sets?
59 seconds for 439 records!

< database query optimization - joins, fetch joins etc
I'm using anything but a simple "SELECT m FROM Mytable WHERE <simcple
conditions>"

< I'd suggest is switching to EclipseLink from Toplink Essentials
Do you think EclipseLink will have a better performance? How can I
swith from Toplink to EclipseLink? I'm using Netbeans IDE

< I have implemented pagination in an JPA application by leaving the
JPA out of that specific function, in pure jdbc.
I've implemented a pagination in JPA following this useful tutorial
http://rubenlaguna.com/wp/2009/08/18/jtable-beans-binding-and-jpa-pagination.
But I am not satisfied by its performance, in fact, scrolling down is
not immediate, you have to wait until it loads the remaining rows
subset. Do you think I need pure JDBC for better performance?
 
L

Lew

carmelo said:
Thank you guys for your answers.

< how slow is slow? Roughly how many seconds for what size result
sets?
59 seconds for 439 records!

I load a 3,000-entry List from JPA over a relatively slow network in one
project virtually instantly. It's not JPA that is at fault.

I don't bother with pagination when I'm using JPA. It ends up defeating the
internal caching and available external caching capabilities of the JPA
provider. It also messes up your architecture.

When you use JPA, you need to avoid thinking of your persistent information as
"data". The whole point of JPA is to give an *object-oriented* view of
persistent entities and their relationships. The fact that you are tangled in
questions of "performance" (and trying to blame JPA for it) and "batches"
indicates that you are not working from an object-oriented view of your
architecture. That is the real mistake.

You get the best performance out of JPA when you use it in its natural mode.
Then it's quite fast, at least as much so as normal database access will let
it be. Done properly, it'll never be "JPA" that is to blame for performance
issues.

You have given us absolutely no information about the data or object models
you use, or how you try to integrate them. Then you expect miracles from
Usenet to solve your problem, while you go haring off after chimeras like "JPA
hurts performance". It doesn't work that way.

Well, I guess it does, now that you are reading this post.

JPA has built-in support for certain natural kinds of "batches", that is,
one-to-many, many-to-one and many-to-many relationships of entities or sets of
entities to each other. It fills these batches quite efficiently
appropriately to the programmer's choice of 'Fetch.EAGER' or 'Fetch.LAZY'.
The former uses JOIN syntax (assuming proper integrity constraints in the
database) and the latter conditional queries.

With only 439 entities ("records" is a data-oriented term) you don't need to
page. That's just silly.

If you keep entity size reasonable, let's say about a KiB apiece, you can hold
a thousand per MiB. In batch sizes over about ten thousand it starts to
become useful to think about paging.

Proper data models on the database side are an absolute must. Screw up the
data model and too bad for you.

JPA works best for simpler data models, ones that have clean representations
of entities. It can handle beautifully any relationships expressible in
collection terms, and since relational databases are set oriented by
definition, that is anything that a relational database expresses.

As relationship complexity grows, particularly with many-to-many links, the
@JoinX() annotations become mandatory. You have to get pretty specific in
your mappings, telling JPA not only the fields through which you link but the
table and column names involved.

The Java EE 5 tutorial on java.sun.com goes through some of this, and googling
around will find you more examples. I find the documentation for each of
EclipseLink, Hibernate and Apache OpenJPA all to be very useful, regardless of
which one you actually happen to use.
 
C

carmelo

Thank you Lew for your exaustive answer, but I've got this performance
problem and I need to solve it. I hope you can help me finding a
reasonable solution. I'm happy that you can load thousands of records
with JPA, so I hope I can load them so fast too.. However, locally, in
my local area network, I can load thousands of records with JPA too..
the performance problem begins when I try to load data from a remote
host database.

To be clear, here is my JPA query and my entity code:

//query
String query = "SELECT a FROM Articoli a";
Query q = entityManager.createQuery(query);
java.util.Collection data = q.getResultList();
for (Object entity : data) {
entityManager.refresh(entity);
}
list.clear();
list.addAll(data);

//entity, Articoli.java
@Entity
@Table(name = "articoli")
@NamedQueries({
..here are NamedQueries for each field..
})
public class Articoli implements Serializable {
@Transient
private PropertyChangeSupport changeSupport = new
PropertyChangeSupport(this);
private static final long serialVersionUID = 1L;
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Basic(optional = false)
@Column(name = "CodiceArticolo")
private Integer codiceArticolo;
@Basic(optional = false)
@Column(name = "Descrizione")
private String descrizione;
@Column(name = "H")
private BigDecimal h;
@Column(name = "L")
private BigDecimal l;
@Column(name = "Diametro")
private BigDecimal diametro;
@Column(name = "Spessore")
private BigDecimal spessore;
@Basic(optional = false)
@Column(name = "prezzo")
private BigDecimal prezzo;
@Column(name = "Lungh")
private BigDecimal lungh;
@Column(name = "Peso")
private BigDecimal peso;
@Column(name = "Lung_scatola")
private BigDecimal lungscatola;
@Column(name = "Larg_scatola")
private BigDecimal largscatola;
@Column(name = "H_scatola")
private BigDecimal hscatola;
@Column(name = "Peso_scatola")
private BigDecimal pesoscatola;
@Column(name = "Spessore_max")
private BigDecimal spessoremax;
@Column(name = "Lampada")
private String lampada;
@Column(name = "Watt")
private BigDecimal watt;
@Column(name = "Lumen")
private BigDecimal lumen;
@Column(name = "Temp_colore")
private BigDecimal tempcolore;
@Column(name = "Trattamento")
private String trattamento;
@Column(name = "T_50_min")
private BigDecimal t50min;
@Column(name = "T_50_max")
private BigDecimal t50max;
@Column(name = "Id")
private BigDecimal id;
@Column(name = "Od")
private BigDecimal od;
@Version
protected int version;
@JoinColumn(name = "CodiceSezione", referencedColumnName =
"codice")
@ManyToOne
private Sezione codiceSezione;
@JoinColumn(name = "CodiceCategoria1", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria1;
@JoinColumn(name = "CodiceCategoria2", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria2;
@JoinColumn(name = "CodiceCategoria3", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria3;
@JoinColumn(name = "CodiceCategoria4", referencedColumnName =
"codice")
@ManyToOne
private Categoria codiceCategoria4;
@OneToMany(cascade = CascadeType.ALL, mappedBy =
"codicearticolomagaz")
private Collection<Magazzinodettaglio>
magazzinodettaglioCollection;
@OneToMany(cascade = CascadeType.ALL, mappedBy =
"codicearticolofornitore")
private Collection<Articolifornitori> articolifornitoriCollection;

public Articoli() {
}

public Articoli(Integer codiceArticolo) {
this.codiceArticolo = codiceArticolo;
}

public Articoli(Integer codiceArticolo, String descrizione,
BigDecimal prezzo) {
this.codiceArticolo = codiceArticolo;
this.descrizione = descrizione;
this.prezzo = prezzo;
}

..here are getter/setter methods for each field..

public Collection<Magazzinodettaglio>
getMagazzinodettaglioCollection() {
return magazzinodettaglioCollection;
}

public void
setMagazzinodettaglioCollection(Collection<Magazzinodettaglio>
magazzinodettaglioCollection) {
this.magazzinodettaglioCollection =
magazzinodettaglioCollection;
}

public Collection<Articolifornitori>
getArticolifornitoriCollection() {
return articolifornitoriCollection;
}

public void
setArticolifornitoriCollection(Collection<Articolifornitori>
articolifornitoriCollection) {
this.articolifornitoriCollection =
articolifornitoriCollection;
}


@Override
public int hashCode() {
int hash = 0;
hash += (codiceArticolo != null ? codiceArticolo.hashCode() :
0);
return hash;
}

@Override
public boolean equals(Object object) {
if (!(object instanceof Articoli)) {
return false;
}
Articoli other = (Articoli) object;
if ((this.codiceArticolo == null && other.codiceArticolo !=
null) || (this.codiceArticolo != null && !
this.codiceArticolo.equals(other.codiceArticolo))) {
return false;
}
return true;
}

@Override
public String toString() {
return "desktopapplication1.Articoli[codiceArticolo=" +
codiceArticolo + "]";
}

public void addPropertyChangeListener(PropertyChangeListener
listener) {
changeSupport.addPropertyChangeListener(listener);
}

public void removePropertyChangeListener(PropertyChangeListener
listener) {
changeSupport.removePropertyChangeListener(listener);
}

}


If it's not enough, please let me know, and I'll prepare a sample
application.

I hope you can help me to achive a better performance.

Thank you very much in advance for your help guys!
 
A

Alessio Stalla

To be clear, here is my JPA query and my entity code:

//query
String query = "SELECT a FROM Articoli a";
Query q = entityManager.createQuery(query);
java.util.Collection data = q.getResultList();
for (Object entity : data) {
   entityManager.refresh(entity);}

Are you aware of the fact that refresh() will reload the entity from
the database, therefore issuing a query? And that many-to-one
relations are eager by default, so for each entity you'll be fetching
all the related objects each time?

hth,
Alessio
 
C

carmelo

Thank you Alessio for your reply.
Are you aware of the fact that refresh() will reload the entity from the database, therefore issuing a query?

You're right! refresh() issues a new query. I removed it, but the
loading time is now about 49 seconds, still too high. Even if it's
better that 59 seconds :)
many-to-one relations are eager by default, so for each entity you'll be fetching all the related objects each time?

How can I do? There are n:1 relations between Sezione, Categoria and
Articoli. In other words, Articoli.sezione can have multiple values,
which values are taken from Sezione. And the same is for
Articoli.categoria1, Articoli.categoria2, Articoli.categoria3,
Articoli.categoria, which values are taken from Categoria.
However I tried to remove these relations, and the resulting loading
time was 46 secods, therefore I saved only 3 seconds.


I hope you can help me achiving better results. Thanks guys
 
M

Martin Gregorie

Thank you Lew for your exaustive answer, but I've got this performance
problem and I need to solve it. I hope you can help me finding a
reasonable solution. I'm happy that you can load thousands of records
with JPA, so I hope I can load them so fast too.. However, locally, in
my local area network, I can load thousands of records with JPA too..
the performance problem begins when I try to load data from a remote
host database.
I think your best bet is to do some monitoring to see exactly what is
going on at the network.

Start by instrumenting your code so you can see how big the retrieved
dataset has to be to cause problems.

Then install a copy of Wireshark (available for most OS) and use it to
inspect the network traffic when your query is running. Look at smaller
queries as well as ones big enough to cause the problem to pick up any
significant differences, e.g. does the nature of the traffic change as
the retrieved data expands from a few display lines to more than a
screenful? Does scrolling to the next page cause network activity?

Turning on JDBC and/or DBMS tracing can help too.

You're probably going to have to wade through great heaps of captured
data, especially if the problem only occurs with large data sets, but the
GUI used by later copies of Wireshark can help there.

I've seen awfully dumb tricks played by both software and database
designers that have only come to light when large data volumes and/or
slower data links were used, e.g.:

- really stupid auto-generated SQL
(PowerBuilder - fortunately it would also accept manually written SQL)

- dumb developer tricks
(Microsoft Foundation Classes - MFC queries managed to scan the whole
table when getting the column names for 'SELECT * FROM...' queries)

- poor DB schema design
( missing indexes aren't apparent with typical programmer small test
data sets but boy do they bite once row counts are in the ten
thousands)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top