Performance Bottlenecks of Database Layer in JEE Applications

Past week I have been in discussions with different people over performance bottlenecks in database layer. These discussions were more related to common mistakes that are made in the database layer which cause performance issues when load increases. So I thought it would be a good idea to share the discussion here as well.

I have worked on database performance issues a lot. Here is my list of most common problems I encountered:

  1. No proper Connection Pooling.
    Some people argue that databases have become so advanced these days that getting a connection is not much of a bottle neck but that is not true. Connection Pooling has several other advantages as well which I will be discussing later.
  2. Improper use of Statements and Prepared Statements.
    Many a times I have seen code using prepared statements where statements should have been used and vice versa. Also keep in mind that prepared statements are cached per connection so having a connection pool is very necessary.
  3. Poorly written queries especially the where clauses.
    Make sure you have the most restricting criteria first in the where clause. We have been using MS SQL server in our product line and although their documents suggest that this is not necessary as the plan will be optimized accordingly but I have seen many a times that just changing the order of conditions in the where clause can make huge difference. Selecting unnecessary data, unnecessary joins are also one of the most common issues.
  4. Not using Indexes.
    The reasoning behind this is usually that this makes writes slow. This is somewhat true but there are certain conditions for it to be true (most of the applications usually don’t end up meeting that case)
  5. Improper usage of Transaction Isolation Levels or mostly the defaults are applied as is.
    I have rarely seen anyone having a look at this side.
  6. Long Transactions.
    This is also one of the most common case. Keep the transactions as short as possible and group similar things in a transaction instead of having every other database operation in a single transaction. This is easier said than done as proper care should be taken in demarcating the Transaction boundaries.
  7. No Caching.
    This is self explanatory.
  8. Denormalizing the database schema w/o any strong reasons for doing so.
    I would take this is as a premature optimization that causes more issues than it resolves.
Performance Bottlenecks of Database Layer in JEE Applications

6 thoughts on “Performance Bottlenecks of Database Layer in JEE Applications

  1. Mehdi says:

    You are correct in most of the pointers

    Under the heading of (3) Poorly written queries, one should also consider the ordering of Joins as well should use “EXPLAIN” (in MYSQL) like statement before finalizing it to see what order the DB is using and also to see which indexing is being used .. and whether its correct or not?

    * Should try to lower down the result set.
    Usually we have to play with multiple BIG TABLES in one single query and there are some restrictions on it in JOIN and WHERE clauses .. so if we identify some clauses that can lower down the result set and help us in the final query .. then it works much better .. so in simple words sometimes BIG queries should be broken down into few SIMPLE queries

    * DISTINCT and GROUP BY clause should be used wisely because it gets the DB to go to temporary disk.

    * If the query is resulting in LARGE dataset .. then it should be done in batch with proper batch mechanism .. like for the result set of around 100 million rows. The single query will take much time but if you break it into like 10 BATCHES resulting 10 Million rows each or 100 batches of 1 Million rows each .. and run in parallel threads according to number of processors.

    1. Faisal Feroz says:

      @Mehdi Threading can help but but do keep an eye out on lock contention as well.

      Sharding or Horizontal Partitioning is also one technique which is similar to the thread usage you mentioned but more on the database layer than on the application layer.

  2. Magician says:

    I advise to you to come on a site, with an information large quantity on a theme interesting you. There you by all means will find all.

  3. Faisal Feroz says:

    I was asked about being Point No.2 and 3 in this post being valid for Oracle or not. So I thought I should share the details of my discussion here as well.

    Well they are valid on Oracle server as well.

    Point No. 2 talks about using appropriate objects for database communication. I do agree that advanced RDBMS cache the query plans but Prepared Statements are different thing. Its a special way of asking the server to cache the query plan with place holders. You can ask the server to cache these query plans globally (not on a per connection basis which is the default behavior most of the time) but then again this has its overhead. In my opinion caching per connection is much better so that these plans are cleared when the connection drops. I can see that you can argue that it can cause issues but connections only drop after idle period timeout or some other issues like server restarting etc. (not that common)

    In Point No 3 I think you are talking about Oracle creating optimized execution plans no matter how the joins in where clause are ordered. But I am not talking about any specific database here. It is always good to follow best practices so that in future if you happen to change your DBMS you don’t have to go through your code having to fix up all those queries. As I said having the most restricting criteria first is always better and usually the plan is generated based on the same criteria. But it happens that you get a really bad plan and its good to you EXPLAIN and similar things provided by the RDBMS to write the best possible query possible. I cannot explain everything about query optimization in this email but its more of an art and the internet has tons of information available on the subject.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s