Maintain Large Java Caches With Low Memory Overhead

Maintain Large Java Caches With Low Memory Overhead

0 3504

Wow, it has been a month since my last post. Thought I would come on and provide everyone with an update of what I have been working on lately.

For the past week and a half I had the pleasure of trying to get a Java cache implementation working. I was caching 2 very large tables down locally from one of our applications at work that I am pulling data from. The system I am working with has a very simplistic SQL engine to access its files, but it does not support sub-queries. This usually leads to a lot of extra queries being done in a loop, putting extra load on the DB, and slowing down the process of getting at the data I want.

The light bulb finally went on when I had the idea to just cache these 2 tables using the PK as the cache key and for the value a map with the column names and value from the db for each row in the table. I had used EHCache in the past with success for some in memory caching, so I thought, I will just throw EHCache at this and be done. I basically wanted an eternal cache so I did not have to rebuild the entire cache from scratch every time I needed it. I got my cache all setup and built out, and caching to the disk. The problem was the cache was getting cleared out every time the JVM reloaded. I researched this a bit and found out I was missing an option on my cache setup to tell EHCache to reload from disk on start up. Ah ha I thought, this is going to totally rock, and then this happened:

RuntimeException: This feature is only available in the enterprise version

Nooooooo, how could this be, lol! I checked out the pricing for the Enterprise version and you had to call. We were not wanting to spend a lot of money on this project so I knew EHCache was not going to be working out for us at this point. I have nothing against paying for software by the way, and I am sure that EHCache is worth the money if you have the money to throw at it.

Well after lots of Googling and reading, I landed on Infinispan which also looked really nice. I really liked that this had a little bit more of a simplistic way of moving data in and out of the Cache than EHCache. With EHCache, everything has to be wrapped in Element like so:

//putting
Element e = new Element("myKey", objectToCache);
cache.put(e);

//getting
Element e = cache.get("myKey");
e.getObjectValue();

It was much more Map like with Infinispan

cache.put("myKey", objectToCache);

cache.get("myKey");

I did a simple test with Infinispan and was able to push a value to the cache, shut the cache down, and retrieve the value again when I fired the cache back up. Another wave of excitement sweeps over me as I move my existing EhCache code over to the Infinispan API. I am able to use their SingleFileCacheStore and build out the cache. I then start to use the cache and delight at the speed at which it is flying through the data not having to do my extra DB queries. But then I watch as it starts to go slower and slower GC, GC, GC, Out of Memory. NOOOoooooo! The objects that were getting serialized from the cache never seemed to get garbage collected. I spent a while researching this and ran into a bunch of dead ends.

Back to The Google I go….

I briefly tried JCS, but could not get the cache to reload from disk on start up. Thought maybe it was something stupid in my configuration. Went through the docs multiple times and posted a question on Stackoverflow with no success.

Yup you guessed it, back to Google AGAIN!

I finally landed on this gem, MapDB. It is actually built to be a database to store Java Objects and implements the Java Collections so you can work with it as a Map, List, etc. As of right now, I could not be happier. My Infinispan cache size was a 2GB file. This file is 500mb caching the same data. It is a little bit slower pushing the data into it. It has a Async option for writes that speeds it up quite a bit, but I was getting a concurrent modification exception when I had it turned on. I just shut it down for now as I was not really concerned with the speed at which my cache got created. I did not notice any significant speed differences when pulling the objects out of the DB. The memory usage stays very low, and I am quiet impressed with how fast they can go to the file and pull the information out without having it in memory. Here is the JavaDoc for anyone interested.

I would be very interested to know your thoughts / experiences with similar implementations.

NO COMMENTS

Leave a Reply