Tuesday, December 7, 2010

Memory grids

There's a great podcast on Software Engineering Radio with Nati Shalom. The podcast discusses memory grids and I certainly found it insightful. In essence memory grids are being looked upon as the next disk.

The clincher for me was the revelation that the durability of data is not related to it being persisted to disk; it is related to the number of geographically disbursed copies of that data at any one time. Taken to the extreme this could mean that you don't need disk at all, but practically the data gets persisted to disk in an asynchronous manner. This is called "write behind" and can be performed at n nodes, if not all of them.

I think memory grids are very interesting. My prediction is that memcached, a popular open source memory cache that can be distributed over many nodes, will become a memory grid offering write-behind persistence. Same goes for Ehcache/Terracotta i.e. these memory caches will evolve beyond being just that. There are of course commercial memory caches out there including vmware's Gemfire and Oracle's Coherence.

One reason in my mind as to why memory grids are topical is commodity hardware being able to address one heck of a lot of memory resident data. Since the introduction of 64 bit computing for the masses, we now have a situation where a cheap processor can generally access about 256TB data - more than enough for most databases! Of course, with 32 bit processors about 4GB could be addressed which is less than many databases.

I think something that can be overlooked with memory grids is persistence. As mentioned, write-behind appears typical, but what isn't focused on is what performs the write-behind. There's no reason why that write-behind can't be done with a conventional RDBMS and I understand that many memory grids support such a thing. Thus with memory grids, it appears that you can have the best of both worlds.

Bring on the memory grid (preferably open sourced!).

No comments: