Session management approach – choice vs. discipline
May 23, 2007
Bring up the topic of stateful and stateless applications and you can guarantee to get a house divided, almost equally on both sides. There are die-hard proponents of statelessness (and frameworks to support it;Spring for one) and those that support stateful behavior(and frameworks to support it; JBoss Seam for one) .
This leaves the average developer confused – is holding state good or bad? The answer is different : It is inevitable in circumstances and avoidable in certain others.
A regular web application normally contains state while service calls in the integration or infrastructure layer may not. Often the latter is designed that way i.e being stateless for sake of scalability and fail-over.
An often used approach to maintain state is a user session. The most common choice is the HttpSession. Cluster-aware application servers handle replication of these sessions. In-efficiencies in session replication are often cited as reasons to move to a stateless design or look for alternative means for replication. Lets take a look at common approaches to managing user sessions before we decide on the merit of this move. Session replication choices:
- No replication. No fail-over. Sticky behavior is the only choice in a redundant server deployment.
- In memory replication. Default behavior in J2EE application servers.
- DB based replication. Optional behavior in .Net and Ruby On Rails platforms.
Take any approach and you can find people give you many reasons why not to use them. Some of reasons can be : lopsided load in case of stickiness, inefficient in-memory replication and cluster license cost(3 times more) in case of in-memory replication and increase in DB I/O in case of DB based sessions.
We might do better by addressing the problem before trying to find more efficient solutions. Control over what is considered as valid state and the size of the state object graph matter more. I follow these principles/practices when handling state in my application. Some of them are not new and are in fact best practice recommendations for performance, robustness and overall hygiene of the system:
- Store only key information in the session i.e only minimal data with which you can re-construct the entire session graph.
- Store only domain model or equivalent data objects. Avoid objects that hold behavior. An easy way to implement this is to wrap session access with a layer that entertains say only XSD derived objects, which effectively cuts out behavioral class instances.
- Set a limit to the size of the session i.e avoid large session graphs. The session wrapper can ensure this.
- Persist session only if dirty. Applies to cases where there is container support and in custom session persistence implementation.
An application that follows all of the above would rarely need to debate on the cost of maintaining state via sessions – in memory or in the DB.
DB based persistence is considered expensive and a mis-fit to store transient data such as user session information. However, interestingly frameworks like .Net and Ruby on Rails(RoR), that matured later than J2EE, provide this as an option. In fact, it is the default in RoR, if Iam not wrong.
Recently I had the choice to architect a SOA based platform to build applications on top. We wanted the core services to be stateless to easily scale out when required. Naturally we preferred the application servers to NOT be clustered and be load balanced instead. The applications built on top had to contain minimal state however. We also decided to mask session management from the consuming applications and implemented session persistence and therefore recovery using the DB. While there were initial apprehensions about DB I/O bottlenecks, adopting the principles described above helped us tide over the issue. The end-applications have been in production for a year now. The logic we used in favor of DB based sessions was this : nature of DB access for say 100 concurrent users would mostly be READ with the odd case of a WRITE(i.e when session gets dirty). 100 reads of small sized records using an index on a table is extremely fast as there are no concurrency or transaction isolation issues as each read is for a specific record independent of the other. Anyway, we have the option to switch back(courtesy the wrapper over session management) to Http sessions and clustering if performance sucked, which hasn’t happened till date.
To sum it up : the debate between Stateful and Stateless applications and consequently that on the most efficient session persistence/replication mechanism is really a matter of choice if session is handled with some discipline in the application.