Is scalability a factor of number of machines/CPUs?
February 5, 2007
Thanks to the likes of Google, the world today sees application scalability in a new light – that of not being only dependent on Symmetric Multi-Processor (SMP) boxes. What probably doesnot come out clearly from such implementations is optimization of the available CPU power.
I read somewhere that only a portion of the world’s available processing power is actually used. On the other hand, dont many of us worry about applications being slow? One area that is being addressed to improve performance in regular J2EE applications is that of data access through application strategies(partitioning of data, lazy load e.t.c) and technologies (caches and data grids) , often provided by the vendor themselves.
The ubiquitous nature of Http and its applications has created its own patterns of application design. The positive ones being:
- Stateless behaviour of applications
- Tiers in the application
- Application security and identity management
On the other hand, it has also led to stereo-typing of applications. Iam yet to see significant number of applications that deviate from the standard J2EE patterns : MVC–>Service Locator–> Facade –> DAO. It has constrained us in some ways:
- The flow from web to the database and back is one thread of activity
- Patterns donot let us think otherwise
- Platform specifications are considered sacred – not spawning new threads inside an EJB container for e.g
In physical deployments, we consider the job done, by having, say a hardware load-balancer in place to seemingly “load balance” requests between servers. Its not often that load balancing happens by nature of work that needs to be done in order to service a request. It often is a simple IP level round-robin or at best a weighted one based on CPUs on a server.
This leads to the question : Is scalability a factor of number of machines/CPUs?
It appears so unless we think differently. To prove the point, in a IP level load-balanced setup, once a request is assigned to a server, the burden is solely on the machine servicing the request and thereby processing the entire thread of execution for a period of time when other servers could have unused processing capability.
There are ways to address this issue and ensure high CPU utilization before deciding that scalability is a factor of number of machines/CPU:
- Co-locating applications : different applications have varied peak loads. Co-locating applications on a shared setup(software i.e framework, hardware) ensures overall better scalability and availability. [I have worked on an engagement where we have 6 applications co-deployed in production on just 2 blade servers]
- Leveraging multi-threading capabilities of the JVM. Now but isnt that against the specifications? Actually no, if you use the features of the JVM to multi-thread say Message Driven Bean(MDB) for e.g
There are some fundamental changes to the way we design applications in order to make the second point(multi-threading) a reality.
Lets take an example: A sequence of activity in a regular J2EE application to generate invoices would involve:
- Validating the incoming data
- Grouping request data – by article, customer, country e.t.c.
- Retrieving master and transactional data from the RDBMS
- Calculating the invoice amount – tax, other computations
- Generating the final artifact – XML, PDF e.t.c.
In most designs, 1. to 5. happen via components implemented in one or other of the stereotyped J2EE tiers and the execution is therefore serial in nature.
What if we implemented a few of the above steps using the Command pattern i.e the component takes a well defined request and produces a well defined response using the request data provided?
This component may then be remoted: as SOA services or just plain remotely invocable objects, say stateless EJBs.
Go on and now implement a request processor that breaks up a request to multiple discreet units of work. Each unit of work is then a message – request and a response. The messages may then be distributed across machines using a suitable channel – a JMS queue with MDBs listening to it. The interesting thing happens here : the container would spawn MDBs depending on the number of messages in the queue and other resource availability factors, thereby providing the multi-threaded execution. The MDBs themselves may be deployed across machines to “truly” spread the load across available machines.
I therefore believe that scalability in a well designed system is a factor of number of threads that can be efficiently executed in parallel and not just on the number of machines/CPUs.