Is scalability a factor of number of machines/CPUs?

February 5, 2007

Thanks to the likes of Google, the world today sees application scalability in a new light – that of not being only dependent on Symmetric Multi-Processor (SMP) boxes.  What probably doesnot come out clearly from such implementations is optimization of the available CPU power.

I read somewhere that only a portion of the world’s available processing power is actually used. On the other hand, dont many of us worry about applications being slow? One area that is being addressed to improve performance in regular J2EE applications is that of data access through application strategies(partitioning of data, lazy load e.t.c) and technologies (caches and data grids) , often provided by the vendor themselves.

The ubiquitous  nature of Http and its applications has created its own patterns of application design. The positive ones being:

  • Stateless behaviour of applications
  • Tiers in the application
  • Application security and identity management

On the other hand, it has also led to stereo-typing of applications. Iam yet to see significant number of  applications that deviate from the standard J2EE patterns : MVC–>Service Locator–> Facade –> DAO. It has constrained us in some ways:

  •  The flow from web to the database and back is one thread of activity
  • Patterns donot let us think otherwise
  • Platform specifications are considered sacred – not spawning new threads inside an EJB container for e.g

In physical deployments, we consider the job done, by having, say a hardware load-balancer in place to seemingly “load balance” requests between servers. Its not often that load balancing happens by nature of work that needs to be done in order to service a request. It often is a simple IP level round-robin or at best a weighted one based on CPUs on a server.

This leads to the question : Is scalability a factor of number of machines/CPUs?

It appears so unless we think differently. To prove the point, in a IP level load-balanced setup, once a request is assigned to a server, the burden is solely on the machine servicing the request and thereby processing the entire thread of execution for a period of time when other servers could have unused processing capability.

There are ways to address this issue and ensure high CPU utilization before deciding that scalability is a factor of number of machines/CPU:

  1. Co-locating applications : different applications have varied peak loads. Co-locating applications on a shared setup(software i.e framework, hardware)  ensures overall better scalability and availability. [I have worked on an engagement where we have 6 applications co-deployed in production on just 2 blade servers]
  2. Leveraging multi-threading capabilities of the JVM. Now but isnt that against the specifications? Actually no, if you use the features of the JVM to multi-thread say Message Driven Bean(MDB) for e.g

There are some fundamental changes to the way we design applications in order to make the second point(multi-threading) a reality.

Lets take an example: A sequence of activity in a regular J2EE application to generate invoices would involve:

  1.  Validating the incoming data
  2. Grouping  request data – by article, customer, country e.t.c.
  3. Retrieving master and transactional data from the RDBMS
  4. Calculating the invoice amount – tax, other computations
  5. Generating the final artifact – XML, PDF e.t.c.

In most designs, 1. to 5. happen via components implemented in one or other  of the stereotyped J2EE tiers and the execution is therefore serial in nature.

What if we implemented a few of the above steps using the Command pattern i.e the component takes a well defined request and produces a well defined response using the request data provided?

This component may then be remoted: as SOA services or just plain remotely invocable objects, say stateless EJBs.

Go on and now implement a request processor that breaks up a request to multiple discreet units of work. Each unit of work is then a message – request and a response. The messages may then be distributed across machines using a suitable channel – a JMS queue with MDBs listening to it. The interesting thing happens here : the container would spawn MDBs depending on the number of messages in the queue and other resource availability factors, thereby providing the multi-threaded execution.  The MDBs themselves may be deployed across machines to “truly” spread the load across available machines.

I therefore believe that scalability in a well designed system is a factor of number of threads that can be efficiently executed in parallel and not just on the  number of machines/CPUs.

9 Responses to “Is scalability a factor of number of machines/CPUs?”

  1. Guy Nirpaz Says:

    Hi,

    The scalability paradigm you describe here known also as SBA – Space Based Architecture.
    SBA is a share-nothing architecture. Self sufficient processing units are deployed on a grid.
    Every processing unit contains business logic and data and is activated based on CBR scheme.

    When application tiers are logical, no network overhead is involved and no data format transformation is required. The data is co-located with the business logic and it is in object format there is no need for the fetch (from a central data source) -> lock -> convert to object -> process -> convert to sql -> store paradigm anymore.

    By collapsing the tier and bringing data and logic into the same address space you can achieve true linear scalability, low-latency which results in much better utilization of processing power.


  2. […] posted an interesting question recently: “Is scalability a factor of the number of machines/CPUs?”. His answer can ultimately be summed up as “yes, but…” – it was qualified in terms of […]


  3. Guy & Udi,
    Thanks for your views. They are definitely encouraging.
    One point that I have struggled with is that of optimizing data access when the data is shared between the independent “processing units”.
    Obviously data grid solutions(such the Gigaspaces IMDG) might be addressing these already.
    What is the guarantee on data correctness across the nodes in the grid? It would help to know how the grid works to address this issue…..
    The best, sort of “parallel data store”, I have known till date was Oracle RAC where the data files were shared between nodes and came from fast and efficient storage(such as a SAN).

  4. Guy Nirpaz Says:

    Ragu,

    The key for data correctness is data replication. In GigaSpaces we have a very sophisticated cluster replication paradigm that makes sure data is consistent across the grid.

    For more information, please have a look at: http://www.gigaspaces.com/wiki/display/GS/Data+Grid


  5. Hi Regu, this is a topic that I have a great deal of interest in, and experience with as well.

    Regu: “… in a IP level load-balanced setup, once a request is assigned to a server, the burden is solely on the machine servicing the request and thereby processing the entire thread of execution for a period of time when other servers could have unused processing capability.”

    While this is theoretically true, applications will scale and perform best if you can design them to be load-balance-able AND if they are CPU constrained. In other words, you should _design_ them to max out those CPUs, so that all you have to do to scale linearly or near-linearly is to add more and more commodity servers.

    Furthermore, attempting to distribute CPU-efficient work will negatively impact both latency (how long it takes to process a request) and aggregate CPU (since distribution itself will require CPU work on both ends).

    The key to achieving the goal of bottlenecking on a CPU is to eliminate “IO wait” cycles, i.e. to make sure that the application never waits for data. In our case, we have witnessed this first-hand at quite a few of the worlds largest web sites, for example. (Our Coherence software is used by most of the largest Java- and .NET-based ecommerce, travel, gambling, banking and other web sites.)

    Regu: “There are ways to address this issue and ensure high CPU utilization before deciding that scalability is a factor of number of machines/CPU”

    As I mentioned, the goal should be to achieve scalability by finding a way to turn “load on the application” into “high CPU utilization in the scalable app tier”.

    Regu: “Co-locating applications : different applications have varied peak loads. Co-locating applications on a shared setup(software i.e framework, hardware) ensures overall better scalability and availability. [I have worked on an engagement where we have 6 applications co-deployed in production on just 2 blade servers]”

    This is a much different type of scalability than the one that I am referring to. In the case that I am describing, a large ecommerce site (like some un-named auction site) may have 6,000 application servers running the same application to handle the web traffic, as opposed to one server hosting 6 applications.

    Regu: “Leveraging multi-threading capabilities of the JVM. Now but isnt that against the specifications? Actually no, if you use the features of the JVM to multi-thread say Message Driven Bean(MDB) for e.g”

    Yes, the container will handle messaging, HTTP requests, etc. in a multi-threaded manner. This does help significantly to achieve high CPU utilization, by allowing one thread in an “IO wait” state to go to sleep, letting another one do its work.

    Of course, the key to making it scale across machines is to make sure that those machines are not all sharing a single data source, or they will all be “IO waiting” on the same shared server (e.g. a database), and the result will be that the waits will get longer and longer.

    I hope this helps!

    Peace,

    Cameron Purdy
    http://www.tangosol.com/

  6. Sam Says:

    well there is a way to cluster the things, why not use terracotta, so that load will be scaled out efficiently.

    happy techhing.

  7. imparare Says:

    Interesting comments.. 😀

  8. Subramanian Says:

    A question here on collocation of the web and ejb containers. Is collocation the way forward in todays eBusiness applications ? The takers for distributed architecture (separate web and ejb containers) seem to be going down – in practice who wud want to do away with a single ear being deployed rather than deployments in webserver and appserver separately.

    However, the ease of deployment apart – would it be better to go for collacted approach as the application – the business services – grows ? Is there a compromise on security for the collacted approach? Is the decision based on number of end users/size of application – or what other factors need to be considered ?


  9. Subramanian,

    The main reason I would isolate the web parts and the business tier is for deployment flexibility. By web, I do not include static content. This would of source sit on a Http server in the DMZ. There are no security risks in collocating the dynamic web content(servlet container hosted components) and the business tier if you have infrastructure like firewalls and a DMZ set up. With the move towards POJO based business tiers and frameworks that support it, I personally wouldnt want to add more tiers to my application unless there are significant benefits.

    There is a strong case however for module based deployment (see OSGi) and we might still break up our ear to multiple parts. But I dont see it being done to separate web and ejb parts. I dont see value there.

    The number of end users does not affect deployment approach. The size of the application might influence deployment if it is a standalone application deployed via the network. For server side J2EE components, you might still break it up for benefits like what you get in OSGi i.e. control over dependencies, dynamic deployment without server restart, ability to break up monolithic applications into discreet modules e.t.c.


Leave a reply to Sam Cancel reply