<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Is scalability a factor of number of machines/CPUs?</title>
	<atom:link href="http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/feed/" rel="self" type="application/rss+xml" />
	<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/</link>
	<description>A convenient place for me to capture my thoughts and experiences on technology</description>
	<lastBuildDate>Fri, 02 Oct 2009 11:16:31 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: regumindtrail</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-1405</link>
		<dc:creator>regumindtrail</dc:creator>
		<pubDate>Thu, 22 May 2008 05:32:55 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-1405</guid>
		<description>Subramanian,

The main reason I would isolate the web parts and the business tier is for deployment flexibility. By web, I do not include static content. This would of source sit on a Http server in the DMZ. There are no security risks in collocating the dynamic web content(servlet container hosted components) and the business tier if you have infrastructure like firewalls and a DMZ set up. With the move towards POJO based business tiers and frameworks that support it, I personally wouldnt want to add more tiers to my application unless there are significant benefits.  

There is a strong case however for module based deployment (see OSGi) and we might still break up our ear to multiple parts. But I dont see it being done to separate web and ejb parts. I dont see value there.

The number of end users does not affect deployment approach. The size of the application might influence deployment if it is a standalone application deployed via the network. For server side J2EE components, you might still break it up for benefits like what you get in OSGi i.e. control over dependencies, dynamic deployment without server restart, ability to break up monolithic applications into discreet modules e.t.c.</description>
		<content:encoded><![CDATA[<p>Subramanian,</p>
<p>The main reason I would isolate the web parts and the business tier is for deployment flexibility. By web, I do not include static content. This would of source sit on a Http server in the DMZ. There are no security risks in collocating the dynamic web content(servlet container hosted components) and the business tier if you have infrastructure like firewalls and a DMZ set up. With the move towards POJO based business tiers and frameworks that support it, I personally wouldnt want to add more tiers to my application unless there are significant benefits.  </p>
<p>There is a strong case however for module based deployment (see OSGi) and we might still break up our ear to multiple parts. But I dont see it being done to separate web and ejb parts. I dont see value there.</p>
<p>The number of end users does not affect deployment approach. The size of the application might influence deployment if it is a standalone application deployed via the network. For server side J2EE components, you might still break it up for benefits like what you get in OSGi i.e. control over dependencies, dynamic deployment without server restart, ability to break up monolithic applications into discreet modules e.t.c.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Subramanian</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-1404</link>
		<dc:creator>Subramanian</dc:creator>
		<pubDate>Wed, 21 May 2008 13:07:34 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-1404</guid>
		<description>A question here on collocation of the web and ejb containers. Is collocation the way forward in todays eBusiness applications ? The takers for distributed architecture (separate web and ejb containers) seem to be going down - in practice who wud want to do away with a single ear being deployed rather than deployments in webserver and appserver separately. 

However, the ease of deployment apart - would it be better to go for collacted approach as the application - the business services - grows ? Is there a compromise on security for the collacted approach? Is the decision based on number of end users/size of application - or what other factors need to be considered ?</description>
		<content:encoded><![CDATA[<p>A question here on collocation of the web and ejb containers. Is collocation the way forward in todays eBusiness applications ? The takers for distributed architecture (separate web and ejb containers) seem to be going down &#8211; in practice who wud want to do away with a single ear being deployed rather than deployments in webserver and appserver separately. </p>
<p>However, the ease of deployment apart &#8211; would it be better to go for collacted approach as the application &#8211; the business services &#8211; grows ? Is there a compromise on security for the collacted approach? Is the decision based on number of end users/size of application &#8211; or what other factors need to be considered ?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: imparare</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-59</link>
		<dc:creator>imparare</dc:creator>
		<pubDate>Sun, 15 Apr 2007 08:34:32 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-59</guid>
		<description>Interesting comments.. :D</description>
		<content:encoded><![CDATA[<p>Interesting comments.. <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sam</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-10</link>
		<dc:creator>Sam</dc:creator>
		<pubDate>Fri, 16 Feb 2007 19:48:46 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-10</guid>
		<description>well there is a way to cluster the things, why not use terracotta, so that load will be scaled out efficiently.

happy techhing.</description>
		<content:encoded><![CDATA[<p>well there is a way to cluster the things, why not use terracotta, so that load will be scaled out efficiently.</p>
<p>happy techhing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cameron Purdy</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-9</link>
		<dc:creator>Cameron Purdy</dc:creator>
		<pubDate>Tue, 13 Feb 2007 20:10:31 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-9</guid>
		<description>Hi Regu, this is a topic that I have a great deal of interest in, and experience with as well.

Regu: &quot;... in a IP level load-balanced setup, once a request is assigned to a server, the burden is solely on the machine servicing the request and thereby processing the entire thread of execution for a period of time when other servers could have unused processing capability.&quot;

While this is theoretically true, applications will scale and perform best if you can design them to be load-balance-able AND if they are CPU constrained. In other words, you should _design_ them to max out those CPUs, so that all you have to do to scale linearly or near-linearly is to add more and more commodity servers.

Furthermore, attempting to distribute CPU-efficient work will negatively impact both latency (how long it takes to process a request) and aggregate CPU (since distribution itself will require CPU work on both ends).

The key to achieving the goal of bottlenecking on a CPU is to eliminate &quot;IO wait&quot; cycles, i.e. to make sure that the application never waits for data. In our case, we have witnessed this first-hand at quite a few of the worlds largest web sites, for example. (Our Coherence software is used by most of the largest Java- and .NET-based ecommerce, travel, gambling, banking and other web sites.)

Regu: &quot;There are ways to address this issue and ensure high CPU utilization before deciding that scalability is a factor of number of machines/CPU&quot;

As I mentioned, the goal should be to achieve scalability by finding a way to turn &quot;load on the application&quot; into &quot;high CPU utilization in the scalable app tier&quot;.

Regu: &quot;Co-locating applications : different applications have varied peak loads. Co-locating applications on a shared setup(software i.e framework, hardware)  ensures overall better scalability and availability. [I have worked on an engagement where we have 6 applications co-deployed in production on just 2 blade servers]&quot;

This is a much different type of scalability than the one that I am referring to. In the case that I am describing, a large ecommerce site (like some un-named auction site) may have 6,000 application servers running the same application to handle the web traffic, as opposed to one server hosting 6 applications.

Regu: &quot;Leveraging multi-threading capabilities of the JVM. Now but isnt that against the specifications? Actually no, if you use the features of the JVM to multi-thread say Message Driven Bean(MDB) for e.g&quot;

Yes, the container will handle messaging, HTTP requests, etc. in a multi-threaded manner. This does help significantly to achieve high CPU utilization, by allowing one thread in an &quot;IO wait&quot; state to go to sleep, letting another one do its work.

Of course, the key to making it scale across machines is to make sure that those machines are not all sharing a single data source, or they will all be &quot;IO waiting&quot; on the same shared server (e.g. a database), and the result will be that the waits will get longer and longer.

I hope this helps!

Peace,

Cameron Purdy
http://www.tangosol.com/</description>
		<content:encoded><![CDATA[<p>Hi Regu, this is a topic that I have a great deal of interest in, and experience with as well.</p>
<p>Regu: &#8220;&#8230; in a IP level load-balanced setup, once a request is assigned to a server, the burden is solely on the machine servicing the request and thereby processing the entire thread of execution for a period of time when other servers could have unused processing capability.&#8221;</p>
<p>While this is theoretically true, applications will scale and perform best if you can design them to be load-balance-able AND if they are CPU constrained. In other words, you should _design_ them to max out those CPUs, so that all you have to do to scale linearly or near-linearly is to add more and more commodity servers.</p>
<p>Furthermore, attempting to distribute CPU-efficient work will negatively impact both latency (how long it takes to process a request) and aggregate CPU (since distribution itself will require CPU work on both ends).</p>
<p>The key to achieving the goal of bottlenecking on a CPU is to eliminate &#8220;IO wait&#8221; cycles, i.e. to make sure that the application never waits for data. In our case, we have witnessed this first-hand at quite a few of the worlds largest web sites, for example. (Our Coherence software is used by most of the largest Java- and .NET-based ecommerce, travel, gambling, banking and other web sites.)</p>
<p>Regu: &#8220;There are ways to address this issue and ensure high CPU utilization before deciding that scalability is a factor of number of machines/CPU&#8221;</p>
<p>As I mentioned, the goal should be to achieve scalability by finding a way to turn &#8220;load on the application&#8221; into &#8220;high CPU utilization in the scalable app tier&#8221;.</p>
<p>Regu: &#8220;Co-locating applications : different applications have varied peak loads. Co-locating applications on a shared setup(software i.e framework, hardware)  ensures overall better scalability and availability. [I have worked on an engagement where we have 6 applications co-deployed in production on just 2 blade servers]&#8221;</p>
<p>This is a much different type of scalability than the one that I am referring to. In the case that I am describing, a large ecommerce site (like some un-named auction site) may have 6,000 application servers running the same application to handle the web traffic, as opposed to one server hosting 6 applications.</p>
<p>Regu: &#8220;Leveraging multi-threading capabilities of the JVM. Now but isnt that against the specifications? Actually no, if you use the features of the JVM to multi-thread say Message Driven Bean(MDB) for e.g&#8221;</p>
<p>Yes, the container will handle messaging, HTTP requests, etc. in a multi-threaded manner. This does help significantly to achieve high CPU utilization, by allowing one thread in an &#8220;IO wait&#8221; state to go to sleep, letting another one do its work.</p>
<p>Of course, the key to making it scale across machines is to make sure that those machines are not all sharing a single data source, or they will all be &#8220;IO waiting&#8221; on the same shared server (e.g. a database), and the result will be that the waits will get longer and longer.</p>
<p>I hope this helps!</p>
<p>Peace,</p>
<p>Cameron Purdy<br />
<a href="http://www.tangosol.com/" rel="nofollow">http://www.tangosol.com/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guy Nirpaz</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-8</link>
		<dc:creator>Guy Nirpaz</dc:creator>
		<pubDate>Tue, 13 Feb 2007 19:15:36 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-8</guid>
		<description>Ragu,

The key for data correctness is data replication. In GigaSpaces we have a very sophisticated cluster replication paradigm that makes sure data is consistent across the grid.

For more information, please have a look at: http://www.gigaspaces.com/wiki/display/GS/Data+Grid</description>
		<content:encoded><![CDATA[<p>Ragu,</p>
<p>The key for data correctness is data replication. In GigaSpaces we have a very sophisticated cluster replication paradigm that makes sure data is consistent across the grid.</p>
<p>For more information, please have a look at: <a href="http://www.gigaspaces.com/wiki/display/GS/Data+Grid" rel="nofollow">http://www.gigaspaces.com/wiki/display/GS/Data+Grid</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: regumindtrail</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-7</link>
		<dc:creator>regumindtrail</dc:creator>
		<pubDate>Tue, 13 Feb 2007 06:52:38 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-7</guid>
		<description>Guy &amp; Udi,
Thanks for your views. They are definitely encouraging.
One point that I have struggled with is that of optimizing data access when the data is shared between the independent &quot;processing units&quot;. 
Obviously data grid solutions(such the Gigaspaces IMDG) might be addressing these already.
What is the guarantee on data correctness across the nodes in the grid? It would help to know how the grid works to address this issue.....
The best, sort of &quot;parallel data store&quot;, I have known till date was Oracle RAC where the data files were shared between nodes and came from fast and efficient storage(such as a SAN).</description>
		<content:encoded><![CDATA[<p>Guy &amp; Udi,<br />
Thanks for your views. They are definitely encouraging.<br />
One point that I have struggled with is that of optimizing data access when the data is shared between the independent &#8220;processing units&#8221;.<br />
Obviously data grid solutions(such the Gigaspaces IMDG) might be addressing these already.<br />
What is the guarantee on data correctness across the nodes in the grid? It would help to know how the grid works to address this issue&#8230;..<br />
The best, sort of &#8220;parallel data store&#8221;, I have known till date was Oracle RAC where the data files were shared between nodes and came from fast and efficient storage(such as a SAN).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Udi Dahan - The Software Simplist &#187; Blog Archive &#187; So, how many machines/CPUs do we need?</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-6</link>
		<dc:creator>Udi Dahan - The Software Simplist &#187; Blog Archive &#187; So, how many machines/CPUs do we need?</dc:creator>
		<pubDate>Mon, 12 Feb 2007 13:02:27 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-6</guid>
		<description>[...] posted an interesting question recently: “Is scalability a factor of the number of machines/CPUs?”. His answer can ultimately be summed up as “yes, but…” – it was qualified in terms of [...]</description>
		<content:encoded><![CDATA[<p>[...] posted an interesting question recently: “Is scalability a factor of the number of machines/CPUs?”. His answer can ultimately be summed up as “yes, but…” – it was qualified in terms of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guy Nirpaz</title>
		<link>http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-5</link>
		<dc:creator>Guy Nirpaz</dc:creator>
		<pubDate>Fri, 09 Feb 2007 15:31:38 +0000</pubDate>
		<guid isPermaLink="false">http://regumindtrail.wordpress.com/2007/02/05/is-scalability-a-factor-of-number-of-machinescpus/#comment-5</guid>
		<description>Hi,

The scalability paradigm you describe here known also as SBA - Space Based Architecture. 
SBA is a share-nothing architecture. Self sufficient processing units are deployed on a grid.
Every processing unit contains business logic and data and is activated based on CBR scheme.

When application tiers are logical, no network overhead is involved and no data format transformation is required. The data is co-located with the business logic and it is in object format there is no need for the fetch (from a central data source) -&gt; lock -&gt; convert to object -&gt; process -&gt; convert to sql -&gt; store paradigm anymore.

By collapsing the tier and bringing data and logic into the same address space you can achieve true linear scalability, low-latency which results in much better utilization of processing power.</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>The scalability paradigm you describe here known also as SBA &#8211; Space Based Architecture.<br />
SBA is a share-nothing architecture. Self sufficient processing units are deployed on a grid.<br />
Every processing unit contains business logic and data and is activated based on CBR scheme.</p>
<p>When application tiers are logical, no network overhead is involved and no data format transformation is required. The data is co-located with the business logic and it is in object format there is no need for the fetch (from a central data source) -&gt; lock -&gt; convert to object -&gt; process -&gt; convert to sql -&gt; store paradigm anymore.</p>
<p>By collapsing the tier and bringing data and logic into the same address space you can achieve true linear scalability, low-latency which results in much better utilization of processing power.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
