While writing Netty socket listeners for Flipkart Phantom project, we realized the need for a Unix Domain Sockets based transport. We wrote one and published more details here : https://github.com/Flipkart/phantom/wiki/Unix-domain-socket-transport-for-netty

It is a UDS transport implementation for Netty and is compliant with Netty version 3.3.x. This transport is a port of the Netty OIO transport implementation and uses UDS specific API. It may be used independently of Phantom and has dependencies only on Netty and junixsocket libraries.


2012 in review

December 31, 2012

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

600 people reached the top of Mt. Everest in 2012. This blog got about 5,300 views in 2012. If every person who reached the top of Mt. Everest viewed this blog, it would have taken 9 years to get that many views.

Click here to see the complete report.

Code commits on Trooper (https://github.com/regunathb/Trooper/) have kept me busy the last few weeks. Trooper is an umbrella project for a number of things.

Trooper now has a RabbitMQ connector for Mule.

Do a Google on Big Data and you are more likely to find people talking about two things:

  • How Open Source solutions like Hadoop have pioneered this space
  • How some companies have used these solutions to build large scale analytics solutions and business intelligence modules.

Read more and one will find mention of Map Reduce and how many of the NoSQL data stores support this useful “Data Locality” pattern – taking compute to where the data is.

Hadoop users and the creators themselves acknowledge that the technology is good for “streaming reads” and supports high throughput at the cost of latency. This constraint and the fact that Map Reduce tasks are very I/O bound, make it seemingly unsuitable for use cases that involve users waiting for a response such as in OLTP applications.

While all of the above is relevant and mostly true, is it also leading to a certain stereo-typing – that of equating Big Data to Big Analytics?

It might be useful to describe Big Data first. Gartner categorizes data build up in an enterprise as under : Volume, Variety and Velocity. Rapid growth in any of these categories or combinations thereof, results in Big Data. It might be worthwhile to note here that there is no classification under transaction processing or analytics, thereby implying that Big Data is not just Big Analytics.

Big Data solutions need not be limited to Big Analytics and may extend to low latency data access workloads as well. A few random thoughts on patterns and solutions:

  • Data Sharding – useful to scale low latency data stores like RDBMS to store Big Data. Sharding may be built into application code, use an intermediary between the application and data store or inherently supported by the data store using auto-sharding of data.
  • Data Stores by purpose – Big Data invariably means distribution and may result in data duplication; within a single store or multiple. For e.g. data extracts from a DFS like Hadoop may also be stored in a high-speed NoSQL or sharded RDBMS and accessed via secondary indices. This could lead to scenarios outlined by the CAP theorem (http://en.wikipedia.org/wiki/CAP_theorem).
  • Data Stores that effectively leverage CPU, RAM and disk space – Moore’s Law has been proven right the last few years and data stores like the Google Big Table (or HBase) successfully leverage the trend of abundant commodity compute, memory and storage.
  • Optimized Compute Patterns – Efforts like Peregrine(http://peregrine_mapreduce.bitbucket.org/) that support pipe-lined Map Reduce jobs.
  • Data aware Grid topologies – A compute grid where worker participation in a compute task is influenced by data available locally to the worker, usually in-memory. Note that this is different from the data locality pattern implemented in most Map Reduce frameworks.
  • And more…..

It may suffice to say that Big Analytics has been the most visible and commonly deployed use case on Big Data. New age companies, especially the internet based ones, have been using Big Data technologies to deliver content sharing, email, instant messaging and social platform services in near real time. Enterprises are slowly but surely warming up to this trend.

The big software vendors use the term – Service Registry synonymously with SOA Governance. In the process they inadvertently confuse a reader into thinking that setting up a Service Registry can ensure SOA Governance. I wrote an article on this subject that got published here : http://www.cioupdate.com/insights/article.php/3886106/SOA-Governance-Requires-More-than-a-Service-Registry.htm

I wrote this article for CIOUpdate.com (http://www.cioupdate.com) a while back on SOA and its relation to the Cloud. The article tries to introduce the two concepts and compares them using different view perspectives.

You can find the original article here : http://www.cioupdate.com/reports/article.php/3853076/How-SOA-and-the-Cloud-Relate.htm

Mention SOA or Services and most of your audience would immediately relate it to web-services – yes, the often un-intented misuse of XML over Http that gives the technology and anything related to it a bad name in the world of high-performance J2EE applications.

Two of the biggest culprits in loss of Performance are I/O and Transformation overheads. Web services has both these drawbacks – increased data transfer i.e higher I/O associated with markup overhead of well-formed XML and the CPU utilization overhead when converting XML to Java and back aka. Marshalling.

Web-services and its implementation of XML over Http is good when it is genuinely needed. For e.g. exposing services for consumption with partner organizations where consuming technologies are not known or for integration between disparate systems.  However often this need for integration unfortunately leads people to stereotype services as web-services in a SOA. 

The question then is : can we reap the benefits of SOA and not suffer the drawbacks of the overheads inherent in web-services? I believe, we can.

Quite a while back, I read this excellent IBM Redbook : Implementing SOA using ESB where the author recommends deploying a B2B Gateway external to the ESB. I must admit it didnot make much sense to me then. I have come to appreciate it much better these days. A B2B Gateway enables consumption of services by “third-party” . This “third-party” may be a client from a different technology platform or from an altogether different organization.

A separate B2B gateway introduces the possibility of:

  • Making the web-service channel independent of the service implementation and therefore a matter of choice to use (and therefore suffer) the XML over Http interface
  • Introducing the much required security standards(and implementations) for securing services and data managed by the services
  • Using third party implementations that specialize in implementing WS-* policies
  • Using hardware to augment the processing capability provided by software frameworks – e.g. XML appliances

The SOA runtime therefore must enable services to be written independently of XML and the WS-* specifications/constraints.  The Web-service interface  is then an optional channel , via a B2B Gateway, to invoke the services. 

We, at MindTree, have taken this design further in our implementation of Momentum – an SOA based delivery Platform. Interfaces like JMS and web-services are optional channels provided by the Framework to invoke any deployed service. A schematic that explains this approach is shown below:


Request flow to a Service from different channels

The web-service interface is therefore an optional means to invoke your service when you separate the service container and the ESB(optional) from the B2B Gateway and deploy the latter as a separate infrastructure.  You can then benefit from the good of web-services without compromising on your service’s QoS.