Rethinking Cloud Computing - From Client/Server to P2P Krishnan Subramanian, Analyst and Researcher, www.krishwo rld .co m
We are still at the early days of Cloud Computing and the technologies underlying this concept are maturing fast. However, the services offered by different vendors are facing downtimes. In fact, such downtimes were part of even the traditional infrastructure approaches like On-Premise hosting and Managed hosting too. Since, Cloud Computing is a new technology, it has already captivated the imagination of businesses worldwide and, as a result, we are also seeing some harsh criticism by certain segment of the media and tech blogosphere. Even though some of them are unwarranted, there is some merit in these criticisms and it is time for us to do a reality check. As I have mentioned in this blog and elsewhere, I believe in a future where we will have an ecosystem of open federated clouds and pretty happy with the current state of progress in Cloud Computing. However, for the sake of this analysis, I would like to wear a contrarian hat and do a complete rethink of the Cloud Computing architectural model. Let us do a brief recap of how Cloud is architected at present and, then, do a complete rethink of this model to keep such downtimes at its bare minimum. Before describing the nature of Cloud Computing as it exists today, let us dig back into the history of computing. Till a few decades back, the computing was done on huge centralized mainframe machines and super computers and are accessed by users using dumb text based terminals. All the software, peripherals, etc. were part of this huge centralized powerful machines and were centrally managed by dedicated teams. This centralized client-server model of computing was in vogue for quite some time before the PC revolution ushered in a new era of distributed client-server model. This new client-server model saw the federation of management and offered greater flexibility than the centralized client-server model. The past few years saw the emergence of Cloud Computing which is a much sophisticated evolution from the centralized client-server system but built using large numbers of cheaper x86 systems. Even though the computing resources in the Cloud model appear to be centralized like the centralized client-server model of mainframe years, there are some significant differences. In the traditional mainframe clientserver model, the work was split between the server and the client whereas in the Cloud model, the work is done completely on the "server" side (I have used the double quotes here to differentiate from a single powerful server). On the traditional model, the server was a single powerful machine like a mainframe or a supercomputer whereas in the cloud model, the "server" is actually a server farm with hundreds or thousands of cheap low end x86 machines that acts as a centralized computing resource. Even though the Cloud model is a much sophisticated evolution from the previous client-server models, we are still dealing with a "centralized resource" from a single vendor. Some of the big vendors use geographically distributed datacenters and state of art virtualization technologies or "fabric" technology to offer high reliability in terms of uptime. However, it is not the case with all the vendors. Many of them use a single datacenter and a Cloud like architecture to offer their infrastructure services. This leads to a single point of failure, like what happened in the case of Rackspace recently. Even with geo-distributed datacenters, there are partial outages like the recent lightning strike on one of the Amazon's datacenters. A way to minimize the downtimes is to do a complete reboot of the way we think about Cloud Computing and architect it using P2P technologies. P2P, also known as Peer to Peer, is defined in Wikipedia as Peer-to-peer (P2P) networking is a method of delivering computer network services in which the participants share a portion of their own resources, such as processing power, disk storage, network bandwidth, printing facilities. Such resources are provided directly to other participants without intermediary network hosts or servers. Peer-to-peer network participants are providers and consumers of network services simultaneously, which contrasts with other service models, such as traditional client-server computing where the clients only consume the server's resources.
The idea of using P2P technology, instead of client-server model, for Cloud Computing is nothing new. It has been discussed and debated in the hallways of academia for quite some time now. Even in the tech blogosphere, Bernard Lunn of ReadWriteWeb wondered about the use of P2P in the Cloud Computing in one of his posts last year. In fact, some of the Cloud storage providers like Wuala are already using the technology. P2P is a great fit for Cloud storage systems offering the much needed reliability. Another area where P2P can play a major role is Content Delivery Networks (CDN) which are usually offered as an extension to Cloud storage offerings. It has been established that the use of Erasure Resilient Codes in P2P based Cloud storage systems greatly improves the reliability of Cloud storage and also eliminates the need of redundant backup servers. They are also quite effective against malicious attacks, thereby, offering higher levels of security. This low cost, highly reliable, secure Cloud storage systems can be a boon to the enterprise customers. P2P on the computing side is not something altogether new. It is an extension from the distributed computing model used in the projects like SETI@Home, Protein Folding, financial modeling, etc.. It is possible to build a Cloud that taps into the idle CPU cycles of desktops and, let me be a bit bold here, the servers in some of the enterprise datacenters. This underlying P2P technology can be masked with a "fabric" that could offer a perception of a centralized computing resource. Well, such a P2P cloud cannot be built on top of naked P2P nodes alone and there is a need for an hybrid approach containing servers for management, messaging, monitoring, etc. (much like how it is done for Skype). I am not discussing my childhood fantasy here in this article. In fact, a group of researchers in the University of Western Ontario has been working on a project to build a stable, predictable, P2P cloud infrastructure that leverages the idle CPU cycles. There are also other academic research groups looking at the P2P based Cloud model. I am not aware of any company, like IBM, doing any research on P2P Cloud but it is just a matter of time before some of these companies start exploring in this direction. Let us now take a moment to see the advantages of a P2P Cloud Improved reliability than the "client-server" cloud. Much more cost effective because there is no need to build expensive datacenters. The very fact that expensive datacenters are not needed means even startups can be Cloud infrastructure players. This eliminates the possibility of few players holding the monopoly control over Cloud infrastructure services. Easy scalability. People who offer their idle CPU times will get to benefit commercially and an ecosystem of P2P node providers will develop around every provider's infrastructure. This opens up new business opportunities for many, thereby, helping us socially. The downsides are It is a long path before the technology matures to be used in production systems. Security and privacy are issues to be tackled. Ensuring the predictability of nodes is a big problem to be solved. Network infrastructure needs to be revamped to handle the extra traffic. (Greg Ness can add more to this point) Management is another difficult problem to solve. Lack of control compared to the "client-server" architecture will make people uneasy. Regulatory issues could be a big deterrent. Well, P2P based Cloud is a realistic possibility but it is not clear if it will be more effective than the current "client-server" model. However, its reliability cannot be questioned and Skype is a good example for this. The technology is still inside the academic labs and it is a long way to go before we see its adoption on the commercial side. It is important that we explore the P2P model as an alternative to the current Cloud Computing model.