The undeniable highlight of Thursday’s cloud-happy Structure 09 conference was Vijay Gill, Google senior manager of engineering and architecture. As he described how Google’s famously distributed infrastructure shames the Redmond competition, he would occasionally point his audience to relevant online materials using a deadpan line that put Microsoft’s incurable Mountain View envy is sharp relief. “If you Bing for it,” he would say, “you can find it.”
Yesterday afternoon – as part of a panel discussion on the infrastructure philosophies of Google, Microsoft, and other web giants – Gill took the conference stage alongside Najam Ahmad, Redmond global networking services general manager. Ahmad spent several minutes describing Microsoft’s approach to boosting the worldwide performance of its myriad web services, before Gill calmly undercut just about everything Ahmad said.
“The challenge that we end up with is similar to what Yahoo! has,” the Microsoftee said. “The set of applications we have is tremendous. We have all sorts of applications on our network, and we can’t really come up with one solution that actually works – or even one set of KPIs [key performance indicators] that works. Some services are very stateful, like Messenger, and some are very stateless.
“What we end up with is really varied mixture of KPIs and solutions in terms of building performance.”
As an example, he said, Microsoft is working to juice the performance of its Virtual Earth mapping service using so-called “edge technologies” – a combination of Akamai-like CDN (content delivery network) tools and actual code that runs at the edge of the network. “We’re looking at a mixture of approaches and trying to solve problems for specific applications and types of applications,” he said.
At which point, Gill piped up to explain – in his matter-of-fact monotone – why Microsoft’s philosophy is fundamentally flawed. He pointed his audience to a blog post where an online real estate outfit called Redfin says it recently ditched Microsoft Virtual Earth for Google Maps. Yes, Redfin believes that Google gives better performance.
“Our approach is a little more absolute than [Microsoft's],” Gill said. “Not only does getting to the end user have to be fast, but the back-end has to be extremely fast too…[We are] virtualizing the entire fabric so you get maximum utilization and speed on a global basis as opposed to local fixes – putting one service in a data center, for example, in Denver.
“You want to figure out how you want to distribute that across the entire system so you get it as horizontal as needed, which is essentially the definition of cloud computing.”
You can take issue with his terminology. But his argument is sound: While Microsoft is struggling to separately hone performance for each and every application, Google can uniformly juice speed across its entire portfolio. The secret to Google’s success, Gill said, is not in the company’s mystery data centers, but in its software infrastructure, including GFS, its distributed file system; BigTable, its distributed database; and MapReduce, its distributed number-crunching platform.
“[Data centers] are just atoms,” he said. “Any idiot can build atoms together and then create this vast infrastructure. The question is: How do you actually get the applications to use the infrastructure? How do you distribute it? How do you optimize it? That’s the hard part. To do that you require an insane amount of force of will.”
He’s the one with the force of will. Gill and his architecture team are charged with strong-arming Google developers into writing their applications to what is an extremely restrictive set of distributed platforms. “We have a set of primitives, if you would, that takes those collections of atoms – those data centers, those networks – that we’ve built, and then they abstract that entire infrastructure out as a set of services – some of the public ones are GFS obviously, BigTable, MapReduce.”
Notice he said some of the public ones. It would seem that Google’s working on other distributed platforms it’s yet to share with the rest of the world. The likes of GFS and MapReduce were discussed in Google research papers back in 2004, and they soon gave rise to Hadoop, an open-source distributed platform now used by Yahoo!, Facebook, and others.
Actually, Microsoft uses Hadoop. But that’s only because it recently purchased the semantic search startup Powerset. Not only is open-source slow to reach the Microsoft back-end, but so is, well, the cloud.
“Microsoft has various vertical lines of business that use various portions of infrastructure from soup to nuts,” Google’s Gill said. “Our guys are more horizontally distributed. Pretty much every service runs on GFS as a baseline system.
“If we make a minor change to, say, disk storage to get a three per cent gain, and we roll that our to the GFS library, suddenly the entire base of applications stored on GFS sees that gain.”
The trouble, he says, is getting developers to use the thing. “People are lazy. They say ‘I don’t want to design my applications into this confined space.’ And it is a confined space. So you need a large force of will to get people to do that.”
Gill acknowledges that data-center design plays its own role in all this. But he insists it’s not the “key part,” and when asked to discuss Google’s data center operations, he demurred, pointing his audience to a now famous Google paper called The Data Center as a Computer. “If you Bing for it,” he said, “you can find it.”
Related Posts: On this day...
- Hackers are being radicalised by government policy - 2011
- Researchers have found that the act of detoxification from alcohol results in damage to the areas of the brain that veto spontaneous desire - 2011
- Tom Woods's Nullification: How To Resist Federal Tyranny in the 21st Century - 2010
- Hackers: Heroes of the Computer Revolution, Steven Levy interviewed by Dale Dougherty - 2010
- Mystery on Fifth Avenue - 2008