Maximum Interview: the Science Behind Folding @ Home


Walking into the Pande Lab at Stanford University is somewhat of a hardcore geek’s ultimate dream. This is, after all, where the real work gets done—or should we say, work units. For the various desktop systems and consoles scattered around the area are all a part of a larger initiative that likely you and I, as well as Stanford graduate students, researchers from around the globe, and consortiums of geeks and enthusiasts alike, have all contributed to.

But don’t take my word for it. Dr. Vijay Pande, an associate professor of chemistry, structural biology, and computer science over at Stanford—as well as the longtime director of the Folding@Home distributing computing project, which his aptly titled “Pande Lab” oversees—estimates that around 400,000 systems actively “fold” at the current moment. Given the program’s fairly linear growth of around 40,000 new systems a year, Folding@Home should be able to push past half a million “connected” PCs easily before its crystal anniversary.

"It isn’t something like someone finds the magic cure directly in the computer, but...our hope is to radically accelerate the time to discovery"

Which is all well and good but what, er , does Folding@Home actually do ? In short, the distributed computing network splits up the laborious task of analyzing how proteins fold into a three-dimensional structure from a simpler chain of amino acids. Individual computers (or consoles) involved with Folding@Home run simulations in an effort to learn more about misfiring (or misfolding) proteins, which can clump together and form the basis for a number of different biological issues. Understanding the ins and outs of their metamorphoses can give researchers additional insight into the building blocks of more unpleasant issues.

Although the results of a single Folding@Home work unit isn’t going to unlock the keys to the kingdom of a disease like, say, Alzheimer’s, the research being collected by all the systems working in a kind of lockstep supplements the work of Pande’s Lab and Stanford’s drug design program. And with 75 published papers and counting that can directly be traced to the distributed work of Folding@Home, it’s not as if the project has fallen short on its goal of producing actionable results.

"Running anything on a GPU was a really big deal in the early days. The new goal in Folding@Home version 7 is to really make it as easy as possible again"

Dr. Pande sat down with us for a quick 30-minute chat (just how many work units is that?) on the eve of Folding@Home’s 10 th birthday earlier this month. We’ve reproduced some of his thoughts about distributed computing and Folding@Home below:

On Folding@Home’s 10 th anniversary:

I’ve made this joke with my wife and some of my group members that if I could be sent back in time ten years, the thing we’re doing now… I think people in the field would be really surprised and, in many ways, blown away because what we can do now, in some ways, exceeds what our hope was.”

“I think that in ten years of developing new algorithms, we’ve shown that we can really tackle these new problems. It’s also opened the door to show that we’ve radically changed how you can develop your algorithms to run on these distributed computing platforms.”

Folding@Home’s biggest challenge:

“It’s one thing to write an algorithm to run on one computer. It’s even a big deal to make something that can work on ten computers. To have something that can divide up the work amongst 100,000 or 500,000 computers is a big thing to do.”

“An analogy my wife likes to make is that you can’t take nine women and have them give birth to a baby in a month. Very few problems can be broken up into very small bits. And whenever you run something on Folding@Home, it has to be broken into these small chunks.”

“Folding@Home might have been viewed as a niche ten years ago, but that niche is becoming more mainstream.“

“The reason why it’s more of an issue is if you look at the roadmaps of Intel or AMD, they’re just adding more cores: four-core, eight-core, six-core chips are common. If you think about what it’ll be like in ten years, it’ll be 1,000-core chips. If you can’t break your problem up amongst more than four cores, you won’t see any kind of speedup. We can break our program up amongst half a million cores right now – we could scale to a billion cores, easily.”

The Growth of the Social/Communal/Team-based Aspect of Folding@Home:

I wouldn’t say that was a surprise entirely, but it always surprised me. I’m always excited and happy to see the extent of what people do. It was clear that distributed computing would have to work in terms of getting people to work together in some level, even if it was just a bunch of individuals.”

“I think it’s also gotten people excited about the science. That’s made me very happy to see. There’s one story—I forget the name behind it—but I remember reading five years ago in our forums. I think it was a granddaughter and grandfather, or it could have been an uncle and niece. They ran Folding@Home together on her computer and he used it to teach her about the science of things, and I think for her Christmas present he printed out her certificate that shows all the work they did for the project.

It was both an educational sort of thing and a sort-of bonding kind of thing. Seeing it all together was very touching and surprising, in that sense.”

What Folding@Home Actually Accomplishes:

“In terms of what we can do, I think it’s really been a radical departure from what anyone else can do with simulation. I think it’s had significant payoffs both in terms of direct studying with disease and, in addition to the impact directly on disease, the general knowledge we’ve gotten--the basic research--could have a much greater impact in years to come.”

“If you can understand how proteins fold, you can do something where, ideally, you don’t have to run repetitive simulations for each disease, but you can understand general properties and basic concepts.”

“It has to field into a drug-design team. Normally what feeds in to this drug-design team are things called crystal structures, or experimental data that tells you what the protein target looks like. There isn’t any natural analog as to what people believe are the toxic element in Alzheimer’s and that’s something we can get out of the simulation.

It isn’t something like someone finds the magic cure directly in the computer, but it could never be that way because whatever you do computationally has to be tested experimentally and so on. Our hope is to radically accelerate the time to discovery.”

Folding@Home, Gamers, and Hardware:

“We’ve been moving aggressively into new hardware—especially into the PS3 and GPUs … One, there’s a great opportunity because those machines can be very fast if you can code for them. Second, I am constantly thinking about where the people interested in running something like Folding@Home would be moving to, and many of them are gamers.

For gamers or heavy-duty server stuff, that’s where the big desktops are these days. There are plenty of desktops being sold. Gamers typically have powerful machines and they’re not using them 24/7 for games… and they have powerful GPUs.”

Distributing Folding@Home Alongside Systems and Consoles:

“There are computer companies that have come to us—‘Would you be interested in this? How can we make this work?’ Part of is it just to find a way to make this a win-win. I find, in general, the response from computer companies has been positive. They want these computers to do good; It’s a good feeling within the company with them.

One negative, if nothing else, is that a lot of these machines just come with so much bloatware. There’s a move to strip them down and that is sort of working against us right now.”

Upgrading Folding@Home’s Power or Presentation:

“We have new algorithms we’re doing all the time. A lot of the key to our algorithmic successes is that we come up with different methods that can give a massive speed increase versus what you could do in a vanilla way – there are 3, or 4, or 5 methods like that, which can increase speeds by 100 or 1,000 times.

One thing we can do algorithmically is just to come up with new ideas. There’s something new we’re working on right now that is another factor of 100 or 1,000 on top of GPUs, which can already have that factor of 100 or 1,000 (versus CPUs).”

“Running anything on a GPU was a really big deal in the early days. The new goal in Folding@Home version 7 is to really make it as easy as possible again. We started off trying to make Folding@Home as easy as possible, and we got a lot of computer power that way, and it had to be complicated up so we could use these advanced things like GPUs. But it has to get easier again. If we made it easier, I think it’d go a long way.

In my mind, I would love to get to a point—and maybe we’ll only get to this in version 8—but every time someone goes to the Folding@Home Web page, every click they make--you lose people. If we could get people to do something in one click--you agree to download the client and everything else is done in a simple, automated way, that’s’ the kind of thing that could have the greatest chance of getting to people.”

Is Distributed Computing a… Competition?

“I think one question people ask: Are there going to be 500 distributing projects in the future? Or 50? Or five? If there are only going to be five, is that a sign of success or a sign of failure?”

“I have a feeling there’s going to be more merging of things where the amount of scientific output will increase but the amount of visible names of projects might even decrease. That might even be a good thing because it takes so much effort to run these infrastructures and there’s no need to reinvent the wheel in 500 different places.“

“In general, that would be one way—as well—you could use 1,000 cores. You could run 1,000 different projects. It’s a trivial way to parallelize things and I think, unfortunately, that dilutes the true power of these machines. I think what is probably going to be more likely is that we’ll see coalescing of these different infrastructures.”

Why Fold?

“In the end, I think this is really rooted in human nature. It’s both a combination of altruism and also competition. People do want to contribute the most and things like that—it’s like friendly competition. People want to get the high score in the game or something like that.

That gets peoples interests and builds communities, and it helps too that we really are making a difference in some pretty nasty diseases. And so it’s something where it also builds upon those people who have had family members with diseases and so on. It’s also something that i personally take very seriously.

I think about the next ten years of where I want to be. I’m very happy with the way things have come over the last ten years. I think the next ten years will be very interesting—ideally with a small-molecule drug.”

Former Maximum PC Editor David Murphy tried to muti-core fold on a Mac once, to great failure. He looks forward to the day when he can turn his NAS box into an F@H powerhouse.

Around the web