Sun and Oracle Community Voices How to Buy Log In United States [Change] English

»  Contrarian Minds Archive

The Celeste Project

Safe Storage Without Backups

Story by Al Riske. Photography by Howard Friedenberg.

13.Sept.07- This is one of those stories where a common problem inspires an uncommon solution.

The problem, in this case, was that an important file had been inadvertently deleted.

No big deal. That's why we have backup systems, right? Well, yes, but ...

"In the end we weren't able to get it back," recalls Glenn Scott, a senior researcher in Sun Labs.

That got him thinking: "What's the deal here? We spend all this money on backup equipment and we have all these people running around doing backups and when it comes time to actually recover a file ... 'Oh, I'm sorry.'"

Irritating? You could say that.

"Then, as we swing the big spotlight on this problem (because we were working on something else at the time) we discover ... there's a train wreck coming," Scott says.

"Backups are insurance. If you don't need them, money's out the door. But if you do need them, 'Whew! You've got a backup.' But what if the backup fails?"

Glenn Scott
Senior Researcher
Sun Microsystems Laboratories

 

People load their computers with tons of applications, games, songs, pictures, videos -- you name it. Companies collect and store mountains of information. There's no shortage of storage in the world. But is it safe? Really safe?

"We've discovered that customers spend a lot of money on backup systems with very little return. It's kind of like money expended on insurance. Backups are insurance. If you don't need them, money's out the door. But if you do need them, 'Whew! You've got a backup.' But what if the backup fails?" Scott asks.


Three years ago, when his colleague lost that file, Scott posed a more interesting question: What would it take to make a storage system that didn't need to be backed up?

"The storage system would just be there. You could store stuff on it, and it would never forget. That means it would have to withstand failures and even equipment upgrades and migrations," he says.

Given the short shelf-life of technology, how could you design a storage system in which the underlying pieces are irrelevant? Scott wondered. What would that look like?

"What would it take to make a storage system that didn't need to be backed up?"

Glenn Scott
Senior Researcher
Sun Microsystems Laboratories

 

The answer: It would look like the Celeste Project.

Celeste is the software needed to create a storage system that never needs to be backed up. The software to create a secure, distributed storage system out of tens (or millions) of disparate, untrusted computers.

Not bad for just 61,000 lines of Java code.

Parts can fail arbitrarily. Nodes can be added or removed at any time. Technologies can come and go. Celeste doesn't care. But that's nothing new.

"There are already systems that do this. Object stores. We sell one, Honeycomb. Others sell them, too. But there are some big differences between them and Celeste," Scott says.

"Object stores allow you to store data across many machines. Some of those machines may fail and can be replaced without the users of the system ever being aware of it. As requests for stored data come in, there's something that says, 'Oh, that's failed. Let me go over here.' But once you've stored the data you can't change it. So that's only suitable for things like photos, movies, music."

Celeste, on the other hand, allows for changes.

And it doesn't simply work around failures. It works around malicious behavior from viruses and bad actors.

"We made a storage system out of [Celeste], but really what we have underneath the storage system is a potentially large distributed system that has some interesting properties. For one, no node is trusted."

Glenn Scott
Senior Researcher
Sun Microsystems Laboratories

 

While Scott describes Celeste as a complete system from beginning to end, he recognizes that parts of it are useful in their own right.

"We made a storage system out of it, but really what we have underneath the storage system is a potentially large distributed system that has some interesting properties," he says.

"For one, no node is trusted. That's very different than most distributed systems. The state of the art is you have a master, and if the master fails there's an algorithm that's used by the remaining systems to elect a new master. But in the underpinnings of Celeste there is no master. Well, this is interesting. So part of what's come out of this project is some new ideas about how to do distributed systems."

There are some pretty powerful advantages.


"In a master system if you have a node that's malicious, what happens if the malicious node becomes elected as the master? Then the whole system is compromised or down. In a masterless system, because of the nature of masterless agreements, a malicious or non-conforming node can be detected. Over time a node that does not conform to the protocols can become ostracized and effectively disconnected," Scott explains.

"As long as nodes are interacting as their neighbors expect them to, everything is good. If a node is acting aberrrantly or not answering the question it previously said it could answer, it loses reputation. So fewer nodes will choose it. Other nodes that are equivalent will be chosen instead. Bad nodes wane or waste away whether from maliciousness or just incidental failure."

Another advantage:

"If a node becomes saturated with requests and becomes slow in responding, other nodes will deselect it -- in other words they'll back off so he'll be able to catch up. As his reputation wanes, other nodes will pick up the work, but as his speed picks back up, then his reputation improves. So there's some load-balancing that goes on automatically in this masterless world."

"Celeste, underneath it all, is also a system that you or I could use to store data and yet not own the equipment."

Glenn Scott
Senior Researcher
Sun Microsystems Laboratories

 

Although the Celeste Project started just three years ago, it incorporates concepts that Scott has been tinkering with for more than a decade. Chiefly, the notion of public utility computing.

"Look at the outlet over there. You don't know where the power comes from. All you know is it conforms to certain expectations. Not everybody even knows what they are. It's 120 volts, 60 hertz. You don't know if it comes from a dam, from a nuke, from a coal-burning plant. You don't even know whose equipment produced it. Siemens? GE? You don't care. You just plug in. Why can't we treat computing the same way?" Scott says.

"Celeste, underneath it all, is also a system that you or I could use to store data and yet not own the equipment. What if all the PCs out there contributed a little storage to a collective. So your data is encrypted and hidden but its spread around the country, so to speak, on everybody else's PC [and vice versa]. You've effectively made a storage system out of nothing. Out of borrowed pieces. Or a service provider could own all the storage equipment and we'd just store stuff in there. We could be competitors, you and I. I can't see your data; you can't see mine. But we treat the storage as a utility just like power."

"A a child should be able to augment the system with no futzing around."

Glenn Scott
Senior Researcher
Sun Microsystems Laboratories

 

"At this point I'm pretty happy with what we have, which is a reference implementation -- an example of a Celeste system that works. This is the kind of project where you could always be buffing chrome, making this piece shinier or that piece shinier. You have to be careful about doing too much of that. Then you're not really working on the problem; you're just playing with the system. Right now we have enough of the hard problems solved to claim the system is good," Scott says.

He notes that a couple of important customers have shown a lot of interest in Celeste, but that he can't talk about them as yet.

"Our internal product people are also looking at it; I haven't got a firm commitment from anyone that they want it, because it is very different than what we're used to thinking about with storage," he says. "It's not an obvious, 'Oh, yeah, we have to do it this way.'


A big part of what makes Celeste work is redundancy. Lots of redundancy. So it's not exactly cheap.

"But it's cheaper in other respects. Humans are probably the most expensive IT component. So the goal of Celeste is to never require human intervention. That means a bunch of things. There are no backups. There's no restore. Adding to the system of course has to be done by somebody. They have to tote it in on a dolly or whatever. So there is some human intervention, but a child should be able to augment the system with no futzing around," Scott says.

"When I talk with folks there's a lot of education involved: It works like this. Here's what you can do with it. Then they go off and think about it. Sometimes they come back with their own ideas. Sometimes they go off and try to build it themselves. This is all part of the process of tech transfer and what the role of a labs project should be. Even if it can't transfer itself, transfer it's ideas."


Glenn Scott

Title: Senior researcher, Sun Microsystems Laboratories.

Expertise: Secure computing in unsecured public environments.

Current Focus: Creating a massive, secure, distributed storage system out of tens to millions of disparate, untrusted computers.

Quote: "We made a storage system out of [Celeste], but really what we have underneath the storage system is a potentially large distributed system that has some interesting properties. For one, no node is trusted."

Education: Studied computer science, electrical engineering, chemical engineering, and applied linguistics at Chapman University in Orange, California, graduating in 1995.

Background: Did research into formal methods of software development, automatic theorem provers, and trusted operating systems for System Development Corporation before joining Sun in 1990.

Accomplishments: Has worked on a wide array of concepts, projects, and products at Sun as an engineer, researcher, and manager.

Patents: 17.

Hobbies: Plays keyboard, trumpet, trombone, and French horn. Also enjoys woodworking, motorcycling, and ham radio.

Last Book Read: Genetic Programming: On the Programming of Computers by Means of Natural Selection, by John R. Koza

Favorite Food: Steak and potatoes.

Favorite Movie: Blade Runner.

Pet Peeve: "People who don't do their jobs."

Little-Known Fact: Went to a one-room school house as a boy.

First Job: Worked as a hired hand, milking cows and mending fences, on a ranch in South Dakota at age 14.

Inspiration: "Ever since I was a kid I've been motivated by a desire to make stuff better than it was before."

Favorite Destination: Switzerland. ("I have family there.")

Proudest Moment: "Watching my sons grow up a little more each day."