Skip to Content Java Solaris Communities Partners My Sun Sun Store United States Worldwide

»  Contrarian Minds Archive

No Bad Dogs

How to Make a Dog-Slow System Sit Up and Speak

By Al Riske

14.Nov.05--Bryan Cantrill, the slender, energetic engineer behind dynamic tracing, has been recognized as one of the world's top young innovators.

Technology Review -- the prestigious "Magazine of Innovation" published by MIT -- recently placed Cantrill among the top 35 innovators under the age of 35.

"You look at the other people [on the TR35] and they're literally rocket scientists and people finding cures for cancer. Things that I'm used to talking about merely metaphorically, they're actually doing," Cantrill says. "So it's quite humbling in that regard."


One line from the magazine's summary -- "They gravitate to the most interesting and difficult scientific and engineering problems at hand, and arrive at solutions no one had imagined" -- seems especially apt in Cantrill's case.

Apt because Cantrill's brainchild, DTrace, solves a problem software engineers had struggled with for decades.

"He refuses to believe hard problems can't be solved simply because no one else has been able to solve them yet," Sun CTO Greg Papadopoulos says of Cantrill. "DTrace is a perfect example. Bryan and his teammates came up with an elegant solution to a seemingly intractable problem. Not only that, they solved it within the constraints of a production environment."

"When you have customers like FedEx or the Philadelphia Stock Exchange stand on the stage, as they did when we launched Solaris 10, and talk about the wins they got in production, it just about brings a tear to your eye."

Bryan Cantrill
Senior Staff Engineer
Sun Microsystems

 

DTrace, the dynamic tracing facility built into Solaris 10, provides an unprecedented view into interactions between the operating system and the applications running on it.

Simply put, DTrace makes it possible to quickly identify bottlenecks and dramatically increase system performance.

"When you have customers like FedEx or the Philadelphia Stock Exchange stand on the stage, as they did when we launched Solaris 10, and talk about the wins they got in production, it just about brings a tear to your eye," Cantrill says.


In the case of the stock exchange, he recalls, the team used DTrace to examine a sluggish application and soon had it running two-thirds faster -- on a server that was one-third the size.

"I love Jonathan's rhetoric about being the good guys. I think there are a lot of people at Sun who want to work for the good guys, and that's very important to me personally," Cantrill says. "That's why I like that example, but there are many more like that. We've done it over and over again."

"I remember exactly where I was when this happened. It's one of those moments in your life that is crystal clear in your memory."

Bryan Cantrill
Senior Staff Engineer
Sun Microsystems

 

Cantrill came up with the idea for DTrace in 1996, while he was still a computer science student at Brown University in Providence, Rhode Island.

His faculty adviser told him it couldn't be done.

"I sketched out some specific ideas on how I thought it would be possible, and his reaction was, 'Well, you know, if this were possible, they would have already done it by now,'" Cantrill recalls.

"At the time, I took what he said at face value. I thought, There's obviously some subtlety I'm missing -- something about the microprocessor or something -- that makes this impossible."

Fortunately, he didn't listen to the professor's other advice -- "This is the same person who told me that there was nothing interesting in operating system development, that operating systems were done" -- and joined Sun later that year to work on Solaris.


"I wanted to do operating system development, and I interviewed everywhere," Cantrill says. "The amount of energy at Sun was probably three orders of magnitude greater than any other place. All the other computer companies ... their operating system development groups were like morgues, because operating system development was viewed as something of the past."

Not so at Sun.

"I remember exactly where I was when this happened. It's one of those moments in your life that is crystal clear in your memory. We were on Willow Road on the bridge over the 101. I was in the backseat of a blue minivan, talking to Jeff Bonwick (now a Distinguished Engineer at Sun, then an engineer in the operating system group) and sketching out some of these ideas. My question was, 'Why is this impossible? I understand it must be impossible or you would have done it by now, but why is it impossible? What am I missing?' And Jeff said, 'Yeah, I think that would work.'

"It was clear to me at that moment that Sun was, particularly in operating system development, an environment where things were not thought to be impossible simply because they hadn't been done before," Cantrill continues.

"That's incredibly important. It's easy for us to forget, simply because we have such an innovative culture, that the idea that things are impossible simply because smart people have thought about the problem and they didn't come up with an answer -- that's an idea that's pervasive elsewhere."

"If somewhere in the back of your brain there's a voice saying, 'This one is going to have the last laugh' -- that's a hard problem. But the satisfaction you get from solving that kind of problem is incomparable."

Bryan Cantrill
Senior Staff Engineer
Sun Microsystems

 

The reason he pursued OS development says a lot about Cantrill -- and why he couldn't let go of his notions for dynamic tracing.

"I had written video games, I had written spreadsheets, and I had the absurd idea that I could implement anything. Then I did some kernel development. This was at a small company that has an operating system called QNX, in Canada. They had a uniprocessor kernel and I brought it up on a multiprocessor -- designed that architecture -- and oh, my god, did that thing kick my ass," he says.

"I got it working, but there were bugs where I thought to myself, 'I'm never going to find this.' It was a feeling that was completely foreign to me, because I had the idea that for a broken program, if I turned the crank long enough, I'd find the bug. But when you start implementing at the operating system kernel level, the pathologies you discover are much nastier and you lose that feeling. In fact, the feeling you get is, 'I could work in perpetuity on this and not find this bug. And this bug could prevent me from shipping.' That's the other thing that is just terrifying about doing operating system development work. You never know when the bug that you only run into during some stress test actually constitutes a design flaw that is a complete deal breaker for your software."

And this is a reason to pursue OS development?

"If you come up to a problem and you don't know that you can solve it, if somewhere in the back of your brain there's a voice saying, 'This one is going to have the last laugh' -- that's a hard problem. But the satisfaction you get from solving that kind of problem is incomparable," Cantrill says.

"Bryan and his teammates came up with an elegant solution to a seemingly intractable problem. Not only that, they solved it within the constraints of a production environment."

Greg Papadopoulos
Chief Technology Officer
Sun Microsystems

 

The amazing thing about DTrace is not that it solves a hard problem, but that it does so within the constraints of a production environment.

"The core problem we solved was the problem of a production system that is misbehaving in a transient way -- a polite way of saying the thing is dog slow," Cantrill explains.

"It's not crashing, but in many ways a fatal software problem is an easier problem. Why did the browser crash? Why did the operating system crash? But when you're dealing with a performance pathology, you have to ask the question, 'Why is the system slow?' Much harder question to answer."

If an application misbehaves in development, the coder can kill it, recompile, and restart. Not so in a production environment.

"If your Oracle database is misbehaving, you can't restart Oracle. That's not an option. You certainly can't recompile Oracle," he says.

"So, on that production machine, how do you see what the software is doing? Before DTrace there wasn't a lot you could do. There were just a lot of ad hoc tools basically."

"I think someone finally said, 'If DTrace would solve that problem, why don't you go write DTrace?'"

Bryan Cantrill
Senior Staff Engineer
Sun Microsystems

 

At first DTrace was a wish as much as anything.

Cantrill and Mike Shapiro, a friend from Brown whom he recruited to join Sun, kicked the idea around over and over.

"By 1999 or 2000, we had such a clear idea of what we wanted to go build that we knew exactly the kind of problems it would solve. So we would say, 'Oh, man, I really needed DTrace today' or 'DTrace would have saved my ass today.' It was kind of a ridiculous thing for something that didn't exist. Even worse, we started telling other people who were having problems that they couldn't figure out, 'You know, DTrace would solve that problem.'"

Not a line of code had been written yet.

"I think someone finally said, 'If DTrace would solve that problem, why don't you go write DTrace?'"

And, to make a long story short, he and Shapiro -- joined by Adam Leventhal in early 2002 -- did just that.


Readers Survey
I found this article...
Not Informative   Informative   Very Informative
Comments:

Bryan Cantrill

Title: Senior staff engineer, Solaris kernel development.

Honors: Named one of world's top 35 innovators under 35 by Technology Review, MIT's magazine of innovation. One of six Sun engineers to receive an InfoWorld Innovators award for their work on Solaris 10.

Claim to Fame: Inventor of DTrace, a dynamic tracing facility in Solaris 10 that can identify bottlenecks and dramatically increase system performance.

Quote: "With DTrace, I can walk into a room of hardened technologists and get them giggling."

What Others Say: "Bryan Cantrill has proved himself an extraordinary innovator. DTrace is a perfect example. Bryan and his teammates came up with an elegant solution to a seemingly intractable problem. Not only that, they solved it within the constraints of a production environment." - Sun CTO Greg Papadopoulos

Education: ScB in computer science from Brown University.

Interests: Observability and diagnosability. post-mortem analysis, real-time kernel implementations, microprocessor architecture.

Hobby: "Right now my hobby is my son. I have a one-year-old, so he soaks up a lot of my spare time. Spare time is a dim memory at this point."

Pet Peeve: Spelling "a lot" as one word and the misuse of the subjunctive.

Patents: 3 granted, 24 more in the works.

Passion: Ultimate Frisbee.

Current Reading: Beyond Oil, by Kenneth S. Deffeyes.

Favorite Food: Tandoori pizza.

Favorite Movie: Sudden Fear (1952), staring Joan Crawford and Jack Palance.

Favorite Song: "Reanimation" by Blackalicious.

Most-Admired Person: Paul O'Neill, former Secretary of the U.S. Treasury. "He had great courage and was willing to say some very unpopular things and lost his job as a result of it."

What He Wanted to be When He Grew Up: Economist.

What Keeps Him Up at Night: "I'm not optimistic about where our economy is going. I'm worried that the struggles facing my son's generation will be enormous and that we're making them worse."

Little-Known Fact: Can do a disturbingly good impersonation of a dolphin.

What's Next: "Whenever you develop an especially radical technology, you can't really predict where it's going to go or how it's going to be absorbed. What we're seeing with DTrace is that it's being absorbed in a lot of different capacities, some of which we didn't necessarily anticipate. So we've got a lot of work to do. I can see a very clear ribbon of highway for the next two to three years. Beyond that, we'll take stock. I'll keep innovating in system software, that's for sure."

 
Would you recommend this Sun site to a friend or colleague?
Contact About Sun News Employment Privacy Terms of Use Trademarks Copyright 1994-2008 Sun Microsystems, Inc.