Skip to Content Java Solaris Communities Partners My Sun Sun Store United States Worldwide

»  Contrarian Minds Archive
Tapestry of Innovation

Highly Threaded Creations Sell Well for Sun

Story by Al Riske. Photography by Howard Friedenberg.

10.November.08 - Rick Hetherington has reason to feel good. In the high-risk business of developing microprocessors, he and the team have been on a roll.

Hetherington, you see, is chief technology officer in Sun's Microelectronics group and has been responsible for driving the company's innovative approach to processor design. Namely, chip multithreading, or CMT.

"We started out with 32 threads in 2005 with Niagara 1 and the T2000 and here we are in 2008 -- just three years later -- and we're up to 256 threads," he says.

That's quite a climb. Not surprisingly, sales of CMT systems have been climbing, too -- by more than 80 percent last quarter -- and now represent a $1 billion-plus business for Sun.

What's more, the most recent addition to the family is a blockbuster system that promises to change the economics of information technology: The Sun SPARC Enterprise T5440, a.k.a. "Batoka."

"It brings all the benefits we've known from CMT to midrange computing -- the outstanding performance, energy efficiency, and cost effectiveness that has only previously been available in the volume space," says System EVP John Fowler.

"We started out with 32 threads in 2005 with Niagara 1 and the T2000 and here we are in 2008 -- just three years later -- and we're up to 256 threads."

Rick Hetherington
Chief Technology Officer, Microelectronics
Sun Microsystems

 

The concept behind CMT is quite simple. Instead of designing ever-more complex processors and chasing after ever-higher clock rates, Sun has done the opposite -- coming out with simpler designs that are, in some respects, actually slower.

Slower at running a single task, that is; but able to juggle many tasks at the same time.

And get more done.

The T5440, for example, had already posted seven world records by the time it was launched last month.

Hetherington
"Sun's latest multicore server, the Sparc Enterprise T5440, runs rings around more expensive devices when performing a lot of tasks concurrently," InformationWeek reports. "It's ideal for databases and virtualization and excels at encrypting and decrypting data thanks to on-board security co-processors."

Able to process 256 threads, or tasks, at once, the new server is almost like a datacenter in a box.

"We're headed in that direction," Hetherington says. "We are aggressively compacting systems. That will be our competitive advantage. Instead of datacenters composed of tiny two-socket servers -- thousands of them interconnected through very expensive networking equipment -- why not consolidate a lot of that activity onto as few boxes as possible?"

In other words, the approach Hetherington and team have been perfecting is far different -- and far more efficient -- than what most of the world has been doing.

"Sun's latest multicore server, the Sparc Enterprise T5440, runs rings around more expensive devices when performing a lot of tasks concurrently. It's ideal for databases and virtualization and excels at encrypting and decrypting data thanks to on-board security co-processors"

Jeff Ballard
InformationWeek

 

Of course Hetherington will be the first to point out that it's not all about the processor. As with everything Sun does, multithreading is designed into every part of the system.

"It's no different than any other breakthrough in the history of computing," he says. "Hardware typically has to precede software."

Fortunately, with Sun's long history of building servers with multiple processors, the Solaris operating systems has long been capable of scheduling multiple tasks. (Now of course a single processor with multiple cores running multiple threads can do the work of many single-core processors.)

"We have a lot of close contact with the Solaris team, particularly with the scheduler group, so that threads and cores are scheduled properly, allowing applications to scale," Hetherington says.

What's more, the fault-management and power-management features in Solaris are designed with a deep understanding of the underlying hardware.

"When errors are reported, the diagnose engine in the Solaris Fault Management Architecture makes, let's say, a thoughtful determination of what's going on with the hardware," he says.

Likewise, he adds, "The environmentals are controlled by the platform firmware. For example, there's no reason to run cooling fans at their peak if the conditions inside the box don't demand it."

In fact, Hetherington and team are also working hand-in-hand with Sun's tools group in an effort to make it easier for software developers to write code that takes full advantage of chip multithreading.

While Solaris, Java, and database software are equipped to deal with lots of threads, other applications are not.

Hetherington
"People don't typically think in parallel ways, we are essentially sequential beings," Hetherington says. "This is one of the most difficult problems in computer science. But our tools group is making a lot of progress with auto parallelization in our compilers. And I expect to see more improvements as we make cheap and highly threaded systems available to the developer community."

"Once the company determined that we were going down this path, we developed a roadmap and we developed a predictable cadence of introducing processors that had higher and higher thread counts."

Rick Hetherington
Chief Technology Officer, Microelectronics
Sun Microsystems

 

All of which leads up to what Hetherington describes as the toughest challenge in the processor game.

"Once the company determined that we were going down this path, we developed a roadmap and we developed a predictable cadence of introducing processors that had higher and higher thread counts," he explains. "But the challenge is keeping up with changes that happen in the industry."

In other words, a lot can happen in the five years it typically takes to develop a microprocessor.

"To be successful in the processor business you can't do a lot of about-faces midterm. You really have to set a course and stay with it. Otherwise you never really get there," he says.

"So how do you stick to the schedule and still react to those things that are changing in the industry and make sure your features are aligned so you have a product that's interesting when you get there? How do you stay relevant even though you have such long development cycles?"

Take the whole Web 2.0 phenomenon.

Hetherington notes that popular scripting language such as PHP, Ruby, and Perl have little concept of threading or parallel processing.

"They're used because they're easy to code and easy to get a function up and running quickly and not really worry about underlying performance or 'How well does this will perform on a processor with two or three levels of caching?' That's completely out of consideration," he says.

"So that's been a bit of change in our needs since we've gone along here. We're working on that now. We're studying these codes, studying the architectures, and coming up with ways of effectively running them on our CMT cores."

"With the Victoria Falls coherence fabric, we get a near-linear improvement -- an 80 percent bump in performance as we double the socket count."

Rick Hetherington
Chief Technology Officer, Microelectronics
Sun Microsystems

 

While competitors have begun to follow Sun's approach to processor design, we remain far ahead in terms of thread count with our third-generation CMT processors.

Hetherington
And one key to the success of the UltraSPARC T2 Plus was the addition a patented coherence fabric.

Make that four independent coherence fabrics.

"That allows us to scale nearly linearly," Hetherington says. "Typically when you go from two sockets to four sockets, for example, you don't really get a doubling of performance. You usually realize a much more modest increase. But with the Victoria Falls coherence fabric, we get a near-linear improvement -- an 80 percent bump in performance as we double the socket count."

So how does it work?

"If a processor is doing a load instruction from memory, it will look in its lower level caches. If there's a miss in the primary cache, it probes the second level cache, which is resident on that processor. Load instructions that miss in the Level 2 cache will need to send a command that probes all the other Level 2 caches in the system. This is called a snoop. The core that sent the load will get a response back in the form of 'Sorry, not here' or 'Here's the data. I'll share it with you.' This is shared memory," he explains.

"So we use these links [the coherence fabrics] to interconnect the processors at very high speed, because now we're going to have a lot of concurrent memory accesses among these processors snooping their caches. If we had just one link, it would be a bottleneck. We maintain affinity within these channels, if you will, with these four independent coherence fabrics. With that, the aggregate bandwidth now is satisfactory for the 256 threads we support."

He notes that Sun's coherence fabric works amazingly well and says, "We will continue to use that as we go forward with follow-on products."

Follow-on products like the processor code-named "Rock."

"Rock is hot on the heels of what we just announced and will bring to market some new techniques to aid in parallelization, aid in the effort of programmers to write parallel programs with transactional memory built into the hardware," Hetherington says.

"We've got systems that will be getting out to beta sites in the very near future. It won't be long before John Fowler is back on stage to do another announcement."

Rick Hetherington Portrait
Rick Hetherington

Title: Chief Technology Officer, Microelectronics

Job: Driving Sun's leadership in the chip-multithreading approach to processor design.

Quote: "Instead of datacenters composed of tiny two-socket servers -- thousands of them interconnected through very expensive networking equipment -- why not consolidate a lot of that activity onto as few boxes as possible?

Background: Spent sixteen years with Digital Equipment Corp., working on various processors and systems, before joining Sun in 1996. From 2000 to 2002 he took a hiatus from Sun to join a networking startup as VP of engineering.

Education: Bachelor's degree from Pennsylvania State University.

Patents: 53 or so.

Proudest Moments: Getting Niagara and Niagara 2 systems to market. ("From a career perspective, that definitely was a highlight because I felt like I was a leader in that. This whole effort has been a real lift and a real joy to me.")

Hobby: Collecting wine.

Favorite Wine: Kistler pinot noir.

Favorite Food: Any and every kind of cheese.

What Annoys Him: Not being able to read without glasses.

Last Book Read: The Shock Doctrine, by Naomi Klein.

Favorite Movie: The Sand Pebbles, with Steve McQueen and Candice Bergen.

First Job: Worked in a hospital pushing food carts from the cafeteria up to patients' rooms.

Little-Known Fact: Was a diver (one meter and three meter) in high school and college.

Childhood Ambition: Wanted to be the captain of a submarine.

Retreat: Maui, Hawaii.

About Sun: "It is not a place where you are compartmentalized or constrained. If you really want to grow and develop, it's a wide open opportunity at Sun. I really enjoy that atmosphere."

 
Would you recommend this Sun site to a friend or colleague?
Contact About Sun News Employment Privacy Terms of Use Trademarks Copyright 1994-2009 Sun Microsystems, Inc.