|
| United States Worldwide |
|
Storage with SmartsNew Paradigm Boosts Scalability and Speed By Al Riske 27.Sept.07- In 40 years of programming high-performance systems, Harriet Coverston has never been this excited. The chief architect of QFS -- the distributed file system with near raw I/O performance -- Coverston is no stranger to ground-breaking projects. But this is different. What she's working on now represents a complete paradigm shift in data storage. "We haven't really had a change since SCSI was introduced back in 1976," she says. "This is so exciting. Like going to the moon." Call it the most exciting project in an exciting career.
Coverston began her programming career at Lawrence Livermore National Laboratory, then the largest computer center in the world, and has specialized in high-performance computing ever since. "It was an amazing place to work," she recalls. "Back then there were no computer science courses; I was learning from hands-on experience. Livermore was the best teaching school of all. If you visit the Computer History Museum in Mountain View, California, you will see Serial 1 CDC 7600. We had production running on that machine in January 1969. It was the height of the Cold War and every machine cycle was precious. There was no I/O idle -- all I/O was async."
From there she went to fabled supercomputing pioneer Control Data Corporation, where she created the operating system and storage subsystem for the Cyber 205, one of the first supercomputers to use a vector processor for improved math performance. That's when her interest in high-performance storage began, and she soon cofounded LSC, a small company specializing in large storage configurations. Coverston recalls cofounder Don Crouse drawing a picture in the early 1980s with storage in the center and computers around it. "Storage at that time was really the peripheral. Kind of a second-class citizen to computing," she says. "Now storage is first class. So we were really ahead of the curve." In 1992, Coverston and a couple of coworkers created SAM-FS, a file system with integrated archive management. Two years later, they ported SAM-FS to Solaris. "We chose to run it on the Sun platform because of the threading capability of the Solaris operating system," she says. "We needed the async I/O capabilities of threads in order to keep our tape devices streaming at device speeds." In 1997, Boeing, which had chosen Solaris for a large U.S. government data-capture project, presented Coverston with a new challenge. "They asked for some additions -- some performance enhancements -- to the file system of SAM-FS," Coverston recalls, "and that became QFS." It had to be fast enough to keep up with the download speeds of earth-observation satellites and geological survey systems without missing a beat. The problem: file systems frequently need to make small changes to their metadata -- updating access times, modification dates, and so on. "So if you're streaming data from a satellite and, oops, you've got to update the access time, you have to move the head [the device that reads and writes information on the disk's surface] over there to do the update. Then you have to move it back. Now you've lost your performance," Coverston explains. "What we did with QFS was to separate the metadata from the data."
When Sun bought her company in 2001, Coverston was already at work on Shared QFS, a multinode distributed version of the QFS file system. "Today a lot of jobs exceed the processing power of one node, so you need to spread the work across multiple nodes. Like Oracle RAC, for instance. Even when we act as a data repository with SAM, we set up a QFS shared file system so we can have multiple data paths pumping the data out from the repository. We'll have the big [Sun StorageTek] SL8500 jukeboxes -- multiple jukeboxes, petabytes of data -- so we need multiple nodes to deliver that. You want to do things in parallel today. You reach a wall with only one node," Coverston says. She went on to win the Chairman's Award for Shared QFS in 2004.
Coverston, who recently returned from a conference on government, education, and health care, was pleased to see that "Sun SEs and sales people are just really excited about how much QFS and SAM are solving problems out there." What kinds of problems? "Did you know that when you watch HBO, all broadcasting, including on-demand video, is coming to you by way of QFS? QFS guarantees the speeds and reliability needed for HBO's playout system. HBO is an example of how bleeding-edge HPC programs like the large data-capture project drive commercial HPC," Coverston says. "In the medical field, everything is going digital, so SAM provides the repository for radiology, for CAT scans, all the pictures. When a doctor walks into the operating room and pulls up the pictures, they're coming right off our media. It's really critical data we're protecting, and we're in five of the largest U.S. hospitals. We're really unbeatable in the medical field. "We also just won the Library of Congress, where all the information is being digitized and stored forever. Not just 100 years but forever. And we have the capability to constantly, automatically, move to more up-to-date media as it becomes available. The end user doesn't even know it's happening." She notes that Sun storage solutions based on QFS and SAM are also well established in financial services, entertainment, retail, media, Internet, telecommunications, and government. "Redshift companies like Salesforce.com and Qualcom use QFS and SAM. They love it," she adds.
What I'm working on now is the most exciting of all," says Coverston, who leads a small, passionate team -- "They're the best!" -- with loads of experience in high-performance I/O and large-scale systems. "With QFS today we have one metadata server and hundreds of clients. We do name services -- create/remove/et cetera -- on the metadata server. We also allocate the storage on the metadata server. So, eventually, we hit a wall where we can't increase the number of clients because the metadata server becomes overloaded -- it's doing the name services and the space management," she explains. "So what we're doing now, in layman's terms, is we're taking the file system and splitting it in two. One portion of the file system stays on the host and works with the metadata -- the creates, removes, those kinds of things. But we're taking the space management and moving it to the storage nodes, which are computers themselves, running Solaris, and there are lots of them, so we're able to do all that work in parallel." Coverston calls it the biggest paradigm shift in storage in more than 30 years and says it will dramatically increase horizontal scalability. "Those storage nodes will become very intelligent and will understand the layout of the data. Storage that understands the layout of the data can organize it for optimum performance. You'll get different performance depending on where the data resides, so if some data is more important than other data, then you can place it on higher performance storage in the background. You could never do that before. The file system would say, 'Write this block here' and then the device just wrote it there and quit," she says. "Today we support hundreds of nodes. With this change, we'll move up to thousands. This is very good for driving Solaris adoption on Andy Bechtolsheim's platform, the Sun Constellation System."
The new paradigm is called object-based storage, but Coverston prefers the term intelligent storage. "We're really making storage intelligent. What we're going to do is say, 'Write this to objects in the storage device.' The storage device then understands that the blocks in this object are related and it can act on that. It can do prefetching. It can do caching intelligently," she says. Simply put, the storage device will be smart enough to figure out what kind of workload it's being given -- sequential or random. "It knows if it gets three sequential requests in a row it's probably going to read another one, so it can start reading ahead. If it's randomly accessing data, then it won't read ahead. So we can mix those kinds of workloads now. We've never been able to efficiently do that before," Coverston says. "It's very exciting. We can mine the data. We can deliver quality of service on the data. There's all kinds of stuff we can do." |
|
||||||||||||||