|
One Miracle at a TimeWith the Neuromancer project, Jim Waldo and team add orders of magnitude to the scale of system design. By Al Riske 14.Jul.05--"One of the questions I've always found interesting in research is what happens when you go up a couple orders of magnitude from where you are now?" says Jim Waldo, distinguished engineer and principal investigator on the Neuromancer project. With Neuromancer, Waldo and his six-person team, headquartered at Sun Laboratories in Burlington, Massachusetts, are exploring "the network of things." What kind of things? Waldo chose medical sensors, tiny devices that monitor a person's blood pressure, body temperature, blood oxygen, and other vital signs. But the project isn't really about sensors. It's about infrastructure.
"Suppose you could put 5 or 10 sensors on everybody in the U.S., and have the information gathered pretty much continuously -- uploading it automatically whenever people came within range of a wi-fi network, for example," Waldo says. "What would be the distributed system infrastructure needed to deal with that amount of information?" A 13-year Sun veteran, Waldo is perhaps best known as the lead architect for Jini technology. Indeed, Jini plays a key role in Neuromancer, but the project goes way beyond that. "The idea behind Neuromancer was to pick a problem that would have more data than any of us could think about comfortably right now," he says. "A back-of-the-envelope calculation we did early on was that with 10 sensors you're going to generate about 10 megabits of data per person per day. That's a lot of information."
Born and raised in Salt Lake City, Utah -- "in the culture but not of it" -- Waldo, now 54, remembers that he wanted to be a wizard when he was a boy. Naturally, his interests changed over the years, but one thing remained constant. "I always wanted to figure out things nobody knew the answer to," he says. A "sixties radical" in his college days, Waldo studied philosophy, earned a doctorate, and taught for a while. Later, he joined a colleague's startup and taught himself to write code. "Philosophy, in the Anglo-American tradition, is heavily tied to mathematics, and I had been a sort of applied logician as a grad student," he explains. For a long time now he has been focusing on distributed systems -- designing CORBA while at HP and Jini at Sun -- so his desire to take on challenges like the Neuromancer project makes perfect sense. "Doing research in distributed systems is as close as you can come to doing magic that I know of," he says. "You start with an idea and the goal is to make your idea real. All of design is like that, but distributed systems tend to be somewhat more mysterious. Inside a distributed system there is no fact of the matter."
The challenge wasn't selected arbitrarily. It was born out of need. The need of caregivers, overwhelmed with the needs of an aging population, to automate routine testing. The need of public health officials to quickly identify outbreaks of disease. The need of researchers to gather, store, and analyze long-term medical data. A network of sensors could help with all three. Indeed, while the work of the Neuromancer team will be applicable to other areas as well, the response of the medical profession has been especially encouraging. "When we have gone out and talked to people, expecting their answer to be, 'Well, I don't really think I could use this' or 'Tell me how I could use this,' instead their response has been, 'Tell me when I can use this,'" Waldo says. "These guys, at least the smart ones, understand that they don't get enough information to do something that could reasonably be called science. The evidence in most medical studies would count as anecdotal in any other field." There are a number of possible business scenarios for Neuromancer -- and there are members of the team focused on that -- but on the research side, there's still the challenge of how to design and build the necessary infrastructure. It's a challenge Waldo relishes.
With long-term studies in mind, he notes, "We're looking at a system that may have a lifetime of between 40 and 80 years. That becomes interesting in itself from a technology point of view because, over that many years, all the equipment used to build up this network will have changed, multiple times." So the scale is gigantic, the timeframe is long, and ... oh, yeah, there's a fundamental flaw in the way networks are designed today.
"If you move and get a new IP address, you are considered a new thing," Waldo notes. "That's not good if you want to be able to move things around and have them maintain their identity." That's a key factor because the Neuromancer team has rejected the idea of creating a single repository or directory for all the sensor data. Why? Because the repository would have to exist before anyone could use the system. "Instead we have a highly decentralized system ... but that means if a single doctor wanted to try this in his practice, he could. We think that's an important part of the business model -- ease of deployment and an ability to connect things afterward," Waldo says.
Virtually everything about the Neuromancer project is contrarian. "From the very beginning, the notion that the edge devices would essentially be servers rather than clients -- producers of information rather than just consumers of information -- was switching the general model on its head," he says. "The notion that the network itself was not going to be centralized into a small number of servers that would hold the information is also pretty radical. The usual notion is that you have a lot of clients going through a set of tiers to get to a database. The notion here is that there is no single database. There are thousands, millions, of data repositories, and those are the things you query. In fact, there would be so many of them that you would probably never be able to query them all." And yet, using some "statistical mechanisms," Waldo believes that Neuromancer infrastructure will be able to deliver answers with a high degree of confidence. "When you make a query with this," he says, "you are not just saying, 'I want the results,' but 'I want the results with a specified level of confidence in their statistical viability.'"
The project has been in the labs for about a year now and is already beginning to show results, using Jini as a base technology and extending it in various ways. "One of my rules of thumb on a project is you only try to do one miracle at a time. We started out with the miracle of how you do location-independent references on a distributed object, letting an object move around from place to place but still being able to find it," he says. "We have a first cut at that." In fact, he adds, "We have some implementation work going on, some partnerships going on with places like Harvard, and we're looking at developing partnerships with potential users of the technology so we can start testing."
The team will also be looking at how to introduce new forms of data as new kinds of sensors are invented. The key in this case, Waldo says, is exploiting the portability of Java byte code. "When you introduce a new kind of sensor, you will introduce a new Java class and anyone who receives the data also receives the code that implements the class that lets you access the data," he explains. "Now what that means is we're assuming that Java byte codes aren't going to change over that time, which is probably false, but at some point you have to put a stake in the ground. That's the stake we've put in so far." One more thing: "This being medical data, the security and privacy requirements are also extremely interesting," Waldo says. "A simple-minded approach is to say, 'Okay, for someone to get access to your data, you have to actively give them permission.' That seems all right until you think about the time you're in an automobile accident and you're unconscious. You want the medical people to get access to your information, but, given the nature of the situation, you can't actively give it to them."
The approach Waldo and team are just now beginning to investigate is based on auditing access to information. "Most attempts at computer security say that the fail-safe position you should take is to deny access," Waldo says. "We are leaning -- and this is just a lean -- toward a system that will allow access but keep an audit trail so you can go back to the person later and say, 'Look, I know you accessed my information. Under what circumstances did you do that? Why do you think you were authorized?'" There are other privacy factors to consider as well. "If someone is trying to do a statistical analysis over large chunks of medical data, the current rules and regulations say they can have access to information if it's anonymous, not personally identifiable. And there's currently a debate about what constitutes personally identifiable. How disaggregated does the information have to be?" Waldo says. "Turns out if you have someone's date of birth and their zip code, you have a high probability of being able to pick out the individual. But what if you're trying to correlate age and environmental factors with some disease?" So the Neuromancer project is also looking into ways to anonymize data. "There are techniques we can use to ... change the data values among various records in a way that doesn't change the statistical distribution, but does mean you can't narrow in on a particular person," he says. "But that's a future miracle." |
|
|||||||||||||||||