|
Managing Small Data in Huge Quantities
The Yggdrasil Framework and Sensor.Network Service Make it Easier to Collect, Analyze, Visualize, and Share Data Generated by Net Devices May 27, 2009 -The Internet is no longer just a network of computers. It has evolved into a network of devices of all types and sizes, all connected, all communicating and sharing information all the time: smart phones, cameras, cars, toys, medical instruments, home appliances, even trees embedded with sensors. This "Internet of Things" can generate an enormous volume of very small bits of data: temperature, GPS position, speed, rate of acceleration, compression, humidity, shock, altitude, and so on. Every day we are generating more data faster than ever before. This data can be used to tackle all kinds of issuesfrom global ones like climate change to local ones like improving traffic flow during rush hour. With current state of the art, however, it can be very difficult to access data that scientists do not generate from their own study. In many cases, original data from related studies is hard or impossible to find and only papers with interpretations of the raw data are available. It is difficult to return to the source and draw new conclusions. Similarly, today scientists tend to place sensors where they can be easily accessed rather than where the richest data can be collected. This is because data retrieval from the sensor often requires a physical visit. How can we help scientists collect richer data sets more easily? How can we enable sharing, visualization and analysis of vasts amount of data so it can be interpreted and acted upon?
Sun is working toward answers to these questions with what it calls the "Internet of Things, Actualized," or IOTA. The cornerstones of the IOTA vision are the Sensor.Network Web service, the Project Yggdrasil data collection framework and the Sun SPOT Platform, including the Squawk Java Virtual Machine. Yggdrasil:* A Flexible Framework for Collecting Sensor Data The Yggdrasil framework is designed to make it easier for scientists, researchers and other domain experts, who may have little or no computer programming expertise, to create applications that collect sensor data over long periods of time. The framework also captures metadata (data about the data, e.g. units of measurement, normal range, sensor location) which is useful in analyzing, interpreting and cross-comparing data from multiple sources. Yggdrasil is initially targeted at applications such as environmental monitoring, asset tracking, datacenter management, and security surveillance. * In Norse mythology, Yggdrasil is an immense ash tree with branches that extend far into the heavens. The framework is designed for use with small, wireless, embedded devices and takes their special requirements into consideration: their limited power, the need to operate unattended under harsh environmental conditions, and the ability to deal with node failures and replacements. For example, it allows battery-powered sensors to be duty-cycled, i.e. the devices sleep most of the time to conserve energy and wake up periodically for a short duration to record and/or transmit sample readings. The sampling of sensor data and its transmission can be scheduled independently and this flexibility offers greater control over energy consumption. A high-level API built into the framework abstracts away low-level details dealing with scheduling the radio and various sensors. Creating a data collection application is as simple as defining a sensor (what values are reported, how they are collected, their units etc.) and specifying sampling and transmission schedules. Data recorded by sensors is communicated via sensor-specific gateways for archival into a database. The creation of tables, insertion of data and metadata and the duty-cycling of energy constrained sensor nodes is all handled automatically by the system. Sensor.Network.com: Turning Data into Useful Information Sun Labs is also building Sensor.Network, a cloud-based infrastructure for sharing, visualizing and analyzing the data collected. The service is agnostic to the source of data, which could be a mobile phone, an automobile, a datacenter or an embedded device like the Sun SPOT. Besides supporting a heterogenous mix of data sources, the service supports multiple sensor installations, each of which could potentially be owned by a different entity. "Sensor.Network isn't just a data repository but also a means for monitoring and controlling a sensor network installation", said Vipul Gupta, a Distinguished Engineer at Sun Labs. "The design of Sensor.Network places a strong emphasis on security and privacy concerns. Our aim is to provide fine grained control over how data is shared among authorized partners." Upon successful authentication, each entity is presented with a private portal to the installations it is authorized to access. Multiple kinds of views are supported, e.g. the dashboard view presents an overview of the health of the entire site how long each sensor has been up and running, the time of the last reading, the state of its battery, etc. The data view allows users to get live sensor readings and plot different data streams (e.g. sunlight and photosynthetic activity) to study their correlation. The management view allows users to look at applications installed on a device, deploy and launch new applications and even pause, resume, stop and migrate active applications.
Real World Applications The powerful new capabilities of Yggdrasil and Sensor.Network are already being exploited in real-world applications.
In California, the Ravenswood project is working with scientists from the United States Geological Survey (USGS) to collect sensor data from a salt marsh restoration project-an initiative that will convert 15,100 acres of commercial salt ponds at the South end of San Francisco Bay to a mix of tidal marsh, mudflat and other wetland habitats. Dozens of sensor nodes will monitor various conditions within the salt ponds, such as the level of salinity, pH, or dissolved oxygen in the tidal marshes, and communicate that data through to a back-end server. A long-range Wi-Fi link has already been deployed from the salt ponds to a datacenter inside Sun's Menlo Park campus.
As Sun's Yggdrasil and Sensor.Network teams build up their system, they plan to open it up to anyone who would like to store and share data from their sensor installation whether it be a collection of parking meters, cellular phones or even automobiles. The team is taking a collaborative approach to building the system and welcomes participation from students, scientists, universities, and other companies. Visualize the Possibilities Yggdrasil and Sensor.Network together provide a foundation for building better models of natural phenomena by letting scientists and researchers share not only their data but also their algorithms for making sense of those data sets. "There are many questions that require correlation and cross-comparison of data from multiple sources," said team leader Arshan Poursohi. "How do fluctuations in water quality impact the success of bird breeding? How does the presence of automobiles and people impact the numbers of migratory birds in salt ponds? How does the level of air pollution in Beijing affect the weather in San Francisco? Traditional sensors only provide raw data; what we're trying to do is turn that data into actionable information. This requires not only the data collection capabilities of Yggdrasil but also advanced analytics and data sharing capabilities we aim to provide with Sensor.Network." Specifically, using Yggdrasil in a cloud computing model would make it possible to create and apply cloud-based algorithms for analyzing many variables. One algorithm, for example, could combine data streams from the Fish and Wildlife Service and the US Geological Survey to produce a census of all bird species in a given area. Another algorithm could notify scientists in Georgia that researchers in Oregon had produced a data set with a strong correlation to their area of focusand even send them an email notification and request for data sharing. Access to a wide variety of individually designed algorithms will greatly improve a scientist's ability to analyze data from multiple viewpoints. The Sun Labs team is also researching ways to make the enormous volumes of data generated by sensors more visually accessible. The team is leveraging 3D visualization technology from Sun Labs' Project Wonderland to create realistic, virtual-world representations based on real-world data to help illustrate complex interactions. "We're interested in giving scientists tools to evaluate the data and to decide what is interesting," said Poursohi. "We want to give them new tools and new ways to see patterns and visualize relationships. Using virtual world images, we hope to illustrate the impacts of real-world interactions, so researchers can test out various what-if scenarios and look at data in new and useful modalities." As an example, Malden Labs, based in Syracuse, New York, is using Project Wonderland technologies to bring the richness of 3D content to a whole new generation of enterprise applications. For instance, datacenters use an enormous amount of energy. Sensors embedded within a real datacenter can provide temperature, humidity, and power usage data that can be transformed into images that identify potential problems. "We're focused on solving real problems for real clients," said Malden Labs founder Mike Wetzer. "We're extremely excited about the possibilities and our clients are equally excited." Current Status The Yggdrasil framework is freely available as open source technology under BSD licensing. A QuickStart guide is available at: http://wiki.java.net/bin/view/Mobileandembedded/QuickStart Sensor.network is currently in alpha state and the team is working with potential customers to fine tune the implementation. For More Information
| ||||||||||||||||||||||||||||