Solaris MC: A Multi-Computer OS
Yousef A. Khalidi, Jose M. Bernabeu, Vlada Matena, Ken Shirriff, Moti Thadani
Introduction by Yousef A. Khalidi
This technical report describes a prototype distributed operating system for closely-coupled computer clusters. The prototype system, called Solaris MC, was built as a set of extensions to the base Solaris(TM) Operating Environment UNIX® system and provided the same ABI/API as the Solaris OE, running unmodified applications. We used several techniques to extend Solaris OE using a CORBA-compliant object oriented system. Objects communicated through a runtime system that borrowed from Solaris OE doors and Spring subcontracts. Solaris MC provided a single-system image: a cluster appeared to the user and applications as a single computer running Solaris. Solaris MC was designed for high availability: if a node failed, the remaining nodes and cluster services remained operational.
After completing the Solaris MC prototype, the team and technology were transferred out of Sun Labs to form the core of a new product group chartered with building Sun's next generation clustering product. This effort resulted in Sun(TM) Cluster 3.0 (Full Moon) product which was released in November of 2000. Sun Cluster 3.0 is Sun's most powerful and comprehensive cluster solution ever. Sun Cluster 3.0 focuses on delivering integrated availability, scalability, manageability and ease of use with the core Solaris OE (http://www.sun.com/clusters).
We learned many lessons starting from the prototype work in Sun Labs, through product development of Sun(TM) Cluster 3.0 and customer deployment of the final product. The initial emphasis on single-system image was later relaxed in the final product as we balanced it against the increasing requirements of high-availability.
Many of the initial design decisions we made in Solaris MC were proven correct, including the use of object-oriented techniques, and seamless recovery of system services. Finally, our early decision to extend Solaris without breaking any applications was key to making the transition from a prototype system to a major Sun software product.
REFERENCES:
- Yousef A. Khalidi, Jose M. Bernabeu, Vlada Matena, Ken Shirriff, and Moti Thadani,
``Solaris MC: A Multicomputer Operating System,'' Proceedings of Usenix 1996, January 1996,
pp. 191-203.
- Ken Shirriff, Jose Bernabeu Auban, Yousef A. Khalidi, Vlada Matena, "Single System Image:
The Solaris MC Approach,'' Proceedings of PDPTA '97, July 1997, pp. 1097-1105.
- Ken Shirriff, ``Building distributed process management on an object-oriented framework,''
Proceedings of Usenix 1997, January 1997, pp. 119-131.
- Jose M. Bernabeu-Auban, Vlada Matena, and Yousef A. Khalidi, "Extending a Traditional OS Using Object-Oriented Techniques," USENIX Conference on Object-Oriented Technologies, 1996.
