Abstract/Details

Achieving higher dependability through host and NIC processor collaboration


2008 2008

Other formats: Order a copy

Abstract (summary)

Traditionally, distributed systems requiring high dependability were designed using custom hardware with massive amounts of redundancy. Not only the nodes, but the network, was replicated in most of these systems. Recently, the need for cost reduction and access to the latest commercial technologies has prompted the use of commercial off-the-shelf (COTS) hardware and software products in the design of such systems. On the other hand, reliance on COTS technology brings about new challenges in system reliability. This dissertation attempts to address these challenges by developing fault tolerance techniques for modern high-speed networking-based systems.

Being driven by the demand for greater network performance, emerging network technologies have complex network interfaces with a Network Interface Card (NIC) processor and large local memory. However, increasing complexity results in a larger set of failure points and a potential increase in the network failure rate. This is in addition to the system failures that can be caused by faults that strike the host system. In this dissertation, we propose to achieve higher dependability of distributed systems through host and NIC processor collaboration. The host processor will detect and recover a failed network interface, and in addition, the symbiotic relationship allows the NIC processor to aid in the recovery of a failed host system or application. More specifically, we present an effective low-overhead adaptive and concurrent self-testing technique to protect programmable high-speed network interfaces, and a low-overhead message logging protocols to achieve fast recovery from host application crashes.

Indexing (details)


Subject
Computer science
Classification
0984: Computer science
Identifier / keyword
Applied sciences; Fault tolerance; Network interface card processors; Processor collaboration
Title
Achieving higher dependability through host and NIC processor collaboration
Author
Zhou, Yizheng
Number of pages
82
Publication year
2008
Degree date
2008
School code
0118
Source
DAI-B 69/12, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
ISBN
9780549915171
Advisor
Koren, Israel
Committee member
Krishna, C. Mani; Weems, Charles C.; Wolf, Tilman
University/institution
University of Massachusetts Amherst
Department
Electrical & Computer Engineering
University location
United States -- Massachusetts
Degree
Ph.D.
Source type
Dissertations & Theses
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3336931
ProQuest document ID
304568379
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
http://search.proquest.com/docview/304568379
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.