Questions/Q1 - Cross-Layer Reliability Wiki

Question

How do we organize, manage, and analyze layering for cooperative fault mitigation?

Summary

A cross-layer reliable system must be able to communicate information and coordinate actions across levels. Broadly, this question tries to address the nature of how these levels should interact.

Sub-Questions

What should new contracts and interfaces look like?
What information is useful to reflect up the stack?
What controls on lower levels should be exposed and how?
What information is it useful for higher-levels to pass down?
How do we evaluate and compose techniques across levels?
How do we engineer and analyze adaptation and repair control loops across layers?

Relevant Scenarios

Scenario 1

Workshop Materials

Workshop 1 Slides

Existing Work

add additional references here

Comments

We want to be careful about laying an increasing burden on the application programmer.
Software is not a single piece, but many layers and by treating them differently, we can achieve more than lumping them together. To enumerate: at the bottom virtual machine monitor/hypervisor, then operating system, then C++ lib/runtime, then app frameworks (J2EE, ruby rails, python/perl interpretter, and finally the actual application. Also, many new applications have a multiple tiers with multiple instances (front-end web servers talking to app servers talking to backend database).

Some handling and information can be added by the compiler. Even for cases where we do communicate to/from the application, a number of things can be inserted automatically without adding a burden to the programmer.
Fault model coercion: if exposing errors to higher layers, it may be useful to coerce faults into a small number of equivalence classes to reduce the number of errors/behaviors that upper layers have to worry about. This is one of the reasons distributed systems often like fail-stop semantics; this reduces a large class of potential problems to one kind.
To comment, please add another bullet to this list.