Replication and fault-tolerance

Next: Recent Accomplishments and activities Up: Approach Previous: Application Utilities

Replication and fault-tolerance

A task is made fault-tolerance through its replication. A replica will run on different set of resources w.r.t. the original at the same quality level. The number of replicas that need to be run depends on the fault model of the system. Hence, following a certain fault-model, we can assign fault-tolerance as another QoS dimension. The values along this dimension is equated to the number of copies (replicas) of the task. The utilities achieved depend on the fault-model or the reliability of the system.

For example, consider a task that has the following resource vector allocation choices (options): $r_{i1}$ , $r_{i2}$ and $r_{i3}$ . At the same level of quality, any of these resource choices can be allocated to the task. In order for the task to be fault-tolerant, more than one resource vectors need to be allocated. Thus, we can generate the QoS set-points in the following way:

Fault-tolerance Quality Index Number of replicas Resource Vectors

1 0 $r_{i1}$ , $r_{i2}$ , $r_{i3}$

2 1 $(r_{i1}+r_{i2})$ , $(r_{i1}+r_{i3})$ , $(r_{i2}+r_{i3})$

3 2 $(r_{i1}+r_{i2}+r_{i3})$

For a task with resource vector options, the fault-tolerant quality index of can be attained in $\left(\begin{array}{c} N M \end{array} \right)$ combinations of resource vectors. This automatically limits the maximum number of replicas to the number of independent resource options.

Next: Recent Accomplishments and activities Up: Approach Previous: Application Utilities

Sourav Ghosh 2002-09-13

Fault-tolerance Quality Index	Number of replicas	Resource Vectors
1	0	$r_{i1}$ , $r_{i2}$ , $r_{i3}$
2	1	$(r_{i1}+r_{i2})$ , $(r_{i1}+r_{i3})$ , $(r_{i2}+r_{i3})$
3	2	$(r_{i1}+r_{i2}+r_{i3})$