Next: Improving traditional task assignment Up: Configuring multiserver systems with Previous: The BB approximation Contents

Concluding remarks

In this chapter, a fundamental nature of multiserver systems is studied, in particular with respect to how the number of servers affects the performance of multiserver systems. DR allows us to study this problem not only when jobs are served in the order of their arrivals (FCFS) but also when there are priorities among jobs, whose analysis involves multidimensional Markov chains. Note that an advantage of DR is that the job size can be modeled as a PH distribution, and this allows us to study the impact of job size variability on the performance.

Our analysis illuminates principles on the performance (mean response time, in particular) of multiserver systems as compared to single server systems, which can be characterized primarily by three rules of thumb.

: (i) A single server system has an advantage over multiserver systems with respect to utilization, and this relative advantage becomes greater at higher load.
: (ii) A multiserver system has an advantage over single server systems with respect to reducing the impact of job size variability on the mean response time, and this advantage becomes greater at higher load and at larger job size variability.

When jobs are served in FCFS order, the mean response time of single server and multiserver systems can be qualitatively characterized primarily by the above two rules. In particular, the rules characterize how the optimal number of servers is affected by the load and job size variability. However, the above two rules are not sufficient to characterize the mean response time when there are priorities among jobs, since

: (iii) A multiserver system has an advantage over a single server system with respect to reducing the impact of prioritization on low priority jobs, and this advantage becomes greater when the mean and/or variability of the higher priority job size are larger and/or when the load of the higher priority job is higher.

We find that these three rules quantitatively characterize the mean response time under single and multiserver systems. In particular, these rules well characterizes how the optimal number of servers (per class or overall) is affected by the load and the mean and variability of the job size of each class.

In fact, the third rule has an important implication in designing scheduling policies in multiserver systems. Namely, prioritizing small jobs to improve mean response time is not as effective in multiserver systems as in single server systems.

We have also proposed an approximate analysis, DR-A, of the (per class or overall) mean response time in an M/PH/ queue with many () priority classes. DR-A is based on DR, but as opposed to the two approximations that we have introduced in Section 3.6 (DR-PI and DR-CI), its running time does not grow as the number of priority classes, , increases. The accuracy of DR-A as well as two existing approximations (BB and MK-N) is evaluated against simulation, and the results are discussed extensively. We find that the error in DR-A is within 5% for a range of loads and job size variabilities, while the error in BB and MK-N can be as high as 50%. Since BB is based on the assumption that the effect of priority is similar between a single server system and a multiserver system, the error in BB can also be explained by the above observation that prioritizing small jobs to improve mean response time is not as effective in multiserver systems as in single server systems.

In this chapter, we primarily limit our discussion to ``how many servers are best?'' However, the analysis of M/PH/ queue with priority classes via DR has a broad applicability in capacity planning of multiserver systems with multiple priority classes. For example, in [204], we study the impact of system tasks on the performance of user tasks in the context of dependability systems, where systems tasks have higher priority than user tasks for the purpose of fault recovery, fault isolation, fault masking, intrusion detection, virus checking, etc.

The results in this chapter can be used to infer how the performance is affected by changing the number of servers or by prioritizing some of the jobs, in more complex multiserver systems. For example, in the rest of the thesis, we consider multiserver systems consisting of multiple queues and multiple servers, where each server behaves differently from another. We typically assume that there are no priorities within each queue, and there is only a single server of each type. By contrast, the model that we have studied in this chapter has multiple homogeneous servers and a single queue where some jobs have priority over other jobs. The results in this chapter can be used to infer how the mean response time is affected in the systems studied in the rest of the thesis when there are priorities within each queue and there are multiple servers of each type.

Next: Improving traditional task assignment Up: Configuring multiserver systems with Previous: The BB approximation Contents

Takayuki Osogami 2005-07-19