 
 
 
 
 
 
 
  
In Chapter 4, we address a fundamental question in
designing multiserver systems, namely ``How many servers are best to
minimize mean response time?''  As may be expected, this question has
a long history in the literature; however, it has not been fully
resolved.  As early as 1958, Morse observed that for an M/M/ /FCFS
system1.3,
the optimal number of servers is one (
/FCFS
system1.3,
the optimal number of servers is one ( ) [131].  In 1970,
Stidham formalized and generalized the observation by Morse
[188]; Stidham showed that a single server is optimal in a
G/Er/
) [131].  In 1970,
Stidham formalized and generalized the observation by Morse
[188]; Stidham showed that a single server is optimal in a
G/Er/ /FCFS system, namely when the arrival process is a general
process and the service demand has an Erlang distribution (i.e., when
the service demand is a sum of i.i.d., or independent identically
distributed, exponential random variables).  
Stidham's result is extended by Brumelle to the system where
the service demand is at most as variable as the exponential random variable;
i.e., a single server is optimal
in a G/G/
/FCFS system, namely when the arrival process is a general
process and the service demand has an Erlang distribution (i.e., when
the service demand is a sum of i.i.d., or independent identically
distributed, exponential random variables).  
Stidham's result is extended by Brumelle to the system where
the service demand is at most as variable as the exponential random variable;
i.e., a single server is optimal
in a G/G/ /FCFS system where the service demand distribution has the squared
coefficient variation at most one (
/FCFS system where the service demand distribution has the squared
coefficient variation at most one ( ) [31].
For general service
demand distributions, only results for limiting cases are known.
Specifically, the work by Reiman and Simon [161] can be
used to show that a single server (
) [31].
For general service
demand distributions, only results for limiting cases are known.
Specifically, the work by Reiman and Simon [161] can be
used to show that a single server ( ) is optimal in the light
traffic limit (when servers are almost always idle), and the work by
Iglehart and Whitt [79] implies that a multiserver
system (GI/GI/
) is optimal in the light
traffic limit (when servers are almost always idle), and the work by
Iglehart and Whitt [79] implies that a multiserver
system (GI/GI/ /FCFS) and its single server counterpart
(GI/GI/1/FCFS) coincide in the heavy traffic limit (when servers are
almost always busy).  In fact, based on these results in the light and
heavy traffic limits, Mandelbaum and Reiman state ``One expects,
therefore, that the difference in performance between the
systems1.4is maximal in light traffic, and it diminishes as utilization increases
to heavy traffic.'' [120].
/FCFS) and its single server counterpart
(GI/GI/1/FCFS) coincide in the heavy traffic limit (when servers are
almost always busy).  In fact, based on these results in the light and
heavy traffic limits, Mandelbaum and Reiman state ``One expects,
therefore, that the difference in performance between the
systems1.4is maximal in light traffic, and it diminishes as utilization increases
to heavy traffic.'' [120].
Contrary to the prior work that suggests the optimality of single server systems in special/limiting cases, we will see that a multiserver system often provides lower mean response time than its single server counterpart. Specifically, we will see that the optimal number of servers increases as the load and/or the service demand variability increases. Intuition behind this finding is that multiple servers allow short jobs to avoid queueing behind long jobs; by contrast, a long job blocks all the other jobs in a single server system, while being processed.
Above we assume that jobs are served in the order of their arrivals (FCFS); however, real world systems are not always FCFS. Rather, there is often inherent scheduling in the system to allow high priority jobs to move ahead of low priority jobs (priority scheduling). For example, the high priority jobs may carry more importance, e.g., representing users who pay more. Alternatively, high priority jobs may simply consist of those jobs with small service demand, since giving priority to short jobs reduces mean response time overall.
The prevalence of priority scheduling motivates us to study the question ``How many servers are best?'' under priority scheduling. In particular, how does the optimal number of servers under priority scheduling compare to that under FCFS, and how is the answer affected by system parameters such as loads and service demand distributions? To answer these questions, we analyze the mean response time in a multiserver systems with multiple priority classes via DR, as its analysis involves a multidimensional Markov chain.
Our analysis shows that the optimal number of servers can be quite different between FCFS and priority scheduling. Intuition behind this finding is seen by considering an example where shorter jobs have higher priority: in this case, the benefit of multiple servers (i.e., allowing short jobs to avoid queueing behind long jobs) is smaller under priority scheduling, since priority scheduling by itself allows small jobs to jump ahead of long jobs. We will also study how the optimal number of servers differs for different priority classes, and how the mean response time is affected by choosing a non-optimal number of servers. We will also evaluate the accuracy of various existing approximations for multiserver systems with multiple priority classes, and see that the error in the mean response time can be as high as 50% or more.
 
 
 
 
 
 
