Figure 4.1(a) shows the optimal number of servers as a function of the load and variability of the job size, where the job size has a two-phase PH distribution. All of our results are expressed as a function of the variability of the job size distribution (specifically, its squared coefficient of variation, )4.1, and the server load, . While other factors, e.g. the exact form of the job size distribution, might affect our results, we posit that load and variability be the most relevant factors. As is proved by Stidham , a single server minimizes mean response time when (exponential distribution) or when (Erlang-2 distribution). Observe, however, that under high job size variability and/or high load, the optimal number of servers is more than 1; we prefer multiple slow servers to a single fast server. For example, at load and , we see that three servers are best. Computations are only done for up to six servers -- the level curves shown will continue into the upper right portion of the plot if a larger number of servers is considered.
Figure 4.1(b) shows that for any particular job size variability, , having a larger number of slower servers may reduce the mean response time up to a point, after which further increasing the number of servers increases the mean response time. To understand why, note that by increasing the number of servers (while maintaining fixed total capacity), we are allowing short jobs to avoid queueing behind long jobs -- specifically, an arriving short job is more likely to find a server free. Thus, increasing the number of servers mitigates the impact of job size variability, hence improving performance. If the number of servers is too high however, servers are more likely to be idle, under-utilizing the system resources.