\documentstyle[11pt,doublespace,times,epsf]{article}
% Standard Article Macros
% Rangarajan Pitchumani, March 27, 1991
\newenvironment{centre}%		--- Centre
{\centering}{}%
% Standard format:
\setlength{\parskip}{0in}	    % Inter-paragraph space.
\setlength{\oddsidemargin}{0in}   % Adjust left/right margins.
\setlength{\textwidth}{6.6in}	    % Overall.
\setlength{\textheight}{9.in}	    % Overall.
\setlength{\topmargin}{-0.6in}	    % (Plus 1.5 inches.)
\setstretch{1.25} 		    % Inter-line spacing.
\setlength{\parindent}{0in}	    % Flush left starting.
\begin{document}

We would like to thank the reviewers  for their useful comments on our
manuscript titled ``Parallel and Distributed  Application  of an Urban
and Regional Multiscale Model'' for publication in {\it  Computers and
Chemical  Engineering}. We have taken into consideration the reviewers
comments and revised our manuscript accordingly.

\vspace{0.4in}

{\bf Reply to Review I:}

\vspace{0.2in}
{\bf Comment 1:} 

{\it The parallel implementation is suited to the low
number of processors regime where the number of processors is smaller
than the number of layers.}

\vspace{0.2in}
{\bf Reply:} 

This  comment is true  as stated; however,  the implementation is even
better suited to the regime where the number of processors is equal to
the number of layers. Although it is true that the URM transport phase
does not scale quite as well as the chemistry phase, it in fact scales
beyond  the  number  of layers.  This  is achieved by distributing the
transport operator across both the layer  dimension and  the  chemical
species dimension, and  by distributing  the LU  factorization  to all
transport processors (LU factorization is replicated on all processors
that handle species for a given layer). Further, scaling the number of
transport  processors  to greater than the  number  of layers  is very
practical, as demonstrated  by our  10-processor performance data.  In
this case, 10 processors (each running one chemistry and one transport
process) are used with a 5-layer data set for an effective speedup  of
greater than 6.  We have clarified this in the paper.

\vspace{0.4in}
{\bf Comment 2:} 

{\it The parallelization of other models on different
architectures using comparable techniques is already discussed in the
literature. Thus, there is a minimal amount of really new material in
this paper}

\vspace{0.2in}
{\bf Reply:} 

Although  our techniques  can  be considered comparable to  one or two
other  efforts, they are  different in significant  ways  that are the
direct  result  of structural  differences  between uniform-scale  and
multiscale  models.   Specifically,  other   efforts  parallelize  the
horizontal  transport phase with separate L$_x$  and  L$_y$  transport
operators.  As stated in Section 2 of  the paper, horizontal transport
is  performed  in  the  URM  model  by  a  two-dimensional  horizontal
(L$_{xy}$) finite element operator in  order to support  multiple grid
scales.   The degree  of  outer-loop  parallelism  available  from  an
L$_{xy}$ operator is lower  than the degree available from separate Lx
and Ly operators;  however,  as  multiple  scales  are  essential  for
large-scale  simulations  (to  achieve  accuracy without  overwhelming
computational  cost), the  development  of  a multiscale  method  that
scales well  to tens of processors  is a  significant  advance  beyond
existing  techniques.  Further,  as mentioned in Section  4  (Portable
Distributed  Code),  we incorporated data prefetching and static  load
balancing  into  our  model,  which  to  our  knowledge  have  not  be
previously  exploited   in  comparable   air  quality  parallelization
efforts. To address these  comments, we  have clarified this issue  in
the revised version of the paper.

\vspace{0.4in}
{\bf Comment 3:} 

{\it The researchers might consider using a different communication
protocol for future work. Other protocols also present the unified
framework of PVM without a high message overhead.}


\vspace{0.2in}
{\bf Reply:} 

We have considered
other protocols, and we may use them in the future.  Our measurements show 
currently relatively little time is spent in PVM.

\vspace{0.4in}
{\bf Comment 4:} 

{\it Table 1 represents the main results of the paper. It would be
helpful to see the time vs. number of processors curves for each system.}

\vspace{0.2in}
{\bf Reply:} 

We have added this plot to the paper.

\newpage

{\bf Reply to Review II:}

\vspace{0.2in}

The reviewer requests more background discussion of air quality models. 
To address this comment, we have rewritten and added to the
introduction. 

The review also requests "...further discussion on the potential
speedups through new algorithms which support scalable ideas..." To
address this comment, we have added a brief discussion of our current
activities and future plans regarding the parallel URM model.


\end{document}

