Michael_Nielsen/Syd/Synergy.SYNERGY@synergy.com.au wrote:
>
> I KNOW this should be easy, but I'm stuck.
>
> My data frame consists of multiple observations from each of a number of
> stations, and what I would like to do is create another data frame that
> contains all the variables of the first, but only rows where a certain
> variable is at its maximum for the station.
>
> So, for example:
>
> > my.df
>    stn obs           v
> 1    1   1  0.26400396
> 2    2   1 -0.79194397
> 3    3   1  0.11924528
> 4    4   1  0.42596859
> 5    5   1 -0.50528235
> 6    1   2 -1.57524853
> 7    2   2  0.17762482
> 8    3   2 -0.83013770
> 9    4   2 -0.53203400
> 10   5   2 -2.71397275
> 11   1   3  0.26902053
> 12   2   3  2.01147908
> 13   3   3  0.73301643
> 14   4   3 -0.67333384
> 15   5   3 -1.36219773
> 16   1   4 -2.20342109
> 17   2   4  0.18941702
> 18   3   4  0.51492032
> 19   4   4  0.03597370
> 20   5   4 -1.43502366
> 21   1   5 -1.34589392
> 22   2   5  1.00389195
> 23   3   5 -0.21233041
> 24   4   5 -1.35141044
> 25   5   5 -0.02052348
>
> > tapply(v,factor(stn),max)
>           1           2           3           4           5
>  0.26902053  2.01147908  0.73301643  0.42596859 -0.02052348
>
> so my new data frame should contain (possibly multiple rows per station)
>
>    stn obs           v
> 1    1   3  0.26902053
> 2    2   3  2.01147908
> 3    3   3  0.73301643
> 4    4   1  0.42596859
> 5    5   5 -0.02052348


As a first idea:

  my.df[tapply(v,factor(stn), function(x) which(v==max(x))),]


Uwe
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._