1

This is kind of a follow-up to my question at Can I subset specific years and months directly from POSIXct datetimes?

I have a dataframe

test <- data.frame(seq(from = as.POSIXct("1983-03-09 01:00"), to = as.POSIXct("1985-01-08 00:00"), by = "hour"))
colnames(test) <- "DateTime"
test$Value<-sample(0:100,16104,rep=TRUE)

and I am subsetting particular years and months using

# Add year column
test$Year <- as.numeric(format(test$DateTime, "%Y"))

# Add month column
test$Month <- as.numeric(format(test$DateTime, "%m"))

# Subset specific year (1984 in this case)
sub1 = subset(test, Year!="1983" & Year!="1985")

# Subset specific months (April and May in this case)
sub2 = subset(test, Month=="4" | Month=="5")

From these subsets sub1 and sub2, I want to use the hourly data to calculate daily minimum, mean, and maximum from column Value.

I found a solution at Aggregating hourly data into daily aggregates

stat <- function(x) c(min = min(x), max = max(x), mean = mean(x))
sub1$Date <- as.Date(sub1$DateTime)
sub2$Date <- as.Date(sub2$DateTime)
aggregate(Value ~ Date, sub1, stat)
aggregate(Value ~ Date, sub2, stat)

This appears to give the minimum, mean, and maximum in columns (although I can't verify as I can't read the top in the R output window). I need to make these aggregate results into a dataframe containing Date, min, mean, and max. Does anyone know how I might do that? I've tried

sub1.sum <- aggregate(Value ~ Date, sub1, stat)

and

sub1.sum <- as.data.frame(aggregate(Value ~ Date, sub1, stat))

but that appears to just return a single value (i'm not sure if this is the min, mean, or max).

2
  • Ok when I run your code aggregate(Value ~ Date, sub1, stat) returns Date Value.min Value.max Value.mean 1 1984-01-01 1.00000 76.00000 41.93750 . .......... ....... ........ ........ Is that not what you wanted. Commented Apr 11, 2014 at 5:12
  • Hello CCurtis, yes that is what I wanted, but I would like to store those results in a dataframe with columns for date, min, mean, and max. Do you know how I might do that? My attempts thus far (above) have failed. Commented Apr 11, 2014 at 5:36

1 Answer 1

1

cbind(sub1.sum[,-2, drop=FALSE], as.data.frame(sub1.sum$Value)) gives you what you want, I guess. The problem before was the the value column consisting of 3 dimensions (min, mean, max)...

Sign up to request clarification or add additional context in comments.

3 Comments

Nice. Ok trying to understand how you answer works. What does indexing with -2 do? I've never had a problem like this where R will show you data but not allowing you to put it in a table.
-2 means all columns but the second. You could also say sub1.sum[,1, drop=FALSE] to only keep the first column (Dates). drop=FALSE prevents from converting a single column dataframe into a vector. So what the code does in total is: pasting together (cbind is for column bind) the Dates column with the three columns in Value.
Ok and calling sub1.sum$Value is just returning min max mean I'm assuming. Mainly hadn't seen negative values used to index before. Is a good trick to know. Also interesting to know that aggregate was adding three values into a single column but only returning 1 and not reporting any error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.