Posted By: Anonymous
I have a data frame with two columns. First column contains categories such as “First”, “Second”, “Third”, and the second column has numbers that represent the number of times I saw the specific groups from “Category”.
Category Frequency First 10 First 15 First 5 Second 2 Third 14 Third 20 Second 3
I want to sort the data by Category and sum all the Frequencies:
Category Frequency First 30 Second 5 Third 34
How would I do this in R?
aggregate(x$Frequency, by=list(Category=x$Category), FUN=sum) Category x 1 First 30 2 Second 5 3 Third 34
In the example above, multiple dimensions can be specified in the
list. Multiple aggregated metrics of the same data type can be incorporated via
aggregate(cbind(x$Frequency, x$Metric2, x$Metric3) ...
(embedding @thelatemail comment),
aggregate has a formula interface too
aggregate(Frequency ~ Category, x, sum)
Or if you want to aggregate multiple columns, you could use the
. notation (works for one column too)
aggregate(. ~ Category, x, sum)
tapply(x$Frequency, x$Category, FUN=sum) First Second Third 30 5 34
Using this data:
x <- data.frame(Category=factor(c("First", "First", "First", "Second", "Third", "Third", "Second")), Frequency=c(10,15,5,2,14,20,3))