Count number of items and sum of the items in a column in R
Code:
mydf <- data.frame(Key=c('a','b','c','d','a','a','a','a','b','c','d','c','c','d','b','b','a'), Values=c(4,6,3,4,5,6,2,1,3,4,5,6,4,2,4,5,6))
mydf_keys <- unique(mydf$Key)
mydf_vals <- c(nrow(mydf_keys))
for (i in mydf_keys)
{
mydf_vals <- c(mydf_vals, nrow(mydf[mydf$Key %in% i, ]))
}
#mydf_count gives count of items i.e. how many times they appear
mydf_count <- data.frame(item=mydf_keys, count=mydf_vals)
print(mydf_count)
#mydf_sum gives sum of values of items in the column you specify
mydf_sum <- setNames(aggregate(mydf$Values, by = list(item=mydf$Key), FUN = sum, na.rm = TRUE), c('item', 'sum'))
print(mydf_sum)
Output:
If you have a better approach for this then please share in the comments below or email me. :)
This R example groups rows by a key column and computes the count and sum of the values within each group by walking the unique keys and aggregating the matching rows.
R offers more concise built in ways to do the same thing, such as aggregate or tapply in base R, or group_by with summarise in dplyr, which are usually faster and easier to read on larger data.

Comments
Post a Comment