Creating a new column based on function

Question

I have an Excel sheet containing employment numbers of each industry of each county in the US.

It looks like this:

County   Industry  Employees
a        1         49
a        2         1
b        1         4
b        2         19
...

I want to calculate the Herfindahl-Hirschman index (HHI) of employment in each county. I'm using R. Given some numbers, calculating the HHI is easy:

hhi <- function(x) {
  # calculate sum
  total <- sum(x)
  
  # calculate share
  share <- x*100/total
  
  # add
  return(sum(share^2))
  
}

So, for example, county 1 has a HHI of 9608 (= 98^2 + 2^2) and county 2 has a HHI of 7127.

But how can I create a new column with the HHI of that county?

akuiper · Accepted Answer · 2016-09-15 15:27:00Z

6

You can use dplyr:

library(dplyr)
df %>% group_by(County) %>% mutate(HHI = sum((Employees/sum(Employees) * 100)^2))

# Source: local data frame [4 x 4]
# Groups: County [2]

#   County Industry Employees      HHI
#   <fctr>    <int>     <int>    <dbl>
# 1      a        1        50 9615.532
# 2      a        2         1 9615.532
# 3      b        1         4 7126.654
# 4      b        2        19 7126.654

Or equivalently, use data.table:

setDT(df)[, HHI := sum((Employees/sum(Employees) * 100)^2), County][]

With your own customized function hhi, since all the functions it calls are vectorized, you can directly use it with mutate:

df %>% group_by(County) %>% mutate(HHI = hhi(Employees))

or:

setDT(df)[, HHI := hhi(Employees), County][]

answered Sep 15, 2016 at 15:27

akuiper

216k33 gold badges362 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

wwl Over a year ago

How do I ignore NA values using the first approach?

akuiper Over a year ago

You can specify na.rm = T for the sum functions.

akrun · Accepted Answer · 2016-09-15 15:30:28Z

3

We can use ave from base R (no packages used)

df1$HHI <- with(df1, ave(Employees, County, FUN = hhi))

answered Sep 15, 2016 at 15:30

akrun

891k38 gold badges590 silver badges700 bronze badges

Collectives™ on Stack Overflow

Creating a new column based on function

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related