1

With my dataset "chile2", I am doing following calculations:

Tab<-with(chile2,table(Q27,Q12_1_TEXT))
Tab<-as.data.frame.matrix(Tab)

I want the output like below (only 4 columns are shown):

I am trying below code to do manipulations to the columns, which is strangely calculating few columns but not all:

 for(col in names(T1)){
    T1[col]=(T1[col]*100)/colSums(T1)
 }

  for(t in names(T1)){
    T1[t]=paste(round(T1[t],0),"%")
  }

The "All" row can be added after this.

Any other better way to do this?

dput(chile2)

structure(list(Q27 = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 2L, 
2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L), .Label = c("Si", 
"No"), class = "factor"), Q12_1_TEXT = c("Abertis Autopistas S.a.", 
"Accenture", "Adessa Falabella", "Administradora de Fondos de Cesantía", 
"AES GENER S.A.", "AFP HABITAT S.A.", "Agrícola Ariztía", "Agrosuper S.A.", 
"Aguas de Antofagasta S.A.", "AIG Chile Compañía de Seguros Generales S.A.", 
"Arcos Dorados Restaurantes de Chile Ltda", "Artel", "Arval - Relsa", 
"Banchile Inversiones", "BANCO INTERNACIONAL", "BASF CHILE SA", 
"BBVA Chile", "BOEHRINGER INGELHEIM LTDA", "Bredenmaster S.A.", 
"Caja de Compensación 18 de Septiembre")), class = c("data.table", 
"data.frame"), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x0000000000100788>, .Names = c("Q27", 
"Q12_1_TEXT"))
1
  • You may need addmargins(100*prop.table(Tab, 2), 1) Commented Feb 13, 2018 at 10:32

2 Answers 2

1

dcast the chile2 data to get the length by Q27 variable. Then find the proportions using prop.table and append it with '%' character using paste0()

library('data.table')
Tab <- dcast(chile2, Q27 ~ Q12_1_TEXT, fun.aggregate = length, value.var = 'Q12_1_TEXT')
Tab[, 2:ncol(Tab) := lapply(.SD, function(x) paste0( prop.table(x)*100, "%")), .SDcols = -1 ]
Tab
#    Q27 AES GENER S.A. AFP HABITAT S.A. AIG Chile Compañía de Seguros Generales S.A. Abertis Autopistas S.a.
# 1:  Si           100%               0%                                         100%                      0%
# 2:  No             0%             100%                                           0%                    100%

Now you can add All row.


EDIT:

I converted Q27 column of Tab data table from factor to character so it is consistent. Then, I added Total column which takes row sum from column 2 to last column. The .SDcols = -1 includes all columns except the first column, because it is of character type. We want to work with numeric columns when making row sums. Then I created the ALL row, so that it can be rbind later. Then the proportions with in the data is computed and % character is added to it. Finally, the Tab and ALL data are combined together using rbindlist.

library('data.table')
Tab <- dcast(chile2, Q27 ~ Q12_1_TEXT, fun.aggregate = length, value.var = 'Q12_1_TEXT')
Tab[, Q27 := as.character(Q27)]  # convert column one from factor to character
Tab[, Total := rowSums(.SD), .SDcols = -1 ]  # add Total column
ALL <- as.list( c( Q27 = "ALL", colSums(Tab[, 2:ncol(Tab)]) ) )  # create ALL row
Tab[, 2:ncol(Tab) := lapply(.SD, function(x) paste0( prop.table(x)*100, "%")), .SDcols = -1 ]  # find proportions
Tab <- rbindlist(l = list(Tab, ALL)) # row bind Tab and ALL

Output:

str(Tab)
# Classes ‘data.table’ and 'data.frame':    3 obs. of  22 variables:
# $ Q27                                         : chr  "Si" "No" "ALL"
# $ AES GENER S.A.                              : chr  "100%" "0%" "1"
# $ AFP HABITAT S.A.                            : chr  "0%" "100%" "1"
# $ AIG Chile Compañía de Seguros Generales S.A.: chr  "100%" "0%" "1"
# $ Abertis Autopistas S.a.                     : chr  "0%" "100%" "1"
# $ Accenture                                   : chr  "100%" "0%" "1"
# $ Adessa Falabella                            : chr  "0%" "100%" "1"
# $ Administradora de Fondos de Cesantía        : chr  "0%" "100%" "1"
# $ Agrosuper S.A.                              : chr  "0%" "100%" "1"
# $ Agrícola Ariztía                            : chr  "0%" "100%" "1"
# $ Aguas de Antofagasta S.A.                   : chr  "0%" "100%" "1"
# $ Arcos Dorados Restaurantes de Chile Ltda    : chr  "0%" "100%" "1"
# $ Artel                                       : chr  "100%" "0%" "1"
# $ Arval - Relsa                               : chr  "0%" "100%" "1"
# $ BANCO INTERNACIONAL                         : chr  "100%" "0%" "1"
# $ BASF CHILE SA                               : chr  "0%" "100%" "1"
# $ BBVA Chile                                  : chr  "100%" "0%" "1"
# $ BOEHRINGER INGELHEIM LTDA                   : chr  "100%" "0%" "1"
# $ Banchile Inversiones                        : chr  "0%" "100%" "1"
# $ Bredenmaster S.A.                           : chr  "0%" "100%" "1"
# $ Caja de Compensación 18 de Septiembre       : chr  "0%" "100%" "1"
# $ Total                                       : chr  "35%" "65%" "20"
# - attr(*, ".internal.selfref")=<externalptr> 
Sign up to request clarification or add additional context in comments.

6 Comments

This one's working, I am trying now to understand how it works, thanks
If I add a total column to Tab with - Tab2<-Tab%>%mutate("Total"=rowSums(Tab[,2:ncol(Tab)])), then the code you suggested s giving error: Error in [.data.frame(Tab2, , :=(2:ncol(Tab2), lapply(.SD, function(x) paste0(prop.table(x) * : unused argument (.SDcols = -1)
I think its because Tab2 is not data table
The output I shown in the question is what I want. Your code works fine . data table issue resolved. But adding "All" row using rbind is not working after this.rbind(Tab2,c("All",colSums(Tab2[,2:ncol(Tab2)])))
Yes, I added code to convert Q27 to character then the old code also working. Is it because rbind works like this?
|
1

We can do this with base R using addmargins and prop.table. From the table output in the OP's post, apply prop.table with margin specified as 2 (for columnwise). Create the Sum row with addmargins. Then, we multiply the rows except the last one with 100 and paste the % to the values

res <- addmargins(prop.table(Tab, 2), 1)
res[-nrow(res),] <- paste0(res[-nrow(res), ] * 100, "%")
dimnames(res)[[1]][3] <- "All"
names(dimnames(res)) <- NULL
res[,1:3]
#   Abertis Autopistas S.a. Accenture Adessa Falabella
#Si  0%                      100%      0%              
#No  100%                    0%        100%            
#All 1                       1         1           

2 Comments

This is giving error - Error in margin.table(x, margin) : 'x' is not an array
@SandhyaGhildiyal I was using only your Tab output based on the 'chile' data you provided. It is not giving me any problem

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.