0

I have 2 data.frames

> head(cont)
                    old_pert     cmap_name       conc   perturb_geo        t1        t2        t3        t4        t5
1 5202764005789148112904.A02     estradiol 0.00000001 GSM119257 GSM119218 GSM119219 GSM119221 GSM119222 GSM119223
2 5202764005789148112904.A01 valproic acid 0.00050000 GSM119256 GSM119218 GSM119219 GSM119221 GSM119222 GSM119223

> head(expression)[1:3,1:8]
          GSM118911 GSM118912 GSM118913 GSM118723 GSM118724 GSM118725 GSM118726 GSM118727
1007_s_at     387.6     393.2     290.5     378.6     507.8     383.7     288.8     451.9
1053_at        56.4      53.5      32.8      39.0      71.5      47.3      46.0      50.1
117_at          6.3      33.6      19.2      17.6      20.3      15.0       7.1      43.1

I want to apply a loop to do:

for(i in 1:nrow(cont)){

first take some values from cont which will be used ahead

vehicle <- cont[i, 5:9]
perturb <- cont[i, 4]
col_name <- paste(cont[i, 2], cont[i, 3], sep = '_') #estradiol_.00001
tmp <- sum(expression[,which(colnames(expression) == vehicle)])/5
tmp2 <- expression[,which(colnames(expression) == perturb)]
tmp3 <- tmp/tmp2
div <- cbind(div, tmp3)
colnames(div)[i + 1] <- col_name
}

Take those columns from expression where col.names == vehicle & perturb and apply division.

div <- expression$vehicle / expression$perturb #I'm not getting how I can pass here the value in `vehicle` and `perturb`

Assign this new variable a column name which should be a combination of drug_name and concentration

col.names(div) <- drug_name_concentration

assign it the row.names of expression:

row.names(div) <- row.names(expression)

So this process will iterate 271 times (nrow(cont) = 271) and every time a new divised column will be cbindto my previous div. Hence final outcome will be:

                arachidonic acid_0.000010     oligomycin_0.000001 .........
1007_s_at            0.45                      0.30
1053_at              1.34                      0.65
117_at               0.11                      0.67
.....
.....

The logic is clear in my head but I am not getting how I can do it. Thanks for your help.

1 Answer 1

1

You are not assigning the variables correctly in the loop. Below is a sample loop that will correctly go over each row assigning the variable. e.g. the first loop i == 1, note I have changed how the column name is generated.

for(i in 1:nrow(cont)){
       vehicle <- cont[i, 3]
       perturb <- cont[i, 4]
       col_name <- paste(cont[i, 5], cont[i, 6], sep = '_')
    }

To then search for the respective columns with these variable names you can then use:

df[,which(colnames(df) == x)]

approach where df is you data frame and x is the variable.

Therefore,

div <- data.frame(row.names(expression))
for(i in 1:nrow(cont)){
       vehicle <- cont[i, 3]
       perturb <- cont[i, 4]
       col_name <- paste(cont[i, 5], cont[i, 6], sep = '_')

       tmp <- expression[,which(colnames(expression) == vehicle)]/
                    expression[,which(colnames(expression) == perturb)]

       div <- cbind(div, tmp)

       colnames(div)[i + 1] <- col_name
    }

    div <- div[,-1]
    row.names(div) <- row.names(expression)

What is happening is it loops through each row, assigns the value to the variables before finding those columns and simply dividing by the resulting vectors.

It then binds by column to the div data frame created before the loop with the row names from table expression.

Finally, renames the column name and after completing the loop it then renames the row names and drops the first column with the now redundant values.

EDIT - question changed

change #1

vehicle <- cont[i, 5:9]

to

vehicle <- cont[i, c(5:9)] ## note c()

change #2

tmp <- sum(expression[,which(colnames(expression) == vehicle)])/5

to

tmp <- sum(expression[,which(colnames(expression) %in% vehicle)])/5

FINAL EDIT

Full working function:

for(i in 1:nrow(cont)){

  perturb <- cont[i, 4]
  col_name <- paste(cont[i, 2], cont[i, 3], sep = '_')
  vehicle <- cont[i, c(5:9)]
  vehicle <- unname(unlist(vehicle[1,]))
  tmp <- expression[,which(colnames(expression) %in% vehicle)]
  row_tots <- as.data.frame(rowSums(tmp))
  row_tots <- row_tots/5

  tmp <- row_tots/expression[,which(colnames(expression) == perturb)]
  div <- cbind(div, tmp)
  colnames(div)[i + 1] <- col_name
}
div <- div[,-1]
row.names(div) <- row.names(expression)
Sign up to request clarification or add additional context in comments.

10 Comments

Thanks a bundle. It worked.. I was wondering how this thing is working: In some cases the col_name <- paste(cont[i, 5], cont[i, 6], sep = '_') had the same name for 2 instances and this code handled it by giving names "metformin_0.00001" and "metformin_0.00001.1". Can you explain why and how it happened?
You could try creating an empty vector with col_names <- c() and then within the loop col_names <- c(col_names, paste(cont[i, 5], cont[i, 6], sep = '_')) obviously remove the other instance of col_names in the loop. and then after the loop and after the div <- div[,-1] assign the column names via colnames(div) <- col_names
Ok, thanks. Can you tell me what will be the possible solution for the situation where perturb contains more than 1 columns and I want to take perturb = sum of columns / no.of columns and then divide control / perturb
Depends whether or not you know how many columns it is going to be. if that varies then you are going to probably want to write a function the deal with that. The function above deals with a known number of columns. That detail aside, in the df[,which(colnames(df) == x)] you can use the OR operator | so that it becomes df[,which(colnames(df) == x | colnames(df) == y)], you could even wrap that in the sum()/nrow() functions to get the value out. However that will give you a single value, which I'm guessing is the point as you want the mean.
Now in every case I have to take 5 columns for vehicle (that I'm doing by: vehicle <- cont[i, 5:9]), sum their values and divide them by 5: It will be the vehicle (that I'm doing by: tmp <- sum(expression[,which(colnames(expression) == vehicle)])/5) but it is not working. @amwill04
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.