7

My last few posts have been written poorly, so I will attempt to do a better and cleaner job this time.

I am learning how to work with the data tables object, and one task I am struggling with is updating values in the data table by both row number and column name at the same time. With data.frames this is much easier and I just do the following:

my_df = as.data.frame(matrix(ncol = 10, nrow = (100)))
names(my_df) = c("P1", "P2", "P3", "P4", "P5", "Q1", "Q2", "Q3", "Q4", "Q5")
head(my_df)

  P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1 NA NA NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA NA NA NA

replacement = c(1, 2, 3, 4, 5)
my_df[2, names(my_df)[1:5]] = replacement
head(my_df)

  P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1 NA NA NA NA NA NA NA NA NA NA
2  1  2  3  4  5 NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA NA NA NA

so, fairly easy with a dataframe. however, I am struggling with this same exact task with a datatable. using the same structure for an example data table as i did with the dataframe above, i've tried the following:

my_dt = data.table(matrix(ncol = 10, nrow = (100)))
names(my_dt) = c("P1", "P2", "P3", "P4", "P5", "Q1", "Q2", "Q3", "Q4", "Q5")
head(my_dt)

   P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1: NA NA NA NA NA NA NA NA NA NA
2: NA NA NA NA NA NA NA NA NA NA
3: NA NA NA NA NA NA NA NA NA NA
4: NA NA NA NA NA NA NA NA NA NA
5: NA NA NA NA NA NA NA NA NA NA
6: NA NA NA NA NA NA NA NA NA NA

replacement = c(1, 2, 3, 4, 5)
# my_dt[i == 2, names(my_dt)[1:5]] = replacement
# my_dt[i == 2, names(my_dt)[1:5] := replacement]  
# my_dt[2, names(my_dt)[1:5]] = replacement
# my_dt[2, names(my_dt)[1:5] := replacement]  

however none of the four commented lines did the correct substitution. appreciate any help!

Thanks, Canovice

3
  • Probably not what you are looking for, but how about writing your stuff into a data frame and then converting it to a data table? Commented Aug 9, 2016 at 15:05
  • i want to use data tables to speed up my code, performance is very important for this project. I frequently have to access subsets of this df / dt and update their values, and i thought accessing subsets of a dt was faster than accessing subsets of a df. so that's why i don't want to write into a dataframe Commented Aug 9, 2016 at 15:08
  • You shouldn't need this. That you do lets me suspect that you should use a sparse matrix instead of a data.table. Commented Aug 9, 2016 at 16:49

2 Answers 2

8

Or you can do this:

x <- names(my_dt)[1:5]

my_dt[, (x) := lapply(.SD, as.numeric), .SDcols = x]

my_dt[2,  (x):= as.list(replacement)]

First we convert the target columns in my_dt to numeric. .SDcols represents the subset of columns in .SD that we are interested in. .SD holds all the columns in the data.table (except the ones used in by).

Once we convert the target columns to numeric, we update the values by reference.

Note: It is not necessary to define x beforehand, everything can be done on the fly. However, if you define x, you need to wrap it in () to make sure data.table doesn't look for the column x

Sign up to request clarification or add additional context in comments.

2 Comments

yes thanks!, I like both this and mkt's solution, but will use yours because it is important that I initialize with NAs and keep track of NAs in my data table. thank you
Last followup, thanks again. Can I update values by reference over multiple rows at a time? For example, if i wanted to update rows 2:5, columns (x) all with the replacement object? Yes I can! okay thanks again
4

Made a couple of small changes to your example, but this works:

#Filled data.table with integers instead of NAs to avoid converting 
#from logical later
#Left out names as it wasn't relevant to the example
my_dt = as.data.table(matrix(ncol = 10, nrow = (100), 1L))
head(my_dt)

replacement <- 1:5
#Loop through columns and use set to replace values without making a copy
for(k in 1:5) set(my_dt, i = 2L, j = k , value = replacement[k])
head(my_dt)

2 Comments

thanks!. do you think the issue may have been then that I was initializing my dataframe with all NAs?
No, that was not the only problem. I did that mainly for convenience in my response (it wasn't clear that NAs were important in your example). Your code was also trying to assign all 5 replacement values to each of the row and column elements.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.