0

The logic is similar to the content-based recommender,

content undesirable desirable user_1 ... user_10
1 3.00 2.77 0.11 NA
...
5000 2.50 2.11 NA 0.12

I need to run the model for undesirable and desirable as independent values and each user as the dependent value, thus I need run 10 times to fit the model and predict each user's NA value.

This is the code that I hard coding, but I wonder how to use for loop, I just searched for several methods but they do not work for me...

the data as 'test'

hard code

#fit model
fit_1 = lm(user_1 ~ undesirable + desirable, data = test)
...
fit_10 = lm(user_10 ~ undesirable + desirable, data = test)

#prediction
u_1_na = test[is.na(test$user_1), c('user_1', 'undesirable', 'desirable')]
result1 = predict(fit_1, newdata = u_1_na)
which(result1 == max(result1))
max(result1)
...
u_10_na = test[is.na(test$user_10), c('user_10', 'undesirable', 'desirable')]
result10 = predict(fit_10, newdata = u_10_na)
which(result10 == max(result10))
max(result10)

#make to csv file
apply each max predict value to csv.

this is what I try for now(for loop)

mod_summaries <- list() 

for(i in 1:10) {                 
  
  predictors_i <- colnames(data)[1:10]   
  mod_summaries[[i - 1]] <- summary(     
    lm(predictors_i ~ ., test[ , c("undesirable", 'desirable')]))
  
}
4
  • Create the formulas as string formulas <- paste0("user_", 1:10, " ~ undesirable + desirable" use them to iterate and create the regressions models <- lapply(formulas, \(x)lm(as.formula(x), data = test)) Commented Nov 14, 2022 at 15:32
  • R is an index 1 language, no need to subtract 1 from i, i - 1, as index 0 languages like Python require. Simply refer to i. Commented Nov 14, 2022 at 15:42
  • @M.Viking “Like index 0 like Python require” — err this isn’t really required in Python either since virtually every iteration starts at 0, not at 1. Commented Nov 14, 2022 at 15:50
  • @Oliver, Yeah, this is work for me, but I still need to use the model to predict the NA for each user.. Commented Nov 14, 2022 at 16:39

3 Answers 3

1

An apply method:

mod_summaries_lapply <-
  lapply(
    colnames(mtcars),
    FUN = function(x)
      summary(lm(reformulate(".", response = x), data = mtcars))
  )

A for loop method to make linear models for each column. The key is the reformulate() function, which creates the formula from strings. In the question, the function is made of a string and results in error invalid term in model formula. The string needs to be evaluated with eval() . This example uses the mtcars dataset.

mod_summaries <- list() 
for(i in 1:11) {                 
  predictors_i <- colnames(mtcars)[i]   
  mod_summaries[[i]] <- summary(lm(reformulate(".", response = predictors_i), data=mtcars))
  #summary(lm(reformulate(". -1", response = predictors_i), data=mtcars))  # -1 to exclude intercept
  #summary(lm(as.formula(paste(predictors_i, "~ .")), data=mtcars)) # a "paste as formula" method
}
Sign up to request clarification or add additional context in comments.

Comments

0

You could use the function as.formula together with the paste function to create your formula. Following is an example

formula_lm <- as.formula(
    paste(response_var, 
          paste(expl_var, collapse = " + "), 
          sep = " ~ "))

This implies that you have more than one explanatory variable (separated in the paste with +). If you only have one, omit the second paste.

With the created formula, you can use the lm funciton like this:

lm(formula_lm, data)

Edit: the vector expl_var would in your case include the undesirable and desirable variable.

2 Comments

While this is possible and fairly common, it's actually quite convoluted. R has a conceptually simpler and much more elegant way of dynamically creating formulas, namely by interpolating a variable into an unevaluated expression, e.g.: eval(bquote(.(response_var) ~ .)).
Thanks Konrad, learned something new! Did not know that the function bquote existed. Altough I think that the readability suffers a bit.
0

Avoid the loop. Make your data tidy. Something like:

library(tidyverse)

test %>%
  select(-content) %>%
  pivot_longer(
    starts_with("user"),
    names_to="user",
    values_to="value"
  ) %>%
  group_by(user) %>%
  group_map(
    function(.x, .y) {
      summary(lm(user ~ ., data=.x))
    }
  )

Untested code since your example is not reproducible.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.