I'm looking to load and process a CSV file with seven variables, one which is a grouping variable / factor (data$hashtag) and six which are categories (data$support and others) denoted with either an "X" or "x" (or were left blank).
data <- read.csv("maet_coded_tweets.csv", stringsAsFactors = F)
names(data) <- c("hashtag", "support", "contributeConversation", "otherCommunities", "buildCommunity", "engageConversation", "unclear")
str(data)
'data.frame': 854 obs. of 7 variables:
$ hashtag : chr "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" ...
$ support : chr "x" "x" "x" "x" ...
$ contributeConversation: chr "" "" "" "" ...
$ otherCommunities : chr "" "" "" "" ...
$ buildCommunity : chr "" "" "" "" ...
$ engageConversation : chr "" "" "" "" ...
$ unclear : chr "" "" "" "" ...
When I use a function to recode "X" or "x" to 1, and "" (blank) 0, the data are strangely character type, not numeric as intended.
recode <- function(x) {
x[x=="x"] <- 1
x[x=="X"] <- 1
x[x==""] <- 0
x
}
data[] <- lapply(data, recode)
str(data)
'data.frame': 854 obs. of 7 variables:
$ hashtag : chr "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" "#capstoneisfun" ...
$ support : chr "1" "1" "1" "1" ...
$ contributeConversation: chr "0" "0" "0" "0" ...
$ otherCommunities : chr "0" "0" "0" "0" ...
$ buildCommunity : chr "0" "0" "0" "0" ...
$ engageConversation : chr "0" "0" "0" "0" ...
$ unclear : chr "0" "0" "0" "0" ...
When I tried to coerce the characters using as.numeric() in the function, it still didn't work. What gives - why would the variables be treated as characters and how to character variables to numeric?
as.numeric()in the function?recode <- function(x) { x[x=="x"] <- as.numeric(1) x[x=="X"] <- as.numeric(1) x[x==""] <- as.numeric(0) x }return(as.numeric(x)). As I said in my previous comment, the way you did that still forces conversion to character. Or you could dores <- ifelse(x %in% c("x","X"),1,0)