1

I have a function that was suggested by a user as an aswer to my previous question:

word_string <- function(x) {
  inds <- seq_len(nchar(x))
  start = inds[-length(inds)]
  stop = inds[-1]
  substring(x, start, stop)
}

The function works as expected and breaks down a given word into component parts as per my sepcifications:

 word_string('microwave')
[1] "mi" "ic" "cr" "ro" "ow" "wa" "av" "ve"

What I now want to be able to do is have the function applied to all rows of a specified columnin a dataframe.

Here's dataframe for purposes of illustration:

word <- c("House", "Motorcar", "Boat", "Dog", "Tree", "Drink")
some_value <- c("2","100","16","999", "65","1000000")
my_df <- data.frame(word, some_value, stringsAsFactors = FALSE ) 
my_df
      word some_value
1    House          2
2 Motorcar        100
3     Boat         16
4      Dog        999
5     Tree         65
6    Drink    1000000

Now, if I use lapply to work the function on my dataframe, not only do I get incorrect results but also an error message.

 lapply(my_df['word'], word_string)
$word
[1] "Ho" "ot" "at" ""   "Tr" "ri"

Warning message:
In seq_len(nchar(x)) : first element used of 'length.out' argument

So you can see that the function is being applied, but it's being applied such that it's evaluating each row partially. The desired output would be something like:

[1] "ho" "ou" "us" "se
[2] "mo" "ot" "to" "or" "rc" "ca" "ar"
[3] "bo" "oa" "at"
[4] "do" "og"
[5] "tr" "re" "ee" 
[6] "dr" "ri" "in" "nk"

Any guidance greatly appreciated.

1 Answer 1

2

The reason is that [ is still a data.frame with one column (if we don't use ,) and so here the unit is a single column.

str(my_df['word'])
'data.frame':   6 obs. of  1 variable:
# $ word: chr  "House" "Motorcar" "Boat" "Dog" ...

The lapply loops over that single column instead of each of the elements in that column.

W need either $ or [[

lapply(my_df[['word']], word_string)
#[[1]]
#[1] "Ho" "ou" "us" "se"

#[[2]]
#[1] "Mo" "ot" "to" "or" "rc" "ca" "ar"

#[[3]]
#[1] "Bo" "oa" "at"

#[[4]]
#[1] "Do" "og"

#[[5]]
#[1] "Tr" "re" "ee"

#[[6]]
#[1] "Dr" "ri" "in" "nk"
Sign up to request clarification or add additional context in comments.

1 Comment

Ah yes! That seems like such an obvious cause. Thank you, Akrun. I will accept your answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.