0

Since some days I try to find a way to subset my data frame by comparing a character in a column with a string in another column.

In case the character is not within the string, I want to copy a value to a new column. I searched high and low, tried many examples, but for some reason I do not get it to work on my data frame.

    df <- structure(list(POLY = c("K3", "K3", "K3", "K4", "K4", "K4", "K4", 
    "K6", "K6", "K7", "K7", "K7", "L1", "L1", "L1"), FIX = c("O", 
    "K", "M", "M", "K", "O", "L", "K", "M", "K", "O", "M", "M", "L", 
    "O"), SESSTIME = c(310, 190, 181, 188, 151, 260, 268, 200, 259, 
    245, 180, 188, 259, 199, 244), CODE = c("KO", "KO", "KO", "KM", 
    "KM", "KM", "KM", "KM", "KM", "KO", "KO", "KO", "LMO", "LMO", 
    "LMO")), .Names = c("POLY", "FIX", "SESSTIME", "CODE"), row.names = c(42L, 
    44L, 46L, 115L, 116L, 117L, 133L, 225L, 231L, 269L, 270L, 328L, 
    420L, 425L, 431L), class = "data.frame")

This it what a part of it looks like:

    row.names   POLY    FIX SESSTIME    CODE    SESSTIME2
1   42          K3      O   310         KO      NA
2   44          K3      K   190         KO      NA
3   46          K3      M   181         KO      ...
4   115         K4      M   188         KM
5   116         K4      K   151         KM
6   117         K4      O   260         KM      NA
7   133         K4      L   268         KM      268
8   225         K6      K   200         KM      NA
9   231         K6      M   259         KM
10  269         K7      K   245         KO
11  270         K7      O   180         KO
12  328         K7      M   188         KO      188
13  420         L1      M   259        LMO
14  425         L1      L   199        LMO
15  431         L1      O   244        LMO

So when FIX is not in CODE the value of SESSTIME should be copied to SESSTIME2 (column already prepopulated with NA)

I tried it for example with

  df$FIX %in% strsplit(as.character(df$CODE,""))

or similar, but the comparison is always TRUE.

All examples I found only applied (and worked) with comparison of a single character e.g. "K" hardcoded with a vector c("K","L","M") or so, but never an example how to apply this to data frame columns and rows.

I'm getting a little bit nervous ...

Anyone an idea what I'm doing wrong?

UPDATE:

Thanx to the answer below, my code now looks like this and does what I need:

df3$SESSTIME2[!(mapply(function(i, j) length(grep(i, j)), df$FIX, df$CODE)) & is.na(df$SESSTIME2)] 

<- 

df$SESSTIME[!(mapply(function(i, j) length(grep(i, j)), df$FIX, df$CODE)) & is.na(df$SESSTIME2)] 
0

1 Answer 1

2

The reason your code doesn't work is because

strsplit(as.character(df$CODE,""))

returns a list. Instead, you need to use mapply to detect if there is a match.

Here we used grep which allows more flexible character matching

# The values of FIX & CODE are passed to i and j
mapply(function(i, j) length(grep(i, j)), df$FIX, df$CODE)

or using %in%

## Suggested by akrun
mapply('%in%', df$FIX,strsplit(as.character(df$CODE), ''))
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much! Obviously I managed to avoid to use all these apply's and functions so far. Have to learn this concept now! Btw. to achieve the non-matches, I used "!(....)" which worked fine ... just for the completeness.
Errm,, a second question (if allowed): Is there a more elegant way to copy the value to the other column? At the moment, I produce this monster: df3$SESSTIME2[!(mapply(function(i, j) length(grep(i, j)), df$FIX, df$CODE)) & is.na(df$SESSTIME2)] <- df$SESSTIME[!(mapply(function(i, j) length(grep(i, j)), df$FIX, Data3$CODE)) & is.na(df$SESSTIME2)] ? The "is.na" is used to select only those row which haven't a new value yet. But is there a way to avoid duplicating the hole function on both sides?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.