I have one data frame (Table_A) with 3.4 million rows and 33 columns. I have another data frame with 384 rows and 3 columns (Table_B). (This is for one participant, I should have 40 at the end)
Table_A
| Col1 |
|---|
| 100 |
| 143 |
| 178 |
| 245 |
| 265 |
Table_B
| start | stop | name |
|---|---|---|
| 101 | 144 | Name1 |
| 154 | 254 | Name2 |
What I want to do is subset Table A by Col1, by start and stop columns in Table B and give each subset row a name. To return
Table_A adapted
| Col1 | name |
|---|---|
| 143 | Name1 |
| 178 | Name2 |
| 245 | Name2 |
I have tried
df_sub <- subset(Table_A, (Col1 >= (Table_B$start)) & (Col1 <= (Table_B$stop))```
names <- Table_B$name[(Table_A$Col1 >= (Table_B$start)) & (Table_A$Col1 <= (Table_B$Col2))]
df_out <- cbind(df_sub, names)
However, df_sub seems to only return one/two rows per subset and there should be ~187 in half (192) and ~375 in the other half. Whereas names returns 2million + rows.
I tried
(Table_A$Col1 >= (Table_B$Col1)) & (Table_A$Col1 <= (Table_B$Col2))
and this returns a list of False up to 384 and NA after