Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
70 views

I am trying to fuzzyjoin two dataframes. Both contain the column with ZIP codes and some other columns. However, in the parental dataframe there are more ZIP codes than in the secondary one. I would ...
Kass's user avatar
  • 103
1 vote
1 answer
82 views

I have two different datasets and would like to join Dataset2 to Dataset1. In Dataset1, there are different CBs in each day, and there are IV 1 to 3 in each CB. In Dataset2, there are different time ...
Mee's user avatar
  • 321
1 vote
1 answer
352 views

I have a large dataset of over 43 million rows and 3.84 GB and another dataset of over 6000 rows and 459 KB. I am trying to do an inner_join() based on two columns: One exact column based on a common ...
bear_525's user avatar
  • 123
0 votes
1 answer
637 views

I have 2 pandas dataframes that both contain company names. I want to merge these 2 dataframes on company names using a fuzzy match. But the problem is 1 dataframe contains 5m rows and the other 1 ...
L H's user avatar
  • 27
0 votes
2 answers
505 views

I have a very large data that contains a very messy and not uniform address field. I only care to extract a country name out of it. Most of the records contain country and city and some contain other ...
Annabanana's user avatar
0 votes
0 answers
49 views

I have two data frames where I want to apply fuzzyjoin in R. I have written the code like this. library(tidyverse) library(fuzzyjoin) library(readxl) ex_hotels<-readRDS("expedia_hotels....
P Initiate's user avatar
2 votes
3 answers
103 views

This question is expanding on this post: Pairing Time series Data with Batch Data in R A good solution was given to me that worked for the dputs I provided but the problem is that my dataset is quite ...
FactoryData999's user avatar
0 votes
1 answer
80 views

I have two dataframes, both have a Last_Name column. First dataframe has a column Contains_First_Name and the second has a column called First_Name. I want to join the two on the exact spelling of ...
Annabanana's user avatar
0 votes
1 answer
72 views

I'm trying to follow this answer that join two tables with range: https://stackoverflow.com/a/46341899/6636572 I want to join two tables where one has some ranges and another are numbers and I want to ...
monotonic's user avatar
  • 650
1 vote
1 answer
271 views

I have 2 data frames containing short (length == 20) sequences that I want to compare with string distance analysis techniques, returning highly similar sequences with a hamming distance of no greater ...
Bryan's user avatar
  • 68
2 votes
2 answers
216 views

I have 2 data frames. I am trying to merge/join them together while specifying how I want rows to align. Mock data below. df <- data.frame(Race = c("White", "NHPI", "AA&...
shollaback's user avatar
0 votes
1 answer
63 views

Here are the dataframes library(dplyr) set.seed(123) id <- rep(c("A", "B", "C"), each = 5) score <- sample(1:50, 15) label <- paste(sample(LETTERS, 15 * 5, ...
user22746157's user avatar
0 votes
0 answers
53 views

I have two tables. Table one has an id column and a full_name column. Table two has only a full name column but the names are near-matches and not full matches. I would like to apply the id column to ...
Ben Blackburn's user avatar
0 votes
1 answer
49 views

I am trying to join two tables based on a code created within each table that identifies a prescribed drug. The problem is that the drug code sometimes has additional numbers at the end in one table. ...
sebastian.mendoza's user avatar
1 vote
2 answers
274 views

I am trying to left join table 1 'Person Name' to table 2 'Name' and get the values from the Work Group column in Table 2 df1 <- read.table(text=" Person_Name PEREZ, MINDY PEREZ, ABA CLARKE, ...
Pxanalyst's user avatar
0 votes
1 answer
91 views

I am trying to run this code : main_df %>% fuzzy_anti_join(secondary_df, match_fun = list(`==`, `%within%`), by = c("ID","Date" = "Date_Interval"))...
marcelklib's user avatar
0 votes
0 answers
53 views

I need to interval_left_join two dataframes by groups (the grouping variable is File), but using this code I get this error: library(BiocManager) library(fuzzyjoin) df1 %>% group_by(File) ...
Chris Ruehlemann's user avatar
1 vote
1 answer
186 views

I have a table of 10,000 unique names. Using the package(fuzzyjoin) I would like to match these unique names to names that are only spelled with one different letter. I would like to group the ...
megsruppUNBC's user avatar
0 votes
2 answers
101 views

I am working with two data.tables, predicted yields over age based on a variety of stand condition field measurements of yields at a particular field location, with a measured age I would like to ...
David's user avatar
  • 779
0 votes
1 answer
205 views

I'm attempting to join two tables, one is a smaller table with a column of names of common food items (e.g. "Corn", "Peppers", "Squash"...etc...), and the other is a ...
droseraCapensis's user avatar
1 vote
1 answer
163 views

I would like to compare two mixed-type data frames and return the rows that are different between them--but I would like numeric values to only be returned within a certain percentage. tbl1 <- ...
JemJem's user avatar
  • 25
1 vote
1 answer
504 views

I'm trying to join two data sets using fuzzy matching through the stringdist_left_join function from the library fuzzy join, but I keep getting the error message "Error: vector memory exhausted (...
yankees_fan's user avatar
0 votes
0 answers
302 views

In R, I have two dataframes, one with full names and one with abbreviated names, I want to dplyr join them to see which one has a flag. However, it is very hard to get matched names, even when I match ...
Mimi Guo's user avatar
1 vote
2 answers
1k views

I'm trying to join two datasets on based on the values of two variables. Both datasets have the same variable names/number of columns but may have a different number of rows. I want to join them based ...
JRock's user avatar
  • 13
0 votes
0 answers
61 views

I have three data frames that need to be merged. There are a few small differences between the competitor names in each data frame. For instance, one name might not have a space between their middle ...
bandcar's user avatar
  • 743
1 vote
0 answers
35 views

I have two databases, one designated data and another data1 (reference), where I want to compare the codes of each data designation and data2, I have to do it by writing the designations, if they are ...
Mariama Drame's user avatar
1 vote
2 answers
149 views

I have a problem that can be reproduced in the following way: library(tidyverse) a <- tibble(navn=c("Oslo kommune", "Oslo kommune", "Kommunen i Os", "Kommunen i ...
Ajern's user avatar
  • 11
2 votes
1 answer
525 views

When I want to join two data frames based on two intervals, I prefer to use the fuzzyjoin package because it is easy to read in my opinion. But when I need to work with large datasets, the fuzzyjoin ...
Quinten's user avatar
  • 42.8k
2 votes
2 answers
72 views

I have some values in df: # A tibble: 7 × 1 var1 <dbl> 1 0 2 10 3 20 4 210 5 230 6 266 7 267 that I would like to compare to a second dataframe called value_lookup # A ...
Julian's user avatar
  • 9,645
2 votes
1 answer
980 views

I was working in the following problem. I've got monthly data from a survey, let's call it df: df1 = tibble(ID = c('1','2'), reported_value = c(1200, 31000), anchor_month = c(3,5)) ID ...
Juan C's user avatar
  • 6,148
0 votes
1 answer
68 views

i have a dataset that lists several possible genera of plants, and another dataset that lists all the species with their functional forms. I would like to merge these datasets in such a way that IF ...
salix7's user avatar
  • 61
0 votes
1 answer
272 views

I would like to join two data sets that look like the following data sets. The matching rule would be that the Item variable from mykey matches the first part of the Item entry in mydata to some ...
lilla's user avatar
  • 151
0 votes
1 answer
44 views

I have two data frames with columns of interest 'ParseCom', which is the left index of this fuzzy join, and 'REF' which should be a substring of 'ParseCom' during a join. This is iterating over the ...
Isaacnfairplay's user avatar
2 votes
0 answers
233 views

Can someone help me understand what "multi_by" and "multi_match_fun" actually do in comparison to "by" and "match_fun" in the R package fuzzyjoin? This is from ...
tospo's user avatar
  • 738
1 vote
1 answer
130 views

I would like to do an interval join with an additional key. The simplest way in dplyr is quite slow intervalDf <- tibble(id = rep(seq(1, 100000, 1), 10), k1 = rep(seq(1, 1000, ...
blahblah4252's user avatar
1 vote
1 answer
672 views

I am trying to perform a join in R based on a regex pattern from one table. From what I understand, the fuzzyjoin package should be exactly what I need, but I can't get it to work. Here is an example ...
Nick Brown's user avatar
-1 votes
2 answers
809 views

I would like to do exact joins for the columns state and name, but a fuzzy join for the "name" and "versus" columns: year <- c("2002", "2002", "1999&...
hy9fesh's user avatar
  • 661
0 votes
3 answers
2k views

I'm fairly new to R, and have been sifting through other questions all morning trying to figure this out, but can't find anything related enough or my knowledge of R is not good enough to understand ...
P Meddyyy's user avatar
0 votes
2 answers
208 views

I have two data frames I would like to merge a<- data.frame(x=c(1,4,6,8,1,6,7,2),ID=c("132","14.","732","2..","132","14.","732",...
mclofa's user avatar
  • 33
0 votes
0 answers
59 views

I have been trying to use the fuzzy join package to join the "conservation status" column from the con_filtered_report_groups data frame to the report_groups_order dataframe that has the ...
rhelp's user avatar
  • 1
3 votes
2 answers
90 views

I need to join two datasets and the only identifier in both are the company names. For example: db1 <- tibble( Company = c('Bombardier Inc.','Honeywell Development Corp','The Pepsi Bottling Group ...
msn's user avatar
  • 123
2 votes
1 answer
607 views

I want to merge two data frames df1 and df2. df1<-tibble(x=c("FIDELITY FREEDOM 2015 FUND", "VANGUARD WELLESLEY INCOME FUND"),y=c(1,2)) df2<-tibble(x=c("FIDELITY ...
Jane's user avatar
  • 81
1 vote
1 answer
89 views

Is there a way I can partially match the two data frames in R? df1<-data.frame("FIDELITY FREEDOM 2015 FUND", "ID") df2<-data.frame("FIDELITY ABERDEEN STREET TRUST: ...
Jane's user avatar
  • 81
0 votes
1 answer
65 views

I am working on a very advanced join of dataframes that is complex for me. I would like to ask you for some help if possible. I have two dataframes, df1 and df2 which I include at the end as dput(). ...
Duck's user avatar
  • 39.6k
1 vote
1 answer
166 views

The data is as follows: library(fuzzyjoin) nr <- c(1,2) col2 <- c("b","a") dat <- cbind.data.frame( nr, col2 ) thelist <- list( aa=c(1,2,3), bb=c(1,2,3) ) I would ...
Tom's user avatar
  • 2,341
0 votes
0 answers
135 views

I previously asked a question here about how to use R to automatically "spellcheck" a big list of department names before I export a file and send it off. (Same data can be used as ...
Joe Crozier's user avatar
  • 1,056
1 vote
2 answers
511 views

I am working with two datasets that I would like to join based not exact matches between them, but rather approximate matches. My question is similar to this OP. Here are examples of what my two ...
Blundering Ecologist's user avatar
4 votes
1 answer
410 views

Is there a way of joining two dataframes via where a row in the first dataframe is joined with every row in the second dataframe if they share a word in common? For example: companies1 <- data....
thetimidshrew's user avatar
2 votes
3 answers
2k views

I am trying to left-join df2 onto df1. df1 is my dataframe of interest, df2 contains additional information I need. Example: #df of interest onto which the other should be joined key1 <- c("...
Auream's user avatar
  • 55
1 vote
2 answers
217 views

I have a data table (lv_timest) with time stamps every 3 hours for each date: # A tibble: 6 × 5 LV0_mean LV1_mean LV2_mean Date_time Date <dbl> <dbl> <...
Lisa's user avatar
  • 81