1

I have three tables like these that have a key and a field that describes that particular key's random data:

> json1
       key                                        field
1 hg8oxoi4 "components":{"a": "21","b": "12","c": "34"}
2 gic3bv14 "components":{"a": "78","b": "66","c": "54"}
3 yo47wglq  "components":{"a": "6","b": "12","c": "12"}
4 vibidd0l   "components":{"a": "45","b": "5","c": "1"}
> json2
       key                                          field
1 hg8oxoi4 "last_recall": {"date": "012118","size": "43"}
2 vibidd0l "last_recall": {"date": "101618","size": "12"}
> json3
       key                           field
1 hg8oxoi4 "other_fields":{"people": "11"}
2 gic3bv14 "other_fields":{"people": "10"}
3 yo47wglq  "other_fields":{"people": "4"}

whats the best way to combine all three tables into one, making sure to match all the keys to each other and deal with the differences in which keys have data and which don't? Ideally, each field would be appended onto the other so that the new table's field column is a json object with the different data.

Edit: Here's the expected output.

> json4
       key
1 hg8oxoi4
2 gic3bv14
3 yo47wglq
4 vibidd0l
                                                                                                                       field
1 {"components":{"a": "21","b": "12","c": "34"},"last_recall": {"date": "012118","size": "43"},"other_fields":{"people": "11"}}
2                                                {"components":{"a": "78","b": "66","c": "54"},"other_fields":{"people": "10"}}
3                                                  {"components":{"a": "6","b": "12","c": "12"},"other_fields":{"people": "4"}}
4                                   {"components":{"a": "45","b": "5","c": "1"},"last_recall": {"date": "101618","size": "12"}}

EDIT 2: dput of json1 and json2

> dput(json1)
structure(list(key = c("hg8oxoi4", "gic3bv14", "yo47wglq", "vibidd0l"
), field = c("\"components\":{\"a\": \"21\",\"b\": \"12\",\"c\": \"34\"}", 
"\"components\":{\"a\": \"78\",\"b\": \"66\",\"c\": \"54\"}", 
"\"components\":{\"a\": \"6\",\"b\": \"12\",\"c\": \"12\"}", 
"\"components\":{\"a\": \"45\",\"b\": \"5\",\"c\": \"1\"}")), .Names = c("key", 
"field"), row.names = c(NA, -4L), class = "data.frame")

> dput(json2)
structure(list(key = c("hg8oxoi4", "vibidd0l"), field = c("\"last_recall\": {\"date\": \"012118\",\"size\": \"43\"}", 
"\"last_recall\": {\"date\": \"101618\",\"size\": \"12\"}")), .Names = c("key", 
"field"), row.names = c(NA, -2L), class = "data.frame")
2
  • can you post the dput of the json1 and json2 Is it a character column. Also, it is better to show the expected output Commented Oct 30, 2018 at 2:02
  • just added the expected output, all field columns are character columns Commented Oct 30, 2018 at 2:18

1 Answer 1

1

We merge the datasets by 'key' after placing the 'datasets' in a list

out <- Reduce(function(...) merge(..., all = TRUE, by = "key"), 
       mget(ls(pattern ="^json\\d+$")))

Then, paste the non-NA elements rowwise

out$field <- apply(out[-1], 1, function(x) paste(x[!is.na(x)], collapse=", "))
out[c("key", "field")]
Sign up to request clarification or add additional context in comments.

2 Comments

if I were to repeat this with differently named datasets (like test.df, field_store for example) would I use list(test.df,field_store) in place of your ls statement?
@aabernathy You can use list(test.df, field_store) The only reason I used ls is because of the pattern in the object names. If there are 100 objects, it is more easy to do

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.