6

I have a data with a column as json string:

reservation  reasons
1592         [{"name"=>"jorge", "value"=>"MX"}, {"name"=>"Billing phone number", "value"=>"1123"}, {"name"=>"BillingCountry", "value"=>"USA"}]
1597         [{"name"=>"BillingAddress_Country", "value"=>"IN"}, {"name"=>"Billing phone number country code", "value"=>"IN"}, {"name"=>"Latest amount", "value"=>"583000000"}]

I want to parse the column as follows:

reservation   name                                value
1592          jorge                                mx
1592          Billing phone number                 1123
1592          BillingCountry                       USA
1597          BillingAddress_Country               IN
1597          Billing phone number country code    IN
1597          Latest amount                        583000000

I am us ing jsonlite in R. I am getting following error in my code:

data<-read.csv("data.csv")
json<-data$reasons
mydf <- fromJSON(json)
Error: Argument 'txt' must be a JSON string, URL or file.

Can anyone tell me where am I making mistakes? What modifications do I need to do? Many thanks in advance!

6
  • 1
    can you dput(head(dat))? Commented Jul 2, 2018 at 20:57
  • 1
    Are you sure this is JSON? It doesn't look that much like JSON. What's this =>? I think you need to reformat it to JSON first. I can show you how in an answer. Commented Jul 2, 2018 at 21:05
  • Sorry but I am not getting it. How dput(head(dat)) can solve the problem? Commented Jul 2, 2018 at 21:05
  • 2
    That is not a solution to the problem, that's how to reproducibly share your data. Please see the r tag description for more info on that. Commented Jul 2, 2018 at 21:06
  • This doesn't appear to be valid JSON. Where did it come from? It looks like maybe a Ruby hash Commented Jul 2, 2018 at 21:50

2 Answers 2

4

This doesn't look like normal JSON to me (or to fromJSON, which makes me feel a little better). Maybe it's some special case of it or something (?). Update: @camille identified it as Ruby Hash.

In any event, we can fix it:

reasons <-  '{"name"=>"jorge", "value"=>"MX"}, {"name"=>"Billing phone number", "value"=>"1123"}, {"name"=>"BillingCountry", "value"=>"USA"}'

reasons <- gsub("=>", ":", reasons)
reasons <- gsub("[{}]", "", reasons)
reasons <- paste0("{",reasons,"}")

fromJSON(reasons)
$`name`
[1] "jorge"

$value
[1] "MX"

$name
[1] "Billing phone number"

$value
[1] "1123"

$name
[1] "BillingCountry"

$value
[1] "USA"
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. I wrote the following code as you have mentioned: library(jsonlite) data<-read.csv("/Users/susmitaghosh/Downloads/sample_sift_reason.csv") dim(data) reasons<-data$sift_reasons reasons <- gsub("=>", ":", reasons) reasons <- gsub("[{}]", "", reasons) reasons <- paste0("{",reasons,"}") fromJSON(reasons). But I am still getting following error: Error: parse error: invalid object key (must be a string) {["name":"BillingAddress_Country (right here) ------^. Sorry for bad formatting. I don't know how to fix it in comment.
@user3642360 if it doesn't fit neatly in a comment, you can edit your original post and put additional code formatted in the question
@user3642360 I agree with camille's suggestion but also it sounds like my solution works for the data you provided, but not some other part of the data that we don't have. So we need the that data too. Further, since what you thought was JSON was JSON-like but not real JSON, we should look into where you got this data from. A better solution is to find out what format it's in and convert it directly from that instead of making into JSON.
@Hack-R. Thank you so much. Your code is working. I did the modification like this: reasons<-as.character(data$reasons)
@user3642360 Perfect! If this works for you could you please click the green checkmark to mark it as the answer?
1
  dat%>%
  mutate(reasons=str_split(gsub("[^=A-Za-z,0-9{} ]+","",reasons),"(?<=\\}),\\s*"))%>%
  unnest()%>%
  mutate(names=str_extract(reasons,"(?<=name=)[^,}]+"),
         values=str_extract(reasons,"(?<=value=)[^,}]+"),
         reasons=NULL)

 reservation                             names    values
1        1592                             jorge        MX
2        1592              Billing phone number      1123
3        1592                    BillingCountry       USA
4        1597             BillingAddressCountry        IN
5        1597 Billing phone number country code        IN
6        1597                     Latest amount 583000000

with this code, if you need the email, just add email=str_extract.. etc etc

6 Comments

Thanks. Your code works for this subset of data. But I am getting error while trying to run the code on the whole data: Error: Duplicate identifiers for rows (333, 334) In addition: Warning messages: 1: Expected 2 pieces. Additional pieces discarded in 49 rows [121, 124, 129, 136, 141, 146, 155, 158, 163, 166, 169, 172, 177, 207, 215, 224, 236, 239, 251, 265, ...]. 2: Expected 2 pieces. Missing pieces filled with NA in 57 rows [206, 208, 213, 214, 216, 217, 225, 226, 227, 240, 252, 253, 254, 268, 270, 286, 287, 290, 292, 305, ...]. Can you please help me to identify the error?
data<-read.csv("sample_reason1.csv") data%>% mutate(reasons=strsplit(gsub("[^=A-Za-z,0-9 ]+","",reasons),"\\s*,\\s*"))%>% unnest()%>% separate(reasons,c("name","value"),sep="\\s*=\\s*")%>% mutate(id=rep(1:(n()/2),each=2))%>% spread(name,value) This is the code.
Thanks. But I am not getting following error: Error in mutate_impl(.data, dots) : Column id must be length 516 (the number of rows) or one, not 56 In addition: Warning messages: 1: Expected 2 pieces. Additional pieces discarded in 49 rows [121, 124, 129, 136, 141, 146, 155, 158, 163, 166, 169, 172, 177, 207, 215, 224, 236, 239, 251, 265, ...]. 2: Expected 2 pieces. Missing pieces filled with NA in 57 rows [206, 208, 213, 214, 216, 217, 225, 226, 227, 240, 252, 253, 254, 268, 270, 286, 287, 290, 292, 305, ...].
@user3642360 That means that not all the values are present in the rows. give me a second
I just checked the data carefully. the data contains some other information like "email", "details", etc apart from "value" and "names"
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.