0

I want to tell json parser that one of the columns should be POSIXct. How to do this, when the parser loses time information.

    library(rvest)
    library(jsonlite)

      url = "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"
      strJson <- html_text(read_html(url))

      json <- map(strJson, ~fromJSON(.))
      mData <- as.matrix(as.data.frame(json))
      dat <- as.POSIXct(mData[,10], format='%H:%M:%S %d %b %Y')

All dat is NA

3 Answers 3

1

NOT AN ANSWER since Martin has a great+succinct one for the specific quetion (but pedants are invited to downvote if it makes you feel better abt yourselves)

just a suggestion for an alternate data acquisition and conversion workflow:

src_url <- "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"

x <- jsonlite::fromJSON(src_url)

x$settlements <- list(x$settlements)

tidyr::unnest(tibble::as_data_frame(x))
## # A tibble: 58 x 14
##    updateTime  dsHeader reportType tradeDate empty month open  high  low   last  change settle volume openInterest
##    <chr>       <chr>    <chr>      <chr>     <lgl> <chr> <chr> <chr> <chr> <chr> <chr>  <chr>  <chr>  <chr>       
##  1 Friday, 06… Dated B… Final      04/06/20… FALSE APR … -     -     -     -     -1.02  66.81  0      75          
##  2 Friday, 06… Dated B… Final      04/06/20… FALSE MAY … -     -     -     -     -1.24  66.84  0      75          
##  3 Friday, 06… Dated B… Final      04/06/20… FALSE JUN … -     -     -     -     -1.24  66.37  0      75          
##  4 Friday, 06… Dated B… Final      04/06/20… FALSE JLY … -     -     -     -     -1.25  65.95  0      75          
##  5 Friday, 06… Dated B… Final      04/06/20… FALSE AUG … -     -     -     -     -1.24  65.56  0      0           
##  6 Friday, 06… Dated B… Final      04/06/20… FALSE SEP … -     -     -     -     -1.24  65.19  0      0           
##  7 Friday, 06… Dated B… Final      04/06/20… FALSE OCT … -     -     -     -     -1.25  64.78  0      0           
##  8 Friday, 06… Dated B… Final      04/06/20… FALSE NOV … -     -     -     -     -1.24  64.46  0      0           
##  9 Friday, 06… Dated B… Final      04/06/20… FALSE DEC … -     -     -     -     -1.22  64.15  0      0           
## 10 Friday, 06… Dated B… Final      04/06/20… FALSE JAN … -     -     -     -     -1.20  63.82  0      0           
## # ... with 48 more rows
Sign up to request clarification or add additional context in comments.

2 Comments

with your code i get: "Error in open.connection(con, "rb") : HTTP error 407."
ah, you're behind either a transparent or full-on proxy and jsonlite still seems to not use proxies directly. One alternative might be x <- httr::content(httr::GET(src_url), as="parsed") then.
1

The problem might be the locale LC_TIME. Mine was set to de_DE.UTF-8 and therefore the strings could not be parsed because of locale specific options like %p. Check your locale with Sys.getlocale("LC_TIME").

Changing it to an English one does do the job:

library(rvest)
library(jsonlite)

url = "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"
strJson <- html_text(read_html(url))

json  <- map(strJson, ~fromJSON(.))
mData <- as.data.frame(json, stringsAsFactors = F)

Sys.setlocale("LC_TIME", "en_GB.UTF-8")
dat  <- as.POSIXct(mData$updateTime, format='%A, %d %b %Y %I:%M %p')



head(dat)
[1] "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST"
[5] "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST"

2 Comments

Your answer beats mine as you convert the PM into the hour as well by using %I. Did not know there was such a thing.
still i get NA with your solution
0

Make sure you put library(purrr) as well as map() is a function from that package. The column is imported as factor. It needs to change to character first.

cdat <- as.character(mData[,10]))

After that, the format you posted is not equal to the data format. You need to parse it exactly as the data says.

dat <- as.POSIXct(cdat, format = '%A, %d %b %Y %H:%M %p', tz = 'UTC')
# [1] "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC"
# [6] "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC"
#...

Or in one line:

dat <- as.POSIXct(as.character(mData[,10]), format = '%A, %d %b %Y %H:%M %p', tz = 'UTC')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.