0

I am trying to webscrape a XML page into a dataframe to create a table as in https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield

I tried:

library("XML")
library("methods")
xmldataframe <- xmlToDataFrame("http://data.treasury.gov/feed.svc/DailyTreasuryYieldCurveRateData?$filter=month(NEW_DATE)%20eq%2011%20and%20year(NEW_DATE)%20eq%202017")
xmldataframe

But I could not get it to work properly. Thank you for any help.

3
  • Do you want the table that contains "Date,1 Mo, 3 Mo,..."? Commented Nov 15, 2017 at 12:37
  • Yes, that is the table. Commented Nov 15, 2017 at 12:51
  • See the answer, I'm sure that it will help you Commented Nov 15, 2017 at 12:52

1 Answer 1

5

I prefer to use the package rvest, so try this

To download:

if(!require("rvest")){install.packages("rvest");library("rvest")}
url <- "https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield"

xml_page <- read_html(url)

detail <- xml_page %>%
  html_nodes(".text_view_data") %>% #node of the table
  html_text()

Result

> detail
  [1] "11/01/17" "1.06"     "1.18"     "1.30"     "1.46"     "1.61"     "1.74"     "2.01"    
  [9] "2.22"     "2.37"     "2.63"     "2.85"     "11/02/17" "1.02"     "1.17"     "1.29"    
 [17] "1.46"     "1.61"     "1.73"     "2.00"     "2.21"     "2.35"     "2.61"     "2.83"    
 [25] "11/03/17" "1.02"     "1.18"     "1.31"     "1.49"     "1.63"     "1.74"     "1.99"    
 [33] "2.19"     "2.34"     "2.59"     "2.82"     "11/06/17" "1.03"     "1.19"     "1.30"    
 [41] "1.50"     "1.61"     "1.73"     "1.99"     "2.17"     "2.32"     "2.58"     "2.80"    
 [49] "11/07/17" "1.05"     "1.22"     "1.33"     "1.49"     "1.63"     "1.75"     "1.99"    
 [57] "2.17"     "2.32"     "2.56"     "2.77"     "11/08/17" "1.05"     "1.23"     "1.35"    
 [65] "1.53"     "1.65"     "1.77"     "2.01"     "2.19"     "2.32"     "2.57"     "2.79"    
 [73] "11/09/17" "1.07"     "1.24"     "1.36"     "1.53"     "1.63"     "1.75"     "2.01"    
 [81] "2.20"     "2.33"     "2.59"     "2.81"     "11/10/17" "1.06"     "1.23"     "1.37"    
 [89] "1.54"     "1.67"     "1.79"     "2.06"     "2.27"     "2.40"     "2.67"     "2.88"    
 [97] "11/13/17" "1.07"     "1.24"     "1.37"     "1.55"     "1.70"     "1.82"     "2.08"    
[105] "2.27"     "2.40"     "2.67"     "2.87"     "11/14/17" "1.06"     "1.26"     "1.40"    
[113] "1.55"     "1.68"     "1.81"     "2.06"     "2.26"     "2.38"     "2.64"     "2.84" 

Then, you have to adapt it to the format you need

Edit:

To dataframe

This is clearly not a ellegant way but it works.

table_names<-c("Date","1 Mo","3 Mo",    "6 Mo", "1 Yr", "2 Yr", "3 Yr", "5 Yr", "7 Yr", "10 Yr",    "20 Yr",    "30 Yr")
ndates<-sum(grepl("/",detail))
df_detail<-as.data.frame(matrix(nrow = ndates,ncol = length(table_names)))
names(df_detail)<-table_names

pos1<-which(grepl("/",detail))
pos2<-which(grepl("/",detail))-1
pos2<-pos2[-1]
pos2<-c(pos2,length(detail))

for(i in 1:ndates){
  df_detail[i,]<-detail[pos1[i]:pos2[i]]

}

Result

> df_detail
       Date 1 Mo 3 Mo 6 Mo 1 Yr 2 Yr 3 Yr 5 Yr 7 Yr 10 Yr 20 Yr 30 Yr
1  11/01/17 1.06 1.18 1.30 1.46 1.61 1.74 2.01 2.22  2.37  2.63  2.85
2  11/02/17 1.02 1.17 1.29 1.46 1.61 1.73 2.00 2.21  2.35  2.61  2.83
3  11/03/17 1.02 1.18 1.31 1.49 1.63 1.74 1.99 2.19  2.34  2.59  2.82
4  11/06/17 1.03 1.19 1.30 1.50 1.61 1.73 1.99 2.17  2.32  2.58  2.80
5  11/07/17 1.05 1.22 1.33 1.49 1.63 1.75 1.99 2.17  2.32  2.56  2.77
6  11/08/17 1.05 1.23 1.35 1.53 1.65 1.77 2.01 2.19  2.32  2.57  2.79
7  11/09/17 1.07 1.24 1.36 1.53 1.63 1.75 2.01 2.20  2.33  2.59  2.81
8  11/10/17 1.06 1.23 1.37 1.54 1.67 1.79 2.06 2.27  2.40  2.67  2.88
9  11/13/17 1.07 1.24 1.37 1.55 1.70 1.82 2.08 2.27  2.40  2.67  2.87
10 11/14/17 1.06 1.26 1.40 1.55 1.68 1.81 2.06 2.26  2.38  2.64  2.84
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, that is very helpful. But it seems to be a character string. How can I convert it into a dataframe table?
These are two questions in one. But see the answer again. I'm sure this is not the best way to adapt it into data frame but it works.
That is really great. Thanks very much for all your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.