3

I want to save the object that is the result of the htmlParse command. Here is some code to illustrate my problem. Simply, I want to be able to save the parse HTML page to an object and load it into a future session.

library(XML)
PATH = "/colleges/Bentley-University"
URL <- paste("http://www.cappex.com", PATH, sep="")
doc <- htmlParse(URL)
mylist <- list(doc)
mylist[[1]]
save(mylist, file="mylist.Rdata")
rm(list=ls())
load("mylist.Rdata")

However, when I try to recall the contents of my list, this is the error I get:

> mylist[[1]]
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
  cannot open file '/var/folders/hv/wtvckymn0230hpsdwylmtf0r0000gn/T//Rtmp8Mrpev/fileed256550e50': No such file or directory
1
  • 2
    Warning to people who might replicate this. Btibert3 has not constructed a minimal example. The output of mylist[[1]] is many pages long. He should have save mylist[[1]] if that was what he wanted because: doing str on mylist gives: List of 1 $ :Classes 'HTMLInternalDocument', 'XMLInternalDocument' <externalptr> Commented Sep 19, 2012 at 4:10

1 Answer 1

6

doc cant be saved as it is a pointer to 'C-level nodes'. Putting it in a list doesnt change this fact. You can write the representation of the XML tree to a string first then save it. After you can recover the text.

library(XML)
PATH = "/colleges/Bentley-University"
URL <- paste("http://www.cappex.com", PATH, sep="")
doc <- htmlParse(URL)
saveXML(doc, file="ex.txt")
rm(list=ls())

# recover
doc<-htmlParse('ex.txt')
Sign up to request clarification or add additional context in comments.

1 Comment

but how to save a list not only one HTML parse file?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.