In converting a Nokogiri object to XML and then to JSON, the majority of the content dissapears.
Code getting the data and converting:
def get_data
doc = Nokogiri::HTML(open("<url>", "User-Agent" => "Ruby/#{RUBY_VERSION}"))
# Get interesting block of HTML
blurb = doc.css('.entry')
# Convert Nokogiri object to XML
xmlBlurb = blurb.to_xml
# Convert to JSON
jsonBlurb = Hash.from_xml(xmlBlurb).to_json
return jsonBlurb
end
Somehow between xmlBlurb and jsonBlurb, I'm going from 10+ lines of XML, to a single JSON object { attr: content } with only 1 attribute.
I know there are several questions on SO regarding converting XML to JSON but none that I read address this specific issue.
Does anyone know what can cause the loss of data?
Hash.from_xml? It's not a standard Ruby method, nor does it come from Nokogiri.