I've written a small scraper that is meant to open up a connection to a PHP script on a remote server via HTTP and pump some XML it finds there into a local file.
Not exactly rocket science, I know.
The code below is the scraper in its entirety (cleaned up and anonymized).
This code works fine except for one small detail, it seems that no matter the size of the XML file (1 MB or 7 MB) the resulting XML file is always missing a small section at the end (600-800 characters).
Notes:
If I open the php page in Firefox - I get the whole doc no problem.
If I fire up wireshark and run the program below, I see the whole doc transferred across the wire, but not written down into the file.
using System;
using System.IO;
using System.Collections.Generic;
using System.Text;
namespace myNameSpace
{
class Program
{
static void Main(string[] args)
{
Console.Write("BEGIN TRANSMISSION\n");
writeXMLtoFile();
Console.Write("END TRANSMISSION\n");
}
public static void writeXMLtoFile()
{
String url = "http://somevalidurl.com/dataPage.php?lotsofpars=true";
TextWriter tw = new StreamWriter("xml\\myFile.xml");
tw.Write(ScreenScrape(url));
Console.Write(" ... DONE\n");
tw.Close();
}
public static string ScreenScrape(string url)
{
System.Net.WebRequest request = System.Net.WebRequest.Create(url);
using (System.Net.WebResponse response = request.GetResponse())
{
using (System.IO.StreamReader reader = new System.IO.StreamReader(response.GetResponseStream()))
{
return reader.ReadToEnd();
}
}
}
}
}
Should I be using a different Writer? I've tried both TextWriter and StreamWriter to the same effect.
Kind regards from Iceland,
Gzur