0

I am messing around trying to write a small web crawler. I parse out a url from some html and sometimes I get a php redirect page. I am looking for a way to get the uri of the redirected page.

I am trying to use System.Net.WebRequest to get a a stream using code like this

        WebRequest req = WebRequest.Create(link);
        Stream s = req.GetResponse().GetResponseStream();
        StreamReader st =  new StreamReader(WebRequest.Create(link).GetResponse().GetResponseStream());

The problem is that the link is a PHP redirect, so the stream is always null. How would I get the URI to the page the php is redirecting?

1
  • Isnt it returning a HTTP 302? if so, there should be a response header indicating the new location. Check out stackoverflow.com/questions/1391373/… for more information. Commented Mar 17, 2011 at 1:13

2 Answers 2

1
    HttpWebRequest req = (HttpWebRequest)WebRequest.Create(link);
    req.AllowAutoRedirect = true;
    reg.AutomaticDecompression = DecompressionMethods.GZip;

    StreamReader _st = new StreamReader(_req.GetResponseStream(), System.Text.Encoding.GetEncoding(req.CharacterSet));

the AllowAutoRedirect will automatically take you to the new URI; if that is you're desired effect. The AutomaticDecompression will auto decompress compressed responses. Also you should be executing the get response stream part in a try catch block. I my exp it throws alot of WebExceptions.

Since you're experimenting with this technology make sure you read the data with the correct encoding. If you attempt to get data from a japanese site without using Unicode then the data will be invalid.

Sign up to request clarification or add additional context in comments.

Comments

0

Check the "Location" header from the response - it should contain the new URL.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.