2

i am trying to make webrequests with multiple threads but if i try with more than 2 i get error

Index was outside the bonds of the array

on this line:

string username = ScrapeBox1.Lines[NamesCounter].ToString();

Here's the code:

while (working)
{
    while (usernamescount > NamesCounter)
    {
        string username = ScrapeBox1.Lines[NamesCounter].ToString();
        string url = "http://www.someforum.com/members/" + username + ".html";
        var request = (HttpWebRequest)(WebRequest.Create(url));
        var response = request.GetResponse();
        request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0";

        using (var responseStream = response.GetResponseStream())
        {
            using (var responseStreamReader = new StreamReader(responseStream))
            {

                var serverResponse = responseStreamReader.ReadToEnd();
                int startpoint = serverResponse.IndexOf("Contact Info</span>");
                try
                {
                    string strippedResponse = serverResponse.Remove(0, startpoint);
                    ExtractEmails(strippedResponse);

                }
                catch { }


            }
        }
        NamesCounter++;
        textBox1.Text = NamesCounter.ToString();
    }

}
4
  • 1
    your code is apparently not thread safe. What did you think would happen? Commented Nov 5, 2012 at 0:29
  • Yes but i don't wanna use a lock on the whole method because that makes the multithreading pointless. and if i put a lock only on the NamesCounter i still get the same error Commented Nov 5, 2012 at 0:47
  • 1
    You (probably) don't need to hold a lock on the entire method, just those variables that will be accessed by multiple threads... Commented Nov 5, 2012 at 0:51
  • see my answer below, explains what probably needs to be locked Commented Nov 5, 2012 at 0:59

1 Answer 1

2

This code is not thread safe.

You need the code for performing an HttpWebRequest to be atomic and outside the context of looping through the collection.

For example

public void MakeHttpWebRequest(string userName)
{
    string url = "http://www.someforum.com/members/" + userName + ".html";
    var request = (HttpWebRequest)(WebRequest.Create(url));
    var response = request.GetResponse();
    request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0";

    using (var responseStream = response.GetResponseStream())
    {
        using (var responseStreamReader = new StreamReader(responseStream))
        {

            var serverResponse = responseStreamReader.ReadToEnd();
            int startpoint = serverResponse.IndexOf("Contact Info</span>");
            try
            {
                string strippedResponse = serverResponse.Remove(0, startpoint);
                ExtractEmails(strippedResponse);

            }
            catch { }


        }
    }
}

Assuming ScrapeBox.Lines implements IEnumerable, I would reccomend using Parallel.ForEach and passing the ScrapeBox.Lines as the IEnumerable over which to iterate.

Now, there is one additional problem, the code for reading the response from the HttpWebRequest still needs to write its output to a shared location. To accomplish that in a thread safe manner. A common way to do this is with a semaphore. You need a object accessible to each thread instance. A class level private variable private object sharedMutex = new object(); would work. Then the code ExtractEmails(strippedResponse); should be changed to lock(sharedMutex) { ExtractEmails(strippedResponse); }

Without having the code for the ExtractEmails(<string>) method, I can't provide a thread safe implementation for that, so that part of the solution may still cause a problem.

Sign up to request clarification or add additional context in comments.

2 Comments

I upvoted your answer, until I saw you ignore exceptions. I won't downvote, however.
that's just a modified version of the original code, I was simply adding the thread safety items to it... not trying to rewrite it completely. Also, without knowing the real purpose of the code, it's hard to know what to do with the exception when it is caught, other than the standard log it and move on.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.