2

I am making a program that must process about 5000 strings as quickly as possible. about 2000 of these strings must be translated via a webrequest to mymemory.translated.net. (see code below, JSON part removed since not needed here)

Try

          url = "http://api.mymemory.translated.net/get?q=" & Firstpart & "!&langpair=de|it&[email protected]"

          request = DirectCast(WebRequest.Create(url), HttpWebRequest)
          response = DirectCast(request.GetResponse(), HttpWebResponse)
          myreader = New StreamReader(response.GetResponseStream())

          Dim rawresp As String
          rawresp = myreader.ReadToEnd()
          Debug.WriteLine("Raw:" & rawresp)


          Catch ex As Exception
              MessageBox.Show(ex.ToString)

          End Try

the code itself is working fine, problem is it is a blocking code and needs about 1 second per string. Thats more then half an hour for all my strings. i would need to convert this code to a non blocking one and make multiple calls on the same time. Could somebody please tell me how i could do that? I was thinking of a background worker but that wouldnt speed things up.. it would just execute the code on a different thread...

thanks!

3
  • Is firstpart an array value or value from IEnumerable? Commented Jun 7, 2013 at 6:15
  • its a string containing the text i need to translate.. usually between 1 and 5 words Commented Jun 7, 2013 at 6:18
  • Which version of .NET are you targeting? 4.5? Commented Jun 7, 2013 at 6:25

3 Answers 3

2

The problem is you aren't just being held back by the maximum number of concurrent operations. HttpWebRequests are throttled by nature (I believe the default policy allows only 2 at any given time), so you have to override that behaviour too. Please refer to the code below.

Imports System.Diagnostics
Imports System.IO
Imports System.Net
Imports System.Threading
Imports System.Threading.Tasks

Public Class Form1

  ''' <summary>
  ''' Test entry point.
  ''' </summary>
  Private Sub Form1_Load() Handles MyBase.Load
    ' Generate enough words for us to test thoroughput.
    Dim words = Enumerable.Range(1, 100) _
      .Select(Function(i) "Word" + i.ToString()) _
      .ToArray()

    ' Maximum theoretical number of concurrent requests.
    Dim maxDegreeOfParallelism = 24
    Dim sw = Stopwatch.StartNew()

    ' Capture information regarding current SynchronizationContext
    ' so that we can perform thread marshalling later on.
    Dim uiScheduler = TaskScheduler.FromCurrentSynchronizationContext()
    Dim uiFactory = New TaskFactory(uiScheduler)

    Dim transformTask = Task.Factory.StartNew(
      Sub()
        ' Apply the transformation in parallel.
        ' Parallel.ForEach implements clever load
        ' balancing, so, since each request won't
        ' be doing much CPU work, it will spawn
        ' many parallel streams - likely more than
        ' the number of CPUs available.
        Parallel.ForEach(words, New ParallelOptions With {.MaxDegreeOfParallelism = maxDegreeOfParallelism},
          Sub(word)
            ' We are running on a thread pool thread now.
            ' Be careful not to access any UI until we hit
            ' uiFactory.StartNew(...)

            ' Perform transformation.
            Dim url = "http://api.mymemory.translated.net/get?q=" & word & "!&langpair=de|it&[email protected]"
            Dim request = DirectCast(WebRequest.Create(url), HttpWebRequest)

            ' Note that unless you specify this explicitly,
            ' the framework will use the default and you
            ' will be limited to 2 parallel requests
            ' regardless of how many threads you spawn.
            request.ServicePoint.ConnectionLimit = maxDegreeOfParallelism

            Using response = DirectCast(request.GetResponse(), HttpWebResponse)
              Using myreader As New StreamReader(response.GetResponseStream())
                Dim rawresp = myreader.ReadToEnd()

                Debug.WriteLine("Raw:" & rawresp)

                ' Transform the raw response here.
                Dim processed = rawresp

                uiFactory.StartNew(
                  Sub()
                    ' This is running on the UI thread,
                    ' so we can access the controls,
                    ' i.e. add the processed result
                    ' to the data grid.
                    Me.Text = processed
                  End Sub, TaskCreationOptions.PreferFairness)
              End Using
            End Using
          End Sub)
      End Sub)

    transformTask.ContinueWith(
      Sub(t As Task)
        ' Always stop the stopwatch.
        sw.Stop()

        ' Again, we are back on the UI thread, so we
        ' could access UI controls if we needed to.
        If t.Status = TaskStatus.Faulted Then
          Debug.Print("The transformation errored: {0}", t.Exception)
        Else
          Debug.Print("Operation completed in {0} s.", sw.ElapsedMilliseconds / 1000)
        End If
      End Sub,
      uiScheduler)
  End Sub

End Class
Sign up to request clarification or add additional context in comments.

15 Comments

thanks! if i understand it properly this code will be translating 10 strings in parallel without blocking.. right?
No. It will be 10 strings in parallel (so it will finish much quicker), but it will still block. If you don't want to block, wrap the Parallel.ForEach call in a Task and await.
@user2452250 I've changed my example shifting the call to Parallel.ForEach() into the thread pool. I've also incorporated the MaxDegreeOfParallelism feature to make sure that we don't flood the web request queue unnecessarily. Wire up the continuation to do what you need and you're good to go.
hey its working great! takes about 2 minutes to run through all elements! can i use a number greater then 10 or is it not reccomended? and how would i implement the thread to make it non blocking? thanks for your help really appreciate it!
@user2452250, I've already changed my example to refrain from blocking the UI. It's still a blocking call, but it's blocking on the thread pool, so you won't really notice. As for setting request.ServicePoint.ConnectionLimit to be greater than 10 - sure you can do that. Chances are the web service author will hate you though :) Experiment with different numbers. Just be sure to set MaxDegreeOfParallelism accordingly - perhaps move that number into a class-wide constant.
|
2

If you want to send 10 parallel requests, you must create 10 BackgroundWorkers. Or manually create 10 threads. Then iterate, and whenever a worker/thread is done, give it a new task.

I do not recommend firing 5000 parallel threads/workers, you must be careful: A load like that could be interpreted as spamming or an attack by the server. Don't overdo it, maybe talk to translated.net and ask them about the workload they accept. Also think about what your machine and your internet upstream can handle.

1 Comment

You might also want to make some use of WebRequest.BeginGetResponse() to make the request asynchronously.
1

I would create a Task for every request, so you can have a Callback for every call using ContinueWith:

  For Each InputString As String In myCollectionString


            Tasks.Task(Of String).Factory.StartNew(Function(inputString)

                    Dim request As HttpWebRequest
                    Dim myreader As StreamReader
                    Dim response As HttpWebResponse
                    Dim rawResp As String = String.Empty

                    Try

                      Dim url As String = "http://api.mymemory.translated.net/get?q=" & inputString & "!&langpair=de|it&[email protected]"

                      request = DirectCast(WebRequest.Create(url), HttpWebRequest)
                      response = DirectCast(request.GetResponse(), HttpWebResponse)
                      myreader = New StreamReader(response.GetResponseStream())

                      rawResp = myreader.ReadToEnd()
                      Debug.WriteLine("Raw:" & rawResp)


                    Catch ex As Exception
                      MessageBox.Show(ex.ToString)

                     End Try

                     Return rawResp

              End Function, CancellationToken.None, _ 
              Tasks.TaskCreationOptions.None).ContinueWith _
              (Sub(task As Tasks.Task(Of String))                                                                                                 
                'Dom something with result                                                                                                                          
                 Console.WriteLine(task.Result)                                                                                                                     
              End Sub)    

        Next

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.