30

I have between 1000-2000 webpages to download from one server, and I am using go routines and channels to achieve a high efficiency. The problem is that every time I run my program up to 400 requests fail with the error "connection reset by peer". Rarely (maybe 1 out of 10 times), no requests fail.

What can I do to prevent this?

One thing that is interesting is that when I ran this program on a server in the same country as the server the website is hosted in, 0 requests failed, so I am guessing there is some problem with delay (as it is now running on a server on a different continent).

The code I am using is basically just a simple http.Get(url) request, no extra parameters or a custom client.

3
  • Are all or a large portion of the pages coming from the same server? what is the max number of requests you're making concurrently? Commented Jun 12, 2016 at 14:46
  • All pages are from the same server (edited the question to reflect this). I am not sure how many are made concurrently. I just start as many go routines as there are web pages to download and then let the CPU/Golang impose the limits on concurrency. Commented Jun 12, 2016 at 21:12
  • There are no defined limits on concurrency, you need to do that yourself. Commented Jun 12, 2016 at 21:22

5 Answers 5

45

The message connection reset by peer indicates that the remote server sent an RST to forcefully close the connection, either deliberately as a mechanism to limit connections, or as a result of a lack of resources. Either way you are likely opening too many connections, or reconnecting too fast.

Starting 1000-2000 connections in parallel is rarely the most efficient way to download that many pages, especially if most or all are coming from a single server. If you test the throughput you will find an optimal concurrency level that is far lower.

You will also want to set the Transport.MaxIdleConnsPerHost to match your level of concurrency. If MaxIdleConnsPerHost is lower than the expected number of concurrent connections, the server connections will often be closed after a request, only to be immediately opened again -- this will slow your progress significantly and possibly reach connection limits imposed by the server.

Sign up to request clarification or add additional context in comments.

8 Comments

This is a great answer. I ended up doing some measurements on how many simultaneous connections gave the best performance, and for this connection I am currently on, that came out to be about 50, more connections than that gave very little to no extra performance. I limited the amount of go routines running to max 50, and set the MaxIdleConnsPerHost to 50. Works every time now!
@AG1: what code are you looking for? The answer comes down to just setting MaxIdleConnsPerHost to equal the number of concurrent requests.
@JimB I added the code as an answer to make it more concrete.
@AG1: you can see a more complete example of that in this answer
A network connection could be closed at any point, so you may get that from Do(), or while reading the response. It doesn't really matter though, networks are unreliable, and if you get an unexpected error and want to retry, that is a perfectly normal thing to do.
|
24

Still a golang newbie, hopefully this helps.

var netClient = &http.Client{}

func init() {
    tr := &http.Transport{
        MaxIdleConns:       20,
        MaxIdleConnsPerHost:  20,
    }
    netClient = &http.Client{Transport: tr}
}

func foo() {
    resp, err := netClient.Get("http://www.example.com/")
}

1 Comment

Thanks! Usage of http.Transport with MaxIdleConns/MaxIdleConnsPerHost helped me to avoid "connection reset by peer" error with net.DialTCP
6

I had good results by setting the MaxConnsPerHost option on transport...

cl := &http.Client{
    Transport: &http.Transport{MaxConnsPerHost: 50}
}

MaxConnsPerHost optionally limits the total number of connections per host, including connections in the dialing, active, and idle states. On limit violation, dials will block.

https://golang.org/pkg/net/http/#Transport.MaxConnsPerHost

EDIT: To clarify, this option was released in Go 1.11 which was not available at the time of @AG1's or @JimB's answers above, hence me posting this.

4 Comments

This is basically the same solution that @AG1 posted over 2 years ago....
it's not, read my answer carefully, AG1 used MaxIdleConnsPerHost which did not work for me, MaxConnsPerHost was introduced in Go 1.11 (released in November 2018) which was not even released when AG1 posted his answer...
Apologies, read your answer a little too quickly. Nontheless, thanks for the clarification, will certainly help future readers.
How can I set different proxies for every request in this way ? Is it possible ?
0

It might be possible that the server from which you are downloading the webpages has some type of throttling mechanism which prevents more than a certain number of requests per second/(or similar) from a certain ip?. Try limiting to maybe 100 requests per second or adding sleep between requests. Connection reset by peer is basically server denying you service. (What does "connection reset by peer" mean?)

5 Comments

Considering that everything runs fine when I run it on a server in the same country as the web server, it seemingly does not have such limits (unless they are only imposed on people from other countries, which does not make a lot of sense in my scenario). However, I will look into limiting the amounts of requests per second.
Generally servers can only handle a certain number of concurrent requests, and you might be past that capacity. A reason it would run fine from the same country is that the request would probably take significantly less time, so the connection isn't used up as long and the server can handle more.
@robbrit I'm guessing that's probably the case. I will have to implement a connection pool I think.
@fgblomqvist: you don't need a connection pool, the http.Transport already does that for you. Just limit the concurrency, and set Transport.MaxIdleConnsPerHost to match your max concurrency.
@JimB Wanna expand on that? I don't understand how setting the MaxIdleConnsPerHost will limit the max open connections to the host? Also, why would I need to limit the concurrency as well? If I start 1000 go routines, all making one GET request each, they will open ~1000 connections, whether they share an HTTP client or not
0

In macOS, set the parameters.

sudo ulimit -n 6049
sudo sysctl -w kern.ipc.somaxconn=1024

https://github.com/golang/go/issues/20960#issuecomment-465998114

In Linux, set:

sudo ulimit -n 6049
sudo sysctl -w net.core.somaxconn=1024

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.