1
$curl = curl_init("http://example.com/");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($curl, CURLOPT_HTTPHEADER, array("Host: example.com",
                                                "Connection: keep-alive",
                                                "Upgrade-Insecure-Requests: 1",
                                                "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36",
                                                "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
                                                "Accept-Language: en-US,en;q=0.8"));
curl_setopt($curl, CURLOPT_VERBOSE, TRUE);
$result = curl_exec ($curl); 
echo $result;

The response is

<html><title>You are being redirected...</title>
<noscript>Javascript is required. Please enable javascript before you are allowed to see this page.</noscript>

I'm reusing the headers exactly as the browser is sending to the site.

How can a site know this is not a real browser? The error occurs when loading the main page so it's not like there is any authentication going on.

In fact, Javascript is not even needed for the majority of the page's content. I can it's loaded as standard html, but for some reason if not enabled the entire page doesn't load.

Any ideas? (sorry, can't share real site name).

16
  • I would presume the cookie jar is empty? Commented Sep 13, 2016 at 2:49
  • Yes. I've tested with and without the cookiejar line. No difference. Commented Sep 13, 2016 at 2:51
  • What happens if you visit that page by browser, without JS enabled? Thing is, I cannot replicate the issue kinda required for a mcve, but understandable. But the issue seems very intriguing. Commented Sep 13, 2016 at 2:55
  • I disabled JS in Chrome and page loaded with just some images not loading properly. Commented Sep 13, 2016 at 3:00
  • 1
    @Xorifelse that cannot be true: google chrome sends more headers than OP does in their code. Commented Sep 13, 2016 at 3:09

3 Answers 3

1

To my knowledge, the mininum of 2 requests is needed to know if a client has JavaScript enabled or not. Since this is CURL, and can be setup as an "original" request the response would not make any sense unless that website checks request headers like a hound dog.

As @zerkms mentioned, chrome does send more headers then your CURL request:

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8,nl;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Cookie:cookiedata
DNT:1
Host:example.com
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36

There are a couple of mismatches, Host:example.com does not has a space. Secondly, curl would take care of that with the curl_init() function. I'm also missing DNT, cache-control, Accept-Encoding/Languages.

In theory, a server cannot detect client settings but it can very well detect every header.

If for example I would build this software, I would accumulate enough data to detect normal browser headers. If data is missing I could detect if it is a real user request or not.

Sign up to request clarification or add additional context in comments.

7 Comments

I'm not sure what to tell you. What I used in Curl are the headers I see being sent. And like @zerkms suggested I tried with 'copy as curl' and I still get the error
@user2029890 we are not sure what to tell you as well :-) What you said just does not sound real.
@user2029890 Well, you're being redirected. Does this happen with a normal browser request as well? Perhaps CURL picks up the content of the page, with the header to be redirected to the real index. Alternatively, you could be banned\flagged as a bot from the server by means of IP, where this is the default response and why your browser requests works.
@Xorifelse It does not happen with a normal request and I doubt it's blocking the IP as I tried 'copy as curl' on machines with IPs that would not have been used. Furthermore, there are elements of the site I can access via curl but the main (and some other pages) I cannot
@user2029890 Does curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true); also removing Host: example.com header make any change?
|
0

The site likely actually can't tell that it's not a browser making the request. The HTML <noscript> tag marks content that should be shown if and only if JavaScript is enabled. The reason it would seem to not be loading is because the remote server appears to have sent you a meta-refresh/redirect page; the solution as I could see it is to send the same request wherever you're being redirected to.


Aside from that, however, there are in fact ways for a server to tell what's sending a request: the User-Agent heading. This heading is typically hardcoded on most browsers and sent with every request; it contains information on what the client is. Not completely reliable (it can be spoofed, which is what you're doing), but at least it's something.

2 Comments

You realize the OP made a CURL request with existing browser headers, spoofing the request? The question is, how could it detect that.
Yeah, I was mostly just making the point of, here is how the server would actually do this.
0

I have the same problem years later. Some old websites' security s****d minds create bogus security by hiding PHP form submission in complicated JS files. The actual URL displayed on the browser/form is not the URL you actually post to. The real URL is hidden in JS files.

Open page source and look into JS files.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.