0

When I request this URL:

http://www.w3.org/TR/html5/embedded-content-1.html#the-img-element

the server responds with a 404 (File not found) HTTP-response. However, a few moments later a different URL is loaded into the browser, namely:

http://www.w3.org/TR/html5/the-img-element.html#the-img-element

The server basically sends a second HTTP-response whose URL is different from the URL that was originally requested.

enter image description here

How is this "redirect" possible? The first HTTP-response was a 404, not a 3xx. Afaik, 404 responses do not trigger a second HTTP-request by the browser. So, does the server just push the second response without any request being made? If yes, why does the browser allow that?


See for yourself: Open the "Net" tab of Chrome's dev tools, and make sure that the "Preserve Log upon Navigation" flag is activated. Now, load the first URL (from above).

6
  • 2
    Uh, wait... Is JavaScript at work here? Well, that would be embarrassing :) Commented Dec 12, 2012 at 18:48
  • 1
    They may use javascript, but it's not even needed: sending a Refresh: 5;url=new_url.html header would also work. Commented Dec 12, 2012 at 18:56
  • 1
    @Wrikken The 404 page that is returned here is generic, so they can't hard-code the specific new URL like that. What they do instead is, have a huge JavaScript file with all the URL mappings. This file is loaded by the 404 page, and then performs the corresponding redirect. Commented Dec 12, 2012 at 19:01
  • Yep, that's what they do here, just wanted to say there are more ways to get there. Be sure not to follow their lead in this, and if you encounter this, do proper permanent redirects... Commented Dec 12, 2012 at 19:06
  • @Wrikken Actually, why return a Refresh header, when you can return a 301 (Moved permanently) response :). They decided to do it with JavaScript, but if they had decided to do it on the server-side, they would have implemented it via a 301 redirect, not via a Refresh header, I think. Commented Dec 12, 2012 at 20:28

1 Answer 1

1

Let me answer my own question here.

The second HTTP-request is initiated by JavaScript code that is executed as part of the page that was returned by the 404 response. That page contains:

<body onload="fixBrokenLink(404)">

and then:

function fixBrokenLink(is404) {
    if (window.location.hash.length < 1 && !is404)
        return;

    var fragid = window.location.hash.substr(1);
    if (fragid && document.getElementById(fragid))
        return;

    var script = document.createElement('script');
    script.src = 'fragment-links.js';
    document.body.appendChild(script);
}

I love how I've asked the question, made a comment on it, and then answered it, all without any participation from anyone else. :)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.