42

I'm writing a script that uploads a file to a cgi script that expects a multipart request, such as a form on a HTML page. The boundary is a unique token that annotates the file contents in the request body. Here's an example body:

--BOUNDARY
Content-Disposition: form-data; name="paramname"; filename="foo.txt"
Content-Type: text/plain

... file contents here ...
--BOUNDARY--

The boundary cannot be present in the file contents, for obvious reasons.

What should I do in order to create an unique boundary? Should I generate a random string, check to see if it is in the file contents, and if it is, generate a new, rinse and repeat, until I have a unique string? Or would a "pretty random token" (say, combination of timestamp, process id, etc) be enough?

3
  • 2
    What programming language do you use? Usually such a thing is handled by a library. Commented Jan 15, 2010 at 12:03
  • I'm using Ruby. It would have to be in the stdlib, though, can't use gems since the script should be runnable on any system with ruby installed, without having to install gems. Commented Jan 15, 2010 at 12:32
  • 1
    BOUNDARY may be fine, but be sure to use \r\n (DOS line encoding) because with just \n it gracefully fails with "Header section has more than 10240 bytes" error. Commented Jun 18, 2020 at 14:18

4 Answers 4

57

If you use something random enough like a GUID there shouldn't be any need to hunt through the payload to check for an alias of the boundary. Something like:-

----=NextPart_3676416B-9AD6-440C-B3C8-FC66DDC7DB45
Header:....

Payload
----=NextPart_3676416B-9AD6-440C-B3C8-FC66DDC7DB45--

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! Your answer is just as good as the tagged answer, but he needed the rep more than you did ;)
this answer is better since a GUID is specifically engineered to be "globally unique". When you can get a GUID from one line of code why try to come up with your own somewhat random string?
This answer should mention uuidgen
It's never to late! 9 years later, your answer is now the tagged one. Tnx :)
14

For Java guys :

protected String generateBoundary() {
             StringBuilder buffer = new StringBuilder();
             Random rand = new Random();
             int count = rand.nextInt(11) + 30; // a random size from 30 to 40
             for (int i = 0; i < count; i++) {
             buffer.append(MULTIPART_CHARS[rand.nextInt(MULTIPART_CHARS.length)]);
             }
             return buffer.toString();
        }

private final static char[] MULTIPART_CHARS =
             "-_1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
                  .toCharArray();

Reference url : https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/MultipartEntityBuilder.java#L234

Comments

1

And for the Swift people (to balance the Java):

func createBoundaryString() -> String {
    var str = ""
    let length = arc4random_uniform(11) + 30
    let charSet = [Character]("-_1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

    for _ in 0..<length {
        str.append(charSet[Int(arc4random_uniform(UInt32(charSet.count)))])
    }
    return str
}

Comments

0

If you are feeling paranoid, you can generate a random boundary and search for it in the string to be sent, append random char (or re-create new) on find, repeat. But my experience is any arbitrary non-dictionary string of 10 or so characters is about impossible to occur, so picking something like ---BOUNDARY---BOUNDARY---BOUNDARY--- is perfectly sufficient.

9 Comments

No, it is not sufficient. Because you won't be able to send your program source code (or this comment) using your program.
@stepancheg: It seems you are feeling paranoid, in this case use the solution from the first paragraph of my answer. If you are mentally healthy though, use Content-Encoding: gzip and stop worrying about users out there trying to get you.
It is the responsibility of the programmer to avoid foreseeable future errors.
@BornToCode: If the user purposefuly tries to make the application fail, you can't stop them - you may only limit the impact to that single user. The chance that a random compressed content accidentally encodes during compression to one specific string of 39 characters is around 1:2^47 which means it's well within limits of acceptability (UUID is not better and it is deemed sufficient.) - one would need to purposefully construct a content that compresses to the boundary code, and then we can just reject it; it's not a valid content but a malicious attack.
I think that many users will copy the boundary from this answer. As well as many others boundaries found on the stackoverflow and other tutorials. Anyway this kind of vulnerability is not so dangerous :D
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.