3

So I've been toying around with Regular Expressions, and my friend challenged me to write a script that replaced all hex within a string. He gave me a large file mixed with different characters and, of course, some hex strings.

Each occurrence of hex is preceded with \x, so for example: \x55.

I thought it'd be pretty easy, so I tried out this pattern on some online regex tester: /\\x([a-fA-F0-9]{2})/

It worked perfectly.

However, when I throw it into some PHP code, it fails to replace it at all.

Can anyone give me a nudge into the right direction of where I'm going wrong?

Here's my code:

$toDecode = file_get_contents('hex.txt');
$pattern = "/\\x(\w{2})/";
$replacement = 'OK!';

$decoded = preg_replace($pattern, $replacement, $toDecode);

$fh = fopen('haha.txt', 'w');
fwrite($fh, $decoded);
fclose($fh);
4
  • Can you please provide a small snippet of your file hex.txt? Commented Apr 23, 2012 at 20:37
  • 2
    True hex only includes the characters 0-9 and A-F, while your regular expression will match characters other than these. Commented Apr 23, 2012 at 20:38
  • Good call, nick. I'll change that in a moment. Commented Apr 23, 2012 at 20:39
  • You could improve your recognition - \w is any letter or number, but hex is only 0-9 and A-F. You could replace it with [0-9a-fA-F] Commented Apr 23, 2012 at 20:42

2 Answers 2

7
<?php
  // grab the encoded file
  $toDecode = file_get_contents('hex.txt');

  // create a method to convert \x?? to it's character facsimile
  function escapedHexToHex($escaped)
  {
    // return 'OK!'; // what you're doing now
    return chr(hexdec($escaped[1]));
  }

  // use preg_replace_callback and hand-off the hex code for re-translation
  $decoded = preg_replace_callback('/\\\\x([a-f0-9]{2})/i','escapedHexToHex', $toDecode);

  // save result(s) back to a file
  file_put_contents('haha.txt', $decoded);

For reference, preg_replace_callback. Also, don't use \w as it's actually translated to [a-zA-Z0-9_]. Hex is base-16, so you want [a-fA-F0-9] (and the i flag makes it case-insensitive).

Working example, minus the file part.

Sign up to request clarification or add additional context in comments.

3 Comments

wouldn't preg_replace($pattern, chr(hexdec($1)), $toDecode) have the same effect?
@Rob: Supposedly, though I had trouble getting it to work (even while using the e flag)
as did I. very strange. yours does indeed work, but now I want to know why my example above doesn't. Time to toy around with that. Thanks for the help!
2

Your problem is that you have not escaped your backslashes in the PHP string. It needs to be:

$pattern = "/\\\\x(\\w{2})/";

...or:

$pattern = '/\\x(\w{2})/';

...with single quotes. - This actually suffers the same problem and requires the full double-escaped sequence

But \w will match any perl word character, which is not just hex characters. I would use the character class [a-fA-F0-9] instead.

1 Comment

All this escaping gives me a headache, but that was indeed the problem. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.