0

I have to replace xmlns with ns in my incomming xml in order to fix SimpleXMLElements xpath() function. Most functions do not have a performance problem. But there allways seems to be an overhead as the string grows.

E.g. preg_replace on a 2 MB string takes 50ms to process, even if I limit the replaces to 1 and the replace is done at the very beginning.

If I substr the first few characters and just replace that part it is slightly faster. But not really that what I want.

Is there any PHP method that would perform better in my problem? And if there is no option, could a simple php extension help, that just does Replace => SimpleXMLElement in C?

4
  • I'm not clear why SimpleXML can't handle xmlns, but have you tried str_replace()? Commented May 24, 2011 at 8:33
  • 4
    If you want to query namespaced XML with SimpleXml's xpath method, you have to register the namespace first. Changing xmlns to ns will result in invalid XML (which SimpleXML will choke on). Commented May 24, 2011 at 8:38
  • is this helpful for u ?stackoverflow.com/questions/737522/… Commented May 24, 2011 at 8:41
  • If you replace 3 characters by 0 characters (deleting the 3 characters) there will necessarily have to be copying around. Just for fun, try replacing the 3 characters "xml" with 3 spaces (or maybe "<xml" with 3 spaces followed by 1 "<") :) Commented May 24, 2011 at 19:22

4 Answers 4

2

If you know exactly where the offending "x", "m" and "l" are, you can just use something like $xml[$x_pos] = ' '; $xml[$m_pos] = ' '; $xml[$l_pos] = ' ' to transform them into spaces. Or transform them into ns___ (where _ = space).

Sign up to request clarification or add additional context in comments.

1 Comment

yeah, that is what I do now. It's three times faster than preg_replace or str_replace.
0

You're always going to get an overhead when trying to do this - you're dealing with a char array and trying to do replace multiple matching elements of the array (i.e. words).

50ms is not much of an overhead, unless (as I suspect) you're trying to do this in a loop?

1 Comment

Not in a loop, but I have to parse 30MB of XML on some pages.
0

50ms sounds pretty reasonable to me, for something like this. The requirement itself smells of something being wrong.

Is there any particular reason that you're using regular expressions? Why do people keep jumping to the overkill regex solution?

There is a bog-standard string replace function called str_replace that may do what you want in a fraction of the time (though whether this is right for you depends on how complex your search/replace is).

Comments

0

From the PHP source, as we can see, for example here: http://svn.php.net/repository/php/php-src/branches/PHP_5_2/ext/standard/string.c

I don`t see, any copies, but I'm not expert in C. From the other hand we can see there many convert to string calls, which at 1st sight could copy values. If they copy values, then we in trouble here.

Only if we in trouble Try to invent some str_replace wheel here with the help of string-by-char processing. For example we have string $somestring = "somevalue". In PHP we could work with it's chars by indexes as echo $somestring{0}, which will give us "s" or echo $somestring{2} which will give us "m". I'm not sure in this way, but it's possible, if official implimentations don't use references, as they should use.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.