2

Problem

I wrote 2 programs, one in Delphi and one in Java, for string concatenation and I noticed a much faster string concatenation in Delphi compared to Java.

Java

String str = new String();
long t0 = System.nanoTime();
for (int i = 0; i < 50000; i++) 
    str += "abc";
long t1 = System.nanoTime();
System.out.println("String + String needed " + (t1 - t0) / 1000000 + "ms");

Delphi

Stopwatch.Start;
for i := 1 to 50000 do
  str := str + 'abc';
Stopwatch.Stop;
ShowMessage('Time in ms: ' + IntToStr(Stopwatch.ElapsedMilliseconds));

Question

Both measure the time in milliseconds but the Delphi program is much faster with 1ms vs. Javas 2 seconds. Why is string concatenation so much faster in Delphi?

Edit: Looking back at this question with more experience I should have come to the conclusion that the main difference comes from Delphi being compiled and Java being compiled and then run in the JVM.

10
  • 2
    Because it is a different language perhaps? Commented Oct 22, 2016 at 15:07
  • 2
    I'd say: because delphi is compiled in native code and java isn't (but still there's JIT so I may be wrong). But maybe you should show us both codes. Commented Oct 22, 2016 at 15:07
  • How faster is "faster"? Are you talking about it being 1% or maybe 5% faster, or was it more like ten times faster? Commented Oct 22, 2016 at 15:08
  • It depends, what does your code look like? Perhaps your Java code is not working in the most optimal way? Or perhaps because Delphi is 100% native as already mentioned? Commented Oct 22, 2016 at 15:12
  • 3
    maybe in java you used String instead of StringBuilder Commented Oct 22, 2016 at 15:24

2 Answers 2

1

TLDR

There may be other factors, but certainly a big contributor is likely to be Delphi's default memory manager. It's designed to be a little wasteful of space in order to reduce how often memory is reallocated.

Considering memory manager overhead

When you have a straight-forward memory manager (you might even call it 'naive'), your loop concatenating strings would actually be more like:

//pseudo-code
for I := 1 to 50000 do
begin
  if CanReallocInPlace(Str) then
    //Great when True; but this might not always be possible.
    ReallocMem(Str, Length(Str) + Length(Txt))
  else
  begin
    AllocMem(NewStr, Length(Str) + Length(Txt))
    Copy(Str, NewStr, Length(Str))
    FreeMem(Str)
  end;
  Copy(Txt, NewStr[Length(NewStr)-Length(Txt)], Length(Txt))
end;

Notice that on every iteration you increase the allocation. And if you're unlucky, you very often have to:

  • Allocate memory in a new location
  • Copy the existing 'string so far'
  • Finally release the old string

Delphi (and FastMM)

However, Delphi has switched from the default memory manager used in it's early days to a previously 3rd party one (FastMM) that's designed run faster primarily by:

  • (1) Using a sub-allocator i.e. getting memory from the OS a 'large' page at a time.
  • Then performing allocations from the page until it runs out.
  • And only then getting another page from the OS.
  • (2) Aggressively allocating more memory than requested (anticipating small growth).
  • Then it becomes more likely the a slightly larger request can be reallocated in-place.
  • These techniques can thought it's not guaranteed increase performance.
  • But it definitely does waste space. (And with unlucky fragmentation, the wastage can be quite severe.)

Conclusion

Certainly the simple app you wrote to demonstrate the performance greatly benefits from the new memory manager. You run through a loop that incrementally reallocates the string on every iteration. Hopefully with as many in-place allocations as possible.

You could attempt to circumvent some of FastMM's performance improvements by forcing additional allocations in the loop. (Though sub-allocation of pages would still be in effect.)
So simplest would be to try an older Delphi compiler (such as D5) to demonstrate the point.

FWIW: String Builders

You said you "don't want to use the String Builder". However, I'd like to point out that a string builder obtains similar benefits. Specifically (if implemented as intended): a string builder wouldn't need to reallocate the substrings all the time. When it comes time to finally build the string; the correct amount of memory can be allocated in a single step, and all portions of the 'built string' copied to where they belong.

Sign up to request clarification or add additional context in comments.

4 Comments

Wow thanks for that answer! I know that a StringBuilder provides benefits and I would probably use it, but I wanted to know why the why "string + string" was much faster in delphi. I just needed this information for a presentation at universaty.
regarding String Builders - well, yes, in theory that is how it is. TList<String> would be even more efficient in O(SQRT(N)) scaling thing. But reality is that string manipulations were ubiquitous and in Delphi prior to XE2 or after XE6 the compiler generates quite a tight code, combined with FastMM's tuning towards that string manipulations usecases. I guess you read those article where they compared different ways of building strings and different heap managers. With FastMM the TStringBuilder is significantly slower than concat, while with others it might work somewhat ok
@Arioch'The Yes, I've read articles and examined implementations. Which is why I qualified that section with the phrase "if implemented as intended". The first Delphi implementation I saw was shockingly disappointing. ... It had unnecessary complexity (seemingly for feeble micro-optimisation) yet demonstrated a complete lack of understanding: that the purpose of the class is to encapsulate a different (less intuitive) algorithm that allocates memory more efficiently. The net effect was that using the string builder was actually slightly slower yet less readable than alternatives.
Which algo then? I guess it should be some "double memory footprint" with significantly large initial buffer. Or is there something else?
1

In Java (and C#) strings are immutable objects. That means that if you have:

string s = "String 1";

then the compiler allocates memory for this string. Haven then

s = s + " String 2"

gives us "String 1 String 2" as expected but because of the immutability of the strings, a new string was allocated, with the exactly size to contain "String 1 String 2" and the content of both strings is copied to the new location. Then the original strings are deleted by the garbage collector. In Delphi a string is more "copy-on-write" and reference counted, which is much faster.

C# and Java have the class StringBuilder with behaves a lot like Delphi strings and are quite faster when modifying and manipulating strings.

4 Comments

When concatenating a new string is created. So this answer is no good.
The RTL internal for concatention (UStrCat - in XE4 at least) uses destination expansion. That is, if possible (only one reference to the string and memory allows) a realloc, rather than an alloc+copy. Obviously an alloc+copy (and potentially free) may still be necessary in some cases. Concatenation may create a new string, but will try not to. This is logical behaviour for a copy-on-write model; I expect this has been part of Delphi string handling since the introduction of ref. counted strings.
PS it is highly recommended to use TStringBuilder for the code which concatenates strings inside a loop even for Delphi
@MaxAbramovich that depends of heap manager used. There were articles comparing different patterns of building strings against different MMs. Quite often TStringBuilder performance was less than impressive.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.