0

Being a Delphi novice, I have written a simple function, which given a string and a character, returns the indexes of the character's occurrence in the string.

However I am not sure if my implementation is optimal (because I am constantly resizing the dynamic array). Is there a better way to implement this:

type Indexes = array of Integer;

function GetCharacterIndexesInString(inputstring:string; c:char) : Indexes;
  var
  s : char;
  count : integer;
  position: integer;

  begin
    count := 0;
    position := 0;
    SetLength(result, 0);
    for  s in inputstring do
    begin
      if s = c then
      begin
        count := count+1;
        SetLength(result, count);
        result[count-1] := position;
      end;
      position := position+1;
    end;
  end;
2
  • 1
    Probably depends on the typical input that you provide. I doubt one algorithm will be optimal for all possible input. Commented May 29, 2017 at 17:29
  • Another way would be to use TList<Integer> instead of dynamic array. There it is implemented already, we don't care about length and instead of 'result[count-1] := position' write 'Result.Add(position)'. But there is caveat: result of this function must be freed at some time. Commented May 30, 2017 at 12:37

1 Answer 1

7

There is classic algorithm for asymptotically fast array resizing when the final size is unknown. Instead of resizing it by 1, we double its size each time we're out of space. After we're done, set the actual size of array.

Here is the code:

type Indexes = array of Integer;

function GetCharacterIndexesInString(inputstring:string; c:char) : Indexes;
  var
  s : char;
  count : integer;
  position: integer;
  capacity: Integer; //TList uses same method internally :)

  const InitCapacity: Integer = 16;
  begin
    count := 0;
    position := 0;
    capacity := InitCapacity;
    SetLength(result, InitCapacity);
    for  s in inputstring do
    begin
      if s = c then
      begin
        inc(count); //adds 1 to count, delphi analog of ++
        if count > capacity then begin
          capacity := capacity * 2;
          SetLength(result, capacity);
        end;
        result[count-1] := position;
      end;
      inc(position);
    end;
    SetLength(result, count);
  end;

UPD. Modified code according to David Heffernan's suggestion, now it starts with length of 16, which should speed it up a little. Also, it's possible to 'tune' InitCapacity, code stays working with any positive number here. For example, you can gather statistics: average length of resulting array and set InitCapacity a little bigger than this average length.

Sign up to request clarification or add additional context in comments.

5 Comments

This is probably wasteful in early allocations because there will be a minimum block size. I'd likely start with 16 or 32.
Or better still. Read into a fixed length array to begin with. If it fills up, switch to dynamic array. If not, you can do the entire thing with one heap allocation.
FWIW, there have been complaints about TList<T>, which does indeed double the array size when it must be resized. People rather liked what TList (non-generic) does, i.e. add only half of the current size. They found doubling too wasteful.
Quite possibly, TList is actually more wasteful, performing more allocations, and creating blocks that cannot be re-used. You may end up with, for instance, blocks of size 256, 384 and 512, rather than just 256 and 512. The TList approach is only less wasteful if these 1.5 times blocks can re-used. And most people when they discuss these issues don't understand enough about how memory allocation works at all levels (system and library) to be able to construct valid arguments.
@DavidHeffernan I updated an answer to start with array size of 16.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.