C++ <algorithm> implementation explained

Question

When I like to know how a algorithm in the C++ Standard Library could be implemented, I always look at http://en.cppreference.com/w/cpp/algorithm, which is a great source. But sometimes I don't understand some implementation details and I would need some explanation why something is done that particular way. For example in the implementation of std::copy_n, why the first assignment is made outside the loop and the loop therefore starts with 1?

template< class InputIt, class Size, class OutputIt>
OutputIt copy_n(InputIt first, Size count, OutputIt result)
{
    if (count > 0) {
        *result++ = *first;
        for (Size i = 1; i < count; ++i) {
            *result++ = *++first;
        }
    }
    return result;
}

Additionally: Do you know a site where possible algorithm implementations are explained?

this possible implementation is based on the one found in libc++ — Cubbi
– Cubbi, Commented Jul 15, 2013 at 21:02
Also, general note: possible implementations on cppreference are only provided when a functionally correct, but generally far from optimal, implementation can be written in just a few lines of code: std::shuffle is pushing it, std::stable_sort is not gonna happen. Do look at real implementations in the C++ standard libraries such as LLVM libc++, GNU libstdc++, or what-have-you, if you're interested in what really happens (e.g. when std::copy is compiled as std::memmove) — Cubbi
– Cubbi, Commented Jul 15, 2013 at 22:33
@Cubbi: Before I asked this question, I looked at the MSVC implementation. I had to comb through a lot of function calls to reach the actuall implementation. Therefore I like the site cppreference for its "just a few lines of code" which are yet correct as the copy_n example showed. The algorithm implementations give an inspiration, how to write my one generalized, iterator based algorithms. I want to thank you for your valuable comments to this question and the given answers, I think it was you, how pointed out the bug with the naive implementation of copy_n. — Christian Ammer
– Christian Ammer, Commented Jul 16, 2013 at 7:34

Yakk - Adam Nevraumont · Accepted Answer · 2013-07-16 12:30:19Z

21

Compare it with the naive implementation:

template< class InputIt, class Size, class OutputIt>
OutputIt copy_n(InputIt first, Size count, OutputIt result)
{
  for (Size i = 0; i < count; ++i) {
    *result++ = *first++;
  }
  return result;
}

This version does one more increment of first!

count==0, both do 0 increments of first.
count==1, their version does zero increments of first. The above version does 1.
count==2, their version does one increments of first. The above version does 2.

A possibility is to handle iterators that are dereferenceable, but not incrementable. At least in STL days, there was a distinction. I am not sure if input iterators have this property today.

Here is a bug that seems to occur if you use the naive implementation, and Here is some documentation that claims "The actual read operation is performed when the iterator is incremented, not when it is dereferenced."

I have not yet tracked down the chapter-and-verse for the existence of dereferenceable, non-incrementable input iterators. Apparently the standard details how many times copy_n dereferences the input/output iterators, but does not detail how many times it increments the input iterator.

The naive implementation increments the input iterator one more time than the non-naive implementation. If we have a single-pass input iterator that reads on ++ with insufficient space, copy_n could block needlessly on further input, trying to read data past the end of the input stream.

edited Jul 16, 2013 at 12:30

answered Jul 15, 2013 at 20:42

Yakk - Adam Nevraumont

280k31 gold badges367 silver badges585 bronze badges

Sign up to request clarification or add additional context in comments.

15 Comments

Yakk - Adam Nevraumont Over a year ago

@RanEldan Who has an extra condition? Where?

Yakk - Adam Nevraumont Over a year ago

@JimBuck Give an example input, and count the if-tests run, that shows an extra if-test? I'm not seeing it. Both do 1 more if-test than the number of items processed, exactly.

Cubbi Over a year ago

+1, but it may help to point out specifically that the naive implementation is wrong because copy_n has to support input iterators

Cubbi Over a year ago

@MooingDuck gcc.gnu.org/bugzilla/show_bug.cgi?id=50119 lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20110221/…

David Rodríguez - dribeas Over a year ago

@ChristianAmmer: Each call to postfix operator++ requires the creation of a copy of the iterator in the current state before performing the increment.

|

David Rodríguez - dribeas · Accepted Answer · 2013-07-15 20:59:16Z

13

That is just an implementation. The implementation in GCC 4.4 is different (and conceptually simpler):

template<typename InputIterator, typename _Size, typename _OutputIterator>
_OutputIterator
copy_n(_InputIterator __first, _Size __n,
     _OutputIterator __result)
{
  for (; __n > 0; --__n)
{
  *__result = *__first;
  ++__first;
  ++__result;
}
  return __result;
}

[With a bit of handwaving, since I only provided the implementation when the input iterator is an input iterator, there is a different implementation for the case where the iterator is a random access iterator] That implementation has a bug in that it increments the input iterator one time more than expected.

The implementation in GCC 4.8 is a bit more convoluted:

template<typename _InputIterator, typename _Size, typename _OutputIterator>
_OutputIterator
copy_n(_InputIterator __first, _Size __n,
     _OutputIterator __result)
{
  if (__n > 0)
{
  while (true)
    {
      *__result = *__first;
      ++__result;
      if (--__n > 0)
    ++__first;
      else
    break;
    }
}
  return __result;
}

edited Jul 15, 2013 at 20:59

answered Jul 15, 2013 at 20:53

David Rodríguez - dribeas

209k23 gold badges304 silver badges497 bronze badges

6 Comments

Yakk - Adam Nevraumont Over a year ago

As noted here, the above implementation is incorrect. (What @Cubbi said, with more words)

Mark B Over a year ago

@Yakk You should make an answer with that link, showing that a more naive implementation would be functionally wrong (due to the extra increment).

David Rodríguez - dribeas Over a year ago

@Cubby: Correct, there is a bug in this implementation. Updated with the latest implementation in g++ 4.8

Blastfurnace Over a year ago

The MSVC 2012 library also has slightly different implementations for input iterators, forward iterators, and pointers to scalars (calling memmove() in the last case).

Kerrek SB Over a year ago

Does the standard say anything about how the iterators may be incremented? I can't see it.

|

Kerrek SB · Accepted Answer · 2013-07-15 21:25:45Z

7

With the naive implementation, you increment the input iterator n times, not just n - 1 times. This is not just potentially inefficient (since iterators can have arbitrary and arbitrarily expensive user-defined types), but it may also be outright undesirable when the input iterator doesn't support a meaningful "past-the-end" state.

For a simple example, consider reading n elements from std::cin:

#include <iostream>    // for std:cin
#include <iterator>    // for std::istream_iterator


std::istream_iterator it(std::cin);
int dst[3];

With the naive solution, the program blocks on the final increment:

int * p = dst;

for (unsigned int i = 0; i != 3; ++i) { *p++ = *it++; }   // blocks!

The standard library algorithm doesn't block:

#include <algorithm>

std::copy_n(it, 3, dst);    // fine

Note that the standard doesn't actually explicitly speak about iterator increments. It only says (25.3.1/5) that copy_n(first, n, result) has

Effects: For each non-negative integer i < n, performs *(result + i) = *(first + i).

There is only a note in 24.2.3/3:

[input-iterator] algorithms can be used with istreams as the source of the input data through the istream_iterator class template.

edited Jul 15, 2013 at 21:25

answered Jul 15, 2013 at 20:48

Kerrek SB

480k96 gold badges904 silver badges1.1k bronze badges

3 Comments

Ran Eldan Over a year ago

Does increment cheeper than condition?

Kerrek SB Over a year ago

@RanEldan: Iterator increment can be a user-defined operation, so it can be as expensive as you like. And you're doing n comparisons either way!

Kerrek SB Over a year ago

General comment: My current understanding is that the standard makes no requirements on the state of a stream after it's been subjected on an algorithm that consumes an input iterator for that stream. I don't see anything that forbids the naive implementation. Cubbi's link above to the GCC bug is an interesting example of unexpected behaviour when "reusing" streams (and it is not clear to me that it is an actual bug, rather than a very insidious gotcha).

kfsone · Accepted Answer · 2013-07-15 20:42:51Z

1

Because of the initial check

if (count > 0)

we know that count > 0, therefore the author of that code felt that he didn't need to test against count again until he reached the value of 1. Remember that "for" executes the conditional test at the start of every iteration, not at the end.

Size count = 1;
for (Size i = 1; i < count; ++i) {
    std::cout << i << std::endl;
}

would print nothing.

Thus the code eliminates a conditional branch, and if Size is 1, it eliminates the need to increment/adjust "first" - hence it being a pre-increment.

answered Jul 15, 2013 at 20:42

kfsone

24.4k3 gold badges46 silver badges78 bronze badges

Collectives™ on Stack Overflow

C++ <algorithm> implementation explained

4 Answers 4

15 Comments

6 Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

15 Comments

6 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related