1

I am working on a data aggregator implementation with nested lambda expressions. I am not very experienced with lambda functions yet and I am not 100% sure whether my implementation idea is implementable.

The Problem: I have a Multivalue class which has a vector as a private member. The data type of this private vector can be defined via a template parameter of the Multivalue class. I want to offer different aggregation functions (sum, harmonic mean, average, ...) for the data in the vector. BUT: if the vector data type is a tuple, the aggregation functions should also be available for the components of the tuple. The following link shows my first try, which is not working yet. I hope someone can explain to me what the problem is. I think there is a problem with the nested lambda expressions, I use:

#include <iostream>
#include <vector>
#include <functional>
#include <tuple>

using namespace std;

template<typename ... TTypes>
std::ostream& operator<<(std::ostream& out, const std::tuple<TTypes...>& value) { return out; }

/** our general abstract aggregation object */
template<typename T, typename C>
class AbstractAggregator {
public: 
    AbstractAggregator() { }
    AbstractAggregator(const std::vector<C> *data, std::function<const T&(const C&)> access) :
    data(data), access(access) { }
    
    T sum() const {
        T sum;
        for (auto &i : *data)
            sum += access(i);
        return sum;
    }
protected:
    const std::vector<C> *data;
    std::function<const T&(const C&)> access;
};

/** concrete aggregation implementation for types like int, float, double .. */
template<typename T, typename C>
class Aggregator : public AbstractAggregator<T, C> {
public:
    Aggregator() { }
    Aggregator(const std::vector<C> *data, std::function<const T&(const C&)> access) : AbstractAggregator<T, C>(data, access) { }    
};

/** aggregator implementation for tuple (with subaggregators for each component */
template<typename ... TTypes, typename C>
class Aggregator<std::tuple<TTypes...>, C> : public AbstractAggregator<std::tuple<TTypes...>, C> {
public:
    Aggregator() { }
    Aggregator(const std::vector<C> *data, std::function<const std::tuple<TTypes...>&(const C&)> access) : AbstractAggregator<std::tuple<TTypes...>, C>(data, access) { 
    initSubAggregators<sizeof...(TTypes), TTypes...>(access); 
}

    std::tuple<Aggregator<TTypes, C>...> subaggregators;
private:
    template<int N>
    void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { }
    template<int N, typename THead, typename... TTail>
    void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { 
        constexpr int I = N - sizeof...(TTail) - 1;
        std::get<I>(subaggregators) = Aggregator<THead, C>(AbstractAggregator<std::tuple<TTypes...>, C>::data, [&](const C &value) { return std::get<I>(access(value)); });        
        initSubAggregators<N, TTail...>(access);
    }
};

namespace std {
    template<size_t I, typename ... TTypes, typename C>
    auto get(Aggregator<std::tuple<TTypes...>, C>& k) -> decltype(std::get<I>(k.subaggregators)) {
        return std::get<I>(k.subaggregators);
    }

    template<size_t I, typename ... TTypes, typename C>
    auto get(const Aggregator<std::tuple<TTypes...>, C>& k) -> decltype(std::get<I>(k.subaggregators)) {
        return std::get<I>(k.subaggregators);
    }
}

/** multivalue attribute implementation (inherits corresponding aggregation object) */
template<typename T>
class Multivalue : public Aggregator<T, T> {
public:
    Multivalue() : Aggregator<T, T>(&data, [](const T &value) { return value; }) { }
    void append(const T& item) { data.push_back(item); }
private:
    std::vector<T> data;
};

int main() {
    Multivalue<std::tuple<std::tuple<uint32_t, uint32_t>, float, double>> mv;
    mv.append(std::make_tuple(std::make_tuple(13, 12), 2.5, 3.5));
    mv.append(std::make_tuple(std::make_tuple(1, 7), 2.55123, 1.5));
    mv.append(std::make_tuple(std::make_tuple(5, 3), 2.312, 1.8));

    auto &a1 = std::get<2>(mv);
    auto &a2 = std::get<1>(mv);
    auto &a3 = std::get<0>(std::get<0>(mv));
    auto &a4 = std::get<1>(std::get<0>(mv));

    std::cout << "Sum 1: " << a1.sum() << std::endl;
    std::cout << "Sum 2: " << a2.sum() << std::endl;
    std::cout << "Sum 3: " << a3.sum() << std::endl;
    std::cout << "Sum 4: " << a4.sum() << std::endl;

    return 0;
}

Moo

15
  • If the code base is not huge, please post it here. If it is huge, create an MCVE and post the MCVE here. Commented Apr 13, 2015 at 21:02
  • Have you seen the link, I have posted? I think it is already a MCVE example. Commented Apr 13, 2015 at 21:04
  • 1
    I think re-implementing something inside std namespace is a recipe for disaster. Commented Apr 13, 2015 at 21:14
  • 1
    See stackoverflow.com/questions/16173902/… Commented Apr 13, 2015 at 21:23
  • 1
    It should not segfault any more (at least, it doesn't on coliru). The remaining issues are 1) the one pointed out by vsoftco. 2) the identity-lambda causes UB as well: [](const T &value) { return value; } should be [](const T &value) -> T const& { return value; } It currently returns by value, and the std::function that stores it returns that (temporary) value by reference. Commented Apr 13, 2015 at 22:34

1 Answer 1

4

There are at least three issues in your code.


The first has been pointed out by vsoftco, in a now deleted answer:

In the summation function, you're reading from an uninitialized variable.

T sum() const {
    T sum; // uninitialized
    for (auto &i : *data)
    {
        std::cout << "summing from: " << i << "\n";
        sum += access(i);
    }
    return sum;
}

To be a bit more precise, the variable sum is default-initialized, but for non-class types, this implies no initialization at all.

A simple solution is to value-initialize sum:

T sum{}; // value-initialized

The second issue is a lifetime issue with a captured entity:

template<int N, typename THead, typename... TTail>
void initSubAggregators(std::function<const std::tuple<TTypes...>&(const C&)> access) { 
    constexpr int I = N - sizeof...(TTail) - 1;
    std::get<I>(subaggregators) = Aggregator<THead, C>(AbstractAggregator<std::tuple<TTypes...>, C>::data, [&](const C &value) { return std::get<I>(access(value)); });        
    initSubAggregators<N, TTail...>(access);
}

Since I find this rather hard to read, let's introduce some typedefs and other simplifications:

using R = std::tuple<TTypes...> const&;
using P = C const&;
using F = std::function<R(P)>;

template<int N, typename THead, typename... TTail>
void initSubAggregators(F access) {
    constexpr int I = N - sizeof...(TTail) - 1;
    auto l = [&](C const& value) { return std::get<I>(access(value)); };
    std::get<I>(subaggregators) = Aggregator<THead, C>(this->data, l);
    initSubAggregators<N, TTail...>(access);
}

The lamdba in the function initSubAggregators captures the function parameter access by reference. The function parameter is an object that goes out of scope at the end of the function, and is subsequently destroyed. The lambda however is stored within a std::function inside the sub-aggregator and lives beyond the scope of initSubAggregators. Since it captured access by reference, it will store a dangling reference after initSubAggregators returns. One possible solution is to store access by value (which creates a copy for each sub-aggregator). Another solution is to store access within the tuple-aggregator as a data member, and store a reference to that data member within each lambda.


The third issue is also related to lifetime of an object.

The lambda that is created in the constructor of Multivalue and passed to the aggregator returns by value due to the normal return type deduction:

Multivalue() : Aggregator<T, T>(&data, [](const T &value) { return value; }) { }

The constructor of Aggregator<T, T> however expects a function object that returns by const-reference:

Aggregator(const std::vector<C> *data, std::function<const T&(const C&)> access)

When the std::function object is called, it calls the lambda object. The lambda returns a T by value. std::function then returns that temporary object (the return value) by reference, creating a dangling reference. This is the same issue as in What is the return type of a lambda expression if an item of a vector is returned? It can be solved by manually defining the return type:

Multivalue() : Aggregator<T, T>(&data, [](const T &value) -> const T & { return value; }) { }
Sign up to request clarification or add additional context in comments.

4 Comments

There are probably better solutions to these problems, but I'm not sure if they're worth it. I'd consider redesigning the classes instead.
std::function< T&(???) > should really be spewing errors if you pass it a function that returns by value, as it is returning a reference to a temporary. Wonder why not? Maybe it is stored in an intermediate reference parameter before returning, and that confuses the compiler?
@Yakk I agree. I think the reason why it doesn't emit a warning is that the header containing the source of the lifetime issue is defined as GCC system_header via a pragma. In libc++, header __functional_base, if I copy the relevant __invoke_void_return_wrapper to a custom header, I get the warning. Similarly, if I deactivate the pragma via _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER. Jonathan Wakely's comment here is also relevant.
Oh, that is a bad policy, to disable warnings in system headers. I guess the inability for end users to modify the code to say "yes, I understand, I can deal with this error" is missing. But with template code, when it interacts with a user-type, that lack of warning is ... unfortunate.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.