3

Is it possible to template methods for any kind of integer size ?

To illustrate, imagine this very trivial example (the body of the method is not important in my question):

public int Mul(int a, int b)    {
    return a*b;
}

Now, I want the same method that supports any kind of integer (excluding BigInteger of course). I have to write all variants :

public long Mul(long a, long b)    {
    return a*b;
}
public ulong Mul(ulong a, ulong b)    {
    return a*b;
}
public short Mul(short a, short b)    {
    return a*b;
}
public ushort Mul(ushort a, ushort b)    {
    return a*b;
}
public byte Mul(byte a, byte b)    {
    return a*b;
}

While this example is very trivial and it's not actually a problem to duplicate, if I have more complex algorithms like this (replicate for all integer kinds):

    public static IEnumerable<long> GetPrimesFactors(this long number)
    {
        for (long i = 2; i <= number / 2; i++)
        {
            while (number % i == 0)
            {
                yield return i;
                number /= i;
            }
        }
        yield return number;
    }

it introduces a maintenance risk as there is duplicated code and logic (coding integrists would say this is the evil to have same logic at multiple place).

Some of you may suggest to implements the long version and cast the result, but having to ask consumer code to cast can be confusing and reduce readability :

void SomeMethod(IEnumerable<int> valuesToProcess)
{
    foreach(int value in valuesToProcess) { Console.WriteLine(value); }
}

void Main()
{
    int i = 42;
    SomeMethod(((long)i).GetPrimesFactors().Select(l=>(int)l)); 
    SomeMethod(GetPrimesFactors(i));

    long l = 42L;
    SomeMethod(l.GetPrimesFactors().Select(l=>(int)l));
}

When I see the definition of the interface IEnumerable<T>, and especially the definitions of Sum method overloads :

    public static decimal? Sum(this IEnumerable<decimal?> source);
    public static decimal Sum(this IEnumerable<decimal> source);
    public static double? Sum(this IEnumerable<double?> source);
    public static double Sum(this IEnumerable<double> source);
    public static float? Sum(this IEnumerable<float?> source);       
    public static float Sum(this IEnumerable<float> source);
    public static int? Sum(this IEnumerable<int?> source);
    public static int Sum(this IEnumerable<int> source);
    public static long? Sum(this IEnumerable<long?> source);
    public static long Sum(this IEnumerable<long> source);

I conclude that it's not possible... that's why MS has to implement all overloads.

Does anyone have any tips for designing general purpose integer methods without having to duplicate logic ?

9
  • There is no clean high performance solution. You can use generics with a few tricks, but the performance is significantly worse. Commented Dec 12, 2011 at 13:17
  • can you make the methods generic? for instance GetPrimesFactor<T>, and then check if T is an integer type? Commented Dec 12, 2011 at 13:18
  • @CodeInChaos: "performance is significantly worse" : I can't accept that. If I have to choose between performance and readability, I would choose performance if there is a real gap. Commented Dec 12, 2011 at 13:20
  • @JAson: as my methods can be call multiples times, nested in complex algorithms, I can't loose (too much) performance. so testing type is not acceptable. And moreover, I don't think this can works without boxing/unboxing a lot Commented Dec 12, 2011 at 13:22
  • @SteveB If performance is so critical, then it sounds like you have already chosen the trade-off between duplicating/readability and the performance you seek. Code-gen it and move on. Any maintenance issues from a few pieces of extra code are likely dwarfed by the maintenance of the performance itself. Commented Dec 12, 2011 at 13:26

5 Answers 5

2

There is no clean high performance solution. The choices I can think of are:

  1. Manually duplicate the code (fast and redundant)
  2. Automatically duplicate the code with a code generator (fast but a bit ugly). One .net numerics library went that way, but I don't remember its name.
  3. Use some form of indirection, such as MiscUtil's Operator class, or the DLR (slow)
  4. The arithmetic helper struct. I'm not sure how good the performance is, but you can try.

Generic methods representing operators:

These were my first idea. The issue is how to implement them. MiscUtil does this by calling a delegate stored in a static field.

static Func<T,T,T> _multiply;
public static T Multiply(T n1,T n2)
{
  return _multiply(n1, n2);
}

One point to note here, is that you should avoid a static constructor, since its mere existence slows down static field access.

But that involves an indirect call, and that's expensive. I next tried to improve this by manually specializing for certain known types:

public static T Multiply(T n1,T n2)
{
  if(typeof(T)==typeof(int))
    return (T)(object)((int)(object)n1*(int)(object)n2);
  ...
  return _multiply(n1, n2);
}

The JIT compiler is smart enough to realize which of those if cases it has to take, and will remove them. While that improved performance, it bloated the IL representation of the methods. And the JIT compiler is not smart enough to inline those method now, since their IL representation is long, and the inline heuristic only looks at the IL length of a method, not its machine code length. I don't remember if these casts cause boxing, or if the JITter was smart enough to optimize that out. Still the lack of inlining is too costly.

How 4) works:

First create an interface that contains the basic operations you need(arithmetic operators,...):

interface IArithmetic<T>
{
   T Multiply(T n1,T n2);
}

Implement it for each type you need with a struct:

public struct Int32Arithmetic:IArithmetic<Int32>
{
   Int32 Multiply(Int32 n1,Int32 n2)
   {
     return n1*n2;
   }
}

Then make most of your actual code generic, and pass in an arithmetic helper:

internal T MyOperation<T,TArithmetic>(T n1, T n2)
  where TArithmetic:struct,IArithmetic<T>
{
   return default(TArithmetic).Multiply(n1,n2);
}

And if you want a clean interface for multiple types, create a thin overloaded wrapper forwarding to the generic method:

public Int32 MyOperation(Int32 n1,Int32 n2)
{
  return MyOperation<Int32,Int32Arithmetic>(n1, n2);
}

This might be fast, because generics get specialized for each value type. It uses no indirections and the method bodies in IL don't get too long, so inlining is possible. But I haven't tried this myself yet.

Sign up to request clarification or add additional context in comments.

Comments

2

Consider the DLR:

    static void Main(string[] args)
    {
        int i = Mul(2, 4);
        Console.WriteLine(i);
        Console.Read();
    }

    static dynamic Mul(dynamic x, dynamic y)
    {
        return x * y;
    }

Performance is to be determined (I'd expect it to be slower than straight overloads), but readability is much nicer. Could get a little hairy if you provide types that don't implement the required operators or different types that cause values to truncate.

Updated from comment: If performance is so critical, then it sounds like you have already chosen the trade-off between duplicating/readability and the performance you seek. Code-gen it and move on. Any maintenance issues from a few pieces of extra code are likely dwarfed by the maintenance of the performance itself.

3 Comments

Well, this introduces late binding which leads to performance issues, especially if the method is called very often.
@noah1989 It caches the binding on first call, so subsequent calls actually get faster. Not sure what the performance characteristics are compared to other indirect mechanisms such as emitted IL, reflection, or delegates. My argument to this? If performance is that critical, then you have already chosen the trade-off so the question is moot.
@AdamHouldsworth +1 And totally agree with your view on performance.
1

Well, what you could do is generate the duplicated code during your build process using, for example, T4.

2 Comments

I was thinking about T4, or pre-build regex replace. Has to be considerd actually
ah, yeah.. T4 is what I had in mind (couldn't remember the name), thanks :)
0
public T Mul<T>(T a, T b){
    dynamic x = a;
    dynamic y = b;
    return (T)x*y;
}

1 Comment

It's most likely even slower that a reflection based version such as MiscUtil. Unless the JIT compiler has some special knowledge about the DLR, but I never heard of that.
0

I once tried so implement something similar using CodeDom to generate an assembly during runtime and dynamically load it. This works rather well, but has some limitations. For example, your environment might not allow you to dynamically compile assemblies and there is the big one: performance. Although the "calculator"-class is only generated once, the overhead of calling a virtual method actually doubles the time necessary for the calculation.

You could give it a try to see how it would perform in your environment, I just lay out the classes (since this was a long time ago and I don't have the code anymore).

interface ICalculator<T> {
    T Add(T left, T right);
    T Multiply(T left, T right);
}

internal static class Calculator<T> {
     static ICalculator<T> instance;
     static Calculator() { 
          Type type = typeof(T);
          // 1. Use CodeDom to design a class that implements ICalculator<T> using the
          // builtin +,-,*,/ operators
          // 2. Compile assembly in memory
          // 3. Load assembly and create an instance of the ICalculator<T> class
          Type concreteType = GetTypeFromDynamicAssembly(); // Get this from the assembly.
          instance = Activator.CreateInstance(concreteType) as ICalculator<T>;
     }

     public static T Add(T left, T right) {
          return instance.Add(left, right);
     }
}

class MyClassUsingGenericMathType<T> {
    T Sum(params T[] values) {
        T sum = default(T);
        foreach (T value in values) {
            sum = Calculator<T>.Add(sum, value);
        }

        return sum;
    }
}

The idea is that you dynamically built the implementation first time it is used (the static constructor is invoked then), after that the Calculator methods directly call the corresponding operator of the numeric type you are using. As I said, I remember that this adds an overhead everytime an operation is performed, but I never analyzed whether there is a potential for speeding up the process using some Compiler-attributes.

Another thing: using a type that doesn't implement the corresponding operators would cause a runtime-exception rather than a compile error. So it's far from perfect.

1 Comment

Very similar to what MiscUtil does, but has a few more indirections, and thus is probably a bit slower.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.