83

How can I define algebraic data types in Python (2 or 3)?

7
  • 1
    Python has a very loose type system in the first place; what exactly are you trying to get out of this? Commented Apr 28, 2013 at 1:18
  • 12
    @Amber not loose. very strong, but duck. Commented Apr 28, 2013 at 1:19
  • 6
    @Elazar When I say "loose", I mean things like functions not having particular type signatures. But you're right, Python is not weakly typed. Commented Apr 28, 2013 at 1:25
  • 3
    @Amber I see what you mean. Commented Apr 28, 2013 at 1:28
  • 4
    @Amber Rather than "loose" maybe saying it's dynamically typed rather than statically typed would have been clearer. Commented Jan 8, 2017 at 16:23

5 Answers 5

74

Python 3.10 version

Here is a Python 3.10 version of Brent's answer with pattern-matching and prettier union type syntax:

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

@dataclass
class Circle:
    x: float
    y: float
    r: float

@dataclass
class Rectangle:
    x: float
    y: float
    w: float
    h: float

Shape = Point | Circle | Rectangle

def print_shape(shape: Shape):
    match shape:
        case Point(x, y):
            print(f"Point {x} {y}")
        case Circle(x, y, r):
            print(f"Circle {x} {y} {r}")
        case Rectangle(x, y, w, h):
            print(f"Rectangle {x} {y} {w} {h}")

print_shape(Point(1, 2))
print_shape(Circle(3, 5, 7))
print_shape(Rectangle(11, 13, 17, 19))
print_shape(4)  # mypy type error

You can even do recursive types:

from __future__ import annotations
from dataclasses import dataclass

@dataclass
class Branch:
    value: int
    left: Tree
    right: Tree

Tree = Branch | None

def contains(tree: Tree, value: int):
    match tree:
        case None:
            return False
        case Branch(x, left, right):
            return x == value or contains(left, value) or contains(right, value)

tree = Branch(1, Branch(2, None, None), Branch(3, None, Branch(4, None, None)))

assert contains(tree, 1)
assert contains(tree, 2)
assert contains(tree, 3)
assert contains(tree, 4)
assert not contains(tree, 5)

Note the need for from __future__ import annotations in order to annotate with a type that hasn't been defined yet.

Exhaustiveness checking for ADTs can be enforced with mypy using typing.assert_never() in Python 3.11+ or as part of the typing-extensions backport for older versions of Python.

def print_shape(shape: Shape):
    match shape:
        case Point(x, y):
            print(f"Point {x} {y}")
        case Circle(x, y, r):
            print(f"Circle {x} {y} {r}")
        case _ as unreachable:
            # mypy will throw a type checking error
            # because Rectangle is not covered in the match.
            assert_never(unreachable)
Sign up to request clarification or add additional context in comments.

6 Comments

Great answer! Not that python does not optimize tail recursion and the contains function will use one stack frame per tree level. Given the logarithmic height of binary trees, this should be fine, just something to be aware of
@SimonKohlmeyer No language, even ones with tail call optimization/elimination, should be able to eliminate the tail in this case, as far as I can see. The contains branches twice, so only contains(right, value) is in tail position, while the prior one, contains(left, value) is not. The or operation follows it in any case and no tail can be eliminated (the locals are still required after returning from it). So this always uses a stack frame per level, regardless if Python supported TCO.
There's a project on Github to implement tail recursion for Python. I haven't tried using it, and can't vouch for it, but I'm aware it exists.
As I pointed out in Brent's answer, sum and union types are different.
What's that feature that allows to 'annotate with a type that hasn't been defined yet' called? Any idea in which python3 version it would be available by default?
|
27

The typing module provides Union which, dissimilar to C, is a sum type. You'll need to use mypy to do static type checking, and there's a notable lack of pattern matching, but combined with tuples (product types), that's the two common algebraic types.

from dataclasses import dataclass
from typing import Union


@dataclass
class Point:
    x: float
    y: float


@dataclass
class Circle:
    x: float
    y: float
    r: float


@dataclass
class Rectangle:
    x: float
    y: float
    w: float
    h: float


Shape = Union[Point, Circle, Rectangle]


def print_shape(shape: Shape):
    if isinstance(shape, Point):
        print(f"Point {shape.x} {shape.y}")
    elif isinstance(shape, Circle):
        print(f"Circle {shape.x} {shape.y} {shape.r}")
    elif isinstance(shape, Rectangle):
        print(f"Rectangle {shape.x} {shape.y} {shape.w} {shape.h}")


print_shape(Point(1, 2))
print_shape(Circle(3, 5, 7))
print_shape(Rectangle(11, 13, 17, 19))
# print_shape(4)  # mypy type error

5 Comments

Hi! If you are using mypy, you can check the exhaustiveness of your "pattern matching" using the assert_never idiom: github.com/python/typing/issues/735
And PEP 622 (pattern matching) makes using sum types even more similar to functional languages.
This isn't really sum types since it just relies on RTTI to identify the type.
As of today (October 15 2021) Python 3.10 is released, adding support for structural pattern matching. docs.python.org/3.10/whatsnew/…
Sum and union types are two different things. None | None, for example, is isomorphic (if not identical) to None itself. None + None (to use the obvious syntax for a hypothetical sum type) would be isomorphic to bool.
2

there is a library that does exactly what you want called ADT, it even has mypy typechecking. However it is unmaintained so use it with caution ig.

the code using it looks like this

@adt
class Tree:
    EMPTY: Case
    LEAF: Case[int]
    NODE: Case["Tree", "Tree"]

and matching works too

# Defined in some other module, perhaps
def some_operation() -> Either[Exception, int]:
    return Either.RIGHT(22)  # Example of building a constructor

# Run some_operation, and handle the success or failure
default_value = 5
unpacked_result = some_operation().match(
    # In this case, we're going to ignore any exception we receive
    left=lambda ex: default_value,
    right=lambda result: result)

Comments

1

With the introduction of structural pattern matching in Python 3.10, ADTs in Python became practical

from typing import Union
from dataclasses import dataclass

@dataclass
class Circle:
    radius: float

@dataclass
class Rectangle:
    width: float
    height: float

@dataclass
class Triangle:
    base: float
    height: float

Shape = Circle | Rectangle | Triangle

def area(shape: Shape) -> float:
    match shape:
        case Circle(radius):
            return 3.14159 * radius ** 2
        case Rectangle(width, height):
            return width * height
        case Triangle(base, height):
            return 0.5 * base * height

# Example usage
circle = Circle(radius=5)
print("Area of circle:", area(circle))

rectangle = Rectangle(height=10, width=5)
print("Area of rectangle:", area(rectangle))

triangle = Triangle(base=10, height=5)
print("Area of triangle:", area(triangle))

I would argue @dataclass is how Python supports Product types. And in line 18th, you'll see Python already supports Union Types. That enables us to create a Shape relationship (Type) without resorting to inheritance.

The big problem before 3.10 was the lack of Pattern Matching, which meant discriminating between the concrete shape required chained if-elif-else/instanceOf chains.

With pattern matching, we can express the cases directly and cleanly. To my surprise, mypy will give you an error if the pattern matching is not exhaustive.

I am disappointed that Python implemented match as a statement, not an expression, but I am guessing that an expression might have looked weird to long-time Pythonistas.

The above shows Python's support for ADTs is adequate, given the limitations of a dynamically typed imperative language.

If interested in a longer explanation, you could look at my blog

Comments

-3

Here's an implementation of sum types in relatively Pythonic way.

import attr


@attr.s(frozen=True)
class CombineMode(object):
    kind = attr.ib(type=str)
    params = attr.ib(factory=list)

    def match(self, expected_kind, f):
        if self.kind == expected_kind:
            return f(*self.params)
        else:
            return None

    @classmethod
    def join(cls):
        return cls("join")

    @classmethod
    def select(cls, column: str):
        return cls("select", params=[column])

Crack open an interpreter and you'll see familiar behavior:

>>> CombineMode.join()
CombineMode(kind='join_by_entity', params=[])

>>> CombineMode.select('a') == CombineMode.select('b')
False

>>> CombineMode.select('a') == CombineMode.select('a')
True

>>> CombineMode.select('foo').match('select', print)
foo

Note: The @attr.s decorator comes from the attrs library, it implements __init__, __repr__, and __eq__, but it also freezes the object. I included it because it cuts down on the implementation size, but it's also widely available and quite stable.

Sum types are sometimes called tagged unions. Here I used the kind member to implement the tag. Additional per-variant parameters are implemented via a list. In true Pythonic fashion, this is duck-typed on the input & output sides but not strictly enforced internally.

I also included a match function that does basic pattern matching. Type safety is also implemented via duck typing, a TypeError will be raised if the passed lambda's function signature doesn't align with the actual variant you're trying to match on.

These sum types can be combined with product types (list or tuple) and still retain a lot of the critical functionality required for algebraic data types.

Problems

This doesn't strictly constrain the set of variants.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.