The question asks for a Haskell encoding of a class hierarchy with two goals:
- being able to shove everything into one managing container
- code reuse
I'll use a smaller variant of the class hierarchy from the question for my examples. The easiest way to achieve goal 1 is to have a single algebra data type for entities. We can then use lists or arrays or whatever container we want that contain entities. So we want:
data Entity = ...
type ExampleContainer = [Entity]
How should we fill in the ...? I first show a naive approach, analyze why it fails to provide reuse, and then turn this insight into a more sophisticated approach that provides reuse.
Naive approach, bad reuse :(
There are multiple kinds of entities in the CEntity class hierarchy, so we could use multiple constructors for the Entity data type:
data Entity
= Car Position Velocity Color
| Player Position Velocity Gun
| Door Position Key
| Rock Position
Every leaf of the class hierarchy corresponds to a constructor, but the intermediate classes don't show up. This leads to a duplication in the datatype declaration: We repeat Position and Velocity multiple times. This duplication on the type level also influences the rest of our program: For example, a function that moves objects with a velocity one step forward would look like:
move :: Entity -> Entity
move (Car position velocity color) = Car (position + velocity) velocity color
move (Player position velocity gun) = Tank (position + velocity) velocity gun
move (Door poosition key) = Door position key
move (Rock position) = Rock position
The duplication of the Position and Velocity fields indeed leads to a duplication of the position + velocity formula. Maybe if we reuse the Position and Velocity fields in the algebraic data type, we can also reuse the position + velocity formula?
Sophisticated approach, better reuse :)
We restructure our algebraic data so that common fields are shared. All entities have a position, but the other fields differ according to what kind of entity we have:
data Entity
= Entity Position EntityInfo
Moving objects have a velocity but fixed objects don't:
data EntityInfo
= Moving Velocity Moving
| Fixed Fixed
A moving object can be a car or a player:
data Moving
= Car Color
| Player Name
And a fixed object can be door or a rock:
data Fixed
= Door Key
| Rock
So we still have the four constructors Car, Player, Door and Rock, but in addition we have the constructors Entity, Moving and Fixed to store information that is available for multiple kinds of entities. These additional constructors correspond to the intermediate classes in the class hierarchy. Note that we only mention Position and Velocity once, so hopefully the code duplication in the move function should go away. And indeed:
move :: Entity -> Entity
move (Entity position (Moving velocity info))
= Entity (position + velocity) (Moving velocity info)
move (Entity position (Fixed info)) = Entity position (Fixed info)
Now, the formula position + velocity only appears once, as we hoped.
Summary
One approach for encoding a deep class hierarchy is by algebraic data types. Every class corresponds to a constructor, and every class that has subclasses also corresponds to a data type. If we avoid field duplication in these data type, we also avoid code duplication in the code that manipulates values of the data types.