Subtyping is About Total Mappings on Types

Julian Ferrone
I lay out a way to understand subtyping by implementing total functions which map subtypes to supertypes.
Published: June 27, 2026

I’ve been reading up on subtyping lately. Subtyping helps us understand what valid changes we can make to a program—amongst other helpful uses—but the rules can be a bit confusing.

I’d been thinking about this for quite some time, musing over it, when I had a sudden flash of inspiration that helped it all click in my head.

Instead of trying to brute-force memorise the subtyping rules on algebraic data types, we can understand those rules by writing total functions which map subtypes to supertypes. The existence of those functions acts as a proof that there is a subtyping relation between the two types.

What is Subtyping? Why Do We Care About It? #

Subtyping is about substituting values of one type for values of another type. If we have some type AA that is a subtype of a type BB , then we can provide a value of type AA whenever our program expects a value of type B.B. This has two consequences for our code.

Firstly, if we create code fragments which operate on simple supertypes (i.e. there are many types which are subtypes of these) then we can reuse functionality in many parts of our program by providing subtypes of the demanded supertypes. This helps us keeps our code decoupled.

Secondly, when we make changes to our program’s data structures, if our new versions of the type are subtypes of the old versions of the type, our old code will operate on the updated data structures without needing to be edited.

Subtyping helps us keep our programs modular over both space (different module boundaries) and time (our program’s change over time).

We can memorise the rules after we’ve properly understood how subtyping works.

Total Functions #

Rather than just memorising subtyping rules for algebraic data types, I want to provide a richer understanding under the lens of total functions—functions which are defined over the entire range of a type. Functions which aren’t defined for every instance of a type are known as partial functions.

A famous example of a partial function in Haskell is head, which extracts the first element of a non-empty list. head is implemented like so: I’m ignoring the HasCallStack typeclass requirement that head needs to be able to throw errors on empty lists.

head :: [a] -> a
head (x:_) = x

Because head only contains a pattern for non-empty lists, it succeeds on lists with at least one element:

ghci> head (1 : [])
1
ghci> head (2 : 3 : [])
2

But fails on empty lists:

ghci> head []
*** Exception: Prelude.head: empty list
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/List.hs:1644:3 in base:GHC.List
  errorEmptyList, called at libraries/base/GHC/List.hs:87:11 in base:GHC.List
  badHead, called at libraries/base/GHC/List.hs:83:28 in base:GHC.List
  head, called at <interactive>:1:1 in interactive:Ghci1

Contrast this with the total function safeHead implemented like so:

safeHead :: [a] -> Maybe a
safeHead []    = Nothing
safeHead (x:_) = Just x

Which works on both empty and non-empty lists:

ghci> safeHead []
Nothing
ghci> safeHead (1 : [])
Just 1
ghci> safeHead (2 : 3 : [])
Just 2

And is therefore a total function.

So when I say that we want to write “total” functions that map subtypes to supertypes, I mean that they must be well-defined on every value of the subtype—no throwing errors.

A Brief Review of Algebraic Data Types #

Algebraic data types are about composing together types to make new types. Besides your regular built-in compound types (strings, lists, and so on), there are a few special types to go over:

  1. Void
  2. Unit
  3. Product
  4. Sum
  5. Function

Let’s zip over these quickly.

The Void Type #

The void type is a data type that cannot be constructed:

data Void

We’ve defined the data type, but provided no constructors.

This type is also known as the empty type. It has 0 occupants When I’m talking about “occupants”, I’m really talking about the cardinality of the type, i.e. how many unique elements there are in the type. since there is no way to make a value of type Void:

Void0Void \equiv 0

It’s quite rare in programming languages. Interestingly, Rust uses this in the panic! macro (which throws a runtime exception) to notate that it doesn’t return a value, but calls it the never type (as in, it never returns a value).

The Unit Type #

The unit type is a data type that can only be constructed one way, and as such has only 1 occupant:

Unit1Unit \equiv 1

In Haskell, it’s represented with the empty tuple, but for ease of pronunciation I’ll distinguish between the type Unit and its constructor ():

data Unit = ()

Rust also calls it the () type.

Confusingly, the C languages and Java call it void.

Python calls it None.

Product Types #

Here we get to the interesting part—type constructors which take multiple types and create a new compound type.

The first type constructor we’ll look at is the product type which contains values of two types at once, meaning it corresponds with the logical predicate of conjunction—it says “we have an element of type AA AND an element of type BB ”.

These are everywhere, under the guise of tuples (product types indexed by integer):

data Tuple2 a b = (,) a b
-- or, infix:   = (a, b)

and records (product types indexed by a name): There isn’t a generic record type in Haskell unless you use a library.Tuple2 is a real type, though—the type of 2-tuples—so by analogy I defined the generic 2-record as Record2.

data Record2 a b = Record2 { recordA :: a, recordB :: b }

There’s lots of different names for records—Rust and C call these structs, Python calls these dictionaries or (slightly fancier) dataclasses, Erlang has records and maps Interestingly, Erlang’s records are compiled down to tuples. They’re basically a simple affordance for indexing into tuples with names—which works because tuples and records are isomorphic after reindexing.Conversely, maps are stored in different ways based on their size. Small maps (which store less than 32 key-value pairs) are stored as a pair of arrays (one of sorted keys, the other of values). Large maps (more than 32 key-value pairs) are stored as hash-array mapped tries., Elixir adds structs on top of Erlang. Elixir’s structs basically let us add compile-time checks on top of maps.

As for why these are called product types, it’s because the number of types in a product type is equal to the product of the cardinality of its component types:

(a,b)a×b(a, b) \equiv a \times b

Take the Boolean type, which has 2 occupants; True and False. If we have a tuple (Bool, Bool), it has 2 × 2 = 4 occupants:

(False, False)
(False, True)
(True, False)
(True, True)

Or take the tuple of type (Bool, Unit), which has 2 × 1 = 2 occupants:

(False, ())
(True, ())

Sum Types #

The sum type has much rarer language-level support. Where a product type says “we have an AA and a BB ”, a sum type says “we have an AA OR a BB ” and therefore corresponds with the logical predicate of disjunction.

In Haskell, it’s represented by the Either type:

data Either a b = Left a
                | Right b

It’s often doable in other languages by manually compiling to a tagged union, say in C:

typedef enum {
    VALUE_INT,
    VALUE_STRING
} ValueType;

typedef struct {
    ValueType type;
    union {
        int i;
        const char *s;
    } data;
} Value;

Or in old-school Java:

public class TaggedUnionExample {
    enum ValueType {
        INT,
        STRING
    }

    static class Value {
        ValueType type;

        int intValue;
        String stringValue;

        static Value ofInt(int value) {
            Value v = new Value();
            v.type = ValueType.INT;
            v.intValue = value;
            return v;
        }

        static Value ofString(String value) {
            Value v = new Value();
            v.type = ValueType.STRING;
            v.stringValue = value;
            return v;
        }
    }
}

Although modern Java lets us do this with sealed interfaces and records:

sealed interface Value permits IntValue, StringValue {}

record IntValue(int value) implements Value {}
record StringValue(String value) implements Value {}

Rust, like Haskell, just lets us define sum types without any fuss:

enum Value {
    Int(i64),
    String(String)
}

The sum type is also sometimes called the union type, like in Python, where we can notate this with the | operator and do case analysis with chains of if/elif isinstance checks (old-school) or pattern matching with match and case (as of Python 3.10):

Value = int | str

def do_old(value: Value):
    if isinstance(value, int):
        ...
    elif isinstance(value, str):
        ...

def do_new(value: Value):
    match value:
        case int:
            ...
        case str:
            ...

As you might have guessed, the sum type is called that because its cardinality is the sum of the cardinality of its variant types:

Either a ba+bEither \ a \ b \equiv a + b

Let’s work some examples. Take the data type Either Bool Bool. We should expect it to have 2 + 2 = 4 occupants. We can enumerate all the possible combinations and check:

Left False
Left True
Right False
Right True

And if we take the data type Either Unit Bool, we can check that it has 1 + 2 = 3 occupants:

Left ()
Right False
Right True

In fact we can even show an isomorphism between Bool, which has 2 occupants, and Either Unit Unit, which has 1 + 1 = 2 occupants as well:

class (Isomorphism b a) => Isomorphism a b where
  to :: a -> b

instance Isomorphism Bool (Either Unit Unit) where
  to False = Left ()
  to True  = Right ()

instance Isomorphism (Either Unit Unit) Bool where
  to (Left ())  = False
  to (Right ()) = True

Function Types #

And back into our comfort zone we go with the function type, which denotes a function from a value of type AA to a value of type BB .

On the logical level, function types correspond to implication—a function implies that we can go from having an AA to having a BB .

For each parameter of type AA , there are BB possible values that the function could return, so the cardinality of ABA \rarr B is:

B×B×...×B×BA times=BA\underbrace{B \times B \times ... \times B \times B}_\text{A\ times}=B^A

And hence functions are exponential types:

ABBAA \rarr B \equiv B^A

Let’s say we have a Colour type with 3 inhabitants, and we want to create a function of type Colour -> Bool. We should expect there to be 23 = 8 possible functions. Let’s enumerate and check:

data Colour = Red | Green | Blue

f0, f1, f2, f3, f4, f5, f6, f7 :: Colour -> Bool
f0 Red   = False
f0 Green = False
f0 Blue  = False

f1 Red   = False
f1 Green = False
f1 Blue  = True

f2 Red   = False
f2 Green = True
f2 Blue  = False

f3 Red   = False
f3 Green = True
f3 Blue  = True

f4 Red   = True
f4 Green = False
f4 Blue  = False

f5 Red   = True
f5 Green = False
f5 Blue  = True

f6 Red   = True
f6 Green = True
f6 Blue  = False

f7 Red   = True
f7 Green = True
f7 Blue  = True

Subtyping Rules #

Before we get into the subtyping rules, let’s briefly go over some notation—as I go through the subtyping rules, I’ll provide a formal definition, a translation of the notation, and the total function that justifies why the rule is true.

The subtype operator <:<\hspace{-0.08cm}: means “is a subtype of”. So A<:BA <\hspace{-0.08cm}: B should be read as “A is a subtype of B”.

I’ll use Gentzen’s notation to represent inferences; premises are shown above a horizontal inference line, separated by commas, and the conclusion is shown below the inference line. For example, the logical statement “If P, then Q. P. Therefore Q.” would be depicted as:

PQ,PQ\frac{P \rightarrow Q, P}{Q}

Now, let’s define how we’ll notate the function that maps subtypes to supertypes.

As I mentioned, a big part of the value of subtyping is about ensuring that new versions of our types (when we make changes to our program) are subtypes of the old types—subtyping over time.

Imagine we have some function mapToOld that maps the new version of a type to the old version of the same type. You can think of this as taking some type, indexed by its version, and mapping it to the same type, but indexed by its previous version. Morally, something like: I won’t put in the effort to make this computable by the GHC. Each time we created a new version of a type, we’d need to add a completely new implementation of the type, then create a new case for mapToOld from that version to the previous.Plus our old functions would all be operating on the old types, and we’d have to either change them manually or iteratively apply mapToOld until we got the correct version of the type. What a pain.

mapToOld :: Versioned (Succ Nat) a -> Versioned Nat a

Where Versioned is a GADT which constructs a versioned type, indexed by the natural numbers.

In fact, when we make some changes to a type, if we’re able to define a simple implementation of mapToOld (that doesn’t take any additional information outside of what’s in the new type), then that implies the new type is a subtype of the old one. Mapping a new type to an old type is really about casting to a supertype. To make this subtyping relation implementable across a wide range of types, I’ll notate it as a typeclass that we provide instances of:

class Subtype a b where
  upcast :: a -> b

This typeclass should be read as “AA is a subtype of BB , and to prove it, I have a function which lets me convert all values of type AA to values of type BB .”

This will be my template for whenever we’re trying to prove that AA is a subtype of BB by defining a concrete implementation. The subtyping rules I’m going to explain are generic over combinations of types, so I’ll define them using parametric polymorphism.We don’t care about what specific types AA and BB are (e.g. strings or integers or floats), we just care about the shape of the generic type constructors that are parameterised over them.

Strictly speaking, the definitions of the new typed versions should be done as variants within a GADT, where different constructors create different indexed types:

data Account (version :: Nat) where
  Account :: Int -> Account 1
  Account' :: String -> Int -> Account 2

instance Subtype (Account 2) (Account 1) where
  upcast (Account' _ balance) = Account balance

And let’s not even consider implementing Versioned. Maybe after I get in some practice with type-level programming. I’ll keep the Haskell notation simple by just providing instances of Subtype between different types, no type-level version indexing.

Also, for context, whenever I notate some concrete datatype as <type>', I’m really talking about the same data type (i.e. a programmer has edited the type) but at a later point in time. If the programmer were to literally create a new type, we would of course have to update all the call sites to use the new type.

So when I provide an implementation of Subtype that maps a new (sub-) type to the old (super-) type, I wouldn’t expect this to run in a program, Unless, of course, we do the GADT-indexed-by-version-type shenanigans. we are merely using it as a way to help us think about how there are legal ways to change a type (i.e. by creating a subtype of it) while not requiring any changes to the rest of the program.

With that bookkeeping squared away, let’s finally dive into the rules of subtyping.

The Base Case #

All types are subtypes of themselves.

A<:A\frac{}{A <\hspace{-0.08cm}: A}

Considering that the purpose of subtyping is “you can substitute a subtype wherever a type is expected and the program will run fine”, this is pretty obvious—of course you can give a value of type AA where a value of type AA is expected. It’s like saying xxx \leq x . It’s true by tautology.

instance Subtype a a where
  upcast = id

Transitivity #

Subtyping is transitive, which is to say that if AA is a subtype of BB and BB is a subtype of CC , then AA is a subtype of C:C:

A<:B,B<:CA<:C\frac{A <\hspace{-0.08cm}: B, B <\hspace{-0.08cm}: C}{A <\hspace{-0.08cm}: C}

This should also be fairly self-explanatory. Morally, this would look like:

instance (Subtype a b, Subtype b c) 
  => Subtype a c where
  upcast = upcast . upcast

But this would cause problems if we were to try to run it in Haskell (due to undecidability), so if you were to use the Subtype typeclass, you’d have to manually compose upcast with itself whenever you need to cast a type up multiple levels of the supertyping hierarchy.

Product Types #

Adding Fields to a Product #

A product type created by adding a field of type BB to a type AA is always a subtype of A:A:

A×B<:A\frac{}{A \times B <\hspace{-0.08cm}: A}

To give a concrete example, imagine we have a Customer type that we use in our program, which currently only stores their name and age:

data Customer = Customer { 
  customerAge :: Int, 
  customerName :: String
}

What happens to the rest of our program if we decide to add a new field for their contact details?

data Customer' = Customer' { 
  customerAge :: Int, 
  customerName :: String
  customerContactDetails :: ContactDetails
}

Nothing! The rest of the program doesn’t expect to operate on a customer’s contact details, and therefore silently ignores the additional information.

Every Customer' can be mapped into a Customer simply by forgetting the additional field, which lets us define upcast:

upcast :: Customer' -> Customer
upcast (Customer' age name _) = Customer age name

And so our new and updated Customer type is a subtype of the old one. Generically speaking:

instance Subtype (a, b) a where
  upcast (x, _) = x

instance Subtype (a, b) b where
  upcast (_, y) = y

Subtyping of Fields #

If you make the fields of a product type into subtypes, that creates a new subtype product type:

A<:A,B<:BA×B<:A×B\frac{A\prime <\hspace{-0.08cm}: A, B\prime <\hspace{-0.08cm}: B}{A\prime \times B\prime <\hspace{-0.08cm}: A \times B}

Imagine we have a product type Order which includes our Customer type:

data Order = Order {
  orderCustomer :: Customer,
  orderItem :: Item
}

After we upgrade the Customer type to its subtype Customer', all the old places which use Order don’t need to change because they can ignore the additional information in orderCustomer (and orderItem is a subtype of itself).

Concretely:

data Order' = Order {
  orderCustomer :: Customer',
  orderItem :: Item
}

upcast :: Order' -> Order
upcast (Order' customer item) = 
  Order (upcast customer) item

Or generically:

instance (Subtype a' a, Subtype b' b) 
  => Subtype (a', b') (a, b) where
  upcast (x, y) = (upcast x, upcast y)

Sum Types #

Deleting Cases From a Sum Type #

Deleting cases from a sum type always produces a subtype.

A<:A+B\frac{}{A <\hspace{-0.08cm}: A + B}

This is a bit confusing. Why doesn’t it work like products, where adding additional information creates a subtype?

Adding a case adds uncertainty to a sum type—there are more possibilities of what case the type could be, which must be tamed via case analysis.

Say we have some ContactDetails data type which we’ve simplified after some time (as none of our customers ever want us to contact them by phone):

data ContactDetails  = CDEmail EmailAddress
                       | CDPhone PhoneNumber

data ContactDetails' = CDEmail EmailAddress

All the places that expect a subtype of ContactDetails can operate just fine if we provide them the type ContactDetails'—they’ll have a case analysis branch that will never get run (and is hence dead code), but otherwise they’ll cope just fine.

Put generically:

instance Subtype a (Either a b) where
  upcast x = Left x

instance Subtype b (Either a b) where
  upcast y = Right y

Alternatively, imagine we tried to produce a subtype by adding cases to a sum type. How would we define it?

instance Subtype (Either a b) a where
  upcast (Left x)  = x
  upcast (Right y) = ???

Subtyping of cases #

If you make the cases of a sum type into subtypes, that creates a new subtype sum type: Say that three times fast!

A<:A,B<:BA+B<:A+B\frac{A\prime <\hspace{-0.08cm}: A, B\prime <\hspace{-0.08cm}: B}{A\prime + B\prime <\hspace{-0.08cm}: A + B}

This is similar to subtyping the fields of a product type. Instead of mapping upcast over all the fields at once, we check to see which case the sum type has, then map upcast over it (while keeping the same structure):

instance (Subtype a' a, Subtype b' b)
  => Subtype (Either a' b') (Either a b) where
  upcast (Left x) = Left (upcast x)
  upcast (Right y) = Right (upcast y)

  -- or, in point-free style:
  upcast = bimap upcast upcast

A Brief Interlude on Type Variance #

Before we get to the last subtyping rule of the bunch, we need to discuss type variance. I started off this article with a bit of an inaccurate generalisation, saying that subtyping is about substituting types for other types. While this is true, it’s not the case that we are always substituting subtypes for supertypes (like I may have led you to believe)—sometimes, we want to substitute supertypes for subtypes.

I’ll start with a concrete example and then we can generalise from there.

We’ll create a simple subtyping hierarchy of three types:

  1. Music
  2. Rock music
  3. Progressive rock music If King Crimson has million fans, then I’m one of them.If King Crimson has one fan, then I’m THAT ONE.If King Crimson has no fans, that means I’m dead.

Progressive rock is a subtype of rock, rock is a subtype of music.

Let’s talk about a Producer<RockMusic>. If some part of our program expects something which produces rock music, we can happily give it a Producer<ProgRockMusic>, because progressive rock is a subtype of rock. We couldn’t give it a Producer<Music>, though—what if it produces rap, or pop?

Note that the ordering of the subtyping goes the same way between the Producer<T> and its underlying type.

We call this covariance—the type constructor and its type argument co- (“together”, “with”, “jointly”) vary.

A<:AF(A)<:F(A)\frac{A\prime <\hspace{-0.08cm}: A}{F(A\prime) <\hspace{-0.08cm}: F(A)}

Where FF is a covariant type constructor of type TypeTypeType \rarr Type .

Both product types and sum types can be thought of as producers of their component values via pattern matching, which is essentially a message to the type asking for them to produce the values they wrap. Hence, they are both covariant—subtyping the fields makes a subtype product type, and subtyping the cases makes a subtype sum type.

Ah, but now let’s talk about a Consumer<RockMusic>. If some part of our program expects something which consumes rock music, we can’t provide it a Consumer<ProgRockMusic>, because the consumer is too discerning—it won’t be able to handle being given glam rock, or heavy metal, or surf rock, and will crash the program. But we can give it a more general Consumer<Music>, which will happily process the rock music that the rest of the program feeds it.

So, the ordering of the subtyping between Consumer<T> and T is reversed!

This is called contravariance, because the type constructor and its type argument contra- (“against”, “opposite”) vary.

A<:AG(A)<:G(A)\frac{A\prime <\hspace{-0.08cm}: A}{G(A) <\hspace{-0.08cm}: G(A\prime)}

Where GG is a contravariant type constructor of type TypeTypeType \rarr Type .

Finally, let’s talk about a type constructor which both consumes and produces RockMusic.

By the subtyping rules on production, the type constructor can only accept subtypes of RockMusic. By the subtyping rules on consumption, the type constructor can only accept supertypes of RockMusic. And so, much like how the only number both greater-than-or-equal-to and lesser-than-or-equal-to some number x is x itself, this type constructor can only accept the same type.

This is called invariance, because the type constructor and its type argument in- (“not”) vary, i.e. they don’t vary at all.

Incidentally, this is one of the problems with mutation. With mutable data types, every data type is both a consumer of its underlying values (via set) and producer of its underlying values (via get) and therefore mutable types are always invariant.

I won’t provide an inference rule as there’s no subtyping relations on an invariant type constructor.


To recap:

Covariant type constructors let us substitute subtypes for supertypes.

Contravariant type constructors let us substitute supertypes for subtypes.

Invariant type constructors don’t let us substitute types at all.

Function Types #

And now, back to our regularly scheduled discussion of subtyping on functions.

There is only one subtyping rule for functions.

A function ABA \rarr B consumes values of type AA and produces values of type BB , so they are contravariant in their parameter types, and covariant in their return type.

A<:A,B<:BAB<:AB\frac{A <\hspace{-0.08cm}: A\prime, B\prime <\hspace{-0.08cm}: B}{A\prime \rarr B\prime <\hspace{-0.08cm}: A \rarr B}

To spell out the variance rules:

  1. The parameters of the subtype function must be supertypes of the supertype function’s parameters
  2. The return value of the subtype function must be subtypes of the supertype function’s return value

In Haskell code, this looks like:

instance (Subtype a a') (Subtype b' b) 
  => Subtype (a' -> b') (a -> b) where
  upcast f = \x -> upcast (f (upcast x))

  -- or, in point-free style
  upcast f = upcast . f . upcast

Concretely, we can create subtypes of our function by either making our parameters into supertypes: by adding cases to a sum type parameter value:

-- supertype of parameter: adding cases to a sum type
f0  :: Either String Int -> String
f0' :: Either Bool (Either String Int) -> String

-- supertype of parameter: removing fields from a product type
f1  :: (Bool, Int, String) -> String
f1' :: (Bool, Int) -> String

Or by making our return value into a subtype:

-- subtype of return value: removing cases from a sum type
f2  :: Int -> Either Bool String
f2' :: Int -> Bool

-- subtype of return value: adding fields to a product type
f3  :: String -> (Bool, Int)
f3' :: String -> (Bool, Int, Float)

The Liskov Substitution Principle #

Finally, there is one thing I’ve not mentioned yet, and that’s the Liskov Substitution Principle:

A subtype should behave like a supertype as far as you can tell by using the supertype methods.

Barbara Liskov

Unfortunately, unlike the structural subtyping rules I’ve gone over, the Liskov Substitution Principle is undecidable by a simple type checker A simple proof: by LSP rules, we would at the least need to be able to prove that the termination/non-termination of each method was the same for a type and its subtype.This is the halting problem. Famously undecidable:w .—it defines subtyping by behaviour essentially in terms of implementing an interface, not in terms of storing information.

Of course, the confusion of subclassing with subtyping is very common thanks to object-oriented languages, but this is a category error.

Of course, this has been confused with the use of inheritance in object-oriented languages, but subclassing is not subtyping. This is a very common mistake, but it’s a category error—subtyping talks about substituting types for other types, subclassing is usually about reusing implementation behaviour.It’s very easy to create a subclass of a class that is nevertheless not a subtype of it.

I will have to revisit behavioural subtyping another time.

Conclusion #

Let me wrap up by collating all the implementations of upcast:

-- all types are their own subtypes
upcast x = x

-- subtyping is transitive
upcast = upcast . upcast

-- product: adding a field
upcast (x, _) = x
upcast (_, y) = y 

-- product: subtyping over fields
upcast (x, y) = (upcast x, upcast y)

-- sum: removing a case
upcast x = Left x
upcast y = Right y

-- sum: subtyping over cases
upcast (Left x) = Left (upcast x)
upcast (Right y) = Right (upcast y)

-- functions: subtyping by supertyping on parameters
--            and subtyping on return value
upcast f = upcast . f . upcast

To sum up, all of these implementations of upcast are total functions which map a subtype to its supertypes. Thus whenever we have some type in a code fragment, we can replace it with a subtype, upcasting if necessary.

Feel free to memorise the rules now. Flashcards are handy.

Anyways, we don’t need to implement upcast in our code for subtyping to be helpful. Just thinking in terms of subtyping can help us write better software, by being mindful of modularity over both time (measured along the world-line of our software’s changing existence) and space (by helping our code stay decoupled).

Firstly, when we update our program’s code, if we can keep our updated data types as subtypes of their previous versions, then we don’t need to update the rest of the program around them–the code will happily operate on the new subtype.

Secondly, if we design our code to operate on simple supertypes, then our old code can operate on very many subtypes and will not need to change in response to us changing data structures around. Our code will have low coupling and we’ll be free to update our data structures as user needs change.

Subtyping lets us safely substitute values of one type for another.