The Design of Types

Become a Subscriber

One of the recent changes in Swift 6.1 has been the treatment of fromRaw initializers of enumerations. These initializers have been used quite a bit, for example by Natasha the Robot or Swiftris, a Tetris clone in Swift. Coming from a functional programming background, I was a bit surprised that people would actually want to manipulate the raw values underlying enumerations. This blog post aims to explain the way I think about enumerations (and a few other Swift types), which may differ from the intuitions developers coming from Objective-C have.

In Objective-C, enumerations are used all the time. For example, the NSStringEncoding enumeration lists the different possible encodings that an NSString might have:

// Enumerations in Objective-C
enum NSStringEncoding {
    NSASCIIStringEncoding = 1,
    NSNEXTSTEPStringEncoding = 2,
    NSJapaneseEUCStringEncoding = 3,
    NSUTF8StringEncoding = 4,
    // ...
}

Each of these encodings is represented by a number; the enum allows programmers to assign meaningful names to the integer constants associated with particular character encoding.

In Swift, the situation is a bit different. Where you can do crazy things in Objective-C, like assert that NSASCIIStringEncoding + NSNEXTSTEPStringEncoding == NSJapaneseEUCStringEncoding, Swift introduces a new type for each enumeration, entirely different from the underlying integers. As a result, you can’t add together two members of an arbitrary enumeration in Swift.

I always thought of the rawValue mechanism as an escape hatch to drop back down to the world of Objective-C, rather than something you should use in Swift if you can help it. I don’t want to say that using rawValue is wrong in any way, but rather that it doesn’t fit with my intuition. Years of Haskell programming have warped my brain in interesting ways. One important lesson I’ve learned is that designing the right types for your problem is a great way to help the compiler debug your program.

In the rest of this blog post, I want to share a few short examples that illustrating this lesson.

String sanitization

Anyone who has done any web programming in the 90’s will know about the importance of sanitizing user input. If you’re not careful, users may add SQL queries to their input, allowing them unintended access to your database. (XKCD has a great comic about this.)

The problem you keep running into is that every time the user provides you with a String, you should validate that it is safe before processing it any further. One obvious way to do this is by defining a function:

func sanitize(userInput : String) -> String

There is one drawback to this approach: whenever I have a String in my code, it is up to me to keep track of whether or not it has been sanitized or not. If you’re not careful you can sanitize the same String more than once (slightly bad) or not all (really bad). For a small project with a single developer, the problem is still manageable. For a larger project with many developers, this can go wrong all too easily.

Instead of trying to keep track of this ourselves, we can actually teach the compiler to do this for us. To do so, we introduce a two new types SanitizedString and UserInput:

typealias UserInput = String

struct SanitizedString {
    let value : String

    init(input : UserInput) {
        val = sanitize(input)
    }
}

We now have two distinct types: UserInput and SanitizedString. As soon as we try to mix the two, providing raw UserInput to a function that expects a SanitizedString, we will get a type error. Put differently, you no longer need to track which strings have been sanitized, but the compiler does this for you. Even better, the only way to produce a value of type SanitizedString is by running it through the sanitize function. As long as that function does its job, we can safely assume any value stored in a SanitizedString is safe to use.

This example may be a bit outdated: nowadays there are great libraries for database access that handle string sanitization for you. But the same problem keeps popping up in many different situations: constructing REST queries; writing shell scripts; assembling file paths; the list goes on. By investing more thought into the design of our types, we can avoid having to manipulate the raw underlying strings and instead focus on getting the important stuff right.

User registration

Let me give another example of choosing your types wisely.

Suppose you want to allow your users to register with their email address or Facebook account, or both. In Objective-C, we could create a UserAccount class with two fields, emailAddress and facebookAccount that could be nil. Similarly, in Swift we might use optionals and tuples to write something like:

typealias UserInfo = (EmailAddress?, FacebookAccount?)

Unfortunately, this type contains ‘junk’ values:

let bogus : UserInfo = (nil, nil)

Yaron Minsky has a great slogan: make illegal states unrepresentable. In other words, if we want to avoid the situation where a UserInfo has neither an email address nor a Facebook account associated with it, we should make it impossible to represent this situation. By being more careful about the design of our types, we hope that our code will be easier to maintain in the long run.

The solution using enumerations may seem awkward at first:

enum UserInfo {
    case Email (EmailAddress)
    case Facebook (FacebookAccount)
    case EmailAndFacebook (EmailAddress, FacebookAccount)
}

We need to introduce a separate type with three different members, each with different associated values. That may seem like a lot of work –– what have we gained? Quite a lot, I’d argue: we have turned a piece of business logic into an invariant that can never be broken.

For example, suppose you want to allow users to remove the information about their Facebook account. Using the naive approach with tuples, this is an easy operation to implement:

func removeFacebookInfo(userInfo : (EmailAddress?, FacebookAccount?)) -> (EmailAddress?, FacebookAccount?) {
    return (userInfo.0, nil)
}

This function may, however, break the invariant we mentioned previously: throwing away the Facebook account information, when we have no alternative email address.

Working with enumerations does make this problem magically go away. We can still only return valid new account details if we have both the user’s email address and Facebook accounts. However, the Swift compiler will warn us if we try to leave out any of the cases, such as the problematic case where we only have a user’s Facebook account information. One way around this is to return an optional value, using nil to indicate that the operation failed:

func removeFacebookInfo(userInfo : UserInfo) -> UserInfo? {
    switch userInfo {
    case let .EmailAndFacebook(email,_):
        return UserInfo.Email(email)
    default:
        return nil
    }
}

There is clearly a price to pay here for more accurate type information: we have had to introduce a separate enumeration; any function manipulating this enumeration must use switch statements; we can no longer work with simple types like tuples. Is it really worth it?

I’d argue it is. Implementing an operation like removeFacebookInfo correctly using tuples is going to be just as complicated. To make matters worse, the compiler cannot warn us about functions that (unintentionally) break the desired invariant. A judicious choice of type will lead to code that is easier to maintain and better structured in the long run.

Discussion

So, what does this have to do with raw values? Not much arguably. But for me, enumerations are an important piece of technology that allow you to capture information about your data precisely. For example, Chris Eidhof has a nice blog post where he describes how to use enumerations to construct REST queries.

Manipulating the raw values underlying enumerations can be very handy at times –– in the same way it can be useful to use bitshift operators to implement some low-level algorithm. For me, however, the most interesting challenges are in wielding enumerations, tuples, and structs effectively. Learning to use these language features effectively is an important part of coming to grips with Swift.

About the author

Wouter Swierstra is a lecturer at the University of Utrecht. Amongst other things, he recently wrote a book about Functional Programming in Swift.