The Curious Case of the Protocol Default

Chris Marshall
ChrisMarshallNY
Published in
7 min readApr 12, 2020

--

Image: studiostoks/Shutterstock

“Nobody expects the Spanish Inquisition!”
– From A Monty Python Sketch

META: This is a copy of the story on my personal Web site.

THE PROBLEM

I often write SDKs, and, when I do so, I like to follow a philosophy that I nickname “SQUID”.

One of the ways that I implement the “S” in “SQUID” (which stands for “Simplicity”), is to provide internal data structures as protocols, instead of classes or structs. It’s not a really big deal, but it does help to reduce the complexity and overhead of using the SDK.

As I was working on an SDK recently, I encountered a very strange issue. When I called methods, or referenced properties, in SDK classes, I got the default implementation behavior from the protocol definition; not the expected class implementation.

Whiskey Tango Foxtrot

Under the hood, the data structure was a class, but the presentation to the SDK user was as a protocol.

This protocol had an extension, in which I provided default implementations. I use this pattern frequently, in order to make protocol conformance “optional.” The user of the protocol doesn’t have to implement the entity (like a variable/property or method/function), as the default implementation “absorbs” the requirement. Like so:

protocol A {
var thisIsRequired: String { get }
var thisIsOptional: String { get }
}
extension A {
var thisIsOptional: String { get { "" } set {} }
}
struct conformsToA: A {
var thisIsRequired: String { "Struct conformsToA" }
}

In the above example, the conformant struct only needed to implement the “thisIsRequired” property, as the protocol extension took care of the “thisIsOptional” property.

Fair ’nuff. Seems pretty straightforward, eh?

There Ain’t No Such Thing As A Free Lunch (T.A.N.S.T.A.A.F.L.)

Protocols are really cool. They are one aspect of Swift that makes it an incredibly powerful language, but protocols are not classes.

That bears repeating: protocols are not classes.

They may smell like classes, because they offer a sort of “poor man’s hierarchy,” but they don’t have polymorphism.

A truly polymorphic class allows a “base class” to define a method or data member, only to have that co-opted by subclasses that derive from the base class. This is usually done via a mechanism called “vtables”. Swift uses this mechanism for classes, but not for structs, enums or protocols.

You can then treat the subclass as an instance of the base class, but when you call the method, or reference the overridden property, the subclass variant of the element is accessed.

You can’t derive from structs or enums. You can only extend them, which is a different mechanism.

When you conform to a protocol in Swift, you use almost exactly the same syntax as subclassing:

protocol PA {    // Defining Protocol PAstruct SA: PA {    // Conforming to Protocol PAclass CA: PA {    // Conforming to Protocol PAclass CB: CA {    // Deriving From (Subclassing) Class CA

This can easily lull us into thinking of conforming to protocols as “the same as” deriving from a class (especially when we are using default implementations in our protocols). That’s a big reason that Apple is so insistent on using the correct language, when discussing the use of protocols.

But under the hood, things are quite different.

When we implement a protocol-defined property/method, we are replacing the default implementation; not overriding it.

With a subclass, you can always use the “super” keyword to access the properties and methods of the class from which your subclass derives.

You can’t do that with protocols. Once you replace the default implementation; it’s gone for your instance.

That means there’s no vtable for protocols. All a protocol is, is a contract. It promises that the instance that conforms to the protocol will present a certain interface.

The default implementation is really just a way to “add value” to a protocol, so that conformance doesn’t have to be too onerous, and it reduces code duplication.

In that respect, protocols with default implementations more closely resemble PHP Traits, than classes (I think I just earned the enmity of an untold number of people by comparing Swift to PHP).

But What Does That Have to Do With Our problem?

Glad you asked. I decided to do some experimenting, and ended up using the following structure (and included playground) for testing:

UML Diagram for My Experiment. Purple is Protocol. Dotted Line means “Conforms To” or “Extends.”

In the above diagram, I have a protocol (PA) that defines a property as required for conformance (myName).

There is one struct (SA) and one class (CA) that conform to that protocol. Because myName is required, they each also define an instance property of myName.

Since CA is a class, we can derive from it (CB), and that subclass overrides the myName property.

We have a second protocol (PB), that extends protocol PA, and adds a default implementation of myName. This effectively makes myName an “optional” property. Conformant entities don’t need to implement it.

We define two structs (SB and SC) that conform to PB, but SB does not define the myName property; instead, relying on the default implementation defined by the PB extension.

We also define two classes (CC and CF) that conform to PB. As before, CC relies on the default implementation of myName, from PB.

We define two classes that derive from these classes, CD and CG. CD defines a first conformant implementation for the myName property. CG overrides the property that was defined in CF.

Then we have one more class that overrides its superclass: CE. CE overrides the CD implementation of myName.

In the playground below, you will see that I defined a few functions to print the values of myName.

If you read the playground, you will see that a straightforward, direct printing of the various structs and classes should result in something along these lines:

 ENTITY     SHOULD PRINT
Struct A "Struct A"
Class A "Class A"
Class B "Class B"
Struct B "Protocol B"
Struct C "Struct C"
Class C "Protocol B"
Class D "Class D"
Class E "Class E"
Class F "Class F"
Class G "Class G"

Note the expected printouts for Struct B (SB) and Class C (CC). These don’t define their own implementations of myName, so they fall back on the default implementation provided by Protocol B (PB).

These should be the only two instances where we see the String “Protocol B” printed.

Throw The Switch, Igor!

Image: Nantz/Shutterstock

Now, let’s run the playground, and see what printouts we actually get:

PART ONE
Direct Print:
Struct A: Struct A
Class A: Class A
Class B: Class B
printAsProtoclA(_: PA):
Struct A: Struct A
Class A: Class A
Class B: Class B
printAsStructA(_: SA):
Struct A: Struct A
printAsClassA(_: SA):
Class A: Class A
Class B: Class B
PART TWO
Direct Print:
Struct B: Protocol B
Struct C: Struct C
Class C: Protocol B
Class D: Class D
Class E: Class E
Class F: Class F
Class G: Class G
printAsProtoclA(_: PA):
Struct B: Protocol B
Struct C: Struct C
Class C: Protocol B
Class D: Protocol B
Class E: Protocol B
Class F: Class F
Class G: Class G
printAsProtoclB(_: PB):
Struct B: Protocol B
Struct C: Struct C
Class C: Protocol B
Class D: Protocol B
Class E: Protocol B
Class F: Class F
Class G: Class G
printAsStructB(_: SB):
Struct B: Protocol B
printAsStructC(_: SC):
Struct C: Struct C
printAsClassC(_: CC):
Class C: Protocol B
Class D: Protocol B
Class E: Protocol B
printAsClassD(_: CD):
Class D: Class D
Class E: Class E
printAsClassF(_: SF):
Class F: Class F
Class G: Class G

Oh, dear. We seem to have some erroneous printouts of “Protocol B” in Part Two (Part One looks fine).

Specifically, Class D (CD) and Class E (CE) are problematic. What do they have in common?

Well, CE derives directly from CD, which derives directly from CC. They are all in the same “family.” We expect CC to print “Protocol B,” but CD and CE should print “Class D” and “Class E,” respectively.

If we look up at the UML diagram, we see that CC did not define an instance of myName. Instead, it relied on the default implementation from Protocol B.

Now, note that this worked fine:

        printAsClassD(_: CD):
Class D: Class D
Class E: Class E

But this did not:

        printAsClassC(_: CC):
Class C: Protocol B
Class D: Protocol B
Class E: Protocol B

The difference between the two, is where the hierarchy started, with the argument passed into the function that did the printing.

It seems that we can’t expect a vtable, if we don’t start with a class that defines or overrides a virtual method.

If we didn’t have the protocol conformance (and default implementation), then this would be a syntax error:

let name = classInstanceCC.myName

That’s because CC never defined the myName property. myName wasn’t explicitly defined in the hierarchy until CD. The only reason we could get away with that statement, was because of the default implementation provided by Protocol B.

This Is A “Loophole”

No language is perfect, but (in my opinion), Swift comes fairly close. This is a rather benign example of a weakness. This issue is a fairly unavoidable by-product of having protocol default implementations.

I’d rather have default implementations, than a syntax error when we try to use an overridden method that was defined in a default implementation.

CONCLUSION

We learned that Swift has a “loophole,” where unexpected behavior can happen, in a fairly “edge” case, where we override methods that are defined in a protocol default implementation, then access those overrides before their virtual implementation.

It’s a bit weird, but now that we know what to look for, it’s easy to avoid.

UPDATE

I was directed towards this Apple bug report. It appears as if it is the cause of this issue.

SAMPLE PLAYGROUND

--

--

Chris Marshall
ChrisMarshallNY

Software engineer and developer of Things Apple, living in New York.