Sunday, December 10, 2017

Exploring Type Annotations in Scala

Do you come across code like this in Scala: def apply[T <% Mappable[T]](x: T): T or class ReferenceQueue[+T <: AnyRef] { .. } or def setValue[T1 >: T](value: T1): T? code that when you try to decipher, it always seems your brain grinds to a halt and fails to parse what it is seeing? At which point you silently go beserk and swear at this crazy thing called "Scala"?

Ok, take a deep breath. This post is for you. It will teach you what you need to know in other to be able to read and demystify such incantations. A skill that would come in handy when encoding or when reading code that uses Type classes in Scala.

This post is the 3rd post in the Typeclass knowledge pack series. It will take a brief look at types, type constructors, type parameters and the type annotations.


Type annotations are being explored in a series on Type-class because, as explained in the first post, in Scala, Type-class is a pattern which is encoded using various language features. Type annotations happen to be one of the language features that is needed. This is why it is important to have a working knowledge of what they are about.

A little clarification before I proceed. In this post, I would be making use of the phrase "Java developers" a lot. This is not intended to be derogatory. For all it is worth, I still write code using Java myself. The idea behind the appellation is to stress the fact that the mental models developed working in a language could easily and quickly become a stumbling block in appreciating concepts in another language. And since I see myself in the expression "Java developers", using that phrase also helps me in chronicling some of the mental blocks I had to do away with in other to appreciate the paradigm shift and new concepts that come with working with a language like Scala.

With the disclaimer out of the way, let us proceed in trying to learn how to decipher those scala codes that seems impossible to understand. And see how this will help when it comes to encoding the Type-class pattern.

To start, let us revisit the concept of Types and classes.

What do you think when you think about classes?

For a long time, as a Java developer, whenever I think about classes I only think of them as being blueprints for creating objects.

I am pretty sure I am not the only one with this mental model. I would wager most Java developers do. Which is to be expected as the blueprint metaphor is a popular one often used when explaining classes and their relations to objects and object creation.

But there is yet another mental model you can have regarding classes.

This model requires seeing classes as a mechanism to ascribe a Type.

This approach is not that totally foreign to most Java developers, in fact, most Java developers would think of a class as being synonymous with types.

You will see this if you listen carefully to Java developers discussing. You will probably hear something like: "I do not know why the code is not compiling, even though I am passing the expected type to the method" or "When you define that method, you have to make sure the return type implements the interface"

But when I say "see classes as a mechanism to ascribe a Type", I am referring to "type" in a sense that is similar to Sets from mathematics, and not as a synonym for class.

So if we want to talk about synonyms, the only synonym that would be accurate in this discussion is one that sees Types as a Synonym for Sets.

And what does that mean?

From elementary mathematics, we understand that a Set can be seen as a collection of well defined and distinct things which often share similar characteristics; hence why they belong to a set.

So you can have a Set of students in a particular dormitory. a Set representing primary colors. A Set of fruits etc.

So seeing classes as a mechanism to ascribe a Type (which we are allowed to think of as a Set), means seeing classes as a mechanism to delineates the values that should belong together and be part of a certain Type.

So if I define a class I call Human, I am in essence saying, I have a Set which I choose to name Human. And all the instances that can be created by calling new on the Human class, are the acceptable values within this Human Set.

The next step in departing from the "class is just a blueprint for creating objects metaphor" to "a class is just a synonym of Type" is to confront the fact that you can have a Type without having a class.

You can create a Type, without defining a class. I explain how.

In the example of having a Human class, the possible inhabitant of the Set (or Type) is infinite since there is no limit to the number of Human instances you can create with the new keyword (if you ignore the limitation imposed by the available memory on your computer).

But you do have other situations where you have a known, finite value that can exist within a given type. A good example of this is the Boolean type that can only be true, or false.

There are programming languages whose Type system allows us to better model cases where the inhabitant of a type is finite and known. An example of such is Haskell, where you can see something like:

data Boolean = TRUE | FALSE

Which defines a class, or a type called Boolean, whose content can either be True or False.

These are called algebraic data types. In this case, as you can see, a type was defined without creating a class.

But that is Haskell, you might counter, and this post is not about Haskell but about Scala with some references made to Java. So show how it is possible to create types without class in these languages!

True this post is about Scala. Also true, Scala and Java do not have native support for Algebraic data types, but it does have something which we can use to model ADT's And this can be used to drive home the point that you can have types without your traditional class definition.

This something is enums or Enumeration in Scala. Choosing to go with Java, we can see how they allow us to specify types, with finite inhabitants without using classes.

So if you come across a piece of Java code like this:

public SubscriptionStatus status = ...

You can refer to SubscriptionStatus as a type...

But you will be wrong if you assume, that this means there is a class called SubscriptionStatus defined somewhere else...because SubscriptionStatus could be an enum.

That is:

enum SubscriptionStatus { PAID, TRIAL, BANNED, CLOSED}

in which case, we still end up with a SubscriptionStatus type, which delineates a type that can contain 4 values  PAID, TRIAL, BANNED CLOSED and this was done without the need for a class definition or instance creation using the new keyword.

So it is possible to have the concept of Types without classes. And it is possible to create values that inhabit the described Types, without using new to create an object instance, off a class blueprint.

The main paradigm shift here is to be able to tear away the blueprint mental model from class. From my experience, this is not an easy task. But this needs to be done.

The big idea here is to start seeing Types as a distinct concept. As a distinct thing. As a first-class citizen. And that a class is just a way to define a type (albeit not the only way). And also the fact that a type can be viewed as a set. And since all sets should contain things, the whole new instance creation is just a way to fill up this set.

The point is not that the blueprint mental model is wrong, just that with any mental model, it might limit how you think about things.

For example, you might choose to model numbers using the Roman numerals (I,X,V etc) instead of using the Arabic numerals (1,2, 3 etc). There is nothing incorrect about doing this, but I guess certain operations like multiplication and division, or representing numbers in binary would be a lot complicated to do using Roman numerals compared to using Arabic numerals.

And this is what mental model does. It can either limit or unleash how we think about things.

What I found out, from my own experience, was that the class as a blueprint mental model held me back from truly appreciating type level thinking and made thinking about things like type constructors, type parameters, type annotations more cumbersome.

What will it look like to view classes as a way to delineates values, in a set-like fashion? What thoughts will these mental model make easier? Let us find out as we take a step back to view classes in this fashion.

Classes as a way to delineate values; AKA delineate Types

Let us take a look at some examples to help to solidify the idea of seeing classes as a way to ascribe types where types are distinct, set-like concept. The idea here is to exercise a different mental model that is different from class as a blueprint.

So when you see a class definition like this:

class Country(name: String, continent:String)

do you see this as a blueprint you can use to create instances of Country? ie:

val nl: Country = Country("Netherlands", "Europe")
val ng: Country = Country("Nigeria", "Africa")

or

Did you see Country as a way to describe a type, a set, which will only contain values that are country objects?

The idea is to start seeing it as a set.

Let us try another one, this time without having a complete class definitition. Let's have:

class Gender (...) {
...
}

Do you see this as a blueprint for creating Gender objects using the new keyword, or as a mechanism to delineate a type which can contain whatever gender identity out there? Try and see it without the blueprint for creating classes mental model...

Now let's try this:

class Box[T]

Which in Java, will be written as class Box<T>

How will you interpret this using the set metaphor? The Gender example was clear enough, it defined a set of genders...but what does a Box[T] define? especially when T is unknown?

It could be a Box[Dog] in which case it can then be seen as a set. A set which describes Box which contains Dogs...but it could also be a Box[Cat], in which case it will be correct to see it as a set which captures a Box which contains cats...it could be Box[Parrots] etc etc..

It seems Box[T] is not well-formed yet...I mentioned trying to see "Types" as concrete things. As first-class citizens, but it seems it is impossible to think of Box[T] in such concrete manner, unless until after the T is specified.

Hence we can say Box[T] is not a concrete set yet, or better put, not a concrete type, but it becomes a concrete concept the moment we supply what T is.

If you think about it, it would be perfectly acceptable to say Box[T] provides us a mechanism to construct a type. Box of ? does not mean much, but Box of Cat, is a Type which delineates a box that contains cats. Just as Box of Dog, is a concrete type which delineates a box that contains dogs.

Hence why structures like Box[T] are formally called Type Constructors. Since they allow us to create concrete types (read concrete set) from them. They come with holes, and once we provide another type that can fill the hole, we create a concrete type.

The types that fill the holes provided by a Type constructor are referred to as Type parameters. Dog is a type parameter in Box[Dog] just as Cat is a type parameter in Box[Cat].

And things get more interesting with type parameters in Scala because Scala provides us with mechanisms to "annotate" these type parameters.

By annotating type parameters, it means we can specify certain characteristics these type parameters can have.

This is when you start seeing Scala code that looks like Box[+Cat], or Box[-Cat], or Box[Cat <% Animal] or Box[Cat: Animal].

The type parameter, Cat above is not just provided. It is provided together with type annotations.

These annotations then provide us with additional information about the Type parameter and its relations to the type constructor.

There are four categories of type annotation I have been able to identify when it comes to type parameters. And these include:
  1. Annotation for specifying type bounds 
  2. Annotation for specifying variance relationships
  3. Annotation for specifying view bound
  4. Annotation for specifying context bound

1. Annotation for specifying type bounds
A definition of Box[T] means that the type parameter T is fully parametric, meaning that it could be anything. The universe of possible Ts is infinite. 

Sometimes we want to limit the possible values that can be passed on to T. And sometimes we want to make use of subtyping relationship to delineate the acceptable possible values of T.

We use the upper and lower bound annotations in Scala to denote this restriction.

Upper Bound Annotation
For example, we can say, T is acceptable only if it is a subtype of another specified type or if it is that specified type. This is referred to as setting the upper bound, and Scala uses the syntax
 T <: UpperBoundType to denote this.

To illustrate, if we have a Box[T] and we want to say the Box should only contain instances that are subtypes of fruits. Then we can define it as follows:

// T is defined to only be subtype of Fruit
case class Box[T <: Fruit](thing:T)
class Fruit(name: String)

// Apple is subtype of fruit
case class Apple() extends Fruit ("apple")

// compiles since Apple is a subtype of Fruit
Box(Apple())

case class Stone()
// does not compile since stone is not a subtype of fruit
Box(Stone())

Lower bound Annotation
Or we can say T is acceptable only if it is a supertype of another specified type. This is referred to as setting the lower bound. And Scala used the syntax T >: LowerBoundType to denote this.

2. Annotation for specifying Variance relationships
In languages that support subtyping, variance relationship describes how the subtyping relationships between individual types translate to the subtyping relationship between composite types formed using these individual types.

Basically how the subtyping relationship in type parameters, translates to the types created when they are passed to type constructors to create other concrete types.

For example, if Dog is a subtype of Animal, what relationship exists between Box[Animal] and Box[Dog]?

There are 3 kinds of variant relationship that can exist: covariance, contravariance, and invariance.

Scala allows us to indicate this relationship using type annotations. We go over these three variances related annotation next:

Covariance Annotation
Covariance relationship stipulates that if you have two types that are subtypes, their subtype relationship carries over to the type they create when they are used as type parameters.

For example, if Dog is a Subtype of Animal, then Box of dog: Box[Dog] is a subtype of Box of animal:Box[Animal] Scala uses the plus symbol + to indicates that covariance relationship exist.

So, for example the definition of the Box type constructor will look like Box[+T]

Contravariance Annotation
Contravariance is a flip of covariance relationship. It stipulates that the subtype relationship between type parameter is flipped for the types they are used to construct.

In a contravariant case, if we have Cat as a subset of Animal, then Box[Animal] is a subset of Box[Cat]. This relationship is annotated using minus symbol -

Thus the definition of Box with a contravariant relationship on its type parameter would look like
Box[-T]

Do not bother if contravariant relationship appears unintuitive. It is not you. The relationship is indeed counter-intuitive. But since the topic of this post is not to explore variance relationships, but to highlight various type annotations, I won't attempt an explanation of how the contravariant relationship can actually make sense. (topic for a separate blog post I would say) But for the purpose of this post, knowing that Box[-T] means that a contravariant relationship exists is enough.

Invariant Annotation
This is when there is no relationship between the subtype relationship of type parameters and the type they create.

This means that even if Dog is a subtype of Animal, it does not mean Box[Dog] is a subtype of Box[Animal]. With invariance relationship, Box[Animal] is a separate distinct type from Box[Dog] despite the fact that subtype relationship exists between Animal and Dog.

There is no special annotation to represent invariance. Hence Box[T] indicates invariance. Invariance is thus the default case.


3. Annotation for specifying view bound
The view bound annotation and the next one, that is, context bound annotation requires a working knowledge of implicits to fully appreciate what is being said. If thinking about implicits still make you feel dizzy, I will recommend reading both Understanding Implicits in Scala and Implicit Scope and Implicit Resolution in Scala before proceeding.

The view bound annotation is used to indicate that a type parameter can, in essence, be viewed and hence used as if it were of another type.

The syntax takes the following form:

[TypeParameter <% TypeItShouldBeSeenAs]

This annotation requires that there is an implicit value in scope that can convert from TypeParameter to TypeItShouldBeSeenAs

This can be illustrated if we examine the following code snippets:

case class Animal(name:String, speak: String)

case class Box[A <% Animal](animal:A) {
  def speakFromBox = {
    animal.speak
  }
}
implicit def stringToAnimal(value: String): Animal = value match {
  case "dog" => Animal(value, "woof")
  case "cat" => Animal(value, "meow")
}
// prints woof
Box("dog").speakFromBox
// prints meow
Box("cat").speakFromBox

Even though strings are passed into Box when creating instances, the speakFromBox method, ended up being able to use the string that was passed in as if it were Animal.

This is because, if we look at the definition of Box we see:

Box[A <% Animal](animal:A)...


which means, whatever type A is, convert it to an Animal using implicit converter that should be available in the implicit scope.

Hence why we have the stringToAnimal defined as an implicit. Which is what does the string to Animal conversion. Without this in scope, the code won't compile.

If you squint hard enough, the <% symbol will look like a Binocular. Which fits into the narrative of what it does...a binocular that looks on a type as if it was another type :)

It should be mentioned that using View bounds is discouraged as the syntax has been deprecated. It is mentioned in this post for completenes sake. Plus you would still probably run into code using it in the wild.

4. Annotation for specifying context bound
This annotation indicates that for a Type parameter A, there exists a value in implicit scope, that has been created using another type constructor but by passing A as the type parameter of this other type constructor.

Its syntax takes the form of [A : B].

so for example, if I see the following case class Box[A : Animal](animal:A)... it means that:
  • Box is a type constructor that takes A as a type parameter
  • For Whatever A that is provided, there must exist a value of Animal[A] in the implicit scope.
I provide a little more elaborate code snippet to illustrate this.

trait Animal[A] {
  def speak: String
}
case class Dog(name:String)
case class Cat(name:String)

implicit val dogAnimal: Animal[Dog] = new Animal[Dog] {
  override def speak: String = "Woof"
}
implicit val catAnimal: Animal[Cat] = new Animal[Cat] {
  override def speak: String = "meow"
}

case class Box[A : Animal](animal:A) {
  def speakFromBox = {
    implicitly[Animal[A]].speak
  }
}

// prints woof
Box(Dog("bingo")).speakFromBox
// prints meow
Box(Cat("katty")).speakFromBox

As seen, the Box is defined as Box[A : Animal] which means for whatever type A is, there should be an implicit value of type Animal[A] in scope. And the Animal[A] type in scope would be made available automatically within the Box type.

This means when A is Dog, then there must be an implicit value of type Animal[Dog] in scope. If A is Cat, there must be an implicit value of Animal[Cat] in scope. And these implicit value would be available for use within Box.

So with Box(Dog("bingo")).speakFromBox, an instance of Dog is being passed into Box, which means the A in this scenario is Dog. For this to compile, there must be an implicit value of Animal[Dog] in scope. And this is exactly what the dogAnimal implicit value is.

Because of this, the implementation within Box, can reach out to this implicit value by using the implicitly method.

While you can see view bound to mean there must be an existence of an implicit conversion in scope, You can read context bound annotation to mean there should be an existence of an implicit value in scope.

And it is this context bound annotation that comes in handy when encoding Type-class pattern.

Because as mentioned in Revisiting Polymorphism: It is more than Inheritance and Subtyping the idea behind Type-class is to have polymorphism, where the same operation can be applied to different types, as long as these various types can provide a required evidence.

The context bound annotation allows us to express this idea easily.

Basically, if you have a polymorphic code that uses the context bound annotation [A : B] we can view this as stating we can operate on A where A can be of any type. The only requirement being that for whatever type A is, there must be a B[A]where the B[A] represents the required evidence needed in other to perform a polymorphic operation on A.

We would be looking into the exact machinery to express this in the next blog post. Which would be Encoding Type class in Scala (yet to be published)

Conclusion

Now let's revisit the impenetrable code snippets at the beginning of this post and let's see if they now make some sense...

def apply[T <% Mappable[T]](x: T): T or class ReferenceQueue[+T <: AnyRef] { ... } or def setValue[T1 >: T](value: T1): T

def apply[T <% Mappable[T]](x: T): T can be read as a method that takes any type T as long as there is an implicit conversion that can convert T into a Mappable[T]

ReferenceQueue[+T <: AnyRef] { .. } can be read as a type constructor that takes any T as long as T is a subclass (or an instance) of AnyRef. The +T also tells us that whatever subtype relationship exists between instances of T is preserved exactly the same way in the concrete type ReferenceQueue[T]

def setValue[T1 >: T](value: T1): T can be read as a method that takes a type of T1 as long as T1 is a supertype of T.

It turns out these incantations can easily be made understood. As long as the basic idea is understood, reading becomes straightforward.

The other thing worth stressing is the need to transcend the class as a blueprint metaphor. And start to see classes as a way to delineate values, much like in a set-like fashion from mathematics.

This type level way of thinking would come in handy not just with type-classes, but with other aspects of Scala.

This post thus brings to a close the overview of the features of the Scala language that should be understood in other to be able to easily follow the Type-class pattern and appreciate the machinery involved in encoding it in Scala.

The next post, Encoding Type class in Scala will thus bring into convergence all the previously highlighted Scala features and show how they are used to encode the Type-class pattern.


3 comments:

Pantalejmon said...

"It should be mentioned that using Context bound is discouraged as the syntax has been deprecated"

Should be:"It should be mentioned that using view bound is discouraged as the syntax has been deprecated."

dade said...

@Pantalejmon Indeed. Thanks for spotting that. i have updated accordingly.

Anonymous said...

Great post. Note - your link to encoding type class at the bottom of the page is broken. It works from https://www.geekabyte.io/2017/11/exploring-typeclass-in-scala-knowledge.html