Tuesday, December 19, 2017

Encoding Type class in Scala

This post looks into how to encode the Type class pattern in Scala. It is the fourth post in the Type-class knowledge pack series.

The previous posts in this series covered the various language features of Scala that needs to be understood before approaching how to encode the Type class pattern.  As previously explained, Type class is not a native language features in Scala; it is a pattern that uses other native language features. Thus to be able to reason along with how the Type class is encoded, a good working knowledge of the main language feature that comes into play when encoding the pattern is needed.


The first post in the series, Revisiting Polymorphism. Tip: It is more than inheritance, was about developing a broader appreciation for what polymorphism is and how Type class pattern is one way towards achieving polymorphic code. The main goal of that post was to remove whatever mental block might exist due to not fully appreciating polymorphism in its broadest sense.

The second post, Review of Implicit and Implicit resolution, was about understanding implicit resolution in Scala. A feature that is very central to how the Type-class pattern is encoded.

The third post, Exploring Type Annotations in Scala, explored the various type annotations that type parameters could have in Scala.  It also touched on the limitations of seeing classes as just blueprints to creates objects and the need to treat types as first-class citizens and embrace type level thinking.

If you do not have a good working knowledge of the topics mentioned above, then it is advised to first go through all the previous posts in this series before continuing. It would be of tremendous help.

Components of Type class pattern

The Type class pattern consists of three components:
  1. The Type class, which is a trait with at least one type variable.
  2. Type class instance. This is the concrete implementation of the Type class for particular types
  3. An interface that allows applying the needed polymorphic operations over various types 
In this post, we would take a look at what exactly these components are, and how they are wired together to encode the Type class pattern.

Instead of taking the often so popular approach of explaining these individual components, I will attempt a different approach in which I build up a model that seeks to achieve ad-hoc polymorphism via Type classes, and in so doing, we will arrive at these components and see how they fit together.

Hopefully, this would make the explanation more accessible.

How is the Type class encoded

In Revisiting Polymorphism, we established that the Type class pattern allows us to write polymorphic code. Albeit ad-hoc polymorphism.

In that post, we had the EU custom control procedure at airports as an illustration of how Type class is an example of polymorphism.

We are going to continue with this EU custom control example as we explore the machinery of how the Type class pattern is encoded in Scala.

The gist of polymorphism is to be able to apply the same operations, transparently over different types of things. In the EU custom control example, the same operation to be applied is passport control for EU travelers. While the different travelers from the different EU countries represent the different types of things.

It was also established in that post, that polymorphism comes in different forms. One of such is ad-hoc polymorphism implemented via Type class.  A crucial part of this kind of polymorphism is the presence of "an evidence" and in the EU custom control example, the evidence is represented by the possession of an EU passports.

The idea is that, once a type can be qualified (or in more formal terms, constrained) by the existence of an evidence, then the generic operations can be applied.

So once a traveler can be seen to possess an EU passports (regardless of which EU country), the same passport control operations meant for EU travelers can be applied.

The goal now is to see how we can model this polymorphism in code. Let's give that a try.

Modeling A Type class based polymorphism

We can have an attempt at modeling a polymorphic Custom check operations that does passport control in this way:

object EUCustomCheck {
  def control(traveller: Dutch): Boolean = ???
  def control(traveller: German): Boolean = ???
  def control(traveller: Romanian): Boolean = ???
  def control(traveller: Polish): Boolean = ???
  ....
}

Where basically, we have functions listed in the EUCustomCheck object that takes care of all EU nationalities.

This would enable us to do something like EUCustomCheck.control(Dutch()) or EUCustomCheck.control(Polish())

But this approach would not lead us to a Type class based ad-hoc polymorphism. (In fact, it is actually another type of ad-hoc polymorphism, but one implemented via function overloading)

So let us give it another shot.

object EUCustomCheck {
   def control(traveller: EUTraveler): Boolean = ???
  ....
 }

class Dutch extends EUTraveler
class Serbian extends EUTraveler.

Which would allow us to do EUCustomCheck.control(Dutch())or EUCustomCheck.control(Serbian())

But again, this does not look like Type class polymorphism. In fact, this is polymorphism via subtyping.

Ok, let us give it a last try, but this time revisiting carefully the components in the EU custom check example.

In that example, we have an operation: a passport control operation, and we have travelers this operation should be performed on. We then have an evidence, which is an EU passport, which permits this operation to be performed on an EU traveler.

In code, we can model this as:
object EUCustomCheck {
  def control(traveller: Dutch, passport: DutchPassport): Boolean = ???
  def control(traveller: German, passport: GermanPassport): Boolean = ???
  def control(traveller: Romanian, passport: RomanianPassport): Boolean = ???
  def control(traveller: Polish, passport: PolishPassport): Boolean = ???
  ....
}

Which would allow the use via EUCustomCheck.control(Dutch(), DutchPassport())or EUCustomCheck.control(Romanian(), RomanianPassport())

Ok, we are sort of getting there. We have something in code that looks like the example we are working with. Which is, apply an operation, only if the object and the evidence are present.

But we are not there yet. The modeling can still be improved.

If we look at the code again, we might realize that there is no need to make the passport concrete in the EUCustomCheck.control method. Because at the end of the day, the passport control operations do not discriminate on whichever EU passport is present, as long as it is an EU passport.

To include this in the modeling, we modify the code a little bit and introduce a trait to represent EU passports. So we have:

trait EUPassport {
def validPassport: Boolean
}

// implementation of the EUPassport trait
object DutchPassport extends EUPassport {
override def validPassport: Boolean = ???
}
object PolishPassport extends EUPassport {
override def validPassport: Boolean = ???
}

Then the EUCustomCheck can look like this:

object EUCustomCheck {
   def control(traveller: Dutch, passport: EUPassport): Boolean = ???
   def control(traveller: German, passport: EUPassport): Boolean = ???
   def control(traveller: Romanian, passport: EUPassport): Boolean = ???
  def control(traveller: Polish, passport: EUPassport): Boolean = ???
  ....
 }

Ok, with this, the EUCustomCheck.control method no longer cares about the exact type of the passport as long as it is an EU one. But as you notice, we still have a lot of repetition. Why? Because what we have now, still cares about the exact nationality of the traveler.

But, would it not be better to have a modeling where the EUCustomCheck.control does not care about the exact country of the traveler also? but only cares if the traveler is from a country that possesses an EU passport.

That is, apply an operation to an object, no matter what the object is, as long as the object possess something...which is sometimes called an evidence (we would soon get to its more technical term in a bit)

Ok so let us see how we can update the modeling to remove the repetition and make it more in line with the example:

case class Dutch()
case class German()

trait EUPassport[T] {
  def validPassport(traveller: T): Boolean
}
object DutchPassport extends EUPassport[Dutch] {
  // sure in real life, there would be real validating logic
  override def validPassport(traveler: Dutch): Boolean = true
}
object GermanPassport extends EUPassport[German] {
  // sure in real life, there would be real validating logic
  override def validPassport(traveler: German): Boolean = true
}

object EUCustomCheck {
  def control[T](traveller: T, passport: EUPassport[T]): Boolean =
    passport.validPassport(traveller)
}

Which can be used like this:

EUCustomCheck.control(Dutch(), DutchPassport)
EUCustomCheck.control(German(), GermanPassport)

Now, when you look at the definition of EUCustomCheck.control, it is exactly as polymorphic as we want. It exposes an operation which takes any traveler T, as long as there is an implementation of EUPassport for that traveler.

The EUPassport[T], which up until now has been referred to as the "evidence" is the Type class.

It is the trait that abstracts the constraint that is used to deliminate which types would be allowed to be operated on polymorphically.

The DutchPassport and GermanPassport above are referred to as the Type class instance. They provide concrete evidence, (if we choose to put it that way) for the different objects.

Dutch and German above are the objects that need to be operated on in a uniform way. And they are linked to the type class instance by being the type parameter in the definitions of the type class instances.

In the code snippet above, you would see that in the definition of GermanPassport German is passed as the type parameter to EUPassport. This is also exactly how the DutchPassport is defined.

And last but not the least, we have the CustomCheck.control definition itself that provides an interface for us to use the code in a polymorphic fashion which is

object CustomCheck {
  def control[T](traveller: T, passport: EUPassport[T]): Boolean =
    passport.validPassport(traveller)
}

So, in summary, we can say the machinery for encoding the type class pattern consist of the following components:
  1. A Type class (a trait with type parameters that describes the evidence)
  2. Type class instances (implementations of the trait for each object required to provide an evidence)
  3. The interface via which the polymorphic operation is applied 

And this modeling, my friends, captures the essential part of the type class pattern.

But it does not end here. There is one more crucial thing remaining that makes the assembling of the machinery complete in Scala. 

And that is employing the implicit feature to make supplying the type class instance a lot more seamless. Basically, we will be making the Scala compiler work for us. 

Making use of Implicits

Looking at the code snippet below:

object EUCustomCheck {
  def control[T](traveller: T, passport: EUPassport[T]): Boolean =
    passport.validPassport(traveller)
}

We see that to use the EUCustomCheck.control method, one needs to explicitly pass in the object (traveler) together with its type class instance (EU passport).

But would it not be cool to be able to only specify the traveler, and let the passport be automatically provided for us by the Scala compiler? And if the Scala compiler cannot find the required passport (i.e. the required type class instance), then the compilation should fail?

This is possible by making use of implicits. To do this, we slightly modify the definition of EUCustomCheck.control. This looks like this:

object EUCustomCheck {
  def control[T](traveller: T)(implicit passport: EUPassport[T]): Boolean =
    passport.validPassport(traveller)
}

We break the parameter definitions into two, and we tag the last one with the implicit keyword, which would make it a candidate to be automatically provided for us by the Scala compiler in cases where it is omitted.

The next thing we need to do is to mark our concrete evidence, our type class instances with the implicit keyword. This would make them viable options to be included in the parameter list by the Scala compiler.

implicit object DutchPassport extends EUPassport[Dutch] {
// sure in real life, there would be validating logic
override def validPassport(traveler: Dutch): Boolean = true
}

implicit object GermanPassport extends EUPassport[German] {
// sure in real life, there would be validating logic
override def validPassport(traveler: German): Boolean = true
}

Let's take a look at these changes together with the rest of the code snippet

case class Dutch()
case class German()

trait EUPassport[T] {
  def validPassport(traveller: T): Boolean
}
implicit object DutchPassport extends EUPassport[Dutch] {
  // sure in real life, there would be real validating logic
  override def validPassport(traveler: Dutch): Boolean = true
}
implicit object GermanPassport extends EUPassport[German] {
  // sure in real life, there would be real validating logic
  override def validPassport(traveler: German): Boolean = true
}
object EUCustomCheck {
  def control[T](traveller: T)(implicit passport: EUPassport[T]): Boolean =
    passport.validPassport(traveller)
}
// still works because the implicit is automatically resolved
EUCustomCheck.control(Dutch())
// still works because the implicit is automatically resolved
EUCustomCheck.control(German())

In the above example, the missing parameter list is automatically provided because they are in the local scope. As usual, the Scala compiler would scan the implicit scope for values that are suitable.

You can revisit the post on Implicit Scope and Implicit Resolution in Scala to learn the other places the implicit values can be and the Scala compiler would still find it. This knowledge is important as it does help library authors in deciding where to put default implicit type class instances. It also helps an application developer to know the options available for packaging type class instances.

And because we have the implicit mechanism, that is why we could have the code below still work:

EUCustomCheck.control(Dutch())
EUCustomCheck.control(German())

In cases where we pass in a traveler that does not possess an EU passport, aka, a traveler without an instance of EUPassport[T] in implicit scope, then the compilation would fail. For example something like this:

EUCustomCheck.control(American()) 

won't work. It would fail at compile time.

The last thing we are going to touch on is how Context bound and implicitly function, two other language features of Scala are usually applied when putting type class pattern together.

Using Context Bound Annotation and the implicitly function

If you take a look at the definition of the control method, you might have a feeling it is a little bit verbose. That is:

...
def control[T](traveller: T)(implicit passport: EUPassport[T])
...

We can make things snappy by making use of the context bound annotation. That is, a context bound annotation like [T: Bound] which allows us to specify that for any type T there must be an implicit value of Bound[T].

Which in essence, is what we are doing. We want that for any traveler T there must be an instance of EUPassport[T].

You can go check out the post Exploring Type Annotations in Scala for an overview of context bound annotation.

Modifying the code, to include usage of context bound would leave us at:

object EUCustomCheck {
  def control[T: EUPassport](traveller: T): Boolean = ???
}

As we can see, since we have removed the (implicit passport: EUPassport[T]) portion of the method definition, we would no longer have access to passport in the method body. For example, the line passport.validPassport(traveler) would lead to a compile error since passport won't be found anymore.

But this is where the implicitly function comes to our rescue. The implicitly function allows us to reach into the implicit scope and get hold of an implicit value of interest.

In this case, we need to grab a hold of an implicit value of type EUPassport[T], which was what the passport variable was. but since it is now gone, we can replace it with: implicitly[EUPassport[T]].validPassport(traveller)

Taking together the updated code snippet would look like:

case class Dutch()
case class German()

trait EUPassport[T] {
def validPassport(traveller: T): Boolean
}

implicit object DutchPassport extends EUPassport[Dutch] {
// sure in real life, there would be real validating logic
override def validPassport(traveler: Dutch): Boolean = true
}

implicit object GermanPassport extends EUPassport[German] {
// sure in real life, there would be real validating logic
override def validPassport(traveler: German): Boolean = true
}

object EUCustomCheck {
def control[T: EUPassport](traveller: T): Boolean =
// implicitly[EUPassport[T]] allows us to grab a hold
// of the implicit that would be in scope due to the use
// of the context bound
implicitly[EUPassport[T]].validPassport(traveller)
}

// still works because the implicit is automatically resolved
EUCustomCheck.control(Dutch())
// still works because the implicit is automatically resolved
EUCustomCheck.control(German()) 

And with that, we are able to make use of context bound annotation ( together with the help of the implicitly function) to make the type class encoding a little less verbose

Conclusion

In this post, we saw there are three main components that comprise the type class pattern.
  1. The Type class
  2. The Type class instances
  3. The interface via which the polymorphic operation is applied
We also saw how to define these three components and put them together in other to achieve polymorphism. To a large extent, these things capture the core essence of the Type class pattern in Scala.

Every other thing you see regarding Type class in Scala can be seen as common practices or convenience that slightly modify the way the components are put together. In this post, we already explored one of these common practices: which is the usage of Context bound and implicitly function.

In the next post, Common patterns of Type class, we will look into more common patterns that are employed while encoding the Type class pattern.


No comments: