Saturday, September 22, 2018

Easing Into Cats, And The Case For Category Theory Inspired Abstractions

Cats is a lightweight, modular, and extensible library for functional programming in Scala. On its website, it is stated that the name “cats” is a playful shortening of the word, category - so it has nothing to do with the animal: cat (if you ever wondering), but with category theory :)

It is a library that I have personally found to be helpful in making functional programming in Scala less confusing. It does this by being more concrete and more obvious with regards the various patterns and abstractions that arises from a valid logical conclusion of programming with functions in a statically typed language that has higher-kinded types.

Unfortunately, due to the relative unfamiliarity with functional programming and the various abstraction that comes with it, the Cats library is usually approached with hesitation amongst a sizable amount of software developers.

The sentiments I have seen expressed directly or indirectly is that introducing Cats would inadvertently fling developers into the deep ends of various unknown, academic and good for nothing, category theory inspired abstractions, that only confuses and add complexity. Since nobody wants this, the decision usually gets made not to bother with a library like Cats.

Nothing can be farther from the truth.


Yes, for someone coming from an OOP background, the Cats library will introduce new concepts that are totally foreign. But it will be an error to then conclude that such concepts are impractical solely due to the fact that they are unfamiliar and not understood. That would be confusing unfamiliarity with complexity. Which should not be the case, especially when these concepts can be learned if they are approached with an open mind.

Thus the sentiment that you need to have a Ph.D. in category theory to do FP in Scala using Cats is misguided, as it is very much possible to adopt the Cats library and start getting practical, real-life benefits from it, without getting “forced” or “thrown”, or “dragged” into any terrain of abstractions you are not familiar with yet.

This does not mean that the theory and abstraction backing such a library should not be learned, it just means that it is possible (and practical) to gently ease into Cats, and then, at your own pace or your team's pace, start learning the functional abstractions that Cats provide and then gradually apply them as needed.

The other hesitations I see center around doubting the usefulness of the library. I guess this is just a result of its perceived complexity. What better way to avoid learning something deemed complex than to cast doubts over the usefulness of that thing? but again, nothing can be farther from the truth.

In this post, I will be making the case that, it is not only possible to get started using Cats, but it is also a very practical thing to do as it offers features that help make solving everyday programming task easier.

Note that all the reservations mentioned in this post, and all the argument refuting them and implementations showing the usefulness of Cats, also applies to the Scalaz library, which is the other category theory inspired library for doing functional programming in Scala.

Approach

One of the ways you can start benefiting from Cats is by using the various combinators and higher-order functions it provides. Doing this does not require a Ph.D. in any category theory inspired abstraction as programming with higher-order functions is one of the basic and elementary concepts that comes with functional programming.

So this is the way it is going to work: I will describe a problem, I will then present a solution to the problem using Scala with just the standard library and an alternative implementation using the Cats library.

I will strive to be as mundane as possible with the problems. That is, no task involving Fibonacci series, or building of DSLs for a fictional calculator or any fancy type-level wizardry. The tasks would be as real-life as possible, boring and relatable to any developer building business application software.

And by the way, a special shout-out to wongiseng who humored my request and helped with providing the plain Scala solution to the problems listed below :)

Comparing Implementations: Without and With Cats

Task 1: Merging two maps

Given two maps of type Map[Sting, List[String]], combine them to create a resulting map where identical keys from the three maps have their list values concatenated in the combined map.

So for example, combining Map("k1" -> List("One"), "k2" -> List("Zero")) and Map("k1" -> List("Two")) should give Map("k1" -> List("One", "Two"), "k2" -> List("Zero")).

Use the following test data for implementation:

val movies1 = Map(
"Will Smith" -> List("Wild Wild West", "Bad Boys", "Hitch"),
"Jada Pinkett" -> List("Woo", "Ali", "Gotham")
)

val movies2 = Map (
"Will Smith" -> List("Made in America"),
"Angelina Jolie" -> List("Foxfire", "The Bone Collector", "Original



Resulting map should contain same information as below:

Map(Will Smith -> List(Wild Wild West, Bad Boys, Hitch, Made in America), Angelina Jolie -> List(Foxfire, The Bone Collector, Original Sin), Jada Pinkett -> List(Woo, Ali, Gotham))

Implementation using only standard library:



Implementation using Cats library:



Task 2: Generalise merging of maps

Task one is limited to merging just two maps. This task is generalising the merging of maps to support multiple maps.

Implement a varadic method that takes any amount of maps and merge. The type signature of maps to be merged os Map[String, List[String]]

def mergeMaps(maps: Map[String, List[String]]*): Map[String, List[String]] = ???

Implementation using only standard library:



Implementation using Cats library:



Task 3: Mapping over nested structures

The task is about how to map over nested structures like Future[Option[value]], or List[Option[value]] such that given such nested structure we can manipulate the value they contain.

In this task the nested structure we would work with is List of Try of Ints. List[Try[Int]]. The task is given a List[Try[Int]] transformation is to add 1 to each value in the nested wrapper.

Use the following test data:

val nested: List[Try[Int]] = List (
Success(1),
Success(2),
Failure(new Throwable("Not valid")),
Success(3),
)


And define a method, given the above data would add one to the int value.

def mapNested(nested: List[Try[Int]]) = ???

Implementation using only standard library:



Implementation using Cats library:



Task 4: Validate list of values (fail fast scenario)

In this task, we want to take a list of Ints, apply a function that validates if the int value is even. If all the ints are even, return all of the values. If a number that is not even is encountered, stop the validation and return the invalid int value.

This task can be used to represent a fail fast validation scenario.

To implement this task, use the following:

// List of ints
val listOfInts: List[Int] = (1 to 10).toList
val listOfAllEven: List[Int] = List(2,4,6,8)

// validation function
def isEven(value: Int): Either[String, Int] = {
if (value % 2 == 0) {
   Right(value)
} else {
   Left(s"$value is not even")
  }
}

def validateEven(input: List[Int])(f: Int => Either[String, Int]): Either[String, List[Int]] = {
// Implement
???
}

validateEven(listOfInts)(isEven) // should say 1 is not even
validateEven(listOfAllEven)(isEven) // return a list of List(2,4,6,8)



Implementation using only standard library:



Implementation using Cats library:



Task 5: Validation of domain properties (fail slow scenario)

In this task we want to construct a Person object, but only if all properties required are valid. If any or all of the properties are not valid, then report the validation failure.

Our Person object would look like this:

case class Person(name: String, age: Int)

Implement the method below:

def validateAndCreatePerson(name: String, age: Int): Either[Vector[String], Person] = {
// to implement
???
}


The method takes name and age, if these values are valid, return a Right[Person]. If there are validation errors, return them in a Left[Vector[String]].

The validation rules to be applied is as follows:
  1. Name should be invalid if it contains numeric values
  2. Age should be invalid it is less or equal to 0
Validation should not stop at the first encounter of a validation failure. All validation should be applied and collated.

This task can be used to represent a fail slow validation behaviour.

The implementation should have the following behaviour:

validateAndCreatePerson("John", 20)
// return res0: Either[Vector[String],Person] = Right(Person(John,20))

validateAndCreatePerson("John1", 20)
// return res0: Either[Vector[String],Person] = Left(Vector(Name cannot contain numbers))

validateAndCreatePerson("John", -1)
// return res0: Either[Vector[String],Person] = Left(Vector(Age cannot be zero or less than zero))

validateAndCreatePerson("John1", -1)
// return res0: Either[Vector[String],Person] = Left(Vector(Name cannot contain numbers, Age cannot be zero or less than zero))



Implementation using only standard library:



Implementation using Cats library:



Task 6: File processing

For this task, we will imagine that we need to process a file containing lines of text in other to produce some aggregated information

The aggregated information we are interested is, is to know how many words are in the file, how many character are in the files, and the number counts of each words.

This aggregated information should be put in a tuple of type signature:
val result: (Int, Int, Map[String, Int])

You start off with a list, containing all the lines in a text file and you end up with a tuple containing number of words, number of characters and number of occurrence per words.

This task was inspired by taken from the "Oh, all the things you'll traverse" talk by Luka Jacobowitz

Use the following test data for implementation:

val lines: List[String] = List(
"A bird in the hand is worth two in the bush",
"Left hand doesn't know what the right hand is doing",
"Like a moth to a flame"
)

def process(lines: List[String]) : (Int, Int, Map[String, Int]) = {
// to implement
???
}

process(lines)
// should return res0: (Int, Int, Map[String,Int]) = (27,92,Map(Like -> 1, in -> 2, is -> 2, what -> 1, hand -> 3, bird -> 1, two -> 1, A -> 1, a -> 2, moth -> 1, to -> 1, know -> 1, doing -> 1, worth -> 1, Left -> 1, flame -> 1, doesn't -> 1, bush -> 1, right -> 1, the -> 3))



Implementation using only standard library:



Implementation using Cats library:



Task 7: Chaining asynchronous calls that can return or not return values

In this task, we would combine various calls, that are either asynchronous, might return/not return values, or both.

The set of operations we want to chain together is as follows:

  1. Given a user id, fetch a user. User might be found or not found
  2. If user is found, proceed to fetch membership info for the found user.

To help with this task, make use of the following test data and stub implementations:

// domain objects representing users and membership information
case class User(id:Int)
case class MembershipInfo(mtype: String)

// stub implementation for getUser that only returns a user when called with id of 1
def getUser(id: Int): Option[User] = {
if (id == 1) {
  Some(User(id))
} else {
 None
 }
}

// stub implementation of getMembershipIfo that only returns membership info when called with a user
// with id of 1
def getMembershipInfo(user:User): Future[Option[MembershipInfo]] = {
if (user.id == 1) {
  Future.successful(Some(MembershipInfo("Standard")))
} else {
  Future.successful(None)
 }
}

// makes use of both getUser and getMembershipInfo to get the membership info given a user id
def fetchMembershipInfo(id: Int): Future[Option[MembershipInfo]] = {
// to implement
???
}


The following behaviour is expected:

// should return Some(MembershipInfo(Standard))
Await.result(fetchMembershipInfo(1), Duration.Inf) 

// should return None
Await.result(fetchMembershipInfo(2), Duration.Inf)

Implementation using only standard library:



Implementation using Cats library:



Extra task: Generalise map

You have noticed that in Scala there are couple of data types that have the map function. For example, Either, Option, Try, Future, etc are example of such data types.

As good software developers, we know whenever we see a pattern and things getting repeated, it is always a sign that we should generalise. Which is what we are going to do in this task.

This task is about having a generalised map function, which can take either Either, Option, Try, Future and a function, and then apply the function to the value in the data type.

If such a function is named genericMap Then it should be possible to use it this way:

genericMap(Option(2))(_ + 1) // prints Some(3)
genericMap(Try(2))(_ + 1) // prints Success(3)
genericMap(Right(2): Either[Int, Int])(_ + 1) // prints Right(3)
genericMap(Future.successful(2))(_ + 1) // should return Future(3)


Implementation using only standard library:



Implementation using Cats library:



One thing to note though, is that the Scala implementation is not truly generalised, as there is a need to update the code in other to support other data-types that can be mapped. For example, trying to use the current genericMap function with Lists data type would lead to a compile error. This is not the case with the Cats implementation as that is truly generalised. The genericMap function implemented with the Cats library can used with any data type that is mappable. If you know of any implementation using just plain Scala that can be used to achieve similar level of generalisation, please do share in the comment section.

Observations

I posit that the solutions implemented using Cats are simpler to read and understand. They require less code and generally improves the programming ergonomics when compared with the solution that makes use of just the Scala standard library.

And as can be seen, all that was needed to achieve this simplicity was mostly the higher order functions in Cats.

No intimate knowledge of any academic/abstract abstraction was needed.

But you know the cool thing? Every single one of the Cats based solution above is based on a solid foundation of one or more category theory inspired abstractions! Which means that if you also spend time to learn about these abstractions, you get to understand the magic that makes the solutions possible, hence you would not be limited to only what the Cats library provide out of the box, you would be able to wield the library in more powerful ways, to solve problems and model abstractions that cater to your own specific needs!

As this post is not meant to be an expository of these abstractions, I would only mention the abstractions and FP patterns that appear in each of the solutions to the tasks above. Curious readers can then proceed to look them up and learn more about them.
  • Task 1 and 2 makes use of Monoids
  • Task 3 builds on Functors and the fact that Functors compose
  • Task 4 is possible because it makes use of some properties of Monads
  • Task 5 involves Applicatives or lax monoidal functors as it is formally known in maths
  • Task 6 involves Functors, Monads, and Monoids
  • Task 7 involves Traversable functors and Applicative

All of these examples make use of type-classes to achieve polymorphism.

So, as you can see, all these "scary words" actually do have real-life, practical applications :)

Conclusion

No need to be apprehensive about adopting a library like Cats. You can start using it and start getting real-life benefits right away. When it is time to go deeper and learn the more powerful concepts that the library is based on, you can do so at your own pace and you will soon realize that even the examples above does not in anyway scratch the surface of the usefulness of Cats.

Also, using abstractions inspired by category theory in FP is not also just an academic endeavor, it is a very practical thing to do. This is because such abstraction can easily be applied to real-life problems. They also provide more formalized guarantees, because the stipulations that arise from them are not based on whims, opinions or trends, but on formalized mathematical foundations.

Any new topic or concepts would appear like jargon and impenetrable until it is learned. This is no different when it comes to various patterns and abstraction (inspired or not inspired by category theory) that are unique to functional programming. But the nice thing is that there are also loads of resources available out there that can aid with learning.

When it comes to learning cats, you can start off by perusing the documentation pages: The section on type classes and data types is a good place to start. There is also the Scala with Cats book, by Noel Welsh and Dave Gurnell, which introduces monads, functors, and other functional programming patterns and how these concepts are implemented and can be put to good use in Cats.

There is also the very active gitter chat-room where you can pop in and ask any Cats/Scala/FP related questions. Folks are very helpful over there, and you can be rest assured you will get your questions answered.

So finally, no need to fear the Categories, You can ease into it, start reaping practical benefits right away, and when you ready for more, you can start exploring various new concepts that come with using the library, rest assured that: there are resources and a healthy community to help with this.



2 comments:

Bruno Medeiros said...

I think this makes a good show case of some of Cats functionality, however some of the standard library examples could be simplified. For example, fetchMembershipInfo in task 7 can be just:

def fetchMembershipInfo(id: Int): Future[Option[MembershipInfo]] = {
getUser(id)
.map(getMembershipInfo)
.getOrElse(Future.successful(None))
}

dade said...

Still not as concise and straight to the point as:


def fetchMembershipInfo(id: Int): Future[Option[MembershipInfo]] = {
getUser(id)
.flatTraverse(getMembershipInfo)
}