Sunday, May 13, 2018

Thoughts On Working With Nested Monad Within The Future Monad In Scala

This post is about nested contexts, specifically with Future, or more accurately, nested monad within the Future monad. eg: Future[Either[E, A]], Future[Option[A]] etc. And how such nested context could easily lead to hard to read and hence hard to maintain codebase.

These sort of context nesting and its impact is not limited to Future, but I would be using Future as the reference in this post, strictly because they seem to be the form that appears the most in a particular Scala codebase I work on at my day job.

Also in this post, I would use Monad and Context interchangeable to mean the same thing. I would also mention terms like Monads, Monad Stack, Monad Transformers, or For Comprehension without providing any explanation of these terms. This is because the thrust of this post is not to expound on these concepts but to share some recent thoughts I have been having regarding nested context within Future.

A reader not familiar with these concepts should still be able to follow along though.


Why We End Up With Nested Context Within Future.

Why do we end up with Future of an Either or Future of an Option or, in a play application Future of a JsResult?

If you are developing in Scala and you are only using the standard library (i.e no Cats or Scalaz), then sooner or later, you would end up having to use/deal with Future

This is almost inevitable as you would have to perform asynchronous operations like making a remote HTTP call or executing a database query. The library or framework you use for such operations would most likely return the results of these operations in a Future.

And since you also want to make use of the Scala type system to model your program properly, you end up with API’s that encode the result of their asynchronous actions not just as a Future of a plain value, but as a Future of values of type Either, Option etc.

For example, an HTTP request to get a value of type A, can successfully return with that value or fail with an exception E. Using Scala's type system to model our application to be as descriptive as possible, we make use of the Either type to capture this possibility of an error. So instead of having an API that returns a Future[A], we end up with a return type of Future[Either[E, A]]

And there goes our first embedded context within a Future.

My submission in this post is that, often than not, such nested contexts in Future is a code smell and should be avoided at all cost since it leads to a hard to read code. This conclusion is borne out of my experience with a Scala code base at work which is riddled with such nested Future

The rest of this post makes this case.

Why nesting Contexts within Future is bad

To make this case, let us imagine we have two functions. A getIdByEmail function that retrieves a user ID given an email, and a getPostsById function that retrieves a list of posts given a user ID. These two functions return a value of Future of an Either.

Let us imagine the implementations of these two functions is as follows:

// getIdByEmail

import scala.concurrent.{Await, Future}
import scala.language.implicitConversions
import scala.concurrent.ExecutionContext.Implicits.global

def getIdByEmail(email:String): Future[Either[String, Int]] = Future {
 if (email.endsWith("@gmail.com")) {
   Right(1)
 } else if (email.endsWith("@yahoo.com")) {
   Right(2)
 } else {
   Left("No User with given email")
 }
}

and

// getPostsById

import scala.concurrent.{Await, Future}
import scala.language.implicitConversions
import scala.concurrent.ExecutionContext.Implicits.global


def getPostsById(id:Int): Future[Either[String, List[String]]] = Future {
 if (id == 1) {
   Right(List("Post title 1", "Post title 2"))
 } else {
   Left("No posts found")
 }
}

We can then make use of these two functions to retrieve a list of post given an email. Such an implementation that makes use of these functions might look like this:

val result = getIdByEmail("mee@gmail.com").flatMap({
 case Right(id) =>
   getPostsById(id).map {
     case Right(posts) => posts
     case Left(error) => Future {Left(error)}
   }
 case Left(error) => Future {Left(error)}
})

println(Await.result(result, 3.seconds))

//updated: initially was 'case Right(books) => books'

I do not know about you, but code that looks like the above always leaves a bad taste in my mouth.

The combination of nested map/flatMap coupled together with pattern matching does not make for code that is easy to read.

The snippet above is using only just two functions that return nested context in a Future, and yet we are already having code that does not read easily at first glance. What happens if we throw a third or fourth function into the mix? It gets unwieldy really quickly.

So is there anything we can do to improve the readability? What if we try using for comprehension? Since we know it provides syntax sugar for maps and flatMaps, maybe that can help in improving the readability? Let us see.

Our first attempt at using for comprehension does not work out well as intended. The following code:

val result = for {
  id <- getIdByEmail("mee@gmail.com")
  posts <- getPostsById(id)
} yield posts

println(Await.result(result, 3.seconds))

Leads to the following compilation error:

Error:(44, 30) type mismatch;
 found   : Either[String,Int]
 required: Int
      posts <- getPostsById(id)


The reason why we get this error is due to the fact that Monads do not compose

Is there a possibility to wrangle the code, still using for comprehension and make it compile?. It turns out we can pull that off.

Such an incarnation would look like this:

val result = for {
    eitherAnIdValue <- getIdByEmail("dadepo@outlook.com")
} yield for {
    id <- eitherAnIdValue
} yield for {
   posts <- getPostsById(id)
} yield posts

println(Await.result(result, 3.seconds))
//updated: initially was 'books <- getPostsById(id)'

Again, I do not know about you, but I do not want to write, neither do I want to read code like this just to perform an operation that technically involves two steps: look up an Id by email, use the found Id to lookup associated posts.

If this is the kind code that Functional programming in Scala leads to, then I do not want to have anything to do with it!

The two code snippet above is why I have come to the conclusion that a function with the return type of Future[Either[E, A]], or Future[Option[A]] or something similar is usually a code smell. This is because, as seen above, they lead to unnecessary convoluted and unwieldy code.

How to deal with this? Is there a better way?

I currently can make two propositions for preventing or avoiding this unwieldy code that nested context leads to: Encode the failure in Future or Use Monad Transformers.

Let’s explore these two propositions.

Encode the Failure in Future

The Future type in Scala comes with a mechanism for encoding error within it. Although it is might not be the most robust way, it is definitely an option.

This can be done by calling the failed method with an instance of an exception representing the error. For example:

Future.failed(new UnsupportedOperationException("not implemented"))

This mechanism can then be used to encode exceptional scenario instead of using Either (or Option, or JsResult etc). Leading to a simpler return type of the form Future[A] where A represents the type of the returned value of interests.

This then mean that the functions can be used easily in a for comprehension. The only caveat being that a recover block needs to be introduced to take care of the situation where an exception is encountered.

Applying this approach would see the getIdByEmail and getPostsById method take the following form:

def getIdByEmail(email:String): Future[Int] = {
 if (email.endsWith("@gmail.com")) {
   Future.successful(1)
 } else if (email.endsWith("@yahoo.com")) {
   Future.successful(2)
 } else {
   Future.failed(new Exception("No User with given email"))
 }
}

and

def getPostsById(id:Int): Future[List[String]] = {
 if (id == 1) {
   Future.successful(List("Post title 1", "Post title 2"))
 } else {
   Future.failed(new Exception("No posts found"))
 }
}

And the code that makes use of these two methods would now look like this:

val result = {
  for {
    id <- getIdByEmail("dadepo@outlook.com")
    posts <- getPostsById(id)
  } yield posts
} recover {
    case e:Exception => Future.successful(e.getMessage)
}

println(Await.result(result, 3.seconds))
// updated: initially was 'books <- getPostsById(id)'

Or better still, to make it more readable, break into two steps like this:

val futureResult = for {
   id <- getIdByEmail("dadepo@outlook.com")
   posts <- getPostsById(id)
} yield posts

val result = futureResult recover {
  case e:Exception => Future.successful(e.getMessage)
}

println(Await.result(result, 3.seconds))
// updated: initially was 'books <- getPostsById(id)'

I personally think this approach of encoding the error in the Future results in more readable code.

The only drawback though, is that this approach undermines the type safety since it effectively removes the exception from the type system. It then makes it easier for a user to forget about handling the possible exceptions that can emanate from the Future in a "recover" step, hence leading to things blowing up at runtime.

I am still inclined to adopt this approach than having to deal the cumbersomeness of nested context within Future.

The second approach, which uses Monad transformers is explained next.

Use Monad Transformers

The other approach that helps in dealing with a nested context within Future is the use of monad Transformers.

Types like Future[Either[A, B]] can be more accurately described to be a monad inside another monad or stack of monads. And due to the nature of monads, having a stack of them this way does not lend to convenient usage within a for comprehension.

The way to make them less cumbersome is to make use of Monad Transformers which effectively creates a monad from the combination of two or more monads. The resulting monad can then, for all intent and purpose be treated as a single standalone monad entity.

The Scala standard library gives us monads (types like Future, List, Option, Try etc are all monadic if not monads in the true, law-abiding sense of what a monad is) but does not give us the tools to effectively deal with a stack of monads.

To make use of Monad transformers, we would need to bring in libraries like ScalaZ or Cats.

In this post, I will make use of Cats. The code snippet in the rest of the post below shows how Monad Transformers, has provided by Cats, makes programming with types like Future[Either[E, A]] less cumbersome.

Adding Cats to your project, if you use sbt would look like this:

libraryDependencies +=
"org.typelevel" %% "cats-core" % "current.version.number"
scalacOptions ++= Seq(
"-Xfatal-warnings",
"-Ypartial-unification"
)

After Cats is added, we can then make use of the EitherT monad transformer.

Let us have the getIdByEmail and getPostsById function in their initial form where they return an Either inside a Future. That is:

// getIdByEmail

import scala.concurrent.{Await, Future}
import scala.language.implicitConversions
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import cats.data.EitherT
import cats.instances.future._

def getIdByEmail(email:String): Future[Either[String, Int]] = Future {
 if (email.endsWith("@gmail.com")) {
   Right(1)
 } else if (email.endsWith("@yahoo.com")) {
     Right(2)
 } else {
   Left("No User with given email")
 }
}

and

// getPostsById

import scala.concurrent.{Await, Future}
import scala.language.implicitConversions
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import cats.data.EitherT
import cats.instances.future._


def getPostsById(id:Int): Future[Either[String, List[String]]] = Future {
 if (id == 1) {
   Right(List("Post title 1", "Post title 2"))
 } else {
   Left("No posts found")
 }
}

With the EitherT monad transformer, we can then make use of these two methods in a for comprehension in the following manner:

val result = for {
    id <- EitherT(getIdByEmail("mee@gmail.com"))
    posts <- EitherT(getPostsById(id))
} yield posts

println(Await.result(result.value, 3.seconds))

As you can see, this removes all the boilerplate and clunkiness of working with nested Either in a Future. Problem solved!

Conclusion

The Monad Transformer approach is definitely the more robust way to deal with a stack of monads. Which is what a type of Future[Either[E, A]] is. It makes working with them easier without sacrificing type safety.

I definitely recommend using Monad transformers over the other approach of encoding the failure scenarios in the Future itself.

In the case where you cannot use libraries like Scalaz or Cats then I think encoding the exceptional scenarios in the Future would be the way to go, and it would lead to a more readable code when compared to all the nested map/flatMap and pattern matching that would be required otherwise.

So far, these are the two approaches I can think of when it comes to dealing with having another monad nested in the Future monad in Scala. If by chance, you know of any other best practice regarding dealing with a stack of monads in Scala? then please do share in the comment section.👇

Updates
Published Rolling Your Own Monad To Deal With Nested Monads In Scala which looks at the approach of creating a wrapper Monad as a means for making working with nested monads easier.


7 comments:

martin said...

First, thanks for working out a great example for code that mixes several monads, and presenting it so clearly! The advantage of concise examples is that it lets us evaluate the alternatives quickly and that can lead to new insights.

I am usually quite skeptical about monad transformers so thought that maybe this example would let me see the light. But in the end I come to a different conclusion than you do. Here's how I would write the task "look up an Id by email, use the found Id to lookup associated posts":


for (idOpt <- getIdByEmail("mee@gmail.com"))
yield for (id <- idOpt)
yield getPostsById(id:Int)


This is exactly as long as your monad transformer code, without any of the boilerplate to get the monad transformers set up.

How did I get there? I took your second solution (the one with the three chained fors, about which you said "If this is the kind code that Functional programming in Scala leads to, then I do not want to have anything to do with it!") and noted simply that your last for clause read

for (books <- ...) yield books

which is just a fancy way to write the identity. So it can simply be dropped.

It's actually quite beautiful because it shows the usefulness of chaining fors in a nutshell. And sometimes grabbing for more complicated abstractions makes us blind for obvious and elegant simpler solutions.

Note: I don't doubt that there is code that can be simplified using monad transformers. Any abstraction has its place somewhere. But I believe that for monad transformers that place is actually quite limited and Scala 3 will introduce better solutions for many of the remaining use cases around effects.

martin said...

Caveat: If you look at the types, they are actually different. The nested `for` solution gives a `Future[Either[String, Future[...]]`, which is the same as the three-stage for you provide, bit probably not what you want. But the following solution is just as simple, and give the right type:

getIdByEmail("mee@gmail.com").flatMap {
case Right(id) => getPostsById(id)
case Left(error) => Future(Left(error))
}

jamest said...

I could be missing something, but wouldn't the code

for (idOpt <- getIdByEmail("mee@gmail.com"))
yield for (id <- idOpt)
yield getPostsById(id:Int)

give us a Future[Either[String, Future[Either[String, List[String]]]]] instead of the required Future[Either[String, List[String]]? I think the example in the article with the chained for comprehensions would have the same problem.

I can't think of another solution that doesn't involve some sort of explicit handling of success/failure between the calls to getIdByEmail and getPostsById (which I suppose is what EitherT is effectively doing for us). It would be interesting to see other possible approaches.

dade said...

Thanks @Martin for stopping by to chime in and for offering your suggestion. It’s appreciated!

That being said, I can see some issues in the approach you suggested.

I’ll explain:

The first issue is regarding the return type of your suggested approach. As noticed by @jamest in his comment, the return type ends up being:
Future[Either[String, Future[Either[String, List[String]]]]]

See https://scastie.scala-lang.org/dadepo/Dyb36sGXS9W8wOzfjFDjEA

That is quite horrendous! By the way, this issue also applies to the more verbose version included in the post.

So even though the "for yield for yield" expression is compact, not as verbose as what is in the post, and might be said to look beautiful, same cannot be said of its output.

This, right here, exemplify some of the main issues I and my team have been having with Scala. That is, how easy it is to create something as unwieldy as a Future[Either[String, Future[Either[String, List[String]]]]]. And before you know it, instead of spending brain cycles on implementing the needed business logic, you wasting time playing type-tetris and trying to wrought values out of these nested contexts.

The second issue regarding your suggestion is about having readable code that clearly reflects the business logic. I explain.

The problem statement in the example can be outlined as:

1. look up an Id by email
2. use the found Id to lookup associated posts

The "business logic" using Monad Transformers directly maps to this

for {
// look up an Id by email
id <- EitherT(getIdByEmail("mee@gmail.com"))
// use the found Id to lookup associated posts
posts <- EitherT(getPostsById(id))
} yield posts

This, in my opinion, makes for code that is clear to read and easy to discuss with colleagues.

This cannot be said of the "for yield for yield" approach you suggested.

// look up an Id by email
for (idOpt <- getIdByEmail("mee@gmail.com"))
yield for (id <- idOpt) // what is this doing here?
// use the found Id to lookup associated posts
yield getPostsById(id:Int)

What exactly is the second line doing there?

That line exposes the machinery of the embedded monadic context we are dealing with and clutters the ability to crisply express the business logic in code. This kind of clutter would exponentially appear as the intermediate steps in the business logic adds up. In this example we just have two steps, in real life scenario, it is not uncommon to have logic that spans more steps.

These are the two issues I could identify in your suggested approach.

Now to your rebuttal of the Monad Transformer approach. In your criticism, you cited the presence of boilerplate as one of its drawbacks. But I do not see any boilerplate being required in the approach. What was needed was to add a library dependency, and do an import of EitherT. If that is a boilerplate, then almost every single line of Scala code ever written suffers from these kinds of boilerplate.

Yes, the actual machinery for implementing the transformers might require some boilerplate, but the user of the library is shielded from this. The price has already being paid for by the library, saving the user from having to incur this boilerplate cost themselves.

So in conclusion, as it stands, I would still reach out for Monad Transformers. Yes they do come with some performance cost, but I would rather pay for whatever performance hit they bring and have clear, crisp code that easily maps to the business logic, than in my attempt to eschew not using abstractions end up with code that is unwieldy, either due to having to write nested flatMap/Maps/pattern matching, or nested for comprehension that even leads to even more nested return types.

I am continually learning and on the lookout for best practices on how to deal with scenarios like this, so If I find a better approach: which might be based on machinery provided in up-coming Scala 3, or in another abstraction/pattern etc I would definitely update this post.

Unknown said...

Hi Aderemi,

As we discussed after your talk it is also possible to do the Monad Transformation by hand. True, this must be done for every pair of nested types, but if there are not many of them this is doable. See https://gist.github.com/devlaam/bdce0b64fe3c03254a481309a86dfd43 as example for Future[Option[A]].

Joost Heijkoop said...

I demonstrate a possible other solution that should require less machinery, while still reducing some of the boilerplate. Sadly it doesn't allow usage in a for comprehension though: https://github.com/jheijkoop/nested-monad-helper

Unknown said...

Hey Dadepo Aderemi,

Thanks for the clearly exposed blogpost, it was super helpful!

We've been trying to tackle this very same problem at work but we got stuck when combining errors using ADTs. Basically, the problem we have is about how Either is covariant while EitherT is not, so that for example if we have an error ADT such as:

sealed trait UserError

object UserNotFound extends UserError
object UserIsScammer extends UserError

Then we can't easily combine two eithers one of each different error type into a EitherT (we would assume it would give the common error, UserError, but it seems it's impossible to achieve this).

How do you currently solve this problem?

EitherT currently solves the happy path super well for us, it just fails at that error part sadly.