Developing an Akka edge

Chapter 5: Routers

Up to this point in the book, we've looked at some uses of actors. While you can certainly build complicated chains of actors cooperating via message passing, you haven't really seen the components that allow for building more interesting and complex actor topologies. This is where routers (and dispatchers, covered in the next chapter) enter the picture.

Routers are a mechanism Akka provides to handle the flow of messages to actors. They allow for grouping actors together and sending messages to different actors based upon a variety of different rules. Dispatchers, on the other hand, are more concerned with the actual management of the actors' execution. In fact, they are themselves instances of ExecutionContext, which we briefly looked at earlier in the context of futures.

It's easy to get confused at times about which you should choose, given a specific problem, between a router or a dispatcher. Hopefully, by the end of this chapter you will begin building an intuitive sense of when each of these provide an appropriate solution.

The basics of routers

The basic purpose of a router is to provide a means to determine where messages are sent between a group of possible actors. You can think of these as being similar to a load-balancer in front of a typical web application. In fact, one of the most common use cases for routers in Akka is to balance some load across a set of actors. This can be used quite effectively for situations that call for possibly limited or costly resource usage, like what you might typically see in, for instance, a pool of database connections or other external resources (connections to external dependencies is a common scenario here).

It's important right from the start to understand that, while routers are internally implemented as actors, they possess certain properties and behaviors that are not always comparable to normal actors. For instance, they do not actually use the same mailbox and message handling scheme used by other actors. Akka specifically bypasses these to make routing efficient and simple. There is a cost, if you're actually implementing a router, but as we don't plan to cover that here, you can safely ignore it for now.

One of the consequences of routers being actors is that they will be part of the actor hierarchy, so the path used to address them will be based on the name give to the router, rather than the name of the actors used by the router. This also has implications for how responses are sent when an actor behind a router is communicating with another actor. Normally, if you just rely on an actor using sender from within a routed actor, any messages that other actor sends back to its sender reference would go straight to the current actor, rather than being routed back through the router. Depending upon the scenario, this might not be ideal. You can override this behavior and have any responses sent back through the router using a variation on the normal message sending syntax (this actually uses a method to which ! is effectively aliased behind the scenes). This form tells the actor on the other end to use the current actor's parent, which is the router, as the sender:

0001 // within an actor that is behind a router
0002 sender.tell(SomeResponseMessage(), context.parent)

Built-in router types

Akka provides a variety of routers to use in your application.

RoundRobinRouter

The RoundRobinRouter is one of the simplest routers you'll encounter, but it is nonetheless very useful. In very simple terms, it will send each message in-turn to the next actor in order, based upon the ordered sequence of available routees and you cannot dictate which actor to begin with.

This router works quite well for simple scenarios where you want to spread tasks across a set of actors, but where there is not likely to be a large variation of time spent performing those tasks and where mailbox backlogs are not a primary concern. The reason for this is straightforward. If the tasks have a significant variance in time incurred, the likelihood of ending up with one or more actors backing up rises as the rate of messages being sent increases. You might end up with a number of messages that were sent early in the sequence of events, but which end up sitting unattended to for long periods of time while other messages sent much later are handled after only a short delay.

SmallestMailboxRouter

The SmallestMailboxRouter is potentially useful for helping with the situation described in the previous scenario where you want to avoid having messages sitting unhandled simply because they were sent to a busy actor when another actor is sitting without messages to process. The SmallestMailboxRouter will look at the available routees and select from them whichever routee has the smallest (possibly empty) mailbox and the trade-off is that the collected values of these mailboxes from the routees could be stale by the time this router uses it to route messages. It's not a cure-all, though — even if incoming messages are always sent to the smallest mailbox, there is no way to predict whether that mailbox is currently occupied by messages that will actually take longer to process than a much larger mailbox full of messages awaiting a different actor. Also, this router does not have the ability to view the mailbox size for remote actors (remote actors are covered in a chapter 8). Given this limitation, remote actors are given the lowest priority in the selection algorithm.

BroadcastRouter

BroadcastRouter is a handy utility for the case where you need to send the messages to all its associated actors. A use of this router might be that you are building a monitoring system where you would send a message to this router to detect if the nodes in the environment are alive, or busy. When the router receives a health check message, the router broadcasts this request to its actors which in turn conduct a tcp-ping message to the nodes they are responsible for and the return times are collated for storage.

RandomRouter

This router simply sends messages randomly to its routees. There's not much more to it than that. A use case for this router might be to serve requests which returns a value given a particular key. This router would create a pool of routees where each router would query the datastore e.g. in-memory cache, where the datastore holds some key-value pair and returns the value if found or some message if not found. The important thing is that it doesn't matter which actor serves the request but it is important that it does.

ScatterGatherFirstCompletedRouter

The ScatterGatherFirstCompletedRouter is a very special router that behaves quite differently from the other routers described here. Like the BroadcastRouter, it will send any received message to all its routees, but it's intended to be used with futures (therefore, it does not burden the router upfront) and the ask pattern, so that a response will be returned, but the response returned will be the first of the routees to return a response.

This router essentially wraps together a common usage of futures, which is to handle precisely this scatter-gather pattern. This is useful when the use case requires more sensitivity to time than others since this router chooses the response that returns first.

ConsistentHashingRouter

A chapter alone could be dedicated to describing consistent hashing in depth, but in essence it's a means of mapping hash keys such that the addition of new slots to the hash table results in a minimal remapping of keys. This can be very useful, for instance, when you want to determine what set of servers to send a given request to. If the mappings were to change each time a new key was added, there would be a very high cost to any such additions. But with consistent hashing, this is minimized and thus there's a largely predictable mapping of a given request to a target server or resource. This exact technique is used in a number of popular distributed data storage services.

The actual use of this router is complicated enough to be beyond the scope of this book, but it's a useful facility to be aware of when designing distributed systems. It can be of particular use for cases where you are caching or memoizing data within stateful actors and you need to reduce the need to refresh that cached data.

Others

The routers described above are just the existing routers provided as part of Akka's library, but it's not difficult to create custom routers if need be. Reading the source code for the built-in routers will give you perhaps the best indication of what's involved and, of course, the Akka docs include good material on this subject.

Using routers

There are two mechanisms to specify how a router instance should be created, one is purely programmatic and the other uses the Akka configuration. Both have their uses and it's important to understand the reasons to choose one over the other, which we'll cover as we look at each approach.

Configuration-based router creation

Creating routers via configuration is very simple and makes it easy to adjust the runtime routing strategy without needing to make code changes and push out a new build:

0001 akka.actor.deployment {
0002   /configBasedRouter {
0003     router = round-robin
0004     nr-of-instances = 3
0005   }
0006 }

Assuming you have an actor called PooledActor, you would then use this by adding the following code. Note that the Props object is still passed the instance of tha actor you intend as your routee, but then you modify the Props by way of the withRouter method call:

0001 import akka.actor._
0002 import akka.routing.FromConfig 
0003 val router = context.actorOf(Props[PooledActor].withRouter(FromConfig()),
0004   name = "configBasedRouter")
0005 router ! SomeMessage()

All routers available within Akka allow for resizeable pools of the routees. You can set this via code, but since we're focusing on configuration based router definition, let's look at some of the options available:

0001 akka.actor.deployment {
0002   /resizableRouter {
0003     router = smallest-mailbox
0004     resizer = {
0005       lower-bound = 2
0006       upper-bound = 20
0007       messages-per-resize = 20
0008     }
0009   }
0010 }

The configuration defined here specifies a starting size of 2 routees, in addition to assuring that we never drop below 2 routees in the pool. Further, we'll never have more than 20 routees and the router will only try to resize, if necessary, after every 20 messages. This option is useful for assuring that the router is not spending an excessive amount of time trying to resize the pool.

There are a handful of other configuration parameters available for the resizer and they can get a bit confusing on your first encounter with them, so here's a brief summary of each and how they are used:

pressure-threshold is used to define how the resizer determines how many actors are currently busy. If this value is set to 0, then it uses the number of actors currently processing a message. If the value is set to the default value of 1, it uses the number of actors which are processing messages and have one of more messages in their mailbox. If the number is greater than 1, it will only include actors which have at least that number of messages in their mailbox.
rampup-rate is used to determine how much to increase the routee pool size by when all current routees are considered to be busy (based on the pressure-threshold). This value is a ratio and defaults to 0.2, which means that when the resizer determines that it needs to create more routees, it will attempt to increase the pool size by 20%.
backoff-threshold is used for reducing the size of the pool. Another ratio (this value defaults to 0.3) is interpreted to mean that there must be less than 30% of the current routees busy before the pool is shrunk.
backoff-rate is essentially the inverse of rampup-rate, since it determines how much to decrease the pool size when that is called for. The default rate of 0.1 means that it will be decreased by 10% when needed.
stop-delay is used to provide a small delay before a PoisonPill message is sent to the routees that are being removed from the pool, in order to shut them down. The delay, defaulting to 1s, is provided to allow some time for messages to be placed into the routees mailbox before sending them the message to terminate.

You can easily spend a huge amount of time just trying to get the perfect configuration, but we'd advise against spending excessive effort on this when you're first building your system. It takes careful testing to really test these changes adequately, so working with the defaults is often a good place to start.

Programmatic router creation

Creating a router in code is also quite simple. If you choose to use this approach, you can always redefine the configuration using the methods specified previously, assuming you've given the router a name you can use to reference it in the configuration:

0001 val randomRouter = context.actorOf(Props[MyActor].withRouter(
0002   RandomRouter(nrOfInstances = 100)),
0003   name = "randomlyRouted")

Alternately, to using configuration-driven actor sizing (either using a fixed nr-of-instances or by defining a resizer), you can pass a collection of existing actors in to your router. This can be very useful when you need to perform more complex setup of each actor instance than the Props factory-based approach allows. A very simple example of this would be simply setting a specific name for each actor. It's notable that using this approach obviates the use of nr-of-instances or any sort of resizing:

0001 val namedActors = Vector[ActorRef](
0002   context.actorOf(Props[MyActor], name = "i-am-number-one")
0003   context.actorOf(Props[MyActor], name = "i-am-number-two")
0004 )
0005 val router = context.actorOf(Props().withRouter(
0006   SmallestMailboxRouter(routees = namedActors)),
0007   "smallestMailboxRouter")

Routers and supervision

Routers, like other parts of your actor hierarchy, generally should be considered in light of supervision and failure handling. By default, all routers will escalate any errors thrown by their routees, which can lead to some unexpected behavior. For example, if your actor that creates the router has a policy of restarting all of its children on an exception when one of your routees encounters an error, all of the children of the parent of the router will be restarted. This is not a good thing. Thankfully, you can override the strategy used by the router easily enough as the following example demonstrates:

0001 val router = context.actorOf(Props[MyActor].withRouter(
0002  RoundRobinRouter(nrOfInstances = 20,
0003    supervisorStrategy = OneForOneStrategy() {
0004      case _: DomainException => SupervisorStrategy.Restart
0005    }
0006   )
0007 ))

Continuing our example application

In the second chapter, we showed you a very simple example of using a router to both load balance across requests to a database and to provide fault handling. We then changed the fault handling mechanism in the last chapter to use a special intermediate actor to provide a supervisor for our routed actors. Let's expand this a bit further with what we've learned here to make it more adaptive to changing load-handling needs.

First, we make a fairly simple change in Bookmarker, removing a bit more code we placed there earlier:

0001 import org.eclipse.jetty.server.Server
0002 import org.eclipse.jetty.servlet.{ServletHolder, ServletContextHandler}
0003 import java.util.UUID
0004 import akka.actor.{Props, ActorSystem}
0005 import akka.routing.{FromConfig}
0006 object Bookmarker extends App {
0007   val system = ActorSystem("bookmarker")
0008   val database = Database.connect[Bookmark, UUID]("bookmarkDatabase")
0009   val bookmarkStore =
0010     system.actorOf(Props(new BookmarkStore(database)).withRouter(FromConfig()),
0011       name = "bookmarkStore")
0012 
0013   val bookmarkStoreGuardian = 
0014     system.actorOf(Props(new BookmarkStoreGuardian(bookmarkStore))) 
0015   val server = new Server(8080)
0016     val root = new ServletContextHandler(ServletContextHandler.SESSIONS)
0017   root.addServlet(new ServletHolder(new BookmarkServlet(system,
0018     bookmarkStoreGuardian)), "/")
0019 
0020   server.setHandler(root)
0021   server.start
0022   server.join
0023 }

The primary change to note is that we now create the BookmarkStore router using the FromConfig call to tell Akka to load the router settings from the application.conf file. Here's the minimal configuration used here:

0001 akka.actor.deployment {
0002   /bookmarkStore {
0003     router = round-robin
0004     nr-of-instances = 10
0005     resizer {
0006       lower-bound = 5
0007       upper-bound = 50
0008       pressure-threshold = 0
0009       rampup-rate = 0.1
0010       backoff-threshold = 0.4
0011     }
0012   }
0013 }

We're making some assumptions here that we should explain. Of course, this is a semi-imaginary example, given that we're using a mock database interface. Even with a real database or other external data store, determining the settings to use here would take a bit of analysis. In this case, we're assuming that at minimum, we might want to have a collection of 5 BookmarkStore actors to interact with the database but ramping up to 50 in times of peak load. Further, we're setting the pressure-threshold to 0 based upon the understanding that calls to an external system are expensive, so having an actor currently processing a message is enough to consider it busy. However, we might not want to ramp up too quickly, so we've set the rampup-rate to only increase the pool size by 10% at a time. We also want to shrink it back down quickly so that idling resources are returned back to the system, tending towards a small pool, so we've set the backoff-threshold to drop the pool size when fewer than 40% of my routees are busy.

Wrap-up

This whirlwind tour of routers and dispatchers has hopefully given you an idea of the flexibility Akka gives you for creating robust configurations that can handle very different types of workflow, depending upon your needs. There are a huge range of choices available to you, so you might feel overwhelmed, but it's generally best to start with the minimum you need to get things working. Then, through watching the performance and profiling under real workloads, you can get a better sense of where to apply these different tools and understand how they might impact your overall system.