Guide to Implementing Custom Monadic Effects in Issue-Wanted

2019-07-20

Over the past few weeks I’ve been making steady progress on the issue-wanted project I’m working on for Google Summer of Code 2019. Today I would like to present my work on adding a custom monad/effect to issue-wanted and what I learned during the process. Before I go on about what I added, lets explore what I started with.

The App Monad and ReaderT Design Pattern

As I said in my last post, there is still an ongoing debate on what is the best way to structure Haskell programs. As far as I know the most popular method used today is monad transformers, also known as “mtl-style”. Monad transformers give the programmer the ability to combine, or stack, different monads together to create a monad that suits their specific needs. You’ve probably learned of the idea that monads can be thought of as computations or actions, each with a specific set of rules on how to compose them[1]. I like to think of monads as powerups in a video game; each one gives the program you’re writing a unique ability. At the core of issue-wanted is the App monad. It is a basic monad stack defined as follows:

-- | Main application monad.
newtype App a = App
    { unApp :: ReaderT AppEnv IO a
    } deriving (Functor, Applicative, Monad, MonadIO, MonadReader AppEnv, MonadUnliftIO)

I’m using the ReaderT monad transformer in combination with the IO monad. This is just the Reader monad with the addition of being able to use IO in its computations. The ReaderT transformer is wrapped in a newtype, App in this case, so we can inherit useful methods from other typeclasses as well.

In short, the type definition above means App is a monad that can read from an environment of type AppEnv and return values of type a in an IO context. Also note the various typeclasses that App derives. App has access to methods ranging from the useful basics like fmap (from Functor), to more powerful methods like liftIO (from MonadIO) and ask (from MonadReader AppEnv).

This way of structuring an application around the ReaderT monad transformer is based on the ReaderT design pattern[2] made popular by Michael Snoyman at FPComplete. If you think about it, an application runs in a context where:

It has access to an envrionment (in our case, the AppEnv type) where it can retrieve neccessary resources like a database connection, a configuration file, TLS manager, etc.

and

Can perform input/output (e.g. writing to a database or downloading a file from the Internet)

The ReaderT monad transfomer in combination with the IO monad allows us to express this in our programs.

Handling Errors

Wait, how does the App monad handle errors? There are multiple ways to throw and catch errors in Haskell, all with different tradeoffs. One option is to add the ExceptT monad transformer to the App monad stack, but this complicates the code. If possible, it is best to try and avoid adding another layer to your monad stack. Using ExceptT also prevents the App monad from derivng important typeclasses like MonadUnliftIO that we need for concurrency.

Instead, an instance of MonadError is implemented for the the App monad.

{- | Exception wrapper around 'AppError'. Useful when you need to throw/catch
'AppError' as 'Exception'.
-}
newtype AppException = AppException
    { unAppException :: AppError
    } deriving (Show)
      deriving anyclass (Exception)

instance MonadError AppError App where
    throwError :: AppError -> App a
    throwError = liftIO . throwIO . AppException
    {-# INLINE throwError #-}

    catchError :: App a -> (AppError -> App a) -> App a
    catchError action handler = App $ ReaderT $ \env -> do
        let ioAction = runApp env action
        ioAction `catch` \(AppException e) -> runApp env $ handler e
    {-# INLINE catchError #-}

The AppException newtype deriving the Exception typeclass is only used to make implementing MonadError a lot easier and performant.

The reasons for choosing this technique to handle errors in the App monad is more complicated than I describe here, but I will save this lesson for another blog post.

AppEnv and The Has Typeclass Pattern

To get a better understanding of how the ReaderT monad is being utilized in issue-wanted, lets take a look at the AppEnv type that defines the environment the App monad has access to:

-- 1
type AppEnv = Env App

-- 2
type DbPool = Pool Connection

-- 3
data Env (m :: Type -> Type) = Env
    { envDbPool    :: !DbPool
    , envLogAction :: !(LogAction m Message)
    }

-- 4
instance HasLog (Env m) Message m where
    getLogAction :: Env m -> LogAction m Message
    getLogAction = envLogAction

    setLogAction :: LogAction m Message -> Env m -> Env m
    setLogAction newAction env = env { envLogAction = newAction }

-- 5
class Has field env where
    obtain :: env -> field

-- 6
instance Has DbPool                (Env m) where obtain = envDbPool
instance Has (LogAction m Message) (Env m) where obtain = envLogAction

-- 7
grab :: forall field env m . (MonadReader env m, Has field env) => m field
grab = asks $ obtain @field

Ok, lets break this down step by step:

We have the type synonym AppEnv that you saw in the definition of the App monad above. It’s just the Env type parameterized by the App type. One interesting thing to notice is that AppEnv is used to define App and App is used to define AppEnv making these definitions mutually recursive.
DbPool is the type synonym we’re using to represent a pool of database connections. This will be used for connecting to our PostgreSQL database.
This is the Env type. It’s a record type that represents the different resources our App monad has access to. Interestingly, it has a type variable of kind Type -> Type meaning m must be filled in by a single-parameter type constructor. This type variable is primarily used for the LogAction type from the co-log library.
An instance of the HasLog typeclass from the co-log library for our Env type.
The Has typeclass. It is parameterized by two type variables; the field of interest and the env it is retrieved from. It only has one method, obtain, which takes a value of type env and gives back a value of type field. This is one example of the Has typeclass pattern. This pattern is commonly used in conjuction with the ReaderT design pattern to make it even more clean and extensible.
Here are our instances of the Has typeclass. We add one for each field in the Env type. The two definintions above mean we have a way of obtaining DbPool and LogAction values from our Env type.
This interesting function, grab, does exactly what its name implies. It “grabs” a value of the specified type from our environment. It can be used to grab our database pool for example:
```
-- | Perform action that needs database connection.
withPool
    :: ( MonadReader env m
       , Has DbPool env
       , MonadIO m
       )
    => (Sql.Connection -> IO b)
    -> m b
withPool f = do
    pool <- grab @DbPool
    liftIO $ Pool.withResource pool f
{-# INLINE withPool #-}
```
@DbPool is an example of a type application which you can read more about here. I like to think of the grab function as a more convenient version of ask that allows you to specify exactly what you want from the environment.

The Download Monad

With the power to mix and match different monads comes the responsibility of building your own custom monad. When we need to operate within a certain context but the available monads don’t describe it well enough, we need to create our own.

If you don’t know already, issue-wanted is a web application that programmers can use to quickly find open-source Haskell projects to contribute to. Me and my mentors want to make sure that people are able to find the right project for them, and so we thought displaying the Haskell repository along with the category names that populate the project’s .cabal file would be a nice feature. Creators of Haskell projects have the option to add a comma-seperated list of values to the category field of their .cabal file. Here’s the category field for the issue-wanted project for example:

category: Web, Application

Our challenge was to figure out how to fetch this metadata and store it in our database. An easy solution would’ve been to just use the GitHub API to retrieve the file contents, but we already use the GitHub API pretty extensively throughout other parts of the application and thought we might risk going over the API rate-limit. We decided to implement a function to download the file directly from the URL.

We knew we could construct a URL for the .cabal file given a RepoOwner and RepoName:

-- | This function returns a @Url@ for downloading a @Repo@'s @.cabal@ file.
repoCabalUrl :: RepoOwner -> RepoName -> Url
repoCabalUrl (RepoOwner repoOwner) (RepoName repoName) = Url $
    "https://raw.githubusercontent.com/"
    <> repoOwner
    <> "/"
    <> repoName
    <> "/master/"
    <> repoName
    <> ".cabal"

This solution isn’t one hundered percent correct because there will be cases where the .cabal file isn’t stored on GitHub at all, the .cabal file isn’t in the master branch, or we’re dealing with a multi-package Haskell project. For now, an error is thrown when one of these cases occurs and the file isn’t downloaded, but we definitely plan on adding support for these kinds of Haskell projects in the future to ensure that we have as much useful metadata as possible.

Once we figured out how to construct the URLs, we needed to implement a function for downloading the contents. A function that takes a Url and gives back a ByteString representing the file contents:

downloadFile :: Url -> ByteString

But downloading a file from the Internet is not a pure function! downloadFile needs to operate within some sort of monadic context:

downloadFile :: SomeContext m => Url -> m ByteString

We named this context MonadDownload. The class definition looks like this:

-- | Describes a monad that can download files from a given @Url@.
class Monad m => MonadDownload m where
    downloadFile :: Url -> m ByteString

Typeclasses like the one above are a great way to decouple our code and take advantage of things like mock instances for testing.

instance MonadDownload Test where
    downloadFile :: Url -> Test ByteString
    downloadFile url = ...

I haven’t done this for issue-wanted becasue we are using other techniques to test the downloadFile function, but the fact that I can implement this, or any other version of the interface, if I need to is very nice.

More importantly, I needed to implement an instance of MonadDownload for the App monad; the heart of our application.

-- 1
instance MonadDownload App where
    downloadFile = downloadFileImpl

-- 2
type WithError m = (MonadError AppError m, HasCallStack)
type WithLog env msg m = (MonadReader env m, HasLog env msg m, HasCallStack)

type WithDownload env m = (MonadIO m, MonadReader env m, WithError m, WithLog env m)

-- 3
downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
    man <- newTlsManager
    let req = fromString $ toString unUrl
    log I $ "Attempting to download file from " <> unUrl <> " ..."
    response <- liftIO $ httpLbs req man
    let status = statusCode $ responseStatus response
    let body = responseBody response
    log D $ "Recieved a status code of " <> show status <> " from " <> unUrl
    case status of
        200 -> do
            log I $ "Successfully downloaded file from " <> unUrl
            pure $ toStrict body
        _   -> do
            log E $ "Couldn't download file from " <> unUrl
            throwError $ urlDownloadFailedError url

Let’s break this down:

You may be wondering why downloadFile is being implemented via the downloadFileImpl function below it. The type of downloadFile is constrained to Url -> App ByteString so we can only use it in the App context. downloadFileImpl on the other hand can operate within a polymorphic context m, as long as it abides by the WithDownload constraint. This is useful because it helps with code readability and reusablity. The type signature tells the programmer exactly what minimal context[3] is required for downloadFileImpl, and other instances of MonadDownload (like the one for the Test monad we saw above) can reuse downloadFileImpl if we want to keep the same logic for downloading files.
Something that stands out in this snippet of code is the WithError, WithLog, and WithDownload type synonyms. We use type synonyms to combine multiple constraints that are used together often. This helps with readability. Prior to working on issue-wanted I didn’t know that that you could define type synonyms for constraints, but it turns out to be a pretty useful technique.
Finally, the actual implementation of downloadFile. After asking around on r/haskell for a library that would allow us to idiomatically download files from URLs, we decided to use the http-client and http-client-tls libraries. We’re using newTlsManager for loading up a new TLS manager and httpLbs for making the request. We use the response status code for logging messages and return the response body.

Awesome! We have our very own monad for downloading stuff from URLs! But wait…

downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
    man <- newTlsManager
    ...

A new TLS manager has to be created each time we call downloadFileImpl. This is unneccessary and can lead to suboptimal performance, so we decided to add a field for the manager to our Env record. This way, we only need to create the manager once when the application is started, and grab the same one each time we need to download a file.

data Env (m :: Type -> Type) = Env
    { envDbPool    :: !DbPool
    , envManager   :: !Manager
    , envLogAction :: !(LogAction m Message)
    }

instance Has Manager (Env m) where obtain = envManager

Now downloadFileImpl looks like this:

instance MonadDownload App where
    downloadFile = downloadFileImpl

type WithError m = (MonadError AppError m, HasCallStack)
type WithLog env msg m = (MonadReader env m, HasLog env msg m, HasCallStack)

type WithDownload env m = (MonadIO m, MonadReader env m, WithError m, WithLog env m, Has Manager env)

downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
    man <- grab @Manager
    let req = fromString $ toString unUrl
    log I $ "Attempting to download file from " <> unUrl <> " ..."
    response <- liftIO $ httpLbs req man
    let status = statusCode $ responseStatus response
    let body = responseBody response
    log D $ "Recieved a status code of " <> show status <> " from " <> unUrl
    case status of
        200 -> do
            log I $ "Successfully downloaded file from " <> unUrl
            pure $ toStrict body
        _   -> do
            log E $ "Couldn't download file from " <> unUrl
            throwError $ urlDownloadFailedError url

We simply add Has Manager env to the WithDownload constraint and now downloadFileImpl can grab the manager from the envrionment anytime it’s needed to download a file, without creating a new one each time. Nice!

Conclusion

Issue-wanted is looking better and better as the Summer goes on and I’m excited to see how everything comes together. Up until the past couple of weeks, much of the structure of issue-wanted was borrowed from the Holmusk three-layer template project and I haven’t really got the opportunity to implement anything unique to issue-wanted until now. I’m starting to move past building the foundation of the project and work my way into the more interesting features like the one I wrote about in this post. I look forward to implementing interesting solutions to the problems I will encounter as I continue to work on issue-wanted.

Footnotes:

[1] From what I’ve heard, the analogy between monads and “computations” or “actions” isn’t 100% accurate, especially in the mathematical sense, but this is how I reason about monads. It has been a useful model for me so far.

[2] The ReaderT pattern used in issue-wanted and the three-layer template project is a slightly modified version of the classic ReaderT pattern by Michael Snoyman.

[3] We’re also building the project with the -Wredundant-constraints flag which guarantees that we are using the minimum context needed for our functions to operate.