Guide to Implementing Custom Monadic Effects in Issue-Wanted
2019-07-20
Over the past few weeks I’ve been making steady progress on the issue-wanted project I’m working on for Google Summer of Code 2019. Today I would like to present my work on adding a custom monad/effect to issue-wanted and what I learned during the process. Before I go on about what I added, lets explore what I started with.
The App Monad and ReaderT Design Pattern
As I said in my last post, there is still an ongoing debate on what is the best way to structure Haskell programs. As far as I know the most popular method used today is monad transformers, also known as “mtl-style”. Monad transformers give the programmer the ability to combine, or stack, different monads together to create a monad that suits their specific needs. You’ve probably learned of the idea that monads can be thought of as computations or actions, each with a specific set of rules on how to compose them[1]. I like to think of monads as powerups in a video game; each one gives the program you’re writing a unique ability. At the core of issue-wanted is the App
monad. It is a basic monad stack defined as follows:
-- | Main application monad.
newtype App a = App
{ unApp :: ReaderT AppEnv IO a
} deriving (Functor, Applicative, Monad, MonadIO, MonadReader AppEnv, MonadUnliftIO)
I’m using the ReaderT
monad transformer in combination with the IO
monad. This is just the Reader
monad with the addition of being able to use IO
in its computations. The ReaderT
transformer is wrapped in a newtype, App
in this case, so we can inherit useful methods from other typeclasses as well.
In short, the type definition above means App
is a monad that can read from an environment of type AppEnv
and return values of type a
in an IO
context. Also note the various typeclasses that App
derives. App
has access to methods ranging from the useful basics like fmap
(from Functor
), to more powerful methods like liftIO
(from MonadIO
) and ask
(from MonadReader AppEnv
).
This way of structuring an application around the ReaderT
monad transformer is based on the ReaderT design pattern[2] made popular by Michael Snoyman at FPComplete. If you think about it, an application runs in a context where:
- It has access to an envrionment (in our case, the
AppEnv
type) where it can retrieve neccessary resources like a database connection, a configuration file, TLS manager, etc.
and
- Can perform input/output (e.g. writing to a database or downloading a file from the Internet)
The ReaderT
monad transfomer in combination with the IO
monad allows us to express this in our programs.
Handling Errors
Wait, how does the App
monad handle errors? There are multiple ways to throw and catch errors in Haskell, all with different tradeoffs. One option is to add the ExceptT
monad transformer to the App
monad stack, but this complicates the code. If possible, it is best to try and avoid adding another layer to your monad stack. Using ExceptT
also prevents the App
monad from derivng important typeclasses like MonadUnliftIO
that we need for concurrency.
Instead, an instance of MonadError
is implemented for the the App
monad.
{- | Exception wrapper around 'AppError'. Useful when you need to throw/catch
'AppError' as 'Exception'.
-}
newtype AppException = AppException
{ unAppException :: AppError
} deriving (Show)
deriving anyclass (Exception)
instance MonadError AppError App where
throwError :: AppError -> App a
throwError = liftIO . throwIO . AppException
{-# INLINE throwError #-}
catchError :: App a -> (AppError -> App a) -> App a
catchError action handler = App $ ReaderT $ \env -> do
let ioAction = runApp env action
ioAction `catch` \(AppException e) -> runApp env $ handler e
{-# INLINE catchError #-}
The AppException
newtype deriving the Exception
typeclass is only used to make implementing MonadError
a lot easier and performant.
The reasons for choosing this technique to handle errors in the App
monad is more complicated than I describe here, but I will save this lesson for another blog post.
AppEnv and The Has Typeclass Pattern
To get a better understanding of how the ReaderT
monad is being utilized in issue-wanted, lets take a look at the AppEnv
type that defines the environment the App
monad has access to:
-- 1
type AppEnv = Env App
-- 2
type DbPool = Pool Connection
-- 3
data Env (m :: Type -> Type) = Env
{ envDbPool :: !DbPool
, envLogAction :: !(LogAction m Message)
}
-- 4
instance HasLog (Env m) Message m where
getLogAction :: Env m -> LogAction m Message
getLogAction = envLogAction
setLogAction :: LogAction m Message -> Env m -> Env m
setLogAction newAction env = env { envLogAction = newAction }
-- 5
class Has field env where
obtain :: env -> field
-- 6
instance Has DbPool (Env m) where obtain = envDbPool
instance Has (LogAction m Message) (Env m) where obtain = envLogAction
-- 7
grab :: forall field env m . (MonadReader env m, Has field env) => m field
grab = asks $ obtain @field
Ok, lets break this down step by step:
We have the type synonym
AppEnv
that you saw in the definition of theApp
monad above. It’s just theEnv
type parameterized by theApp
type. One interesting thing to notice is thatAppEnv
is used to defineApp
andApp
is used to defineAppEnv
making these definitions mutually recursive.DbPool
is the type synonym we’re using to represent a pool of database connections. This will be used for connecting to our PostgreSQL database.This is the
Env
type. It’s a record type that represents the different resources ourApp
monad has access to. Interestingly, it has a type variable of kindType -> Type
meaningm
must be filled in by a single-parameter type constructor. This type variable is primarily used for theLogAction
type from theco-log
library.An instance of the
HasLog
typeclass from theco-log
library for ourEnv
type.The
Has
typeclass. It is parameterized by two type variables; thefield
of interest and theenv
it is retrieved from. It only has one method,obtain
, which takes a value of typeenv
and gives back a value of typefield
. This is one example of the Has typeclass pattern. This pattern is commonly used in conjuction with the ReaderT design pattern to make it even more clean and extensible.Here are our instances of the
Has
typeclass. We add one for each field in theEnv
type. The two definintions above mean we have a way of obtainingDbPool
andLogAction
values from ourEnv
type.This interesting function,
grab
, does exactly what its name implies. It “grabs” a value of the specified type from our environment. It can be used to grab our database pool for example:-- | Perform action that needs database connection. withPool :: ( MonadReader env m , Has DbPool env , MonadIO m ) => (Sql.Connection -> IO b) -> m b withPool f = do pool <- grab @DbPool liftIO $ Pool.withResource pool f {-# INLINE withPool #-}
@DbPool
is an example of a type application which you can read more about here. I like to think of thegrab
function as a more convenient version ofask
that allows you to specify exactly what you want from the environment.
The Download Monad
With the power to mix and match different monads comes the responsibility of building your own custom monad. When we need to operate within a certain context but the available monads don’t describe it well enough, we need to create our own.
If you don’t know already, issue-wanted is a web application that programmers can use to quickly find open-source Haskell projects to contribute to. Me and my mentors want to make sure that people are able to find the right project for them, and so we thought displaying the Haskell repository along with the category names that populate the project’s .cabal
file would be a nice feature. Creators of Haskell projects have the option to add a comma-seperated list of values to the category
field of their .cabal
file. Here’s the category
field for the issue-wanted project for example:
Our challenge was to figure out how to fetch this metadata and store it in our database. An easy solution would’ve been to just use the GitHub API to retrieve the file contents, but we already use the GitHub API pretty extensively throughout other parts of the application and thought we might risk going over the API rate-limit. We decided to implement a function to download the file directly from the URL.
We knew we could construct a URL for the .cabal
file given a RepoOwner
and RepoName
:
-- | This function returns a @Url@ for downloading a @Repo@'s @.cabal@ file.
repoCabalUrl :: RepoOwner -> RepoName -> Url
repoCabalUrl (RepoOwner repoOwner) (RepoName repoName) = Url $
"https://raw.githubusercontent.com/"
<> repoOwner
<> "/"
<> repoName
<> "/master/"
<> repoName
<> ".cabal"
This solution isn’t one hundered percent correct because there will be cases where the .cabal
file isn’t stored on GitHub at all, the .cabal
file isn’t in the master
branch, or we’re dealing with a multi-package Haskell project. For now, an error is thrown when one of these cases occurs and the file isn’t downloaded, but we definitely plan on adding support for these kinds of Haskell projects in the future to ensure that we have as much useful metadata as possible.
Once we figured out how to construct the URLs, we needed to implement a function for downloading the contents. A function that takes a Url
and gives back a ByteString
representing the file contents:
But downloading a file from the Internet is not a pure function! downloadFile
needs to operate within some sort of monadic context:
We named this context MonadDownload
. The class definition looks like this:
-- | Describes a monad that can download files from a given @Url@.
class Monad m => MonadDownload m where
downloadFile :: Url -> m ByteString
Typeclasses like the one above are a great way to decouple our code and take advantage of things like mock instances for testing.
I haven’t done this for issue-wanted becasue we are using other techniques to test the downloadFile
function, but the fact that I can implement this, or any other version of the interface, if I need to is very nice.
More importantly, I needed to implement an instance of MonadDownload
for the App
monad; the heart of our application.
-- 1
instance MonadDownload App where
downloadFile = downloadFileImpl
-- 2
type WithError m = (MonadError AppError m, HasCallStack)
type WithLog env msg m = (MonadReader env m, HasLog env msg m, HasCallStack)
type WithDownload env m = (MonadIO m, MonadReader env m, WithError m, WithLog env m)
-- 3
downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
man <- newTlsManager
let req = fromString $ toString unUrl
log I $ "Attempting to download file from " <> unUrl <> " ..."
response <- liftIO $ httpLbs req man
let status = statusCode $ responseStatus response
let body = responseBody response
log D $ "Recieved a status code of " <> show status <> " from " <> unUrl
case status of
200 -> do
log I $ "Successfully downloaded file from " <> unUrl
pure $ toStrict body
_ -> do
log E $ "Couldn't download file from " <> unUrl
throwError $ urlDownloadFailedError url
Let’s break this down:
You may be wondering why
downloadFile
is being implemented via thedownloadFileImpl
function below it. The type ofdownloadFile
is constrained toUrl -> App ByteString
so we can only use it in theApp
context.downloadFileImpl
on the other hand can operate within a polymorphic contextm
, as long as it abides by theWithDownload
constraint. This is useful because it helps with code readability and reusablity. The type signature tells the programmer exactly what minimal context[3] is required fordownloadFileImpl
, and other instances ofMonadDownload
(like the one for theTest
monad we saw above) can reusedownloadFileImpl
if we want to keep the same logic for downloading files.Something that stands out in this snippet of code is the
WithError
,WithLog
, andWithDownload
type synonyms. We use type synonyms to combine multiple constraints that are used together often. This helps with readability. Prior to working on issue-wanted I didn’t know that that you could define type synonyms for constraints, but it turns out to be a pretty useful technique.Finally, the actual implementation of
downloadFile
. After asking around on r/haskell for a library that would allow us to idiomatically download files from URLs, we decided to use the http-client and http-client-tls libraries. We’re usingnewTlsManager
for loading up a new TLS manager andhttpLbs
for making the request. We use the response status code for logging messages and return the response body.
Awesome! We have our very own monad for downloading stuff from URLs! But wait…
downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
man <- newTlsManager
...
A new TLS manager has to be created each time we call downloadFileImpl
. This is unneccessary and can lead to suboptimal performance, so we decided to add a field for the manager to our Env
record. This way, we only need to create the manager once when the application is started, and grab the same one each time we need to download a file.
data Env (m :: Type -> Type) = Env
{ envDbPool :: !DbPool
, envManager :: !Manager
, envLogAction :: !(LogAction m Message)
}
instance Has Manager (Env m) where obtain = envManager
Now downloadFileImpl
looks like this:
instance MonadDownload App where
downloadFile = downloadFileImpl
type WithError m = (MonadError AppError m, HasCallStack)
type WithLog env msg m = (MonadReader env m, HasLog env msg m, HasCallStack)
type WithDownload env m = (MonadIO m, MonadReader env m, WithError m, WithLog env m, Has Manager env)
downloadFileImpl :: WithDownload env m => Url -> m ByteString
downloadFileImpl url@Url{..} = do
man <- grab @Manager
let req = fromString $ toString unUrl
log I $ "Attempting to download file from " <> unUrl <> " ..."
response <- liftIO $ httpLbs req man
let status = statusCode $ responseStatus response
let body = responseBody response
log D $ "Recieved a status code of " <> show status <> " from " <> unUrl
case status of
200 -> do
log I $ "Successfully downloaded file from " <> unUrl
pure $ toStrict body
_ -> do
log E $ "Couldn't download file from " <> unUrl
throwError $ urlDownloadFailedError url
We simply add Has Manager env
to the WithDownload
constraint and now downloadFileImpl
can grab the manager from the envrionment anytime it’s needed to download a file, without creating a new one each time. Nice!
Conclusion
Issue-wanted is looking better and better as the Summer goes on and I’m excited to see how everything comes together. Up until the past couple of weeks, much of the structure of issue-wanted was borrowed from the Holmusk three-layer template project and I haven’t really got the opportunity to implement anything unique to issue-wanted until now. I’m starting to move past building the foundation of the project and work my way into the more interesting features like the one I wrote about in this post. I look forward to implementing interesting solutions to the problems I will encounter as I continue to work on issue-wanted.
Footnotes:
[1] From what I’ve heard, the analogy between monads and “computations” or “actions” isn’t 100% accurate, especially in the mathematical sense, but this is how I reason about monads. It has been a useful model for me so far.
[2] The ReaderT pattern used in issue-wanted and the three-layer template project is a slightly modified version of the classic ReaderT pattern by Michael Snoyman.
[3] We’re also building the project with the -Wredundant-constraints
flag which guarantees that we are using the minimum context needed for our functions to operate.