Wednesday, April 3, 2013

The Constrained-Type-Class Problem

In Haskell, there are some data types that you want to make an instance of a standard type class, but are unable to do so because of class constraints on the desired class methods. The classic example is that the Set type (from Data.Set) cannot be made an instance of Monad because of an Ord constraint on its desired binding operation:

returnSet :: a -> Set a
returnSet = singleton

bindSet :: Ord b => Set a -> (a -> Set b) -> Set b
bindSet sa k = unions (map k (toList sa))

However, despite being the classic example, in some ways it's not a very good example, because the constraint appears only on the second type parameter of bindSet, not on the first type parameter, nor on returnSet.

Another example of the problem also arises in the context of embedded domain-specific languages. When constructing a deep embedding of a computation that will later be compiled, it is often necessary to restrict the involved types to those that can be reified to the target language. For example:

data EDSL :: * -> * where
  Value  :: Reifiable a => a -> EDSL a
  ...
  Return :: Reifiable a => a -> EDSL a
  Bind   :: (Reifiable a, Reifiable b) =>
            EDSL a -> (a -> EDSL b) -> EDSL b

While we can construct a computation using Return and Bind, we cannot declare a Monad instance using those constructors because of the Reifiable class constraint.

(Note: if you want to try out the code in this post, you'll need the following:

{-# LANGUAGE GADTs, MultiParamTypeClasses, KindSignatures,
    ConstraintKinds, TypeFamilies, RankNTypes,
    InstanceSigs, ScopedTypeVariables #-}

import GHC.Exts (Constraint)
import Data.Set hiding (map)

)

Restricted Type Classes


There have been numerous solutions proposed to address this problem. John Hughes suggested extending Haskell with Restricted Data Types: data types with attached class constraints. In the same paper, Hughes also suggested defining Restricted Type Classes: type classes that take a constraint as a parameter and impose it on all polymorphic type variables in the class methods. This latter approach was simulated several times (by Oleg Kiselyov and Ganesh Sittampalam, amongst others), before the constraint-kinds extension made it possible to encode it directly:

class RMonad (c :: * -> Constraint) (m :: * -> *) where
  return :: c a        => a                 -> m a
  (>>=)  :: (c a, c b) => m a -> (a -> m b) -> m b

It is then straightforward to define instances that require class constraints:

instance RMonad Reifiable EDSL where
  return = Return
  (>>=)  = Bind

However, restricted type classes are new type classes: using them doesn't allow compatibility with existing type classes. If restricted type classes were already used everywhere instead of the original type classes then there would be no problem, but this is not the case. A variant of restricted type classes (suggested by Orchard and Schrijvers is to use an associated type function with a default instance:

class Monad (m :: * -> *) where
  type Con m (a :: *) :: Constraint
  type Con m a = ()

  return :: Con m a            => a                 -> m a
  (>>=)  :: (Con m a, Con m b) => m a -> (a -> m b) -> m b

instance Monad EDSL where
  type Con EDSL a = Reifiable a

  return = Return
  (>>=)  = Bind

An attraction of this approach is that this type class could replace the existing Monad class in the standard libraries, without breaking any existing code. EDIT: Edward Kmett points out that this claim is not true (see comment below). Any code that is polymorphic in an arbitrary monad m would be broken, as the unknown constraint Con m will need to be satisfied.

Normality can be Constraining


If we don't want to modify the type class, then the alternative is to modify the data type. Specifically, we need to modify it in such a way that we can declare the type-class instance we want, but such that the operations of that type class will correspond to the operations we desired on the original data type. For monads, one way to do this is to use continuations, as demonstrated by Persson et al. An alternative (and, in our opinion, more intuitive) way to achieve the same effect is to construct a deep embedding of the computation, and restructure it into a normal form. The normal form we use is the same one used by Unimo and Operational, and consists of a sequence of right-nested >>=s terminating with a return:





The first argument to each >>= is a value of the original data type, which we will call primitive operations (a.k.a. "non-proper morphisms", "effect basis", or "instructions sets").

The key feature of the normal form is that every type either appears as a type parameter on a primitive operation, or appears as the top-level type parameter of the computation. Consequently, if we enforce that all primitives have constrained type parameters, then only the top-level type parameter can remain unconstrained (which is easy to deal with, as we will show later). We can represent this using the following deep embedding:

data NM :: (* -> Constraint) -> (* -> *) -> * -> * where
  Return :: a                             -> NM c t a
  Bind   :: c x => t x -> (x -> NM c t a) -> NM c t a

The t parameter is the type of the primitive operations (e.g. Set), and c is the class constraint (e.g. Ord).

We can define a Monad instance for this deep embedding, which applies the monad laws to restructure the computation into the normal form during construction (just like the Operational package.)

instance Monad (NM c t) where
  return :: a -> NM c t a
  return = Return

  (>>=) :: NM c t a -> (a -> NM c t b) -> NM c t b
  (Return a)  >>= k = k a                        -- left identity
  (Bind ta h) >>= k = Bind ta (\ a -> h a >>= k) -- associativity

Primitive operations can be lifted into the NM type by applying the remaining monad law:

liftNM :: c a => t a -> NM c t a
liftNM ta = Bind ta Return -- right identity

Notice that only primitive operations with constrained type parameters can be lifted, thereby preventing any unconstrained types infiltrating the computation.

Once a computation has been constructed, it can then be interpreted in whatever way is desired. In many cases (e.g. the Set monad), we want to interpret it as the same type as the primitive operations. This can be achieved by the following lowering function, which takes interpretations for return and >>= as arguments.

lowerNM :: forall a c t. (a -> t a) ->
  (forall x. c x => t x -> (x -> t a) -> t a) -> NM c t a -> t a
lowerNM ret bind = lowerNM'
  where
    lowerNM' :: NM c t a -> t a
    lowerNM' (Return a)  = ret a
    lowerNM' (Bind tx k) = bind tx (lowerNM' . k)

Because the top-level type parameter of the computation is visible, we can (crucially) also constrain that type. For example, we can lower a monadic Set computation as follows:

lowerSet :: Ord a => NM Ord Set a -> Set a
lowerSet = lowerNM singleton bindSet

This approach is essentially how the AsMonad transformer from the RMonad library is implemented.

The idea of defining a deep embedding of a normal form that only contains constrained types is not specific to monads, but can be applied to any type class with a normal form such that all types appears as parameters on primitive operations, or as a top-level type parameter. We've just written a paper about this, which is available online along with accompanying code. The code for our principal solution is also available on Hackage.

9 comments:

Edward Kmett said...

"An attraction of this approach is that this type class could replace the existing Monad class in the standard libraries, without breaking any existing code."

This is unfortunately just not true. Any code that relies on polymorphic recursion is irredeemably broken by this change. It doesn't break simple code, but it makes harder cases all but impossible to write.

In the presence of polymorphic recursion using constrained monads by default means you have to limit yourself to a particular monad -- or you have to provide explicit witnesses using something like the constraints package that can be used to derive the type for the constraint for the polymorphically recursive case from the previous recursion level, but that is quite horrific and its particular to each case. =(

neil said...

EDIT: Ah, I see. Yes, my claim was bogus. Any code that works with an arbitrary Monad m would break because the constraints may not, in general, be empty. Example:
newtype Kleisli m a b = Kleisli (a -> m b)

instance RMonad m => Category (Kleisli m) where
id :: {- Con m a => -} Kleisli m a a
id = Kleisli return

(.) :: {- (Con m b, Con m c) => -} Kleisli m b c -> Kleisli m a b -> Kleisli m a c
(Kleisli f) . (Kleisli g) = Kleisli (\ a -> g a >>= f)

Edward Kmett said...

You can fix Category by upgrading it to take a constraint of its own, but this doesn't extend to anything that recurses polymorphically:

e.g. Consider how to write the moral equivalent of a Traversable instance for

data Grow a = Grow (Complete (a,a)) | Stop a

in the presence of this kind of constraint.

Worse, you can't even do normal Traversable instances. =(

James said...

Hi Neil,

I think this is related:

http://cs.ioc.ee/~tarmo/papers/fossacs10.pdf

Regards,

James

neil said...

@Edward

Did you mean:

data Grow a = Complete (Grow (a,a)) | Stop a

neil said...

Because if so, then I only half agree. Assuming a "restricted traversable" class:

class RTraversable (t :: * -> *) where
type ConT t (a :: *) :: Constraint
type ConT t a = ()

traverseR :: (ConT t a, ConT t b, Applicative f) =>
(a -> f b) -> t a -> f (t b)


Then we can define a "normal" instance that imposes no constraints and the polymorphic recursion does not pose a problem:

instance RTraversable Grow where
traverseR :: (ConT Grow a, ConT Grow b, Applicative f) =>
(a -> f b) -> Grow a -> f (Grow b)
traverseR g (Stop a) = Stop <$> g a
traverseR g (Complete aa) = Complete <$>
traverseR (\ (a1,a2) -> (,) <$> g a1 <*> g a2) aa

Or is this not what you meant? I agree that an instance that actually imposes a constraint would not work due to the polymorphic recursion, e.g.

class C a where ...

instance RTraversable Grow where
type ConT Grow a = C a
traverseR = -- as above

"Could not deduce (C (b, b)) arising from a use of `traverseR' from the context (ConT Grow a, ConT Grow b, Applicative f)"

neil said...

@James: I hadn't seen that paper; thanks for the link.

Edward Kmett said...

neil: Yes I meant the Complete (Grow (a,a)) case. Thats what I get trying to type an example without preview. =)

But, yes, in general the problem with RTraversable is it can't handle the polymorphic recursive case, and it can only handle cases that are non-polymorphically recursive by leaking the arguments for every intermediate state that gets threaded through the applicative.

This means that now the Traversable instance is even less canonical, because the constraint set can be written several different ways, e.g. by building up a tuple of the args, a level at a time and finally map them into the form they need to be in to (almost) avoid ever putting functions in, etc, but that gets hairy fast.

RTraversable is pretty much useless in practice as idiomatic haskell today, because as the API for applicative currently works you need to round trip almost everything with functions.

traverse f (Foo a b) = Foo traverse f a f b

incurs constraints that you are almost assuredly not going to be able to meet. Extending Applicative with a liftA2.. liftAn would partially alleviate the problem, but working in that system doesn't appeal to me at all. =)

Andy Gill said...

Trying to add a comment