Friday, June 8, 2012

The Kansas University Rewrite Engine (KURE)

The Kansas University Rewrite Engine (KURE) is a Haskell-hosted DSL for strategic programming. We`ve just released the third version of KURE, which adds lenses for navigation and a variant set of combinators to make change detection easier.

This post just overviews the basics, and gives a simple example of usage.

KURE Basics


KURE is based around the following data type:
data Translate c m a b = Translate {apply :: c -> a -> m b}

translate :: (c -> a -> m b) -> Translate c m a b
translate = Translate

There`s a lot of type parameters, but the essential idea is that Translate represents a transformation that can be applied to a value of type a in a context c, and produces a value of type b in the monad m. Actually, we require m to be a MonadPlus, as this allows us to encode notions of success and failure, which are integral to strategic programming. Specifically, mzero represents failure and mplus is a "catch" for both mzero and fail. To avoid clutter we`ll omit the class constraints, but just imagine that wherever you see an m there`s a (MonadPlus m => ...) to go with it.

We also define a synonym for the special case when the result and argument type coincide:
type Rewrite c m a = Translate c m a a

Translate itself forms a monad (and an arrow, and a bunch of other structures besides), which provides us with a lot of combinators for free. Two key definitions are composition and bind:
(>>>) :: Translate c m a b -> Translate c m b d -> Translate c m a d
t1 >>> t2 = translate $ \ c -> apply t1 c >=> apply t2 c

(>>=) :: Translate c m a b -> (b -> Translate c m a d) -> Translate c m a d
t >>= f = translate $ \ c a -> do b <- apply t c a
apply (f b) c a

Observe the difference: composition takes the result of the first translation as the argument to the second translation, whereas bind uses the result to determine the second translation, but then applies that second translation to the original argument.

Another useful combinator is <+> (from the ArrowPlus class), which acts as a catch for Translate:
(<+>) :: Translate c m a b -> Translate c m a b -> Translate c m a b
t1 <+> t2 = translate $ \ c a -> apply t1 c a `mplus` apply t2 c a

We can now write strategic programming code, such as the classic try combinator:
tryR :: Rewrite c m a -> Rewrite c m a
tryR r = r <+> idR

Where idR is the identity rewrite:
idR : Rewrite c m a
idR = translate $ \ _ -> return

Finally, one combinator new to this version of KURE is sequential composition of rewrites that allows one rewrite to fail:
(>+>) :: Rewrite c m a -> Rewrite c m a -> Rewrite c m a

Example: Arithmetic Expressions with Fibonacci


Now let`s consider an example. Take a data type of arithmetic expressions augmented with a Fibonacci primitive:
data Arith = Lit Int | Add Arith Arith | Sub Arith Arith | Fib Arith

To keep things simple, we`ll work with an empty context, and use Maybe as our MonadPlus:
type RewriteA = Rewrite () Maybe Arith

Let`s start with some rewrites that perform basic arithmetic simplification:
addLitR :: RewriteA
addLitR = do Add (Lit m) (Lit n) <- idR
return (Lit (m + n))

subLitR :: RewriteA
subLitR = do Sub (Lit m) (Lit n) <- idR
return (Lit (m - n))

We`re exploiting the fact that Translate is a monad to use do-notation - something we have found extremely convenient. If the pattern match fails, this will just trigger the fail method of the monad, which we can then catch as desired.

Using >+>, we can combine these two rewrites into a single rewrite for arithmetic simplification:
arithR :: RewriteA
arithR = addLitR >+> subLitR

Next a more interesting rewrite, unfolding the definition of Fibonacci:
fibLitR :: RewriteA
fibLitR = do Fib (Lit n) <- idR
case n of
0 -> return (Lit 0)
1 -> return (Lit 1)
_ -> return (Add (Fib (Sub (Lit n) (Lit 1)))
(Fib (Sub (Lit n) (Lit 2)))
)

Tree Traversals


Thus far, we`ve only discussed rewrites that apply to the entire data structure we`re working with. But a key feature of KURE (and strategic programming) is the ability to traverse a structure applying rewrites to specific locations. For example, the anyR combinator applies a rewrite to each immediate child of a node, succeeding if any of those rewrites succeed:
anyR :: RewriteA -> RewriteA

At first glance this might sound simple, but there are a number of issues. Most notably, what if the children have distinct types from each other and their parent? How should such a combinator be typed? This isn`t an issue in this simple Fibonacci example, as there is only one type (Arith), but in general you could have an AST with multiple mutually recursive non-terminals. KURE solves this by constructing a sum data type of all non-terminals in the AST, and having traversal combinators operate over this data type (using Associated Types to specify the sum type for each non-terminal). This is the most significant feature of KURE, but it`d take too long to explain the details here. You can read about it in either of the following papers:

Using the anyR combinator (amongst others), KURE defines various traversal strategies (we just give the specialised types here):
anybuR :: RewriteA -> RewriteA
anybuR r = anyR (anybuR r) >+> r

innermostR :: RewriteA -> RewriteA
innermostR r = anybuR (r >>> tryR (innermostR r))

anybuR traverses a tree in a bottom-up manner, applying the rewrite to every node, whereas innermostR performs a fixed-point traversal, continuing until no more rewrites can be successfully applied. For example, we can define an evaluator for Arith using this strategy:
evalR :: RewriteA
evalR = innermostR (arithR >+> fibLitR)

Release


KURE 2.0.0 is now available on Hackage.

You can find this Fibonacci example, and several others, bundled with the source package. For a non-trivial example, KURE is being used as the underlying rewrite engine for the HERMIT tool. HERMIT hasn`t been released yet, but you can read about it in this paper:

Introducing the HERMIT Equational Reasoning Framework

The paper also describes how KURE uses lenses.

No comments: