Sunday, July 19, 2015

New Papers

Two papers from the University of Kansas Functional Programming Group have been accepted for publication at Haskell'15! One about using HERMIT for equational reasoning, and the other about a monad-based design pattern for remote control that externalizes monadic execution. We've put the preprints on our webpage.

Reasoning with the HERMIT: Tool support for equational reasoning on GHC core programs

(by A. Farmer, N. Sculthorpe, and A. Gill)

A benefit of pure functional programming is that it encourages equational reasoning. However, the Haskell language has lacked direct tool support for such reasoning. Consequently, reasoning about Haskell programs is either performed manually, or in another language that does provide tool support (e.g. Agda or Coq). HERMIT is a Haskell-specific toolkit designed to support equational reasoning and user-guided program transformation, and to do so as part of the GHC compilation pipeline. This paper describes HERMIT’s recently developed support for equational reasoning, and presents two case studies of HERMIT usage: checking that type-class laws hold for specific instance declarations, and mechanising textbook equational reasoning.

The Remote Monad Design Pattern

(by A. Gill, N. Sculthorpe, J. Dawson, A. Eskilson, A. Farmer, M. Grebe, J. Rosenbluth, R. Scott, and J. Stanton)

Remote Procedure Calls are expensive. This paper demonstrates how to reduce the cost of calling remote procedures from Haskell by using the remote monad design pattern, which amortizes the cost of remote calls. This gives the Haskell community access to remote capabilities that are not directly supported, at a surprisingly inexpensive cost.

We explore the remote monad design pattern through six models of remote execution patterns, using a simulated Internet of Things toaster as a running example. We consider the expressiveness and optimizations enabled by each remote execution model, and assess the feasibility of our approach. We then present a full-scale case study: a Haskell library that provides a Foreign Function Interface to the JavaScript Canvas API. Finally, we discuss existing instances of the remote monad design pattern found in Haskell libraries.

Monday, August 18, 2014

Bluetooth on Haskell

I'm presenting a very early draft of my bluetooth library. As its name suggests, bluetooth is a Haskell frontend to low-level Bluetooth APIs, making it similar in spirit to Python's PyBluez and Java's BlueCove.

What it can do

Currently, bluetooth only supports Linux. It has the capability of running an RFCOMM server and client. Theoretically, it should also support L2CAP servers and clients, although this has not been tested yet.

What it will eventually do

I plan to have bluetooth support each of the GHC Tier 1 platforms—that is, Windows, OS X, Linux, and FreeBSD. I want to have the capability to run the full gamut of L2CAP and RFCOMM-related functions on each OS, as well as any additional OS-specific functionality.

Motivation

Bluetooth programming on Haskell is currently in a very primitive state. As of now, there are only two packages on Hackage that make any mention of Bluetooth (as far as I can tell):

  1. network hints in its Network.Socket module that there is an AF_BLUETOOTH socket address family. However, trying to use it with network will fail, since there is no corresponding Bluetooth SockAddr.
  2. simple-bluetooth by Stephen Blackheath offers some of what my own bluetooth package offers (namely, RFCOMM client capability on Windows and Linux).
However, there is currently no comprehensive, cross-platform Haskell Bluetooth library à la PyBluez or BlueCove. I want bluetooth to fill that niche.

How bluetooth works

bluetooth can be thought of as a wrapper around network. After all, Bluetooth programming is socket-based, so Network.Socket already provides most of what one needs to implement a Bluetooth server or client. There are several gotchas with Bluetooth programming, however:
  • Every major OS has a completely different Bluetooth software stack. For example, Linux uses BlueZ, and Windows has several different stacks, including Winsock and Widcomm. Therefore, bluetooth is not likely to work identically on every OS.
  • Windows in particular is challenging to support since several Winsock functions do not work correctly on the version of MinGW-w64 that is currently shipped with GHC for Windows (only the 64-bit version, no less). For this reason, I probably won't develop a Windows version of bluetooth until this issue is resolved.
It is recommended that you have a basic understanding of Bluetooth programming before attempting to use bluetooth. I recommend this introduction by Albert Huang.

Examples

The following are abridged examples of the RFCOMM client and server examples from the bluetooth repo.

RFCOMM server

module Main where
import Data.Set

import Network.Bluetooth
import Network.Socket

main :: IO ()
main = withSocketsDo $ do
    let uuid     = serviceClassToUUID SerialPort
        proto    = RFCOMM
        settings = defaultSDPInfo {
            sdpServiceName = Just "Roto-Rooter Data Router"
          , sdpProviderName = Just "Roto-Rooter"
          , sdpDescription = Just "An experimental plumbing router"
          , sdpServiceClasses = singleton SerialPort
          , sdpProfiles = singleton SerialPort
        }

    handshakeSock <- bluetoothSocket proto
    btPort <- bluetoothBindAnyPort handshakeSock anyAddr
    bluetoothListen handshakeSock 1
    service <- registerSDPService uuid settings proto btPort
    (connSock, connAddr) <- bluetoothAccept handshakeSock
    putStrLn $ "Established connection with address " ++ show connAddr

    message <- recv connSock 4096
    putStrLn $ "Received message! [" ++ message ++ "]"
    let response = reverse message
    respBytes <- send connSock response
    putStrLn $ "Sent response! " ++ show respBytes ++ " bytes."

    close connSock
    close handshakeSock
    closeSDPService service

RFCOMM client

module Main where

import Network.Bluetooth
import Network.Socket

import System.IO

main :: IO ()
main = withSocketsDo $ do
    let addr = read "12:34:56:78:90:00"
        port = 1
    
    sock <- bluetoothSocket RFCOMM
    bluetoothConnect sock addr port
    putStrLn "Established a connection!"
    
    putStr "Please enter a message to send: "
    hFlush stdout
    message <- getLine
    
    messBytes <- send sock message
    response <- recv sock 4096
    putStrLn $ "Received reponse! [" ++ reponse ++ "]"

    close sock

Sunday, June 1, 2014

Haskell, DLLs, and You

Haskell libraries are suitable for the large majority of a functional programmer's use cases. Sometimes, however, there are needs that Haskell alone cannot fulfill. One common example is interfacing with low-level OS utilities. Since many of these tools are written in C, Haskell provides a useful -XForeignFunctionInterface (FFI) extension to use C functions and data directly. Below is an example file (Main1.hsc) that utilizes the FFI to call srand() and rand() on Linux:
{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

import Foreign.C.Types

#include <stdlib.h>

foreign import ccall "srand"
    c_srand :: CUInt -> IO ()

foreign import ccall "rand"
    c_rand :: IO CInt

main :: IO ()
main = readLn >>= c_srand >> c_rand >>= print
Notice that the file extension is .hsc, not .hs. This is because it uses a special #include construct that must be processed by the hsc2hs program (which comes with the Haskell Platform). If you use cabal, this is done automatically. Otherwise, hsc2hs can be invoked like so:
$ hsc2hs Main1.hsc
$ runhaskell Main1.hs
42
71876166
$ runhaskell Main1.hs
27
1416980517
Generally, this process is pretty straightforward for Linux header files. What about other operating systems? One OS in particular, Windows, also provides C headers for its API, but they are much more difficult for hsc2hs to use successfully. Here is a simple example (Main2.hsc) that uses the Windows API function timeGetTime(), which retrieves the current system time in milliseconds:
{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

#include <timeapi.h>

import System.Win32.Types

foreign import ccall "timeGetTime"
    c_timeGetTime :: IO DWORD

main :: IO ()
main = putStrLn . show =<< c_timeGetTime
If you attempt to simply run hsc2hs Main2.hsc, you will get an error resembling Main2.hsc:4:21: fatal error: timeapi.h: No such file or directory. That's not surprising, since we're referencing a Windows API header that isn't shipped with the version of MinGW that the Haskell Platform for Windows uses. To get around this, we'll try including the necessary header location manually (using Cygwin):
$ hsc2hs Main2.hsc -I"C:\Program Files (x86)\Windows Kits\8.1\Include\um"
In file included from Main2.hsc:4:0:
C:\Program Files (x86)\Windows Kits\8.1\Include\um/timeapi.h:17:20: fatal error: apiset.h: No such file or directory
compilation terminated.
Oh no! timeapi.h depends on a header file located in a different directory. That means we should only need to link that additional directory to resolve the issue, right?
$ hsc2hs Main2.hsc -I"C:\Program Files (x86)\Windows Kits\8.1\Include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\Include\shared"
In file included from C:\Program Files (x86)\Windows Kits\8.1\Include\um/timeapi.h:21:0,
                 from Main2.hsc:4:
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:94:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'MMVERSION'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:32: error: expected declaration specifiers or '...' before 'return'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:9: error: function definition declared 'typedef'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h: In function '_Return_type_success_':
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:45: error: expected declaration specifiers before 'UINT'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:102:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'FAR'
...
Ack! As far as I can tell, Windows-style headers are simply not compatible with MinGW. It is clear that using Windows-style header files directly in hsc2hs is a fool's errand.

What can be done about this? One approach is to compile the code into a dynamically linked library (DLL) and pass it to the Haskell compiler's linker. This has the advantage of not needing the #include construct. As an example, we can create a simple DLL project in Microsoft Visual Studio (I used Visual Studio 2013 to perform the following steps):
  1. Create a new Win32 Console Application project (I will give it the name dll_example). Make sure its application type is "DLL" in the correspding wizard menu. For good measure, check "Empty project" and uncheck "Security Development Lifecycle (SDL) checks".
  2. In the Solution Explorer, right-click on "Header Files" and click Add > New Item. Select "Header file (.h)" and name it dll_example.h.
  3. Use the following code for dll_example.h:
    #ifndef DLL_EXAMPLE_H
    #define DLL_EXAMPLE_H
    
    #include <windows.h>
    
    #ifdef DLL_EXAMPLE_DLL_EXPORTS
    #define DLL_EXAMPLE_DLL_API __declspec(dllexport)
    #else
    #define DLL_EXAMPLE_DLL_API __declspec(dllimport)
    #endif
    
    DLL_EXAMPLE_DLL_API DWORD time_get_time();
    
    #endif
    
    Note that we use windows.h since it automatically brings all of the definitions of timeapi.h into scope, as well as other needed definitions (such as DWORD).
  4. In the Solution Explorer, right-click on "Source Files" and click Add > New Item. Select "C++ File (.cpp)" and name it dll_example.cpp.
  5. Use the following code for dll_example.cpp:
    #include "dll_example.h"
    #define DLL_EXAMPLE_API
    #pragma comment(lib, "Winmm.lib")
    
    DWORD time_get_time() {
     return timeGetTime();
    }
    
  6. In the Solution Explorer, right-click on "Resource Files" and click Add > New Item. In the left sidebar, click Visual C++ > Code, then click "Module-Definition File (.def)" and name it dll_example.def.
  7. Give dll_example.def the following definition:
    LIBRARY dll_example
    EXPORTS
     time_get_time
    
  8. Click Build > Build Solution. When it is finished, copy the newly created dll_example.dll (it should be in a directory similar to <project directory>/Debug) to the same directory where Main2.hsc is located.
Now we can dynamically link the DLL file by removing the #include <timeapi.h> line from Main2.hsc entirely, changing "timeGetTime" to "time_get_time", renaming it to Main2.hs, and compiling it like so:
$ ghc --make Main2.hs -L. -lWinmm -ldll_example
[1 of 1] Compiling Main             ( Main2.hs, Main2.o )
Linking Main2.exe ...
$ ./Main2.exe
95495217
$ ./Main2.exe
95496824
Great! We seemed to have successfully used the DLL. But what happens when we attempt to execute Main2.exe independently of dll_example.dll?
$ cp Main2.exe ../Main2.exe
$ cd ..
$ ./Main2.exe
.../Main2.exe:  error while loading shared libraries: ?: cannot open shared object file: No such file or directory
Urgh. As it turns out, Windows needs to look up the dynamically linked libraries every time the executable is run. Notable locations that Windows searches include the executable's directory and the directories defined in the PATH environment variable.

This is rather inconvenient for a Haskell programmer, as cabal installs all of its compiled Haskell executables in %APPDATA%/cabal/bin, far away from the custom DLL files it needs. What is the best solution to this problem? This GHC wiki page suggest several approaches, but since I am a fan of straightforward fixes, I prefer to simply copy the needed DLL files directly to %APPDATA%/cabal/bin. Since that's tedious to do manually, we can configure cabal to automate this process.

During my attempts to get cabal to use DLLs during compilation, I discovered that cabal's extra-lib-dirs field only accepts absolute paths. This is a problem for us, since we need to use a custom DLL file whose location is relative to the package's root directory. (There are claims that you can use ${pkgroot} to retrieve this location, but I was not able to get it to work). This solution should resolve both of the aforementioned cabal issues:
  1. Create a new cabal project (i.e., cabal init). Put all of the Main2.hs in the project, and put dll_example.dll in <package root>/lib.
  2. In Main2.cabal, make sure that extra-source-files includes lib/dll_example.dll, and that extra-libraries includes Winmm and dll_example.
  3. Adapt Setup.hs to use this code:
    import Control.Monad
    
    import Debug.Trace
    
    import Distribution.PackageDescription
    import Distribution.Simple
    import Distribution.Simple.LocalBuildInfo
    import Distribution.Simple.Setup
    
    import System.Directory
    import System.FilePath
    
    dllFileName :: FilePath
    dllFileName = "dll_example" <.> "dll"
    
    dllSourceDir :: IO FilePath
    dllSourceDir = do
        curDir <- getCurrentDirectory
        return $ curDir </> "lib"
    
    dllSourcePath :: IO FilePath
    dllSourcePath = do
        sourceDir <- dllSourceDir
        return $ sourceDir </> dllFileName
    
    copyDll :: String -> FilePath -> FilePath -> IO ()
    copyDll message sourcePath destPath = do
        putStrLn message
        putStr "Copying... "
        copyFile sourcePath destPath
        putStrLn "Done."
    
    patchDesc :: FilePath -> PackageDescription -> PackageDescription
    patchDesc sourceDir desc = let Just lib = library desc
                                   lbi = libBuildInfo lib
                                   newlbi = lbi { extraLibDirs = sourceDir : extraLibDirs lbi }
                               in desc { library = Just $ lib { libBuildInfo = newlbi } }
        
    customBuild :: FilePath -> PackageDescription -> LocalBuildInfo -> UserHooks -> BuildFlags -> IO ()
    customBuild sourceDir desc linfo hooks flags = do
        let installDir = bindir $ absoluteInstallDirs desc linfo NoCopyDest
            destPath = installDir </> dllFileName
        sourcePath <- dllSourcePath
        dllExists <- doesFileExist destPath
        
        when (not dllExists) $ copyDll (dllFileName ++ " is not in application data.") sourcePath destPath
        
        destTime <- getModificationTime destPath
        sourceTime <- getModificationTime sourcePath
        
        when (destTime < sourceTime) $ copyDll (dllFileName ++ " is out-of-date.") sourcePath destPath
        
        buildHook simpleUserHooks (patchDesc sourceDir desc) linfo hooks flags
    
    customInstall :: FilePath -> PackageDescription -> LocalBuildInfo -> UserHooks -> InstallFlags -> IO ()
    customInstall sourceDir desc = instHook simpleUserHooks $ patchDesc sourceDir desc
    
    customPostConf :: FilePath -> Args -> ConfigFlags -> PackageDescription -> LocalBuildInfo -> IO ()
    customPostConf sourceDir args conf desc linfo = postConf simpleUserHooks args conf (patchDesc sourceDir desc) linfo
    
    main :: IO ()
    main = do
        sourceDir <- dllSourceDir
        defaultMainWithHooks $ simpleUserHooks
            { buildHook = customBuild sourceDir
            , instHook = customInstall sourceDir
            , postConf = customPostConf sourceDir
            }
    
That code should be sufficient to tell cabal where dll_example.dll is during compilation, where dll_example.dll should be copied to, and when it needs to be copied (i.e., if it doesn't exist or is out-of-date). You should now be able to compile the project simply with cabal install without worrying about ugly GCC flags.

The downside to this approach is now there are extra files to manage if the user ever wants to uninstall Main2. One way to resolve this is to provide users with a makefile containing an uninstall command that automatically removes Main2.exe and dll_example.dll from %APPDATA%/cabal/bin. If you want to see an example of this, check out the hermit-bluetooth repo, the project in which I encountered all of these problems (and motivated me to make this blog post so that maybe I can save other people some time).

Working with DLLs in Haskell tends to be quite gruesome, and I'd recommend avoiding it whenever possible. In same cases, though, dynamic linking is the only feasible solution to one's problems (especially on Windows), so it's nice to know that the infrastructure for interfacing with DLLs exists (even if actually using that interface is a tad unpleasant).

Friday, April 12, 2013

Sunroof: Clockwork Progress

In this article, we are going to generate a JavaScript application. Last year, we wrote a blog post about using monad reification to implement a JavaScript compiler. The compiler, called Sunroof, is now in state that we can make the first public release. By way of a non-trivial example, this blog entry illustrates how to construct an analog clock as a self-contained JavaScript application that renders the clock using the HTML5 canvas element.

[caption id="attachment_163" align="alignnone" width="320" caption="The Sunroof clock example"]The Sunroof clock example[/caption]

The JavaScript API for HTML5 canvas element is already provided by Sunroof in the module Language.Sunroof.JS.Canvas. Lets look how we can render one line of the clock face using Sunroof:
c # save
-- Draw one of the indicator lines
c # beginPath
c # moveTo (0, -u * 1.0)
ifB (n `mod` 5 ==* 0)
(c # lineTo (0, -u * 0.8)) -- Minute line
(c # lineTo (0, -u * 0.9)) -- Hour line
ifB (n `mod` 15 ==* 0)
(c # setLineWidth 8 ) -- Quarter line
(c # setLineWidth 3 ) -- Non-Quarter line
c # stroke
c # closePath
-- Draw of the hour numbers
ifB (n `mod` 5 ==* 0)
(do
c # translate (-u * 0.75, 0)
c # rotate (-2 * pi / 4)
c # fillText (cast $ n `div` 5) (0, 0)
) (return ())
c # restore

The monadic do-notation is used for sequencing JavaScript statements in a neat fashion.

The first few lines probably look familiar to people who have written JavaScript before.
c # save
-- Draw one of the indicator lines
c # beginPath
c # moveTo (0, -u * 1.0)

The #-operator is used instead of the .-operator in JavaScript. u represents the radius of the clock. Knowing this you can see that we are calling methods on the JavaScript object c (Our canvas context). The methods without parameters do not require empty parenthesis, as a Haskell programmer would expect. The tuple used in the call of moveTo is only there to indicate that this parameter is a coordinate, not two single numbers. You can also see that JavaScript numbers are neatly embedded using the Num-class and can be used naturally.

The next few lines show a branch.
ifB (n `mod` 5 ==* 0)
(c # lineTo (0, -u * 0.8)) -- Minute line
(c # lineTo (0, -u * 0.9)) -- Hour line

Haskell lacks the possibilities to deep embed branches and boolean expressions. For that reason we use the Data.Boolean package. Instead of if-then-else you are required to use ifB when writing JavaScript.
ifB (n `mod` 5 ==* 0)
(do
c # translate (-u * 0.75, 0)
c # rotate (-2 * pi / 4)
c # fillText (cast $ n `div` 5) (0, 0)
) (return ())

Note the cast operation in line five. As Haskell's type system is more restrictive then the one used in JavaScript, we sometimes have to cast one value to another. This may seem more complicated then writing JavaScript by hand, but when using the API correctly (by not working around it) compile time errors can show mistakes in the code early.

Getting back to the initial code block: How do we render the other 59 lines of the clock face? We just wrap this code into a function. Of course, we do this at JavaScript level.
renderClockFaceLine <- function $ \
( c :: JSCanvas
, u :: JSNumber
, n :: JSNumber) -> do
...

We have just created the JavaScript function renderClockFaceLine with three parameters. So lets render the complete clock face using the forEach-method provided by arrays.
c # save
c # rotate (2 * pi / 4) -- 0 degrees is at the top
-- Draw all hour lines.
lines <- array [1..60::Int]
lines # forEach $ \n -> do
c # save
c # rotate ((2 * pi / 60) * n)
renderClockFaceLine $$ (c, u, n)
c # restore
c # restore -- Undo all the rotation.

The array combinator converts the list into a JavaScript array. The supplied function for the loop body takes the current element as a parameter. In the loop body you can see how the $$-operator is used just as the $-operator in Haskell to apply a JavaScript function to arguments. As the usefulness of partial application is questionable in the context of deep embedded JavaScript, we only allow uncurried functions.

Using these techniques we can render the clock with about 90 lines of Haskell.
clockJS :: JS A (JSFunction () ())
clockJS = function $ \() -> do
-- Renders a single line (with number) of the clock face.
renderClockFaceLine <- function $ \
( c :: JSCanvas
, u :: JSNumber
, n :: JSNumber) -> do
c # save
-- Draw one of the indicator lines
c # beginPath
c # moveTo (0, -u * 1.0)
ifB (n `mod` 5 ==* 0)
(c # lineTo (0, -u * 0.8)) -- Minute line
(c # lineTo (0, -u * 0.9)) -- Hour line
ifB (n `mod` 15 ==* 0)
(c # setLineWidth 8 ) -- Quarter line
(c # setLineWidth 3 ) -- Non-Quarter line
c # stroke
c # closePath
-- Draw of the hour numbers
ifB (n `mod` 5 ==* 0)
(do
c # translate (-u * 0.75, 0)
c # rotate (-2 * pi / 4)
c # fillText (cast $ n `div` 5) (0, 0)
) (return ())
c # restore
-- Renders a single clock pointer.
renderClockPointer <- function $ \
( c :: JSCanvas
, u :: JSNumber
, angle :: JSNumber
, width :: JSNumber
, len :: JSNumber) -> do
c # save
c # setLineCap "round"
c # rotate angle
c # setLineWidth width
c # beginPath
c # moveTo (0, u * 0.1)
c # lineTo (0, -u * len)
c # stroke
c # closePath
c # restore
-- Renders the clocks pointers for hours, minutes and seconds.
renderClockPointers <- function $ \(c :: JSCanvas, u :: JSNumber) -> do
(h, m, s) <- currentTime
c # save
c # setLineCap "round"
-- Hour pointer
renderClockPointer $$
(c, u, (2 * pi / 12) * ((h `mod` 12) + (m `mod` 60) / 60), 15, 0.4)
-- Minute pointer
renderClockPointer $$
( c, u, (2 * pi / 60) * ((m `mod` 60) + (s `mod` 60) / 60), 10, 0.7)
-- Second pointer
c # setStrokeStyle "red"
renderClockPointer $$ ( c, u, (2 * pi / 60) * (s `mod` 60), 4, 0.9)
-- Restore everything
c # restore
-- Renders the complete face of the clock, without pointers.
renderClockFace <- function $ \(c :: JSCanvas, u :: JSNumber) -> do
c # save
c # rotate (2 * pi / 4) -- 0 degrees is at the top
-- Draw all hour lines.
lines <- array [1..60::Int]
lines # forEach $ \n -> do
c # save
c # rotate ((2 * pi / 60) * n)
renderClockFaceLine $$ (c, u, n)
c # restore
c # restore -- Undo all the rotation.
-- Renders the complete clock.
renderClock <- continuation $ \() -> do
u <- clockUnit
(w,h) <- canvasSize
c <- context
-- Basic setup
c # save
c # setFillStyle "black"
c # setStrokeStyle "black"
c # setLineCap "round"
c # setTextAlign "center"
c # setFont ((cast $ u * 0.1) <> "px serif")
c # setTextBaseline "top"
c # clearRect (0,0) (w,h)
c # translate (w / 2, h / 2)
-- Draw all hour lines.
renderClockFace $$ (c, u)
-- Draw the clock pointers
renderClockPointers $$ (c, u)
c # restore
return ()
window # setInterval (goto renderClock) 1000
-- and draw one now, rather than wait till later
goto renderClock ()

return ()

Using the sunroofCompileJSA function we can compile the deep embedded JavaScript into a string of actual JavaScript.
sunroofCompileJSA def "main" clockJS >>= writeFile "main.js"

The compiled string will contain a function main that executes our JavaScript. This is then called in the HTML file to execute.

There are a few small utilities used in the code. The current time is perceived by currentTime which uses the JavaScript date API provided by the module Language.Sunroof.JS.Date.
currentTime :: JS A (JSNumber, JSNumber, JSNumber)
currentTime = do
date <- newDate ()
h <- date # getHours
m <- date # getMinutes
s <- date # getSeconds
return (h, m, s)

Note that this will literally copy the JavaScript produced by currentTime to where it is used, because it is not abstracted to a function in JavaScript. Every time you write Sunroof code that is not wrapped in a function, the Haskell binding will work like a macro.

The other helpers are just shortcuts to get certain values:
canvas :: JS A JSObject
canvas = document # getElementById "canvas"

context :: JS A JSCanvas
context = canvas >>= getContext "2d"

clockUnit :: JS A JSNumber
clockUnit = do
(w, h) <- canvasSize
return $ (maxB w h) / 2

canvasSize :: JS A (JSNumber, JSNumber)
canvasSize = do
c <- jQuery "#canvas"
w <- c # invoke "innerWidth" ()
h <- c # invoke "innerHeight" ()
return (w, h)

You can see the clock in action here.

As you can see Sunroof mirrors JavaScript closely, and allows access to the capabilities a browser provides. But is this Haskell for Haskell's sake? We do not think so:

  • Sunroof is a deeply embedded DSL, so it is easy to write functions that generate custom code.

  • Sunroof provides some level of type safely on top of JavaScript, including typed arrays, finite maps, functions and continuations.

  • Sunroof also offers an abstraction over the JavaScript threading model, by providing two types of threads, atomic and (cooperatively) blocking. On top of this, Sunroof provides some Haskell concurrency patterns
    like MVar or Chan (JSMVar and JSChan).

  • Furthermore, the sunroof-server package offers a ready to use web-server to deploy generated JavaScript on the fly. It enables you to interleave Haskell and JavaScript computations as needed, through synchronous or asynchronous remote procedure calls.


A number of examples and a tutorial is provided on GitHub. Their Haskell sources can be found on github, they are part of the sunroof-examples package.

Wednesday, April 3, 2013

The Constrained-Type-Class Problem

In Haskell, there are some data types that you want to make an instance of a standard type class, but are unable to do so because of class constraints on the desired class methods. The classic example is that the Set type (from Data.Set) cannot be made an instance of Monad because of an Ord constraint on its desired binding operation:

returnSet :: a -> Set a
returnSet = singleton

bindSet :: Ord b => Set a -> (a -> Set b) -> Set b
bindSet sa k = unions (map k (toList sa))

However, despite being the classic example, in some ways it's not a very good example, because the constraint appears only on the second type parameter of bindSet, not on the first type parameter, nor on returnSet.

Another example of the problem also arises in the context of embedded domain-specific languages. When constructing a deep embedding of a computation that will later be compiled, it is often necessary to restrict the involved types to those that can be reified to the target language. For example:

data EDSL :: * -> * where
  Value  :: Reifiable a => a -> EDSL a
  ...
  Return :: Reifiable a => a -> EDSL a
  Bind   :: (Reifiable a, Reifiable b) =>
            EDSL a -> (a -> EDSL b) -> EDSL b

While we can construct a computation using Return and Bind, we cannot declare a Monad instance using those constructors because of the Reifiable class constraint.

(Note: if you want to try out the code in this post, you'll need the following:

{-# LANGUAGE GADTs, MultiParamTypeClasses, KindSignatures,
    ConstraintKinds, TypeFamilies, RankNTypes,
    InstanceSigs, ScopedTypeVariables #-}

import GHC.Exts (Constraint)
import Data.Set hiding (map)

)

Restricted Type Classes


There have been numerous solutions proposed to address this problem. John Hughes suggested extending Haskell with Restricted Data Types: data types with attached class constraints. In the same paper, Hughes also suggested defining Restricted Type Classes: type classes that take a constraint as a parameter and impose it on all polymorphic type variables in the class methods. This latter approach was simulated several times (by Oleg Kiselyov and Ganesh Sittampalam, amongst others), before the constraint-kinds extension made it possible to encode it directly:

class RMonad (c :: * -> Constraint) (m :: * -> *) where
  return :: c a        => a                 -> m a
  (>>=)  :: (c a, c b) => m a -> (a -> m b) -> m b

It is then straightforward to define instances that require class constraints:

instance RMonad Reifiable EDSL where
  return = Return
  (>>=)  = Bind

However, restricted type classes are new type classes: using them doesn't allow compatibility with existing type classes. If restricted type classes were already used everywhere instead of the original type classes then there would be no problem, but this is not the case. A variant of restricted type classes (suggested by Orchard and Schrijvers is to use an associated type function with a default instance:

class Monad (m :: * -> *) where
  type Con m (a :: *) :: Constraint
  type Con m a = ()

  return :: Con m a            => a                 -> m a
  (>>=)  :: (Con m a, Con m b) => m a -> (a -> m b) -> m b

instance Monad EDSL where
  type Con EDSL a = Reifiable a

  return = Return
  (>>=)  = Bind

An attraction of this approach is that this type class could replace the existing Monad class in the standard libraries, without breaking any existing code. EDIT: Edward Kmett points out that this claim is not true (see comment below). Any code that is polymorphic in an arbitrary monad m would be broken, as the unknown constraint Con m will need to be satisfied.

Normality can be Constraining


If we don't want to modify the type class, then the alternative is to modify the data type. Specifically, we need to modify it in such a way that we can declare the type-class instance we want, but such that the operations of that type class will correspond to the operations we desired on the original data type. For monads, one way to do this is to use continuations, as demonstrated by Persson et al. An alternative (and, in our opinion, more intuitive) way to achieve the same effect is to construct a deep embedding of the computation, and restructure it into a normal form. The normal form we use is the same one used by Unimo and Operational, and consists of a sequence of right-nested >>=s terminating with a return:





The first argument to each >>= is a value of the original data type, which we will call primitive operations (a.k.a. "non-proper morphisms", "effect basis", or "instructions sets").

The key feature of the normal form is that every type either appears as a type parameter on a primitive operation, or appears as the top-level type parameter of the computation. Consequently, if we enforce that all primitives have constrained type parameters, then only the top-level type parameter can remain unconstrained (which is easy to deal with, as we will show later). We can represent this using the following deep embedding:

data NM :: (* -> Constraint) -> (* -> *) -> * -> * where
  Return :: a                             -> NM c t a
  Bind   :: c x => t x -> (x -> NM c t a) -> NM c t a

The t parameter is the type of the primitive operations (e.g. Set), and c is the class constraint (e.g. Ord).

We can define a Monad instance for this deep embedding, which applies the monad laws to restructure the computation into the normal form during construction (just like the Operational package.)

instance Monad (NM c t) where
  return :: a -> NM c t a
  return = Return

  (>>=) :: NM c t a -> (a -> NM c t b) -> NM c t b
  (Return a)  >>= k = k a                        -- left identity
  (Bind ta h) >>= k = Bind ta (\ a -> h a >>= k) -- associativity

Primitive operations can be lifted into the NM type by applying the remaining monad law:

liftNM :: c a => t a -> NM c t a
liftNM ta = Bind ta Return -- right identity

Notice that only primitive operations with constrained type parameters can be lifted, thereby preventing any unconstrained types infiltrating the computation.

Once a computation has been constructed, it can then be interpreted in whatever way is desired. In many cases (e.g. the Set monad), we want to interpret it as the same type as the primitive operations. This can be achieved by the following lowering function, which takes interpretations for return and >>= as arguments.

lowerNM :: forall a c t. (a -> t a) ->
  (forall x. c x => t x -> (x -> t a) -> t a) -> NM c t a -> t a
lowerNM ret bind = lowerNM'
  where
    lowerNM' :: NM c t a -> t a
    lowerNM' (Return a)  = ret a
    lowerNM' (Bind tx k) = bind tx (lowerNM' . k)

Because the top-level type parameter of the computation is visible, we can (crucially) also constrain that type. For example, we can lower a monadic Set computation as follows:

lowerSet :: Ord a => NM Ord Set a -> Set a
lowerSet = lowerNM singleton bindSet

This approach is essentially how the AsMonad transformer from the RMonad library is implemented.

The idea of defining a deep embedding of a normal form that only contains constrained types is not specific to monads, but can be applied to any type class with a normal form such that all types appears as parameters on primitive operations, or as a top-level type parameter. We've just written a paper about this, which is available online along with accompanying code. The code for our principal solution is also available on Hackage.

Friday, June 8, 2012

The Kansas University Rewrite Engine (KURE)

The Kansas University Rewrite Engine (KURE) is a Haskell-hosted DSL for strategic programming. We`ve just released the third version of KURE, which adds lenses for navigation and a variant set of combinators to make change detection easier.

This post just overviews the basics, and gives a simple example of usage.

KURE Basics


KURE is based around the following data type:
data Translate c m a b = Translate {apply :: c -> a -> m b}

translate :: (c -> a -> m b) -> Translate c m a b
translate = Translate

There`s a lot of type parameters, but the essential idea is that Translate represents a transformation that can be applied to a value of type a in a context c, and produces a value of type b in the monad m. Actually, we require m to be a MonadPlus, as this allows us to encode notions of success and failure, which are integral to strategic programming. Specifically, mzero represents failure and mplus is a "catch" for both mzero and fail. To avoid clutter we`ll omit the class constraints, but just imagine that wherever you see an m there`s a (MonadPlus m => ...) to go with it.

We also define a synonym for the special case when the result and argument type coincide:
type Rewrite c m a = Translate c m a a

Translate itself forms a monad (and an arrow, and a bunch of other structures besides), which provides us with a lot of combinators for free. Two key definitions are composition and bind:
(>>>) :: Translate c m a b -> Translate c m b d -> Translate c m a d
t1 >>> t2 = translate $ \ c -> apply t1 c >=> apply t2 c

(>>=) :: Translate c m a b -> (b -> Translate c m a d) -> Translate c m a d
t >>= f = translate $ \ c a -> do b <- apply t c a
apply (f b) c a

Observe the difference: composition takes the result of the first translation as the argument to the second translation, whereas bind uses the result to determine the second translation, but then applies that second translation to the original argument.

Another useful combinator is <+> (from the ArrowPlus class), which acts as a catch for Translate:
(<+>) :: Translate c m a b -> Translate c m a b -> Translate c m a b
t1 <+> t2 = translate $ \ c a -> apply t1 c a `mplus` apply t2 c a

We can now write strategic programming code, such as the classic try combinator:
tryR :: Rewrite c m a -> Rewrite c m a
tryR r = r <+> idR

Where idR is the identity rewrite:
idR : Rewrite c m a
idR = translate $ \ _ -> return

Finally, one combinator new to this version of KURE is sequential composition of rewrites that allows one rewrite to fail:
(>+>) :: Rewrite c m a -> Rewrite c m a -> Rewrite c m a

Example: Arithmetic Expressions with Fibonacci


Now let`s consider an example. Take a data type of arithmetic expressions augmented with a Fibonacci primitive:
data Arith = Lit Int | Add Arith Arith | Sub Arith Arith | Fib Arith

To keep things simple, we`ll work with an empty context, and use Maybe as our MonadPlus:
type RewriteA = Rewrite () Maybe Arith

Let`s start with some rewrites that perform basic arithmetic simplification:
addLitR :: RewriteA
addLitR = do Add (Lit m) (Lit n) <- idR
return (Lit (m + n))

subLitR :: RewriteA
subLitR = do Sub (Lit m) (Lit n) <- idR
return (Lit (m - n))

We`re exploiting the fact that Translate is a monad to use do-notation - something we have found extremely convenient. If the pattern match fails, this will just trigger the fail method of the monad, which we can then catch as desired.

Using >+>, we can combine these two rewrites into a single rewrite for arithmetic simplification:
arithR :: RewriteA
arithR = addLitR >+> subLitR

Next a more interesting rewrite, unfolding the definition of Fibonacci:
fibLitR :: RewriteA
fibLitR = do Fib (Lit n) <- idR
case n of
0 -> return (Lit 0)
1 -> return (Lit 1)
_ -> return (Add (Fib (Sub (Lit n) (Lit 1)))
(Fib (Sub (Lit n) (Lit 2)))
)

Tree Traversals


Thus far, we`ve only discussed rewrites that apply to the entire data structure we`re working with. But a key feature of KURE (and strategic programming) is the ability to traverse a structure applying rewrites to specific locations. For example, the anyR combinator applies a rewrite to each immediate child of a node, succeeding if any of those rewrites succeed:
anyR :: RewriteA -> RewriteA

At first glance this might sound simple, but there are a number of issues. Most notably, what if the children have distinct types from each other and their parent? How should such a combinator be typed? This isn`t an issue in this simple Fibonacci example, as there is only one type (Arith), but in general you could have an AST with multiple mutually recursive non-terminals. KURE solves this by constructing a sum data type of all non-terminals in the AST, and having traversal combinators operate over this data type (using Associated Types to specify the sum type for each non-terminal). This is the most significant feature of KURE, but it`d take too long to explain the details here. You can read about it in either of the following papers:

Using the anyR combinator (amongst others), KURE defines various traversal strategies (we just give the specialised types here):
anybuR :: RewriteA -> RewriteA
anybuR r = anyR (anybuR r) >+> r

innermostR :: RewriteA -> RewriteA
innermostR r = anybuR (r >>> tryR (innermostR r))

anybuR traverses a tree in a bottom-up manner, applying the rewrite to every node, whereas innermostR performs a fixed-point traversal, continuing until no more rewrites can be successfully applied. For example, we can define an evaluator for Arith using this strategy:
evalR :: RewriteA
evalR = innermostR (arithR >+> fibLitR)

Release


KURE 2.0.0 is now available on Hackage.

You can find this Fibonacci example, and several others, bundled with the source package. For a non-trivial example, KURE is being used as the underlying rewrite engine for the HERMIT tool. HERMIT hasn`t been released yet, but you can read about it in this paper:

Introducing the HERMIT Equational Reasoning Framework

The paper also describes how KURE uses lenses.

Sunday, June 3, 2012

Introducing the HERMIT Equational Reasoning Framework



Imagine you are sitting at your terminal wishing your Haskell program would go faster. The optimization flag has been turned on, and you know of an unimplemented transformation that could help. What do you do? You could add a new optimization pass to GHC, taking part in the dark art of tuning heuristics to allow it to play well with others. Or you could experiment, using HERMIT.


First Example of HERMIT


As an example we use the Fibonacci function, not because it is interesting, but because it is so well known.


module Main where

fib :: Int -> Int
fib n = if n < 2 then 1 else fib(n-1) + fib(n-2)

Compiling with -O2, and using Criterion to average over 100 tests, we observe that fib 35 runs in 124.0ms±2.6ms on our development laptop.


To enable further optimization of fib, we want to try unrolling the recursive calls. Like the GHC optimizer, we want to do this without changing the source, which is clear and concise.


To do so, we fire up HERMIT, choosing to use the command-line interface (because the GUI version is not written yet). HERMIT uses the GHC plugin mechanism, to insert itself into the optimization pipeline as a rather non-traditional compiler pass, capturing programs mid-compilation and allowing the user to manipulate them.



ghc -fplugin=Language.HERMIT.Plugin
-fplugin-opt=Language.HERMIT.Plugin:main:Main:mode=i
-fforce-recomp -O2 -dcore-lint Main.hs
[1 of 1] Compiling Main ( Main.hs, Main.o )
...
module main:Main where
fib Int -> Int
...
hermit> _



GHC has compiled our program into its intermediate form, called GHC Core, and HERMIT is asking for input. At this point we can start exploring our captured program. There are several bindings, but let us consider just fib.



hermit> consider 'fib
rec fib = λ n
case (<) $fOrdInt n (I# 2) of wild
False
(+) $fNumInt
(fib ((-) $fNumInt n (I# 1)))
(fib ((-) $fNumInt n (I# 2)))
True I# 1


The consider command moves the focus onto the specified function, and the REPL displays the function's abstract syntax. This is HERMIT's ``compact'' pretty printer for GHC Core. Core programs necessarily contain a substantial amount of information, and in this render engine HERMIT uses as a placeholder for type arguments. Notice that we refer to a Haskell syntactical name using the Template Haskell convention of prefixing with a tick.


Next we want to unroll fib, to see if this improves performance. Specifically, we want to inline all calls to fib (but nothing else) in a bottom-up manner. HERMIT uses strategic programming as its basis, so we build a strategy using two HERMIT functions.



hermit> any-bu (inline 'fib)
rec fib = λ n
case (<) $fOrdInt n (I# 2) of wild
False
(+) $fNumInt
((λ n
case (<) $fOrdInt n (I# 2) of wild
False
(+) $fNumInt
(fib ((-) $fNumInt n (I# 1)))
(fib ((-) $fNumInt n (I# 2)))
True I# 1)
((-) $fNumInt n (I# 1)))
((λ n
case (<) $fOrdInt n (I# 2) of wild
False
(+) $fNumInt
(fib ((-) $fNumInt n (I# 1)))
(fib ((-) $fNumInt n (I# 2)))
True I# 1)
((-) $fNumInt n (I# 2)))
True I# 1


Each inner instance of fib has been replaced with its definition, effectively unrolling fib. There are many more simplifications that can be done, such as β-reduction, but GHC is good at this. For now, we just want to test this version; so we type exit and GHC resumes compiling our program.


Now we time this new version. Again, taking the average over 100 tests using Criterion, we observe that the post-HERMIT fib 35 runs in about 76.7ms±3.6ms. We have modified GHC's internal representation of our program mid-compilation, improving it such that the resulting program runs measurably faster. Furthermore, we did not need to alter the original program to do so.


More Information


If you are interested in learning more about HERMIT, we've just submitted a paper to the Haskell Symposium, called Introducing the HERMIT Equational Reasoning Framework, written by Andrew Farmer, Ed Komp, Neil Sculthorpe, and myself. A first release of HERMIT will be available via hackage in the near future.


Andy Gill