Functional Programming at the University of Kansas: Tutorial

Showing posts with label Tutorial. Show all posts

Sunday, June 1, 2014

Haskell, DLLs, and You

Haskell libraries are suitable for the large majority of a functional programmer's use cases. Sometimes, however, there are needs that Haskell alone cannot fulfill. One common example is interfacing with low-level OS utilities. Since many of these tools are written in C, Haskell provides a useful -XForeignFunctionInterface (FFI) extension to use C functions and data directly. Below is an example file (Main1.hsc) that utilizes the FFI to call srand() and rand() on Linux:

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

import Foreign.C.Types

#include <stdlib.h>

foreign import ccall "srand"
    c_srand :: CUInt -> IO ()

foreign import ccall "rand"
    c_rand :: IO CInt

main :: IO ()
main = readLn >>= c_srand >> c_rand >>= print

Notice that the file extension is .hsc, not .hs. This is because it uses a special #include construct that must be processed by the hsc2hs program (which comes with the Haskell Platform). If you use cabal, this is done automatically. Otherwise, hsc2hs can be invoked like so:

$ hsc2hs Main1.hsc
$ runhaskell Main1.hs
42
71876166
$ runhaskell Main1.hs
27
1416980517

Generally, this process is pretty straightforward for Linux header files. What about other operating systems? One OS in particular, Windows, also provides C headers for its API, but they are much more difficult for hsc2hs to use successfully. Here is a simple example (Main2.hsc) that uses the Windows API function timeGetTime(), which retrieves the current system time in milliseconds:

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

#include <timeapi.h>

import System.Win32.Types

foreign import ccall "timeGetTime"
    c_timeGetTime :: IO DWORD

main :: IO ()
main = putStrLn . show =<< c_timeGetTime

If you attempt to simply run hsc2hs Main2.hsc, you will get an error resembling Main2.hsc:4:21: fatal error: timeapi.h: No such file or directory. That's not surprising, since we're referencing a Windows API header that isn't shipped with the version of MinGW that the Haskell Platform for Windows uses. To get around this, we'll try including the necessary header location manually (using Cygwin):

$ hsc2hs Main2.hsc -I"C:\Program Files (x86)\Windows Kits\8.1\Include\um"
In file included from Main2.hsc:4:0:
C:\Program Files (x86)\Windows Kits\8.1\Include\um/timeapi.h:17:20: fatal error: apiset.h: No such file or directory
compilation terminated.

Oh no! timeapi.h depends on a header file located in a different directory. That means we should only need to link that additional directory to resolve the issue, right?

$ hsc2hs Main2.hsc -I"C:\Program Files (x86)\Windows Kits\8.1\Include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\Include\shared"
In file included from C:\Program Files (x86)\Windows Kits\8.1\Include\um/timeapi.h:21:0,
                 from Main2.hsc:4:
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:94:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'MMVERSION'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:32: error: expected declaration specifiers or '...' before 'return'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:9: error: function definition declared 'typedef'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h: In function '_Return_type_success_':
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:98:45: error: expected declaration specifiers before 'UINT'
C:\Program Files (x86)\Windows Kits\8.1\Include\um/mmsyscom.h:102:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'FAR'
...

Ack! As far as I can tell, Windows-style headers are simply not compatible with MinGW. It is clear that using Windows-style header files directly in hsc2hs is a fool's errand.

What can be done about this? One approach is to compile the code into a dynamically linked library (DLL) and pass it to the Haskell compiler's linker. This has the advantage of not needing the #include construct. As an example, we can create a simple DLL project in Microsoft Visual Studio (I used Visual Studio 2013 to perform the following steps):

Create a new Win32 Console Application project (I will give it the name dll_example). Make sure its application type is "DLL" in the correspding wizard menu. For good measure, check "Empty project" and uncheck "Security Development Lifecycle (SDL) checks".
In the Solution Explorer, right-click on "Header Files" and click Add > New Item. Select "Header file (.h)" and name it dll_example.h.

Use the following code for dll_example.h:

#ifndef DLL_EXAMPLE_H
#define DLL_EXAMPLE_H

#include <windows.h>

#ifdef DLL_EXAMPLE_DLL_EXPORTS
#define DLL_EXAMPLE_DLL_API __declspec(dllexport)
#else
#define DLL_EXAMPLE_DLL_API __declspec(dllimport)
#endif

DLL_EXAMPLE_DLL_API DWORD time_get_time();

#endif

Note that we use windows.h since it automatically brings all of the definitions of timeapi.h into scope, as well as other needed definitions (such as DWORD).

In the Solution Explorer, right-click on "Source Files" and click Add > New Item. Select "C++ File (.cpp)" and name it dll_example.cpp.

Use the following code for dll_example.cpp:

#include "dll_example.h"
#define DLL_EXAMPLE_API
#pragma comment(lib, "Winmm.lib")

DWORD time_get_time() {
 return timeGetTime();
}

In the Solution Explorer, right-click on "Resource Files" and click Add > New Item. In the left sidebar, click Visual C++ > Code, then click "Module-Definition File (.def)" and name it dll_example.def.

Give dll_example.def the following definition:

LIBRARY dll_example
EXPORTS
 time_get_time

Click Build > Build Solution. When it is finished, copy the newly created dll_example.dll (it should be in a directory similar to <project directory>/Debug) to the same directory where Main2.hsc is located.

Now we can dynamically link the DLL file by removing the #include <timeapi.h> line from Main2.hsc entirely, changing "timeGetTime" to "time_get_time", renaming it to Main2.hs, and compiling it like so:

$ ghc --make Main2.hs -L. -lWinmm -ldll_example
[1 of 1] Compiling Main             ( Main2.hs, Main2.o )
Linking Main2.exe ...
$ ./Main2.exe
95495217
$ ./Main2.exe
95496824

Great! We seemed to have successfully used the DLL. But what happens when we attempt to execute Main2.exe independently of dll_example.dll?

$ cp Main2.exe ../Main2.exe
$ cd ..
$ ./Main2.exe
.../Main2.exe:  error while loading shared libraries: ?: cannot open shared object file: No such file or directory

Urgh. As it turns out, Windows needs to look up the dynamically linked libraries every time the executable is run. Notable locations that Windows searches include the executable's directory and the directories defined in the PATH environment variable.

This is rather inconvenient for a Haskell programmer, as cabal installs all of its compiled Haskell executables in %APPDATA%/cabal/bin, far away from the custom DLL files it needs. What is the best solution to this problem? This GHC wiki page suggest several approaches, but since I am a fan of straightforward fixes, I prefer to simply copy the needed DLL files directly to %APPDATA%/cabal/bin. Since that's tedious to do manually, we can configure cabal to automate this process.

During my attempts to get cabal to use DLLs during compilation, I discovered that cabal's extra-lib-dirs field only accepts absolute paths. This is a problem for us, since we need to use a custom DLL file whose location is relative to the package's root directory. (There are claims that you can use ${pkgroot} to retrieve this location, but I was not able to get it to work). This solution should resolve both of the aforementioned cabal issues:

Create a new cabal project (i.e., cabal init). Put all of the Main2.hs in the project, and put dll_example.dll in <package root>/lib.
In Main2.cabal, make sure that extra-source-files includes lib/dll_example.dll, and that extra-libraries includes Winmm and dll_example.

Adapt Setup.hs to use this code:

import Control.Monad

import Debug.Trace

import Distribution.PackageDescription
import Distribution.Simple
import Distribution.Simple.LocalBuildInfo
import Distribution.Simple.Setup

import System.Directory
import System.FilePath

dllFileName :: FilePath
dllFileName = "dll_example" <.> "dll"

dllSourceDir :: IO FilePath
dllSourceDir = do
    curDir <- getCurrentDirectory
    return $ curDir </> "lib"

dllSourcePath :: IO FilePath
dllSourcePath = do
    sourceDir <- dllSourceDir
    return $ sourceDir </> dllFileName

copyDll :: String -> FilePath -> FilePath -> IO ()
copyDll message sourcePath destPath = do
    putStrLn message
    putStr "Copying... "
    copyFile sourcePath destPath
    putStrLn "Done."

patchDesc :: FilePath -> PackageDescription -> PackageDescription
patchDesc sourceDir desc = let Just lib = library desc
                               lbi = libBuildInfo lib
                               newlbi = lbi { extraLibDirs = sourceDir : extraLibDirs lbi }
                           in desc { library = Just $ lib { libBuildInfo = newlbi } }
    
customBuild :: FilePath -> PackageDescription -> LocalBuildInfo -> UserHooks -> BuildFlags -> IO ()
customBuild sourceDir desc linfo hooks flags = do
    let installDir = bindir $ absoluteInstallDirs desc linfo NoCopyDest
        destPath = installDir </> dllFileName
    sourcePath <- dllSourcePath
    dllExists <- doesFileExist destPath
    
    when (not dllExists) $ copyDll (dllFileName ++ " is not in application data.") sourcePath destPath
    
    destTime <- getModificationTime destPath
    sourceTime <- getModificationTime sourcePath
    
    when (destTime < sourceTime) $ copyDll (dllFileName ++ " is out-of-date.") sourcePath destPath
    
    buildHook simpleUserHooks (patchDesc sourceDir desc) linfo hooks flags

customInstall :: FilePath -> PackageDescription -> LocalBuildInfo -> UserHooks -> InstallFlags -> IO ()
customInstall sourceDir desc = instHook simpleUserHooks $ patchDesc sourceDir desc

customPostConf :: FilePath -> Args -> ConfigFlags -> PackageDescription -> LocalBuildInfo -> IO ()
customPostConf sourceDir args conf desc linfo = postConf simpleUserHooks args conf (patchDesc sourceDir desc) linfo

main :: IO ()
main = do
    sourceDir <- dllSourceDir
    defaultMainWithHooks $ simpleUserHooks
        { buildHook = customBuild sourceDir
        , instHook = customInstall sourceDir
        , postConf = customPostConf sourceDir
        }

That code should be sufficient to tell cabal where dll_example.dll is during compilation, where dll_example.dll should be copied to, and when it needs to be copied (i.e., if it doesn't exist or is out-of-date). You should now be able to compile the project simply with cabal install without worrying about ugly GCC flags.

The downside to this approach is now there are extra files to manage if the user ever wants to uninstall Main2. One way to resolve this is to provide users with a makefile containing an uninstall command that automatically removes Main2.exe and dll_example.dll from %APPDATA%/cabal/bin. If you want to see an example of this, check out the hermit-bluetooth repo, the project in which I encountered all of these problems (and motivated me to make this blog post so that maybe I can save other people some time).

Working with DLLs in Haskell tends to be quite gruesome, and I'd recommend avoiding it whenever possible. In same cases, though, dynamic linking is the only feasible solution to one's problems (especially on Windows), so it's nice to know that the infrastructure for interfacing with DLLs exists (even if actually using that interface is a tad unpleasant).

Saturday, May 19, 2012

Monad Reification in Haskell and the Sunroof Javascript compiler

It is possible to reify a monad in Haskell. This took us by surprise; we did not think this was possible. Here is the story of how reification on a monad works.

Reification

Reification is the observation of underlying structure. One reason for reification is to cross-compile a Haskell expression to execute on a different platform. This has been done many times, and a great reference for this is Conal Elliott, Sigbjorn Finne and Oege de Moor's Compiling Embedded Languages

The trick is to build a structure when a direct interpretation would be typically used. A canonical example is arithmetic.

data Expr where
  Lit :: Int          -> Expr
  Add :: Expr -> Expr -> Expr

instance Num Expr where
  e1 + e2 = Add e1 e2
  fromInteger i = Lit (fromInteger i)

Now we do things like

GHCi> 1 + 2
Add (Lit 1) (Lit 2)

and get the structure of the computation of the expression, not just the value. This is called a deeply embedded language; the structure is deeply embedded inside the expression.

From here we can use this structure to, among other things, compile and execute this expression in a different setting. In "Compiling Embedded Languages", the target was executing C graphics code.

Reification of Functions

It is possible to reify a function. The trick here is to provide a prototypical version of the input argument, and observe how it used in the result.

data Expr where
  -- new constructor
  Var :: String       -> Expr
  -- as before
  Lit :: Int          -> Expr
  Add :: Expr -> Expr -> Expr

reifyFn :: (Expr -> Expr) -> Expr
reifyFn fn = fn (Var "x")

Now we can reify the function.

> let f x = x + 2 :: Expr
> reifyFn f
Add (Var "x") (Lit 2)

We use this trick extensively in Kansas Lava, and other DSLs.

Reification of Monads

Providing a deep embedding of a Monad is straightforward. Consider:

data M :: * -> * where
   Return :: a -> M a
   Bind :: M a -> (a -> M b) -> M b
   GetChar :: M Char
   PutChar :: Char -> M ()

The issue is how do we reify Bind? Or more specifically, how do we provide the prototypical variable of type 'a', to reify the 2nd argument of Bind? For a long time, I assumed this was not possible, without a post-hoc constraint on Bind for a type that provided the appropriate Var "..".

However, there is a way of doing this, by normalizing structure of the monad. This trick was introduced by Chuan-kai Lin in Unimo and is used by the Heinrich Apfelmus in his operational hackage package. We work with operational, because (1) Heinrich Apfelmus contacted us, pointing out that his library could simplify our ad-hoc unrolling mechanism, and (2) it was available on hackage.

operational uses the left identity and associativity monad rules to normalize a monadic program into a stream of primitive instructions terminated by a return.

Program ::= Primitive >>= Program
         |  return a

Using operational is easy, you define the primitive(s), and then you can view your program.

import Control.Monad.Operational

data MI :: * -> * where
       GetChar :: MI Char
       PutChar :: Char -> MI ()

-- bind and return provided by operational
type M a = Program MI a

compile :: M a -> ...
compile = eval . view
   where
     eval :: ProgramView MI a -> ...
     eval (PutChar ch :>>= g) = ...
     eval (GetChar :>>= g) = ...
     eval (Return b) = ..

This effectively gives you a deep embedding. Neat stuff.

Sunroof: The Javascript Monad

Javascript implementations on all major browsers provide a powerful API for building interactive web pages. We want to use Javascript libraries, but program in Haskell, by using a Javascript monad.

A usable model can be built using a simple translation of a fixed set of function calls into Javascript commands. With careful construction, we can combine commands before sending them, optimizing network usage. The challenging part is having the Javascript return values in an efficient manner. Consider this Haskell code:

c <- getContext "my-canvas"
... some use of c ...

In a simple transaction model, getContext invokes a Javascript command on the client, returning the response as c. However, we would prefer the whole code fragment to be compiled to Javascript such that the binding and use of c are performed on the client directly, with no intermediate client<->server communication. And thanks to the ideas inside Unimo and operational we can!

We do this by constraining the returned values of all the primitives to be reifiable via a constraint on GADT constructors. In (the simplified version of) our Javascript compiler, Javascript function calls are implemented with the JS_Call primitive.

data JSInst :: * -> * where
  JS_Call     :: (Sunroof a) => String -> [JSValue] -> JSInst a
  ...

This is the key step, the Sunroof constraint provides the ability to generate a prototypical a. The Unimo trick works for constraint types as well as monomorphic types.

So, from our list of primitives, the operational package allows us to build our Javascript monad, with the monad instance for JSM is provided Program.

type JSM a = Program JSInst a

For technical reasons, Program is abstract in operational, so the library provides view to give a normalized form of the monadic code. In the case of JS_Call, bind corresponds to normal sequencing, where the result of the function call is assigned to a variable, whose name has already been passed to the rest of the computation for compilation. newVar, assignVar and showVar are provided by the Sunroof class.

compile :: Sunroof c => JSM c -> CompM String
compile = eval . view
  where
    showArgs :: [JSValue] -> String
    showArgs = intercalate "," . map show
    eval :: Sunroof b
         => ProgramView JSInst b -> CompM String
    eval (JS_Call nm args :>>= g) = do
       a <- newVar
       code <- compile (g a)
       return $ assignVar a ++ nm ++ "("
                ++ showArgs args ++ ");" ++ code
    ...
    eval (Return b) = return $ showVar b

This critically depends on the type-checking extensions used for compiling GADTs, and scales to additional primitives, provided they are constrained on their polymorphic result, like JS_Call.

Using compile, we compile our Sunroof Javascript DSL to Javascript, and now a bind in Haskell results in a value binding in Javascript. A send command compiles the Javascript expressed in monadic form and sends it to the browser for execution.

send :: (Sunroof a) => JSM a -> IO a

The Javascript code then responds with the return value, which can be used as an argument to future calls to send.

We can write a trivial example which draws a circle that follows the mouse:

drawing_app :: Document -> IO ()
drawing_app doc = do
  ...
  send doc $ loop $ do
        event <- waitFor "mousemove"
        let (x,y) = (event ! "x",event ! "y")
        c <- getContext "my-canvas"
        c <$> beginPath()
        c <$> arc(x, y, 20, 0, 2 * pi, false)
        c <$> fillStyle := "#8ED6FF"
        c <$> fill()

The following code is generated by Sunroof (on the Haskell server) and then executed entirely on the client:

var loop0 = function(){
  waitFor("mousemove",function(v1){
    var v2=getContext("my-canvas");
    (v2).beginPath();
    (v2).arc(v1["x"],v1["y"],20,0,2*Math.PI,false);
    (v2).fillStyle = "#8ED6FF";
    (v2).fill();
    loop0();
  })
}; loop0();

Volia! A Haskell-based Javascript monad reified and transmitted to a browser.

Close

This blog article is adapted from the short paper Haskell DSLs for Interactive Web Services, submitted to XLDI 2012, written by Andrew Farmer and myself.

The lesson, I suppose, is never assume something is not possible in Haskell. We only stumbled onto this when we were experimenting with a variant of monadic bind with class constraints, and managed to remove the constraints. We've never seen an example of using operational or Unimo that constraints the primitives to be able to generate specifically a prototypical value, aka the function reification trick above. If anyone has seen this, please point it out, and we'll be happy to cite it.

We would like to thank Heinrich Apfelmus for pointing out that we could rework our compiler to use operational, and providing us with suitable template of its usage.

Let the reification of monads begin!

Andy Gill