Saturday, October 6, 2012

An example use of Haskell Views

In this post I'm going to show how I'm using Haskell Views in a small Scheme implementation to deal with syntactic sugar.

One example of syntactic sugar in Scheme is the way to define functions. You can do it the following way:


(define square (lambda (x) (* x x))
But you can also use the following shortcut:

(define (square x) (* x x))

I'm going to take advantage of Haskell Views to simplify the implementation of this feature. The following code shows the data type representing Scheme code.


       data Expr a where
               ScSymbol :: String -> Expr a
               ScString :: String -> Expr a
               ScNumber :: Integer -> Expr a
               ScDouble :: Double -> Expr a 
               ScCons :: (Expr a) -> (Expr a) -> (Expr a)
               ScNil :: Expr a
               ScBool :: Bool -> Expr a
               ScQuote :: (Expr a) -> Expr a
               ScEnv ::  (Expr a)
               ScChar :: Char -> Expr a
               ScPort :: Handle -> Expr a
               ScClosure :: ScExecutable a =>  [String] -> a -> (Env a) -> Expr a
               ScPrimitive ::  ([Expr a] -> ScInterpreterMonad (Expr a))  -> Expr a

As a side note,I'm using ScCons/ScNil to represent lists which is different from the way "Write Yourself a Scheme in 48 Hours" defines them (a wrapper around a Haskell list). This decision turned out to be incorrect since manipulating a Haskell list is easier.

Given Scheme datatype definition we can represent the following function definition:

(define (id x) x)

As the following structure:


  (ScCons
     (ScSymbol "define")
     (ScCons
        (ScCons
           (ScSymbol "id")   
           (ScCons
              (ScSymbol "x")
              ScNil))
        (ScCons
           (ScSymbol "x")
           ScNil)))

And the following code:

(define id (lambda (x) x))

Using the following structure


  (ScCons
     (ScSymbol "define")
     (ScCons
        (ScSymbol "id")
        (ScCons
           (ScCons
              (ScSymbol "lambda")
              (ScCons
                 (ScCons
                    (ScSymbol "x")
                    ScNil)
                 (ScCons
                    (ScSymbol "x")
                    ScNil)))
           ScNil)))

Using Haskell Views we can create a view that returns the desired parts:


defineView :: (Expr a) -> Maybe (String, Expr a)
defineView  (ScCons 
             (ScSymbol "define")
             (ScCons 
                 name@(ScSymbol symbolToDefine)
                 (ScCons
                    value
                    ScNil))) = Just (symbolToDefine, value)
defineView  (ScCons 
                (ScSymbol "define")
                (ScCons 
                   (ScCons 
                      name@(ScSymbol symbolToDefine)
                      args)
                   (ScCons
                       value
                       ScNil))) = Just (symbolToDefine, 
                                        (ScCons (ScSymbol "lambda") 
                                                (ScCons args 
                                                        (ScCons value ScNil))))
defineView _ = Nothing

As shown above, this view will extract the parts of the original expression or create the necessary elements in the case of the function definition. Also it will fail in case of non supported syntax.

This view is used in the following code:


prepare (defineView -> Just (symbolToDefine, value)) = 
         do
            preparedValue <- prepare value
            return $ ScSeqDefine symbolToDefine preparedValue 

Code for this post is available in the ToyScheme repo.

Saturday, September 22, 2012

A quick look at Haskell Views

Haskell views is a language extension that allows calling functions as part of the pattern matching process. A simple example looks like this:

 
square x = x*x

doSomething :: Integer -> Integer -> Integer
doSomething (square -> squared) other = squared + other

The real power of this feature is that you can combine it with other pattern matching elements. For example, in the following code the 'rgroups' function is used to match a regular expression and return a list with the groups resulting of this matching process. This function is used as part of the 'classify' function.

rgroups :: String -> String -> [String]
rgroups regex text = result
  where
     matchResult :: (String, String, String, [String])
     matchResult = text =~ regex
     (_, _, _, result) = matchResult
   

data Record =
  Person String Integer
  | Company String
  | Unknown String
 deriving Show

classify  ((rgroups "([a-z]*):[ ]*([a-z]*)[ ]*([0-9]+)") -> ["person", personName, age]) =
  Person personName (read age)
classify  ((rgroups "([a-z]*):[ ]*([a-z]*)") -> ["company", personName]) =
  Company personName
classify line =
  Unknown line

We can use this function as follows:


txtlines = ["company:  testcomp",
            "person: someone    30",
            "oasd",
            "person:another    40"]

*Main>  map classify txtlines
[Company "testcomp",Person "someone" 30,Unknown "oasd",Person "another" 40]

Since this is a language extension we need to activate it by using a GHC command line argument or by adding the following code at the top of the file.


{-# LANGUAGE ViewPatterns #-}

More information can be found in the ViewPatterns section of the GHC wiki. This feature is an inspiration for the Scala extractors and F# active patterns which I talked about in previous posts.

Friday, February 10, 2012

Printing Scheme-like expressions using pretty printing combinators

In this post I'm going to show a small printer for Scheme-like expressions using pretty printing combinators in Haskell.

The data type definition for these expressions looks like this:

data Expr = ScSymbol String
            | ScString String
            | ScNumber Integer
            | ScDouble Double 
            | ScCons Expr Expr
            | ScNil
            | ScBool Bool
            | ScQuote Expr
     deriving Show

The pretty printing combinator library is located in the Text.PrettyPrint.HughesPJ package.

import Text.PrettyPrint.HughesPJ 

toStringP :: Expr -> Doc

The Doc data type represents a document to be printed. The definition for printing the atomic expressions looks pretty straightforward:

toStringP (ScSymbol name) = text name
toStringP (ScNumber num) = integer num
toStringP (ScDouble num) = double num
toStringP ScNil = text "()"
toStringP (ScQuote exp) = text "'" <> (toStringP exp)
toStringP (ScBool value) = text $ if value then "#t" else "#f"
toStringP (ScString str) = doubleQuotes $ text str

This code uses the text, integer and double functions to generate documents for each atomic part.

Now, the interesting part is defining the way to print ScCons which represent linked lists. The definition looks like this:

toStringP (ScCons first rest) = parens ( hang (toStringP first) 3  (sep $ toStringPCons rest))
  where
     toStringPCons (ScCons part rest) =  ((toStringP part) : (toStringPCons rest))
     toStringPCons ScNil = []
     toStringPCons other = [(text ".") , (toStringP other) ]

Here we use just three combinator functions:

  1. parens: which creates a document that will be rendered inside parentheses
  2. sep: which combines document parts depending on the context
  3. hang: which will allow us to add a nice effect to composed expressions by nesting the contents when necessary. In this case we specify indentation by 3 characters.

The following function is used to parse the expressions using a parser from a previous post and then print the result with a given style.

    parseAndRender style text =
       case (parseIt text) of
         Right parsed -> putStr $ renderStyle style  $ toStringP parsed
         _ -> putStr "ParseError"

Now we can print an example:

    smallTest style =
      parseAndRender style "(html (head (title \"Hello world\")) (body (p \"Hi\") (ol (li \"First\") (li \"Second\"))))"

We can print using the default style like using the style definition:

*SCPPrint> smallTest style
(html
    (head (title "Hello world"))
    (body (p "Hi") (ol (li "First") (li "Second"))))

The default style definition specifies a line width of 100 characters. We can modify it to be shorter, for example:

*SCPPrint> smallTest style { lineLength =  40 }
(html
    (head
        (title "Hello world"))
    (body
        (p "Hi")
        (ol
            (li "First")
            (li "Second"))))

It is impressive what can be accomplished with such a small number of definitions. For more information about this pretty printing combinator library and its design process see: The Design of a Pretty-printing Library by John Hughes

Tuesday, January 31, 2012

A quick look at Haskell records

While reading the Parsec documentation I learned about a nice technique used to create language definitions. For example:
idSymbol = oneOf ":!$%&*+/<=>?@\\^|-~"

schemeLanguageDef :: LanguageDef st
schemeLanguageDef = emptyDef { 
                   commentLine = ";;"
                   , identStart = letter <|> idSymbol
                   , identLetter = alphaNum <|> idSymbol
                   , opStart = parserZero
                   , opLetter = parserZero
                 }

In this case we are creating a small language definition based on emptyDef. We're saying that we want all what emptyDef has but changing the values of commentLine, identStart, identLetter, opStart and opLetter.

The Text.Parsec.Language package contains several predefined language definitions which you can use to create yours. Some of these definitions are based on each other, for example the haskell98Def and haskellStyle. This is accomplished using Haskell record syntax for defining data types.

Records in Haskell allow the creation of abstract data type definitions that contain named fields . For example, here's a small definition of a data type to store the information of an editor color theme:
data Theme = ColorTheme {
                     keywordsColor :: Color,
                     backgroundColor :: Color,
                     fontSize :: Int,
                     operatorsColor :: Color,
                     literalsColor :: Color
                }
        deriving Show
An instance of this record can be created as follows:
   baseTheme = ColorTheme { keywordsColor = Black,
                            backgroundColor = White,
                            fontSize = 10,
                            operatorsColor = Black,
                            literalsColor = Black }

One nice feature of Haskell records is that it allows the creation of a new record instance based on a existing one. Back to our artificial example, say that we want to create a dark theme which is the same as the base but with colors inverted.
   darkTheme = baseTheme {
                           keywordsColor = White,
                           backgroundColor = Black,
                           operatorsColor = White,
                           literalsColor = White
                         } 
Here we're saying that we want a copy of baseTheme with keywordsColor, backgroundColor, operatorsColor and literalsColor changed to another value.

Pattern matching can be used to extract parts of the record. For example, say that we want to define a function to increase the current font size of a theme:
  increaseFontSize :: Int -> Theme -> Theme
  increaseFontSize amount theme@ColorTheme { fontSize = currentFontSize} =
        theme { fontSize = currentFontSize + amount } 

We can use this definition :

*Tests> increaseFontSize 5 baseTheme
ColorTheme {keywordsColor = Black, backgroundColor = White, fontSize = 15, operatorsColor = Black, literalsColor = Black}
Also accesor functions are defined for each part of the record. For example:
*Tests> backgroundColor baseTheme
White

For future posts I'm going to explore similar features that exist in other programming languages.

Monday, January 23, 2012

Creating a small parser using Parsec

In this post I'm going to show a small parser for a Scheme-like language written in Haskell using the Parsec parsing combinator library.

We're going to start by defining an AST for Scheme-like expressions:

data Expr = ScSymbol String
            | ScString String
            | ScNumber Integer
            | ScDouble Double 
            | ScCons Expr Expr
            | ScNil
            | ScBool Bool
            | ScQuote Expr
     deriving Show

For this experiment I'm going to use the Text.Parsec.Token and Text.Parsec.Language packages which have useful functions for creating language parsers.

idSymbol = oneOf ":!$%&*+/<=>?@\\^|-~"

schemeLanguageDef :: LanguageDef st
schemeLanguageDef = emptyDef { 
                   commentLine = ";;"
                   , identStart = letter <|> idSymbol
                   , identLetter = alphaNum <|> idSymbol
                   , opStart = parserZero
                   , opLetter = parserZero
                 }

schemeTokenParser = makeTokenParser schemeLanguageDef

TokenParser {
   identifier = idParser,
   reservedOp = opParser,
   stringLiteral = stringLiteralParser,
   parens = parParser,
   lexeme = lexParser,
   naturalOrFloat = naturalOrFloatParser
} = schemeTokenParser


Given these definitions we get parsers for identifiers, numbers, string literals "for free".

Boolean literals are parsed using the following parser:

boolLiteral = lexParser (
                 do 
                   char '#'
                   val <- (char 't') <|> (char 'f')
                   return $ ScBool $ val == 't'
             )


Basic quotations are parsed as follows:

quoteParser = lexParser (
              do 
                char '\''
                val <- expressionParser
                return $ ScQuote val
          )


Atomic expressions are parsed as follows:

atomParser =
   (do 
      id <- idParser
      return $ ScSymbol id)
   <|>
    (do 
       fnumber <- naturalOrFloatParser
       return $ case fnumber of
                Left num -> ScNumber num
                Right num -> ScDouble num)
   <|>        
    (do str <- stringLiteralParser
        return $ ScString str)
   <|> boolLiteral
   <|> quoteParser



Next we need to consider S-Expressions. This kind of expressions are combinations of the above elements surrounded by parentheses.

For example:
(+ 1 2 3)

dottedSuffixParser =     
   do 
      dotParser
      finalExpr <- expressionParser
      return finalExpr

parExpressionParser = 
   do (exprs, last) <- parParser 
                (do 
                   seq <- many expressionParser
                   dottedSuffix <- optionMaybe dottedSuffixParser 
                   return (case dottedSuffix of
                            Just lastExpr -> (seq, lastExpr)
                            Nothing -> (seq, ScNil)))
      return $ foldr ScCons last exprs

As shown above, the parser need to verify if there was a dot ('.') before the last element for example:
(1 . 2)
More details on the dot syntax can be found here.
Finally we define expressions:

       expressionParser =
            atomParser <|> parExpressionParser


       parseIt input = parse expressionParser "" input


We can use this function to parse this expressions as follows:

*SCParser> parseIt "(- x 1)"
Right (ScCons (ScSymbol "-") (ScCons (ScSymbol "x") (ScCons (ScNumber 1) ScNil)))


There's a great a series of articles on implementing a Scheme interpreter in Haskell called "Write Yourself a Scheme in 48 Hours/Parsing" which is a great source of information on this topic.

Monday, January 2, 2012

A call-with-current-continuation example

The call-with-current-continuation (or call/cc for short) function in Scheme allows the creation of powerful control structures. It basically allows you to capture the current state of a computation as a first class value. This value could be assigned to a variable and invoked later.

For a great introduction to this function see the call/cc entry in the Community-Scheme-Wiki .

The following toy example shows how this function is used to do an incremental search inside a tree. Say that you have the following tree structure definition:


(define tree-data
'(html
(head (title "test"))
(body
(p "foo" (i "goo"))
(ol
(li "xoo" (i "moo"))
(li (i "loo")))
(p))))


We want to define a function to search for elements identified by a predicate. For example, if looking for a "i" element we can write:


(define (is-italic? element)
(and (list? element)
(not (null? element))
(eq? 'i (car element))))


We want to create a function to perform an incremental search inside a tree structure. That is, we want to find the first element and then request the next element if needed.

Here's a demo of the function we want to create:

> (define hit (search is-italic? tree-data))

> hit
((i "goo") #<continuation>)
> (set! hit (get-next-result hit))
> hit
((i "moo") #<continuation>)
> (set! hit (get-next-result hit))
> hit
((i "loo") #<continuation>)
> (set! hit (get-next-result hit))
> hit
(() #<continuation>)


This function is defined using call-with-current-continuation as follows:

(define (search pred? tree)
(call-with-current-continuation
(lambda (c)
(searching pred? tree c))))

(define (searching pred? tree return)
(set! return
(call-with-current-continuation
(lambda (continuation)
(if (pred? tree)
(return (list tree continuation))
return))))
(if (list? tree)
(for-each
(lambda (item)
(set! return (cadr (searching pred? item return))))
tree))
(list '() return))



The entry point is the search function which captures the current continuation in the return variable. This continuation value allows us to jump to the point where call-with-current-continuation was called. We pass this value to the searching function which contains the logic to search inside the tree.

The first expression of the searching function seems confusing since it contains another call to call-with-current-continuation and an assignment of return. This code is used to do the following things:
  1. Return a search result (remember that return contains a continuation)
  2. Capture the current state of the tree traversal (this is done by the call-with-current-continuation call)



...
(set! return
(call-with-current-continuation
(lambda (continuation)
(if (pred? tree)
(return (list tree continuation))
return))))
...


Notice that a search result contains two things:

  1. The tree fragment that makes the predicate true
  2. The continuation of the point where the element was found (which can be used to restore the traversal)



One key thing to understand is that we change the value of return to the result of the call/cc call. This is done this way because we will change the return continuation to retrieve new results. This is done using the get-next-result function as follows:


(define (get-next-result result)
(if (not (null? (car result)))
(call-with-current-continuation (cadr result))
result))


Calling call-with-current-continuation with the last captured continuation will result on setting it to the return variable in searching.