Friday, February 10, 2012

Printing Scheme-like expressions using pretty printing combinators

In this post I'm going to show a small printer for Scheme-like expressions using pretty printing combinators in Haskell.

The data type definition for these expressions looks like this:

data Expr = ScSymbol String
            | ScString String
            | ScNumber Integer
            | ScDouble Double 
            | ScCons Expr Expr
            | ScNil
            | ScBool Bool
            | ScQuote Expr
     deriving Show

The pretty printing combinator library is located in the Text.PrettyPrint.HughesPJ package.

import Text.PrettyPrint.HughesPJ 

toStringP :: Expr -> Doc

The Doc data type represents a document to be printed. The definition for printing the atomic expressions looks pretty straightforward:

toStringP (ScSymbol name) = text name
toStringP (ScNumber num) = integer num
toStringP (ScDouble num) = double num
toStringP ScNil = text "()"
toStringP (ScQuote exp) = text "'" <> (toStringP exp)
toStringP (ScBool value) = text $ if value then "#t" else "#f"
toStringP (ScString str) = doubleQuotes $ text str

This code uses the text, integer and double functions to generate documents for each atomic part.

Now, the interesting part is defining the way to print ScCons which represent linked lists. The definition looks like this:

toStringP (ScCons first rest) = parens ( hang (toStringP first) 3  (sep $ toStringPCons rest))
  where
     toStringPCons (ScCons part rest) =  ((toStringP part) : (toStringPCons rest))
     toStringPCons ScNil = []
     toStringPCons other = [(text ".") , (toStringP other) ]

Here we use just three combinator functions:

  1. parens: which creates a document that will be rendered inside parentheses
  2. sep: which combines document parts depending on the context
  3. hang: which will allow us to add a nice effect to composed expressions by nesting the contents when necessary. In this case we specify indentation by 3 characters.

The following function is used to parse the expressions using a parser from a previous post and then print the result with a given style.

    parseAndRender style text =
       case (parseIt text) of
         Right parsed -> putStr $ renderStyle style  $ toStringP parsed
         _ -> putStr "ParseError"

Now we can print an example:

    smallTest style =
      parseAndRender style "(html (head (title \"Hello world\")) (body (p \"Hi\") (ol (li \"First\") (li \"Second\"))))"

We can print using the default style like using the style definition:

*SCPPrint> smallTest style
(html
    (head (title "Hello world"))
    (body (p "Hi") (ol (li "First") (li "Second"))))

The default style definition specifies a line width of 100 characters. We can modify it to be shorter, for example:

*SCPPrint> smallTest style { lineLength =  40 }
(html
    (head
        (title "Hello world"))
    (body
        (p "Hi")
        (ol
            (li "First")
            (li "Second"))))

It is impressive what can be accomplished with such a small number of definitions. For more information about this pretty printing combinator library and its design process see: The Design of a Pretty-printing Library by John Hughes