Since I was playing with fparsec last week, I decided to redo (or mostly) the parser for my homebrew language that I’ve previously posted about. Using fparsec made the parser surprisingly succinct and expressive. In fact I was able to do most of this in an afternoon, which is impressive consideringmy last C# attempt took 2 weeks to hammer out.

As always, it starts with the data

  
type Op =  
 | Plus  
 | Minus  
 | GreaterThan  
 | LessThan  
 | Mult  
 | Divide  
 | Carrot

type Ast =  
 | Statement of Ast  
 | Expression of Ex  
 | Function of string option \* Argument list option \* Ast  
 | Scope of Ast list option  
 | Class of Ex \* Ast  
 | Conditional of Ex \* Ast \* Ast option  
 | WhileLoop of Ex \* Ast  
 | ForLoop of Ast \* Ex \* Ex \* Ast  
 | Call of string \* Argument list option  
 | Assign of Ex \* Ex  
and Ex =  
 | Single of Ast  
 | Full of Ex \* Op \* Ex  
 | Float of float  
 | Int of int  
 | Literal of string  
 | Variable of string  
and Argument =  
 | Element of Ex  

Operators

Parsing operators is trivial

  
let plus = pstring "+" \>\>% Plus

let minus = pstring "-" \>\>% Minus

let divide = pstring "/" \>\>% Divide

let mult = pstring "\*" \>\>% Mult

let carrot = pstring "^" \>\>% Carrot

let gt = pstring "\>" \>\>% GreaterThan

let lt = pstring "\<" \>\>% LessThan

let op = spaces \>\>. choice[plus;minus;divide;mult;carrot;gt;lt]  

Expressions

But what was great was parsing expressions. These were complicated because I had to avoid left recursion, and in my C# parser I had a lot of edge conditions and had to deal with special backtracking. It was a nightmare. With FParsec you can create forward recursive parsers, basically you create a dummy variable that you use as the recursive parser. Later you populate a tied reference to it with what are the available recursive parser implementations.

  
// create a forward reference  
// the expr is what we'll use in our parser combinators  
// the exprImpl we'll populate with all the recursive options later  
let expr, exprImpl = createParserForwardedToRef()

let expression1 = spaces \>\>? choice[floatNum;intNum;literal;variable]

let between a b p = pstring a \>\>. p .\>\> pstring b

let bracketExpression = expr |\> between "(" ")"

let lhExpression = choice[expression1; bracketExpression]

let expressionOperation = lhExpression \>\>=? fun operandL -\>  
 op \>\>=? fun operator -\>  
 choice[expr;bracketExpression] \>\>=? fun operandR -\>  
 preturn (operandL, operator, operandR) |\>\> Full

do exprImpl := spaces \>\>. choice[attempt expressionOperation;  
 attempt bracketExpression;  
 expression1]  

expression1 is a type of expression that is just a single element. expressionOperation is an expression that has an operator inbetween two expressions. To avoid left recursion, the left hand side of an expression is limited to either expressions of single elements, or expressions encapsulated in parenthesis. Then the right hand side can be either a parenthesis expression, or a regular expression. You’ll notice that expr isn’t actually defined yet when its used here on line 16. It’s a placeholder for the recursive parser, which is tied to the mutable reference cell (exprImpl) that I populate after I’ve defined all the parsers. This lets you define a parser that can actually call itself recursively! Neat.

The >>=? operator applies the result to the following function, but if it fails, backtracks to the beginning of that parsers state.

Scope

I defined a scope as any valid statements between two curly brackets.

  
let funcInners, funcInnersImpl = createParserForwardedToRef()

let scope = parse{  
 do! spaces  
 do! skipStringCI "{"  
 do! spaces  
 let! text = opt funcInners  
 do! spaces  
 do! skipStringCI "}"  
 do! spaces  
 return Scope(text)  
}  

The forward parser here is going to be populated at the very end of my parser, since it will allow for any kind of statement like while loops, for loops, conditionals, assignment, etc. All the scope parser cares about is that it got stuff between curly brackets. Using this we can leverage it anywhere else that has curly brackets. Later on in the program I set this up:

  
(\* things that can be in functions \*)

do funcInnersImpl := many1 (spaces \>\>? choice [scope; func; statement])  

Which allows scopes, statements (delineated by semicolons), or other functions, to appear inside of a function or scope. So you can see how a scope and function can be recursive (by containing other scopes and functions inside of them).

Anyways, lets parse a function

  
let innerArgs = sepEndBy1 (expr |\>\> Element) (pstring ",")  
let arguments = innerArgs |\> between "(" ")"

let func = parse {  
 do! skipStringCI "func"  
 do! spaces  
 let! name = opt (many1Chars (satisfy isAsciiLetter))  
 let! arguments = opt arguments  
 do! spaces  
 do! skipStringCI "-\>"  
 let! scope = scope  
 return Function(name, arguments, scope)  
}  

Conditionals

Conditionals were fun, because you can have an if statement, an if/else, or an if/elseif, or an if/elseif/…/else combo. In my previous C# parser I covered each type independently so I had a lot of extra overlap, but this time I wanted to see if I could create an aggregate parser combinator to handle all these scenarios for me in one.

  
let conditionalParser, conditionalParserImpl = createParserForwardedToRef()

let ifBlock = parse{  
 do! skipStringCI "if"  
 let! condition = expr |\> between "(" ")"  
 do! spaces  
 let! onTrue = scope  
 do! spaces

let elseKeyword = skipStringCI "else" .\>\> spaces

let elseParse = parse{  
 do! elseKeyword  
 let! onFalse = scope  
 return (condition, onTrue, Some(onFalse)) |\> Conditional  
 }

let elseIfParse = parse{  
 do! elseKeyword  
 let! onFalse = conditionalParser  
 return (condition, onTrue, Some(onFalse)) |\> Conditional  
 }

let noElseParse = parse{  
 return (condition, onTrue, None) |\> Conditional  
 }

let! result = choice[attempt elseIfParse;elseParse;noElseParse]  
 return result  
}  

This time, I created a recursive parser that optionally removes an else statement and captures the scope, or removes the else element and calls back into the if parser, or just terminates (so an if with no else). The final result is a 3 way alternative that the if block can evaluate.

Loops

Here is a while loop

  
let whileLoop = (pstring "while" \>\>. spaces) \>\>. (expr |\> between "(" ")") \>\>= fun predicate -\>  
 scope \>\>= fun body -\>  
 preturn (WhileLoop(predicate, body))  

And here is a for loop

  
let assign = parse{  
 let! ex = expr  
 do! spaces  
 do! skipStringCI "="  
 do! spaces  
 let! assignEx = expr  
 do! spaces  
 return (ex, assignEx) |\> Assign  
}

let forLoop =  
 let startCondition = assign .\>\> pstring ";"  
 let predicate = expr .\>\> pstring ";"  
 let endCondition = expr  
 let forKeyword = pstring "for" .\>\> spaces

let forItems = tuple3 startCondition predicate endCondition |\> between "(" ")"

forKeyword \>\>. forItems .\>\>. scope \>\>= fun ((start, predicate, end), body) -\>  
 preturn (start, predicate, end, body) |\>\> ForLoop  

Which gives you a result like this

  
test "for(x=1;y\<z;y+1){}";;  
val it : Ast list =  
 [ForLoop  
 (Assign (Variable "x",Float 1.0),  
 Full (Variable "y",LessThan,Variable "z"),  
 Full (Variable "y",Plus,Float 1.0),Scope null)]  

Conclusion

My fiance is probably pissed that I spent 4th of july working on parser combinators, but fparsec is just too fun not to. I really can’t wait for an opportunity to use this for some production code, since I’m extremely happy with fparsecs abilities and the experience of working in it.

For the full parser check out this fsharp snippet.