|
Java Parser Combinator (JPC) |
||||||||||||||||||||||
|
||||||||||||||||||||||
These pages written, maintained, and © by Atze Dijkstra. |
What?The package Note: The previous version of the package is based on upon the combinators written for Hugs used in the course Grammatica's en Ontleden, including some of the examples used in the course. It is not supported but left here for the time being. Why?Ever used yacc + lex or equivalents? And thrilled with the speed or integration with the used language (like C, C++, Java) but frustrated by the difficulty of manipulating higher-order concepts like abstract syntax trees? Or you did use a functional programming language like Haskell in which parsercombinators can be used easily to create and manipulate grammars and the parsing of text described by a parser. But you could not find a way to incorporate a parser combinator in a production program written in an imperative language? This package hopes to combine best of both worlds: parser combinators written in Java. Writing a ParserThe basic idea of writing a parser in Java using the parsercombinator package is to write a parser using the errorcorrecting parsercombinators written in Haskell, checking them for correctness and then handcompile the resulting parser to Java. The functional behavior is simulated using the JFC library, existing parsers from the errorcorrecting parsercombinators are found in the JPC library. As an example the code for a (very) simple expression parser is used: module Expr1Simple where import UU_Parsing instance Symbol Char pDigit = (\d -> ord d - ord '0') <$> pAnyOf ['0'..'9'] pNat = foldl (\a b -> a*10 + b) 0 <$> pSome pDigit pParens = pPacked (pSym '(') (pSym ')') pFact = pNat <|> pParens pExpr pTerm = pChainl ( (*) <$ pSym '*' <|> div <$ pSym '/' ) pFact pExpr = pChainl ( (+) <$ pSym '+' <|> (-) <$ pSym '-' ) pTerm t p inp = let (r, rest, errors) = parse p null inp in do putStr (show r) putStr (show errors) t1 = t pExpr "3*4-6" t2 = t pExpr "3**4-6" main :: IO () main = do putStr "Enter expression: " inp <- getLine let (r, rest, errors) = parse pExpr null inp putStr (show r) putStr (show errors) main Code in a functional language consists of writing down function applications. The code for pDigit is handcompiled to something similar: Eval pDigit = UUParsing.pApp.apply2 ( StdFuncs.charToDigit , UUParsing.pAnyOf.apply1( Str.valueOf( "0123456789" ) ) ) ; Each application is written down as an invocation of applyX(). The applyX() returns an Apply object, subclass of Eval. The applyX() is invoked on a predefined Eval object, which may be a Function (subclass of Eval) or another Apply.
The parser pNat, for natural numbers
is defined in a similar way, however, it is passed an explicitly defined
lambda expression. A lambda expression is written down as a subclass of
a Function:
The newly defined function is passed as a parameter to the foldl. The remaining definitions are written down in a similar fashion. The demo applet includes source code of a bit more complicated parser and its translation to Java. The mechanics and internal workings are to be explained in a forthcoming paper. The final version is not yet
available. For the time being one can download or look at the following:
|