SMLYACC(1) | General Commands Manual | SMLYACC(1) |
smlyacc - the parser generator for SML#
smlyacc [-s] [-p output_prefix] filename
SMLYacc is a parser generator in the style of ML-Yacc. It can accept grammer files of ML-Yacc, but generated programs and their usage are not compatible to those of ML-Yacc. Generated programs can be compiled by the SML# compiler.
By default, for an input file X.grm, smlyacc generates X.grm.sml for the generated parser, X.grm.sig for the signature of tokens, and optionally X.grm.desc for the description of LALR parser construction. To compile the generated program with SML#, you need to write an inteface file X.grm.smi by yourself according to the generated signature X.grm.sig.
- -s
- Insert the token signature at the beginning of the generated .sml file, instead of a separate .sig file.
- -p output_prefix
- Set the prefix of the output file names. When output_prefix is set to X, smlyacc generates X.sml, X.sig, and X.desc. The default is the same as the input file name.
The following is a minimal example of an input file ex.grm:
%% %term LPAREN | RPAREN | EOF %nonterm start of word | exp of word %pos int %eop EOF %name Example %% start : exp (exp) exp : (0w0)
| LPAREN exp RPAREN exp (exp1 + exp2)
By applying this file to smlyacc,
smlyacc ex.grm
you obtain two files ex.grm.sml and ex.grm.sig. Only ex.grm.sml needs to be compiled. To compile it, you need to create the following ex.grm.smi file by yourself:
_require "basis.smi" _require local "ml-yacc-lib.smi" _require local "./ex.grm.sig" structure ExampleLrVals = struct
structure Parser =
type token (= boxed)
type stream (= ref)
type result = word
type pos = int
type arg = unit
val makeStream : {lexer : unit -> token} -> stream
val consStream : token * stream -> stream
val getStream : stream -> token * stream
val sameToken : token * token -> bool
val parse : {lookahead : int,
stream : stream,
error : string * pos * pos -> unit,
arg : arg}
-> result * stream
structure Tokens =
type pos = Parser.pos
type token = Parser.token
val EOF: pos * pos -> token
val RPAREN: pos * pos -> token
val LPAREN: pos * pos -> token
end end
The types of token constructors (EOF, RPAREN, and LPAREN) are copied from the generated signature ex.grm.sig file by hand.
The parse function in the generated program is the parser. To invoke it, an imperative lexer function of type unit -> token is needed. In the case of combining with SMLLex, the lexer is generated by SMLLex. Suppose that SMLLex generates a lexer of the following interface:
structure ExampleLex = struct
exception LexError
val makeLexer : (int -> string)
-> unit -> ExampleLrVals.Parser.token end
A typical code joining SMLLex and SMLYacc looks like the following:
fun inputN n = TextIO.inputN (instream, n) val lexer = ExampleLex.makeLexer inputN val stream = ExampleLrVals.Parser.makeStream {lexer = lexer} val (result, stream) =
{lookahead = 0, stream = stream,
error = errorFn, arg = parserArg}
SMLYacc is a derivative of ML-Yacc, which is originally developed by David R. Tarditi and Andrew W. Appel. When ML-Yacc was ported to SML#, the source code was restructured to replace functor applications with SML#'s separate compilation and linking. See the SML# document for major changes from the original ML-Yacc.
ML-Yacc User's Manual, available at
SML# Document, available at