220 lines
3.6 KiB
HTML
220 lines
3.6 KiB
HTML
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<HTML><HEAD><TITLE>Man page of Genlex</TITLE>
|
|
</HEAD><BODY>
|
|
<H1>Genlex</H1>
|
|
Section: OCaml library (3o)<BR>Updated: 2020-01-30<BR><A HREF="#index">Index</A>
|
|
<A HREF="/cgi-bin/man/man2html">Return to Main Contents</A><HR>
|
|
|
|
<A NAME="lbAB"> </A>
|
|
<H2>NAME</H2>
|
|
|
|
Genlex - A generic lexical analyzer.
|
|
<A NAME="lbAC"> </A>
|
|
<H2>Module</H2>
|
|
|
|
Module Genlex
|
|
<A NAME="lbAD"> </A>
|
|
<H2>Documentation</H2>
|
|
|
|
<P>
|
|
Module
|
|
<B>Genlex</B>
|
|
|
|
<BR> :
|
|
<B>sig end</B>
|
|
|
|
<P>
|
|
<P>
|
|
A generic lexical analyzer.
|
|
<P>
|
|
This module implements a simple 'standard' lexical analyzer, presented
|
|
as a function from character streams to token streams. It implements
|
|
roughly the lexical conventions of OCaml, but is parameterized by the
|
|
set of keywords of your language.
|
|
<P>
|
|
Example: a lexer suitable for a desk calculator is obtained by
|
|
<B>let lexer = make_lexer [+;-;*;/;let;=; (; )] </B>
|
|
|
|
<P>
|
|
The associated parser would be a function from
|
|
<B>token stream</B>
|
|
|
|
to, for instance,
|
|
<B>int</B>
|
|
|
|
, and would have rules such as:
|
|
<P>
|
|
<P>
|
|
<B>let rec parse_expr = parser</B>
|
|
|
|
|
|
|
|
<B>| [< n1 = parse_atom; n2 = parse_remainder n1 >] -> n2</B>
|
|
|
|
<B>and parse_atom = parser</B>
|
|
|
|
<B>| [< 'Int n >] -> n</B>
|
|
|
|
<B>| [< 'Kwd (; n = parse_expr; 'Kwd ) >] -> n</B>
|
|
|
|
<B>and parse_remainder n1 = parser</B>
|
|
|
|
<B>| [< 'Kwd +; n2 = parse_expr >] -> n1+n2</B>
|
|
|
|
<B>| [< >] -> n1</B>
|
|
|
|
<B><P>
|
|
</B>
|
|
|
|
One should notice that the use of the
|
|
<B>parser</B>
|
|
|
|
keyword and associated
|
|
notation for streams are only available through camlp4 extensions. This
|
|
means that one has to preprocess its sources e. g. by using the
|
|
<B>-pp</B>
|
|
|
|
command-line switch of the compilers.
|
|
<P>
|
|
<P>
|
|
<P>
|
|
<P>
|
|
<P>
|
|
<I>type token </I>
|
|
|
|
=
|
|
<BR> | Kwd
|
|
<B>of </B>
|
|
|
|
<B>string</B>
|
|
|
|
<BR> | Ident
|
|
<B>of </B>
|
|
|
|
<B>string</B>
|
|
|
|
<BR> | Int
|
|
<B>of </B>
|
|
|
|
<B>int</B>
|
|
|
|
<BR> | Float
|
|
<B>of </B>
|
|
|
|
<B>float</B>
|
|
|
|
<BR> | String
|
|
<B>of </B>
|
|
|
|
<B>string</B>
|
|
|
|
<BR> | Char
|
|
<B>of </B>
|
|
|
|
<B>char</B>
|
|
|
|
<BR>
|
|
<P>
|
|
The type of tokens. The lexical classes are:
|
|
<B>Int</B>
|
|
|
|
and
|
|
<B>Float</B>
|
|
|
|
for integer and floating-point numbers;
|
|
<B>String</B>
|
|
|
|
for
|
|
string literals, enclosed in double quotes;
|
|
<B>Char</B>
|
|
|
|
for
|
|
character literals, enclosed in single quotes;
|
|
<B>Ident</B>
|
|
|
|
for
|
|
identifiers (either sequences of letters, digits, underscores
|
|
and quotes, or sequences of 'operator characters' such as
|
|
<B>+</B>
|
|
|
|
,
|
|
<B>*</B>
|
|
|
|
, etc); and
|
|
<B>Kwd</B>
|
|
|
|
for keywords (either identifiers or
|
|
single 'special characters' such as
|
|
<B>(</B>
|
|
|
|
,
|
|
<B>}</B>
|
|
|
|
, etc).
|
|
<P>
|
|
<P>
|
|
<P>
|
|
<I>val make_lexer </I>
|
|
|
|
:
|
|
<B>string list -> char Stream.t -> token Stream.t</B>
|
|
|
|
<P>
|
|
Construct the lexer function. The first argument is the list of
|
|
keywords. An identifier
|
|
<B>s</B>
|
|
|
|
is returned as
|
|
<B>Kwd s</B>
|
|
|
|
if
|
|
<B>s</B>
|
|
|
|
belongs to this list, and as
|
|
<B>Ident s</B>
|
|
|
|
otherwise.
|
|
A special character
|
|
<B>s</B>
|
|
|
|
is returned as
|
|
<B>Kwd s</B>
|
|
|
|
if
|
|
<B>s</B>
|
|
|
|
belongs to this list, and cause a lexical error (exception
|
|
<B>Stream.Error</B>
|
|
|
|
with the offending lexeme as its parameter) otherwise.
|
|
Blanks and newlines are skipped. Comments delimited by
|
|
<B>(*</B>
|
|
|
|
and
|
|
<B>*)</B>
|
|
|
|
are skipped as well, and can be nested. A
|
|
<B>Stream.Failure</B>
|
|
|
|
exception
|
|
is raised if end of stream is unexpectedly reached.
|
|
<P>
|
|
<P>
|
|
<P>
|
|
|
|
<HR>
|
|
<A NAME="index"> </A><H2>Index</H2>
|
|
<DL>
|
|
<DT id="1"><A HREF="#lbAB">NAME</A><DD>
|
|
<DT id="2"><A HREF="#lbAC">Module</A><DD>
|
|
<DT id="3"><A HREF="#lbAD">Documentation</A><DD>
|
|
</DL>
|
|
<HR>
|
|
This document was created by
|
|
<A HREF="/cgi-bin/man/man2html">man2html</A>,
|
|
using the manual pages.<BR>
|
|
Time: 00:05:44 GMT, March 31, 2021
|
|
</BODY>
|
|
</HTML>
|