Table of Contents
these are some quick notes for my own reference of the peglib readme
- PEG uses both
/
and|
A/B
indicates a prioritized choice, meaningA
takes precedence overB
|
is only used as a delimiter between terminal character strings"
,'
,`
are used to encase literals- the
?
,*
,+
operators exist and consume maximally .
is a wild card- character classes are denoted by
[]
&
,!
are syntactic predicates,&A
consumes a character if patternA
is matchable,!A
consumes if patternA
isn't matchable.()
can be used as embedded options. i.e(A/B)C
matchesA C
andB C
<>
is used to denote token boundaries, for example if you want to use multiple regexp tokens.
using the library
there is a parser type that takes in a grammar string for its constructorR"(...)"
every rule consists of a line of the form foo <- bar
then for every variable in the grammar you can set up a lambda
// variable <- A \ B
parser["variable"]=[](const SemanticValues& vs){
switch (vs.choice()){
case 0: // the first match, A
break;
default: // B
break;
}
};
you can set up an error handler with
parser.set_logger([](size_t line, size_t col, const string& msg, const string& rule)){...}
it will then be called when parsing fails. You can set a specific error message by adding {error_msg "foo"}
after a rule. error messages have the %t
and %c
placeholders for tokens and characters that the parser fails on.
to make an ast
use
parser.enable_ast();
shared_ptr<peg::Ast> ast;
parser.parse(text, ast);
There is also a ast_to_s(ast)
method. By default the ast includes every variable the parser goes through in generating the provided source code. To make the tree more minimal we can use parser.optimize_ast(ast)
and we can flag certain variable in our grammar not to be optimized out using a {no_ast_opt}
flag.
using the AST
each ast node has the following members:
const std::string path;
const size_t line = 1;
const size_t column = 1;
const std::string name;
size_t position;
size_t length;
const size_t choice_count;
const size_t choice;
const std::string original_name;
const size_t original_choice_count;
const size_t original_choice;
const unsigned int tag;
const unsigned int original_tag;
const bool is_token;
const std::string_view token;
std::vector<std::shared_ptr<AstBase<Annotation>>> nodes;
std::weak_ptr<AstBase<Annotation>> parent;
std::string token_to_string() const {
assert(is_token);
return std::string(token);
}
template <typename T> T token_to_number() const {
return token_to_number_<T>(token);
}