Table of Contents
these are some quick notes for my own reference of the peglib readme
- PEG uses both
/and| A/Bindicates a prioritized choice, meaningAtakes precedence overB|is only used as a delimiter between terminal character strings",',`are used to encase literals- the
?,*,+operators exist and consume maximally .is a wild card- character classes are denoted by
[] &,!are syntactic predicates,&Aconsumes a character if patternAis matchable,!Aconsumes if patternAisn't matchable.()can be used as embedded options. i.e(A/B)CmatchesA CandB C<>is used to denote token boundaries, for example if you want to use multiple regexp tokens.
using the library
there is a parser type that takes in a grammar string for its constructorR"(...)" every rule consists of a line of the form foo <- bar
then for every variable in the grammar you can set up a lambda
// variable <- A \ B
parser["variable"]=[](const SemanticValues& vs){
switch (vs.choice()){
case 0: // the first match, A
break;
default: // B
break;
}
};
you can set up an error handler with
parser.set_logger([](size_t line, size_t col, const string& msg, const string& rule)){...}
it will then be called when parsing fails. You can set a specific error message by adding {error_msg "foo"} after a rule. error messages have the %t and %c placeholders for tokens and characters that the parser fails on.
to make an ast
use
parser.enable_ast();
shared_ptr<peg::Ast> ast;
parser.parse(text, ast);
There is also a ast_to_s(ast) method. By default the ast includes every variable the parser goes through in generating the provided source code. To make the tree more minimal we can use parser.optimize_ast(ast) and we can flag certain variable in our grammar not to be optimized out using a {no_ast_opt} flag.
using the AST
each ast node has the following members:
const std::string path;
const size_t line = 1;
const size_t column = 1;
const std::string name;
size_t position;
size_t length;
const size_t choice_count;
const size_t choice;
const std::string original_name;
const size_t original_choice_count;
const size_t original_choice;
const unsigned int tag;
const unsigned int original_tag;
const bool is_token;
const std::string_view token;
std::vector<std::shared_ptr<AstBase<Annotation>>> nodes;
std::weak_ptr<AstBase<Annotation>> parent;
std::string token_to_string() const {
assert(is_token);
return std::string(token);
}
template <typename T> T token_to_number() const {
return token_to_number_<T>(token);
}