func makeSynthRules(l lexer.Lexer, synthSymbols map[string]bool, userFn UserFn) ([]Rule, error) { var rules []Rule if userFn == nil { userFn = DefaultUserFn } for tok := range synthSymbols { if !isSynthName(tok) { tlog.Panic("should be a synth name but it's not: ", tok) } lsid, err := l.GetSymbolSet().GetByName(tok) if err != nil { return rules, err } end := tok[len(tok)-1:] tok := tok[:len(tok)-1] rsid, err := rhsCase(l, tok) if err != nil { return rules, err } switch end { case "?": rules = append(rules, Rule{lsid, []syms.SymbolID{syms.EMPTY}, userFn}) rules = append(rules, Rule{lsid, []syms.SymbolID{rsid}, userFn}) case "*": rules = append(rules, Rule{lsid, []syms.SymbolID{syms.EMPTY}, userFn}) rules = append(rules, Rule{lsid, []syms.SymbolID{lsid, rsid}, userFn}) case "+": rules = append(rules, Rule{lsid, []syms.SymbolID{rsid}, userFn}) rules = append(rules, Rule{lsid, []syms.SymbolID{lsid, rsid}, userFn}) } } return rules, nil }
func (p *Parser) nextToken(lex lexer.Lexer) (lexer.Token, error) { tok, err := lex.NextToken() if err == io.EOF { err = nil tok.ID = syms.EOF } return tok, err }
func lhsCase(l lexer.Lexer, lhs string) (syms.SymbolID, error) { symbols := l.GetSymbolSet() sid, err := symbols.GetByName(lhs) if err == nil { return sid, nil } return symbols.Add(lhs, false) }
func rhsCase(l lexer.Lexer, tok string) (syms.SymbolID, error) { if tok == "" { return syms.ERROR, errors.New("empty rhs identifier") } symbols := l.GetSymbolSet() sid, err := symbols.GetByName(tok) if err == nil { return sid, nil } lt := len(tok) if isIdentName(tok) { return l.Ident(tok[1 : lt-1]), nil } else if isOperatorName(tok) { return l.Operator(tok[1 : lt-1]), nil } if tok == "" || !tokenRe.MatchString(tok) { return syms.ERROR, fmt.Errorf("%s is not a valid rhs identifier", tok) } sid, err = symbols.Add(tok, false) return sid, err }
func (p *Parser) Parse(lex lexer.Lexer) (interface{}, error) { defer tlog.FuncLog(tlog.Func("Parse")) stateStack := []ItemSetID{ItemSetID(0)} dataStack := make([]interface{}, 1, 1) // first is nil tok, err := p.nextToken(lex) if err != nil { return tok, err } if p.debug { fmt.Println("\nInitial token", tok.String(p.Syms), p.fmtStacks(stateStack, dataStack)) } for { topState := stateStack[len(stateStack)-1] action, found := p.parsetable[topState][tok.ID] if !found { action, found = p.parsetable[topState][syms.EMPTY] if found { lex.Unread() tok = lexer.Token{ID: syms.EMPTY, Str: ""} } } if !found { // parsing error tok, stateStack, dataStack, err = p.errorRecovery(lex, tok, stateStack, dataStack) _ = "breakpoint" if err != nil { return dataStack, err } topState = stateStack[len(stateStack)-1] action, found = p.parsetable[topState][tok.ID] } if action.actiontype == Shift { stateStack = append(stateStack, action.state) dataStack = append(dataStack, tok) if p.debug { fmt.Println("\nShifted token", tok.String(p.Syms), p.fmtStacks(stateStack, dataStack)) } tok, err = p.nextToken(lex) if err != nil { dataStack = append(dataStack, tok) return dataStack, err } } else if action.actiontype == Reduce { rule := p.Rules.Get(action.ruleID) cutoff := len(stateStack) - len(rule.RHS) if cutoff < 0 { tlog.Panic("should reduce, but not enough elements on stack.", "\nReduce before token ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), p.fmtStacks(stateStack, dataStack)) } userFn := rule.UserFn if userFn == nil { userFn = grammar.NilUserFn } newdata, err := userFn(p.Grammar, rule, dataStack[cutoff:]) if err != nil { return newdata, UserCallbackParseError( fmt.Sprint("user callback error on reduce. ", "\nReduce before token ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), p.fmtStacks(stateStack, dataStack))) } dataStack = dataStack[:cutoff] dataStack = append(dataStack, newdata) stateStack = stateStack[:cutoff] topState = stateStack[len(stateStack)-1] if topState == 0 && len(stateStack) == 1 && tok.ID == syms.EOF { if p.debug { fmt.Println( "\nFinal Reduce, before token ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), "\nReduced to state:", topState, p.fmtStacks(stateStack, dataStack)) } return newdata, nil } action2, found := p.parsetable[topState][rule.LHS] if !found { tlog.Panic("internal parser error, no goto state. ", "\nAfter Reduce, before token ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), p.fmtStacks(stateStack, dataStack)) } if action2.actiontype != Goto { tlog.Panic("internal parser error, expected goto action, instead got: ", action2, "\nAfter Reduce, before token ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), p.fmtStacks(stateStack, dataStack)) } stateStack = append(stateStack, action2.state) // a reduce does not consume the terminal if p.debug { fmt.Println( "\nAfter Reduce, before token: ", tok.String(p.Syms), "\nRule: ", rule.String(*p.Syms, -1), "\nReduced to state:", topState, "and GOTO state", action2.state, p.fmtStacks(stateStack, dataStack)) } } } }
func (p *Parser) errorRecovery(lex lexer.Lexer, tok lexer.Token, stateStack []ItemSetID, dataStack []interface{}) (lexer.Token, []ItemSetID, []interface{}, error) { parseErr := ParseError{ Invalid: []lexer.Token{tok}, Location: lex.Location(), } topState := stateStack[len(stateStack)-1] for expectedTok := range p.parsetable[topState] { parseErr.Expected = append(parseErr.Expected, p.Syms.Get(expectedTok)) } // rewind the state stack searching for an error rule action, found := p.parsetable[topState][syms.ERROR] for !(found && action.actiontype == Shift) { stateStack = stateStack[:len(stateStack)-1] if len(stateStack) == 0 { break } topState = stateStack[len(stateStack)-1] action, found = p.parsetable[topState][syms.ERROR] } parseErr.Valid = append(parseErr.Valid, dataStack[len(stateStack):]...) dataStack = dataStack[:len(stateStack)] if len(stateStack) == 0 { err := UnexpectedTerminalParseError("parse error; could not find a suitable error rule") dataStack = append(dataStack, parseErr) if p.debug { fmt.Println("\nParse error, no error recovery possible", p.fmtStacks(stateStack, dataStack)) } return tok, stateStack, dataStack, err } stateStack = append(stateStack, action.state) // now search for next action by discarding unfitting tokens actionMap := p.parsetable[action.state] action, found = actionMap[tok.ID] parseErr.Invalid = nil for !found { parseErr.Invalid = append(parseErr.Invalid, tok) if tok.ID == syms.EOF { err := UnexpectedTerminalParseError("parse error; could not find a suitable error rule before EOF") dataStack = append(dataStack, parseErr) if p.debug { fmt.Println("\nParse error, EOF while recovering", "\nRecovery state expected one of:", p.symNames(actionMap), p.fmtStacks(stateStack, dataStack)) } return tok, stateStack, dataStack, err } var err error tok, err = p.nextToken(lex) if err != nil { dataStack = append(dataStack, parseErr) return tok, stateStack, dataStack, err } action, found = actionMap[tok.ID] } dataStack = append(dataStack, parseErr) if p.debug { fmt.Println("\nParse error, stack unwinded", p.fmtStacks(stateStack, dataStack)) } return tok, stateStack, dataStack, nil }
func NewGrammarBlock(l lexer.Lexer, ruleBlock string, funcMap map[string]UserFn) (Grammar, error) { // defer tlog.FuncLog(tlog.Func("NewGrammarBlock")) var rules []Rule arrowSep := "→" defineSep := "::=" subruleSep := "|" ruleHandlerPrefix := "@" ruleList := strings.Split(ruleBlock, "\n") synthSymbols := make(map[string]bool) var lhsID syms.SymbolID var err error for lineno, r := range ruleList { r = strings.TrimSpace(r) if r == "" || strings.HasPrefix(r, "//") { continue } var rhs string if lhsID != syms.ERROR && strings.HasPrefix(r, subruleSep) { // new subrule, lhs stays the same rhs = r[len(subruleSep):] } else { // new rule idxArrow := strings.Index(r, arrowSep) idxDefine := strings.Index(r, defineSep) var lhs string if idxArrow < 0 && idxDefine < 0 { return Grammar{}, fmt.Errorf("rule on line %d has no '→' or '::=' symbol", lineno+1) } else if idxArrow >= 0 && (idxDefine < 0 || idxArrow < idxDefine) { lhs = r[:idxArrow] rhs = r[idxArrow+len(arrowSep):] } else { lhs = r[:idxDefine] rhs = r[idxDefine+len(defineSep):] } lhsTrim := strings.TrimSpace(lhs) if lhsTrim == "" || !identRe.MatchString(lhsTrim) { return Grammar{}, fmt.Errorf("line %d: '%s' is not a valid lhs identifier\n%s", lineno+1, lhs, r) } lhsID, err = lhsCase(l, lhsTrim) if err != nil { return Grammar{}, err } } var rhsa []syms.SymbolID var funcDecl string for _, tok := range strings.Fields(rhs) { if tok == subruleSep { if len(rhsa) == 0 { return Grammar{}, fmt.Errorf("line %d: empty partial rule\n%s", lineno+1, r) } rules = append(rules, Rule{lhsID, rhsa, funcMap[funcDecl]}) rhsa = nil funcDecl = "" } else if strings.HasPrefix(tok, ruleHandlerPrefix) { if funcDecl != "" { return Grammar{}, fmt.Errorf("line %d: multiple function decls\n%s", lineno+1, r) } funcDecl = tok[len(ruleHandlerPrefix):] } else { sid, err := rhsCase(l, tok) if err != nil { return Grammar{}, fmt.Errorf("line %d: %s\n%s", lineno+1, err, r) } rhsa = append(rhsa, sid) if isSynthName(tok) { synthSymbols[tok] = true } } } if len(rhsa) == 0 { return Grammar{}, fmt.Errorf("line %d: empty rhs\n%s", lineno+1, r) } rules = append(rules, Rule{lhsID, rhsa, funcMap[funcDecl]}) } newRules, err := makeSynthRules(l, synthSymbols, funcMap["synth"]) if err != nil { return Grammar{}, nil } rules = append(rules, newRules...) return NewGrammar(l.GetSymbolSet(), rules) }