This is going require some background knowledge. I’m assuming you know what a context-free grammar is and have some understanding of what the job of a parser is. This very specifically is going to be using a simple version of an LR parser I’m going to jump right into this to help explain this.
Start off: Defining Grammar
lunch ::= adjective lunch
lunch ::= food
adjective ::= "tasty" | "delicious" | "savory" | "spicy"
food ::= "hamburger" | "pizza" | "sushi"
These rules are meant to be pretty simple and demonstrate how a bottom-up
parser might work. Let’s start with a string:
spicy savory hamburger and tasty sushi
LR(k) Parsing
An LR parser works by going through with Left to right scanning and Rightmost derivation in reverse. Basically, it has a subset of the list of tokens being used, builds a tree off that, and then after building the tree up as high as possible, another token is added to the tree and the tree is further built. This is continued on until the list of elements is exhausted and tree has built up to a single root element.
The k in LR(k) parsing is for lookahead and the number of tokens which are looked ahead at. Looking ahead can speed up parsing decisions but increases the complexity of the parser. For simplicity, we’re gonna have a lookahead of zero. In other words, this is an LR(0) parser.
Parsing time
* spicy savory hamburger
The little asterisk will represent our dot to which represents where we currently are in the string during our parsing. Everything to left of the string has been gone through while everything to the right is yet to be seen.
Let’s start off with our first token in our string.
spicy * savory hamburger
Notice how we push the dot to the right to indicate that we’ve seen it. Once something is to the left of our parser, we want to update our tree to reflect the new addition.
In this case, spicy is an adjective according to our grammar so we can mark
it as such for our parse tree.
adjective
|
|
spicy * savory hamburger
We can’t simplify the current tree any further so we can now push the dot and include our next token.
adjective
|
|
spicy savory * hamburger
Working our way backwards once again, savory is another adjective.
adjective adjective
\ /
\ /
spicy savory * hamburger
There’s again no way to simplify this any further so we can now push the dot and include our next token.
adjective adjective food
\ / |
\ / |
spicy savory hamburger *
Pushed the dot and we got the hamburger token labeled. However, this time, we
can simplify this. As per our rule lunch ::= food, we can make the following
update to our hamburger:
lunch
|
adjective adjective food
\ / |
\ / |
spicy savory hamburger *
We’re not done just yet simplifying! savory hamburger satisfies the rule of
lunch ::= adjective lunch.
lunch
/ \
/ \
/ lunch
/ \
adjective adjective food
\ / /
\ / /
spicy savory hamburger *
We can use the same rule as before to simplify even further and reach our final result.
lunch
/ \
/ \
/ \
/ \
/ lunch
/ / \
/ / \
/ / lunch
/ / \
adjective adjective food
\ / /
\ / /
spicy savory hamburger *
As you can see, we managed to work our way from the bottom which is our string and all the way up to the root of the parse tree. With this finished parse tree, we can go on to create our stack.