第五部分LL1文法及其分析程序教学课件.ppt

资源描述

《第五部分LL1文法及其分析程序教学课件.ppt》由会员分享，可在线阅读，更多相关《第五部分LL1文法及其分析程序教学课件.ppt（43页珍藏版）》请在三一文库上搜索。

1、1,第五章 LL(1)文法及其分析程序,5.1 预测分析程序 5.2 LL(1)文法 FIRST和FOLLOW集定义和计算 LL(1) 文法定义 LL(1)分析程序的生成 5.3 非LL(1)文法的改造,2,自上而下分析算法,要点： .由根向下构造语法树 .构造最左推导 .推导出的终结符是否与当前输入符匹配 S aaab A B a A,S AB A aA | B b | bB aaab. S AB S AB aAB A aA aaAB A aA aaaAB A aA aaa B A aaab B b,3,带回溯的自上而下分析,S AB A aA | B b | bB a a a b b.

2、S (1) A. S AB (2) aA. A aA (3) aaA. A aA (4) aaaA. A aA (5) aaa B. A (6) aaab B b,aaabb. S (1) A. S AB (2) aA. A aA (3) aaA. A aA (4) aaaA. A aA (5) aaa B A (6) aaa b B B bB (7) aaabb B b,4,预测分析程序Predictive parser无回溯的自顶向下分析程序,特征根据下一个输入符号为当前要处理的非终结符选择产生式要求文法是LL(1)的第一个L 从左到右扫描输入串第二个L 生成的是最左推导 1 向前

3、看一个输入符号（lookahead) 预测分析程序的实现技术 1 递归下降子程序 2 表驱动分析程序,5,PL/0语言的EBNF,程序=分程序. 分程序=常量说明部分变量说明部分过程说明部分语句常量说明部分=CONST常量定义部分，常量定义；变量说明部分=VAR标识符，标识符；过程说明部分= PROCEDURE 标识符分程序；过程说明部分；语句= 标识符：=表达式 |IF 条件 then语句|CALL|READ|BEGIN 语句；语句 END|WHILE|,6,begin(*statement*) if sym=ident then (*parsing ass.st.*) beg

4、in getsym; if sym=becomes then getsym else error(13); expression(fsys); end else if sym=readsym then (* parsing read st.*),begin getsym; if symlparen then error(34) else repeat getsym; if sym ident then error(35) else getsym until symcomma; if symrparen then error(33); end,7,递归下降子程序,program function

5、_list function_list function function_list | function FUNC identifier ( parameter_list ) statement void ParseFunction() MatchToken(T_FUNC); ParseIdentifier(); MatchToken(T_LPAREN); ParseParameterList(); MatchToken(T_RPAREN); ParseStatement(); ,8,void MatchToken(int expected) if (lookahead != expecte

6、d) printf(“syntax error n“); exit(0); else / if match, consume token and move on lookahead = yylex(); ,9,例：递归子程序实现表达式的语法分析,表达式的EBNF 表达式=+|-项（+|-）项项=因子（*|/）因子因子=ident|number|（表达式）,10,procedure expr; begin if sym in plus, minus then begin getsym; term; end else term; while sym in plus, minus do beg

7、in getsym; term; end end;,11,Procedure term; begin factor; while sym in times,slash do begin getsym;factor end end;,12,Procedure factor; begin if sym=ident then getsym else if sym=number then getsym else if sym=( then begin getsym; expr; if sym=) then getsym else error end else error end;,13,表驱动予测分析

8、程序模型,Input,#,总控程序,预测分析表,stack,14,带 a0 a1 a2 a3 a4 a5 a6 a7 a8 an-1 an,有限控制器,磁头,识别程序的数学模型下推自动机,15,上下文无关语言句型分析（识别）程序的数学模型,下推自动机Pda=(K,f,H,h0,S,Z) H:下推栈符号的有穷字母表 h0 :H中的初始符号 f: K （) H K H* Pda的一个组态是K * H 中的一个（k,w,) k:当前状态，w:余留输入串，：栈中符号，最左边的符号在栈顶。Pda的一次移动用组态表示终止和接受的条件: 1.到达输入串结尾时，处在Z中的一个状态或 2.某个动作序列导致

9、栈空时,16,例：Pda P=(A,B,C),a,b,c),f,h,i,i A, ),f(A,a,i) = (B,h) f(B,a,h) = (B,hh) f(C,b,h) = (C, ) f(A,c,i) = (A, ) f(B,c,h) = (C,h) 接受输入串aacbb的过程 (A,aacbb,i) 读a, pop i, push h, goto B (B,acbb,h) 读a, pop h, push hh, goto B (B,cbb,hh) 读c, pop h, push h , goto C (C,bb,hh) 读b, pop h, push , goto C (C,b,h)

10、读b ,pop h, push , goto C (C, , ),17,G E: (1) E TE (2) E +TE (3) E (4) T FT (5) T *FT (6) T (7) F (E) (8) F a,18,G E: (1) E TE (2) E +TE (3) E (4) T FT (5) T *FT (6) T (7) F (E) (8) F a,19,分析算法,BEGIN 首先把#然后把文法开始符号推入栈；把第一个输入符号读进b; FLAG：=TRUE； WHILE FLAG DO BEGIN 把栈顶符号上托出去并放在中； IF X Vt THEN IF X=b THEN

11、把下一个输入符号读进a ELSE ERROR ELSE IF X=# THEN IF X=b THEN FLAG:=FALSE ELSE ERROR ELSE IF X,b=X X1X2XK THEN 把XK，X K-1,X1一一推进栈 ELSE ERROR END OF WHILE; STOP/*分析成功，过程完毕* END,20,分析输入串#a+a#,栈内容栈顶符号当前输入余留串 MX,b 1 #E E a +a# E TE 2 #ET T a +a# T FT 3 #ETF F a +a# F a 4 #ETa a a +a# 5 # ET T + a# T 6 #E E + a

12、# E +TE 7 #ET+ + + a# 8 # ET T a # T FT 9 #ETF F a # F a 10 #ETa a a # 11 #ET T # T 12 #E E # E 13 # # #,21,FIRST集和FOLLOW集的定义设G=(VT,VN,P，S)是上下文无关文法 FIRST（）=a| =* a,aVT, , V* 若 =* 则规定FRIST（） FOLLOW（A）=aS =* A 且a FRIST（）， V*， V+ 若S =* u A ,且 =* ，则#FOLLOW(A),LL (1) 文法,22,计算FIRST集,1.若XV,则FIRST(X)=X 2.若

13、XVN,且有产生式Xa，则把a加入到FIRST(X)中;若X也是一条产生式,则把也加到FIRST(X)中. 3.若XY是一个产生式且YVN,则把FIRST(Y)中的所有非元素都加到FIRST(X)中;若X Y1Y2YK 是一个产生式,Y1,Y2,Y(i-1)都是非终结符,而且,对于任何j,1j i-1, FIRST(Yj)都含有 (即Y1Y(i-1) =* ),则把FIRST(Yj)中的所有非元素都加到FIRST(X)中;特别是,若所有的FIRST(Yj , j=1,2,K)均含有,则把加到FRIST(X)中.,23,计算FOLLOW集,1.对于文法的开始符号S,置#于FOLLOW(S) 中;

14、 2.若 B 是一个产生式,则把 FIRST()加至FOLLOW(B)中; 3.若 B是一个产生式,或 B是一个产生式而 =* (即FIRST()），则把FOLLOW（A）加至FOLLOW（B）中,24,一个文法G是LL（1）的，当且仅当对于G的每一个非终结符的任何两个不同产生式，下面的条件成立： FIRST（）FIRST()=,也就是和推导不出以同一个终结符a为首的符号串；它们不应该都能推出空字 .假若 =* ，那么， FIRST（)FOLLOW（A）. 也就是，若 =* .则所能推出的串的首符号不应在FOLLOW(A）中 ,25,G E: (1) E TE (2) E +TE (3

15、) E (4) T FT (5) T *FT (6) T (7) F (E) (8) F a,各非终结符的FIRST集合如下： FIRST(E)=(,i FIRST(E)=+, FIRST(T)=(,i FIRST(T)=*, FIRST(F)=(,i,各非终结符的FOLLOW集合为： FOLLOW(E)=), FOLLOW(E)=)， FOLLOW(T)=，)， FOLLOW(T)=,)，# FOLLOW(F)=*,，),#,26,G E: (1) E TE (2) E +TE (3) E (4) T FT (5) T *FT (6) T (7) F (E) (8) F a,E +TE |

16、FIRST(+TE)=+ FOLLOW(E)=)， T *FT | FIRST(*FT)=* FOLLOW(T)=+,)， F (E) | a FIRST( (E)=( FIRST( a)=a 所以GE是LL（1）的,27,予测分析表构造算法,1.对文法G的每个产生式执行第二步和第三步； 2.对每个终结符aFIRST()，把加至A,a中， 3.若 FIRST()，则对任何bFOLLOW(A) 把加至A,b中， 4.把所有无定义的A,a标上“出错标志”。可以证明，一个文法G的予测分析表不含多重入口，当且仅当该文法是LL(1)的,28,LL(1)文法的性质： LL(1)文法是无二义的

17、LL(1)文法不含左递归非LL(1)文法的改造消除左递归提左公因子将产生式 | 变换为： B B |,29,EE+TT TT*FF Fi(E) FIRST(E)=(,i FIRST(T)=(,i FIRST(F)=(,i 消左递归 E TE E +TE E ,S if C t S | if C t S e S C b 提左因子 S if C t S A A e S | First集 Follow集 S if #,e A e, #, e C b t MA,e=A e S A ,30,LL(1)分析中的一种错误处理办法,发现错误 1栈顶的终结符与当前输入符不匹配 2非终结符A于栈顶，面临的

18、输入符为a，但分析表M的MA,a为空 “应急”恢复策略跳过输入串中的一些符号直至遇到“同步符号”为止。同步符号的选择 1把FOLLOW(A)中的所有符号作为A的同步符号。跳过输入串中的一些符号直至遇到这些“同步符号”，把A从栈中弹出，可使分析继续 2把FIRST(A)中的符号加到A的同步符号集，当FIRST(A)中的符号在输入中出现时，可根据A恢复分析,31,review-parsing,The syntax analysis phase of a compiler verifies that the sequence of tokens returned from the scanner

19、 represent valid sentences in the grammar of the programming language. There are two major parsing approaches: top-down and bottom-up. In top-down parsing, you start with the start symbol and apply the productions until you arrive at the desired string. In bottom-up parsing, you start with the strin

20、g and reduce it to the start symbol by applying the productions backwards.,32,In the top-down parsing,we begin with the start symbol and at each step, expand one of the remaining nonterminals by replacing it with the right side of one its productions. We repeat until only terminals remain. The top-d

21、own parse prints a leftmost derivation of the sentence.,A bottom-up parse works in reverse. We begin with the sentence of terminals and each step applies a production in reverse, replacing a substring that matches the right side with the nonterminal on the left. We continue until we have substituted

22、 our way back to the start symbol. If you read from the bottom to top, the bottom-up parse prints out a rightmost derivation of the sentence.,33,lookahead symbol The lookahead symbol is the next symbol coming up in the input. backtracking. Based on the information the parser currently has about the

23、input, a decision is made to go with one particular production. If this choice leads to a dead end, the parser would have to backtrack to that decision point, moving backwards through the input, and start again making a different choice and so on until it either found the production that was the app

24、ropriate one or ran out of choices.,34,predictive parser and LL(1)grammar,Predictive parser is a non-backtracking top-down parser. A predictive parser is characterized by its ability to choose the production to apply solely on the basis of the next input symbol and the current nonterminal being proc

25、essed. To enable this, the grammar must take a particular form. We call such a grammar LL(1). The first “L” means we scan the input from left to right; the second “L” means we create a leftmost derivation; and the 1 means one input symbol of lookahead.,35,recursive-descent,The first technique for im

26、plementing a predictive parser is called recursive-descent. A recursive descent parser consists of several small functions(procedures), one for each nonterminal in the grammar. As we parse a sentence, we call the functions (procedures) that correspond to the left side nonterminal of the productions

27、we are applying. If these productions are recursive, we end up calling the functions recursively.,36,Table-driven LL(1) parsing,In a recursive-descent parser, the production information is embedded in the individual parse functions for each nonterminal and the run-time execution stack is keeping tra

28、ck of our progress through the parse. There is another method for implementing a predictive parser that uses a table to store that production along with an explicit stack to keep track of where we are in the parse.,37,How a table-driven predictive parser works,We push the start symbol on the stack a

29、nd read the first input token. As the parser works through the input, there are the following possibilities for the top stack symbol X and the input token nonterminal a: 1. If X = a and a = end of input (#): parser halts and parse completed successfully 2. If X = a and a != #: successful match, pop

30、X and advance to next input token. This is called a match action. 3. If X != a and X is a nonterminal, pop X and consult table at X,a to see which production applies, push right side of production on stack. This is called a predict action. 4. If none of the preceding cases applies or the table entry

31、 from step 3 is blank, there has been a parse error,38,The first set of a sequence of symbols u, written as First(u ) is the set of terminals which start all the sequences of symbols derivable from u. A bit more formally, consider all strings derivable from u by a leftmost derivation. If u =* v , wh

32、ere v begins with some terminal, that terminal is in First(u). If u =* , then is in First(u ).,39,The follow set of a nonterminal A is the set of terminal symbols that can appear immediately to the right of A in a valid sentence. A bit more formally, for every valid sentence S =*uAv , where v begins

33、 with some terminal, that terminal is in Follow(A).,40,Computing first,To calculate First(u) where u has the form X1X2.Xn, do the following: 1. If X1 is a terminal, then add X1 to First(u), otherwise add First(X1) - to First(u ) . 2. If X1 is a nullable nonterminal, i.e., X1 =* , add First(X2) - to

34、First(u). Furthermore, if X2 can also go to , then add First(X3) - and so on, through all Xn until the first nonnullable one. 3. If X1X2.Xn =* , add to the first set.,41,Calculating follow sets. For each nonterminal in the grammar, do the following:,1. Place# in Follow(S) where S is the start symbol

35、 and # is the inputs right endmarker.The endmarker might be end of file, it might be newline, it might be a special symbol, whatever is the expected end of input indication for this grammar. We will typically use # as the endmarker. 2. For every production A uBv where u and v are any string of gramm

36、ar symbols and B is a nonterminal, everything in First(v) except is placed in Follow(B). 3. For every production A uB, or a production A u Bv where First(v ) contains (i.e. v is nullable), then everything in Follow(A) is added to Follow(B).,42,Constructing the parse table,1. For each production A u

37、of the grammar, do steps 2 and 3 2. For each terminal a in First(u), add A u to MA,a 3. If in First(u), (i.e. A is nullable) add A u to MA,b for each terminal b in Follow(A), If in First(u), and # is in Follow(A), add A u to MA,# 4. All undefined entries are errors,43,LL(1) grammar,A grammar G is LL

38、(1) iff whenever A u | v are two distinct productions of G, the following conditions hold: - for no terminal a do both u and v derive strings beginning with a (i.e. first sets are disjoint) - at most one of u and v can derive the empty string - if v =* then v does not derive any string beginning with a terminal in Follow(A),

展开阅读全文