欢迎访问 生活随笔!

生活随笔

当前位置: 首页 >

编译原理Antlr教程

发布时间:2023/12/29 68 豆豆
生活随笔 收集整理的这篇文章主要介绍了 编译原理Antlr教程 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

一.安装、配置Antlr

首先,安装配置Antlr前,确保你已经安装好java环境了。

1.下载Antlr4

下载网址:https://www.antlr.org/download/

选择 Tool and Java runtime lib 目录下的 antlr-4.7.2-complete.jar 下载。

2.配置批处理文件

antlr-4.7.2-complete.jar 所在目录下新建两个bat文件,antlr4.bat和grun.bat

文件组织如下:

在antlr4.bat中写入:

java org.antlr.v4.Tool %*

在grun.bat中写入:

java org.antlr.v4.gui.TestRig %*

3.配置环境变量

步骤:(win10)设置 -> 系统 -> 关于 -> (右上角)高级系统设置 -> 环境变量 ->系统变量。

在系统变量 CLASSPATH 中添加antlr-4.7.2-complete.jar所在路径:

就成功配置好了Antlr环境。

二、使用Antlr

1.编写.g4文件

.g4文件是antlr生成词法解析规则和语法解析规则的基础,是语言的文法的表示方法。一个完整的文法是编译原理整个实验的基础。

以下是我的实验采用的C语言的文法文件。命名为MyCGrammer.g4

具体是参考

/*[The "BSD licence"]Copyright (c) 2013 Sam HarwellAll rights reserved.Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditionsare met:1. Redistributions of source code must retain the above copyrightnotice, this list of conditions and the following disclaimer.2. Redistributions in binary form must reproduce the above copyrightnotice, this list of conditions and the following disclaimer in thedocumentation and/or other materials provided with the distribution.3. The name of the author may not be used to endorse or promote productsderived from this software without specific prior written permission.THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS ORIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIESOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUTNOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANYTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OFTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. *//** C 2011 grammar built from the C11 Spec */ grammar MyCGrammer;primaryExpression: tokenId//Identifier| tokenConstant//Constant| tokenStringLiteral//StringLiteral+| '(' expression ')'| genericSelection| '__extension__'? '(' compoundStatement ')' // Blocks (GCC extension)| '__builtin_va_arg' '(' unaryExpression ',' typeName ')'| '__builtin_offsetof' '(' typeName ',' unaryExpression ')'; tokenId : Identifier; tokenConstant: Constant; tokenStringLiteral: StringLiteral+;genericSelection: '_Generic' '(' assignmentExpression ',' genericAssocList ')';genericAssocList: genericAssociation| genericAssocList ',' genericAssociation;genericAssociation: typeName ':' assignmentExpression| 'default' ':' assignmentExpression;postfixExpression: primaryExpression #postfixExpression_pass| postfixExpression '[' expression ']' #postfixExpression_arrayaccess| postfixExpression '(' argumentExpressionList? ')' #postfixExpression_funcall| postfixExpression '.' Identifier #postfixExpression_member| postfixExpression '->' Identifier #postfixExpression_point| postfixExpression '++' #postfixExpression_| postfixExpression '--' #postfixExpression_| '(' typeName ')' '{' initializerList '}' #postfixExpression_pass| '(' typeName ')' '{' initializerList ',' '}' #postfixExpression_pass| '__extension__' '(' typeName ')' '{' initializerList '}' #postfixExpression_pass| '__extension__' '(' typeName ')' '{' initializerList ',' '}' #postfixExpression_pass;argumentExpressionList: assignmentExpression| argumentExpressionList ',' assignmentExpression;unaryExpression: postfixExpression #unaryExpression_pass| '++' unaryExpression #unaryExpression_| '--' unaryExpression #unaryExpression_| unaryOperator castExpression #unaryExpression_| 'sizeof' unaryExpression #unaryExpression_pass| 'sizeof' '(' typeName ')' #unaryExpression_pass| '_Alignof' '(' typeName ')' #unaryExpression_pass| '&&' Identifier #unaryExpression_pass;unaryOperator: '&' | '*' | '+' | '-' | '~' | '!';castExpression: unaryExpression #castExpression_pass| '(' typeName ')' castExpression #castExpression_| '__extension__' '(' typeName ')' castExpression #castExpression_;multiplicativeExpression: castExpression #multiplicativeExpression_pass| multiplicativeExpression '*' castExpression #multiplicativeExpression_| multiplicativeExpression '/' castExpression #multiplicativeExpression_| multiplicativeExpression '%' castExpression #multiplicativeExpression_;additiveExpression: multiplicativeExpression #additiveExpression_pass| additiveExpression '+' multiplicativeExpression #additiveExpression_| additiveExpression '-' multiplicativeExpression #additiveExpression_;shiftExpression: additiveExpression #shiftExpression_pass| shiftExpression '<<' additiveExpression #shiftExpression_| shiftExpression '>>' additiveExpression #shiftExpression_;relationalExpression: shiftExpression #relationalExpression_pass| relationalExpression '<' shiftExpression #relationalExpression_| relationalExpression '>' shiftExpression #relationalExpression_| relationalExpression '<=' shiftExpression #relationalExpression_| relationalExpression '>=' shiftExpression #relationalExpression_;equalityExpression: relationalExpression #equalityExpression_pass| equalityExpression '==' relationalExpression #equalityExpression_| equalityExpression '!=' relationalExpression #equalityExpression_;andExpression: equalityExpression #andExpression_pass| andExpression '&' equalityExpression #andExpression_;exclusiveOrExpression: andExpression #exclusiveOrExpression_pass| exclusiveOrExpression '^' andExpression #exclusiveOrExpression_;inclusiveOrExpression: exclusiveOrExpression #inclusiveOrExpression_pass| inclusiveOrExpression '|' exclusiveOrExpression #inclusiveOrExpression_;logicalAndExpression: inclusiveOrExpression #logicalAndExpression_pass| logicalAndExpression '&&' inclusiveOrExpression #logicalAndExpression_;logicalOrExpression: logicalAndExpression #logicalOrExpression_pass| logicalOrExpression '||' logicalAndExpression #logicalOrExpression_;conditionalExpression: logicalOrExpression ('?' expression ':' conditionalExpression)?;assignmentExpression: conditionalExpression #assignmentExpression_pass| unaryExpression assignmentOperator assignmentExpression #assignmentExpression_;assignmentOperator: '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|=';expression: assignmentExpression #expression_| expression ',' assignmentExpression #expression_pass;constantExpression: conditionalExpression;declaration: declarationSpecifiers initDeclaratorList? ';'| staticAssertDeclaration;declarationSpecifiers: declarationSpecifier+;declarationSpecifiers2: declarationSpecifier+;declarationSpecifier: storageClassSpecifier| typeSpecifier| typeQualifier| functionSpecifier| alignmentSpecifier;initDeclaratorList: initDeclarator| initDeclaratorList ',' initDeclarator;initDeclarator: declarator| declarator '=' initializer;storageClassSpecifier: 'typedef'| 'extern'| 'static'| '_Thread_local'| 'auto'| 'register';typeSpecifier: 'void' #typeSpecifier_| 'char' #typeSpecifier_| 'short' #typeSpecifier_| 'int' #typeSpecifier_| 'long' #typeSpecifier_| 'float' #typeSpecifier_| 'double' #typeSpecifier_| 'signed' #typeSpecifier_| 'unsigned' #typeSpecifier_;structOrUnionSpecifier: structOrUnion Identifier? '{' structDeclarationList '}'| structOrUnion Identifier;structOrUnion: 'struct'| 'union';structDeclarationList: structDeclaration| structDeclarationList structDeclaration;structDeclaration: specifierQualifierList structDeclaratorList? ';'| staticAssertDeclaration;specifierQualifierList: typeSpecifier specifierQualifierList?| typeQualifier specifierQualifierList?;structDeclaratorList: structDeclarator| structDeclaratorList ',' structDeclarator;structDeclarator: declarator| declarator? ':' constantExpression;enumSpecifier: 'enum' Identifier? '{' enumeratorList '}'| 'enum' Identifier? '{' enumeratorList ',' '}'| 'enum' Identifier;enumeratorList: enumerator| enumeratorList ',' enumerator;enumerator: enumerationConstant| enumerationConstant '=' constantExpression;enumerationConstant: Identifier;atomicTypeSpecifier: '_Atomic' '(' typeName ')';typeQualifier: 'const'| 'restrict'| 'volatile'| '_Atomic';functionSpecifier: ('inline'| '_Noreturn'| '__inline__' // GCC extension| '__stdcall')| gccAttributeSpecifier| '__declspec' '(' Identifier ')';alignmentSpecifier: '_Alignas' '(' typeName ')'| '_Alignas' '(' constantExpression ')';declarator: pointer? directDeclarator gccDeclaratorExtension*;directDeclarator: Identifier #directDeclarator_pass| '(' declarator ')' #directDeclarator_pass| directDeclarator '[' typeQualifierList? assignmentExpression? ']' #directDeclarator_array| directDeclarator '[' 'static' typeQualifierList? assignmentExpression ']' #directDeclarator_array| directDeclarator '[' typeQualifierList 'static' assignmentExpression ']' #directDeclarator_array| directDeclarator '[' typeQualifierList? '*' ']' #directDeclarator_array| directDeclarator '(' parameterTypeList ')' #directDeclarator_func| directDeclarator '(' identifierList? ')' #directDeclarator_func;gccDeclaratorExtension: '__asm' '(' StringLiteral+ ')'| gccAttributeSpecifier;gccAttributeSpecifier: '__attribute__' '(' '(' gccAttributeList ')' ')';gccAttributeList: gccAttribute (',' gccAttribute)*| // empty;gccAttribute: ~(',' | '(' | ')') // relaxed def for "identifier or reserved word"('(' argumentExpressionList? ')')?| // empty;nestedParenthesesBlock: ( ~('(' | ')')| '(' nestedParenthesesBlock ')')*;pointer: '*' typeQualifierList?| '*' typeQualifierList? pointer| '^' typeQualifierList? // Blocks language extension| '^' typeQualifierList? pointer // Blocks language extension;typeQualifierList: typeQualifier| typeQualifierList typeQualifier;parameterTypeList: parameterList| parameterList ',' '...';parameterList: parameterDeclaration| parameterList ',' parameterDeclaration;parameterDeclaration: declarationSpecifiers declarator| declarationSpecifiers2 abstractDeclarator?;identifierList: Identifier| identifierList ',' Identifier;typeName: specifierQualifierList abstractDeclarator?;abstractDeclarator: pointer| pointer? directAbstractDeclarator gccDeclaratorExtension*;directAbstractDeclarator: '(' abstractDeclarator ')' gccDeclaratorExtension*| '[' typeQualifierList? assignmentExpression? ']'| '[' 'static' typeQualifierList? assignmentExpression ']'| '[' typeQualifierList 'static' assignmentExpression ']'| '[' '*' ']'| '(' parameterTypeList? ')' gccDeclaratorExtension*| directAbstractDeclarator '[' typeQualifierList? assignmentExpression? ']'| directAbstractDeclarator '[' 'static' typeQualifierList? assignmentExpression ']'| directAbstractDeclarator '[' typeQualifierList 'static' assignmentExpression ']'| directAbstractDeclarator '[' '*' ']'| directAbstractDeclarator '(' parameterTypeList? ')' gccDeclaratorExtension*;initializer: assignmentExpression| '{' initializerList '}'| '{' initializerList ',' '}';initializerList: designation? initializer| initializerList ',' designation? initializer;designation: designatorList '=';designatorList: designator| designatorList designator;designator: '[' constantExpression ']'| '.' Identifier;staticAssertDeclaration: '_Static_assert' '(' constantExpression ',' StringLiteral+ ')' ';';statement: labeledStatement| compoundStatement| expressionStatement| selectionStatement| iterationStatement| jumpStatement| ('__asm' | '__asm__') ('volatile' | '__volatile__') '(' (logicalOrExpression (',' logicalOrExpression)*)? (':' (logicalOrExpression (',' logicalOrExpression)*)?)* ')' ';';labeledStatement: Identifier ':' statement | 'case' constantExpression ':' statement | 'default' ':' statement;compoundStatement: '{' blockItemList? '}';blockItemList: blockItem| blockItemList blockItem;blockItem: declaration| statement;expressionStatement: expression? ';';selectionStatement: 'if' '(' expression ')' statement ('else' statement)? #selectionStatement_if| 'switch' '(' expression ')' statement #selectionStatement_switch;iterationStatement: 'while' '(' expression ')' statement #iterationStatement_while| 'do' statement 'while' '(' expression ')' ';' #iterationStatement_dowhile| 'for' '(' expression? ';' expression? ';' expression? ')' statement #iterationStatement_for| 'for' '(' declaration expression? ';' expression? ')' statement #iterationStatement_forDeclared;jumpStatement: 'goto' Identifier ';' #jumpStatement_goto| 'continue' ';' #jumpStatement_continue| 'break' ';' #jumpStatement_break| 'return' expression? ';' #jumpStatement_return| 'goto' unaryExpression ';' #jumpStatement_ // GCC extension ;compilationUnit: translationUnit? EOF;translationUnit: externalDeclaration| translationUnit externalDeclaration;externalDeclaration: functionDefinition| declaration| ';' // stray ;;functionDefinition: declarationSpecifiers? declarator declarationList? compoundStatement;declarationList: declaration| declarationList declaration;functionCall: tokenId '(' argumentExpressionList? ')' #functionCall_ ;Auto : 'auto'; Break : 'break'; Case : 'case'; Char : 'char'; Const : 'const'; Continue : 'continue'; Default : 'default'; Do : 'do'; Double : 'double'; Else : 'else'; Enum : 'enum'; Extern : 'extern'; Float : 'float'; For : 'for'; Goto : 'goto'; If : 'if'; Inline : 'inline'; Int : 'int'; Long : 'long'; Register : 'register'; Restrict : 'restrict'; Return : 'return'; Short : 'short'; Signed : 'signed'; Sizeof : 'sizeof'; Static : 'static'; Struct : 'struct'; Switch : 'switch'; Typedef : 'typedef'; Union : 'union'; Unsigned : 'unsigned'; Void : 'void'; Volatile : 'volatile'; While : 'while';Alignas : '_Alignas'; Alignof : '_Alignof'; Atomic : '_Atomic'; Bool : '_Bool'; Complex : '_Complex'; Generic : '_Generic'; Imaginary : '_Imaginary'; Noreturn : '_Noreturn'; StaticAssert : '_Static_assert'; ThreadLocal : '_Thread_local';LeftParen : '('; RightParen : ')'; LeftBracket : '['; RightBracket : ']'; LeftBrace : '{'; RightBrace : '}';Less : '<'; LessEqual : '<='; Greater : '>'; GreaterEqual : '>='; LeftShift : '<<'; RightShift : '>>';Plus : '+'; PlusPlus : '++'; Minus : '-'; MinusMinus : '--'; Star : '*'; Div : '/'; Mod : '%';And : '&'; Or : '|'; AndAnd : '&&'; OrOr : '||'; Caret : '^'; Not : '!'; Tilde : '~';Question : '?'; Colon : ':'; Semi : ';'; Comma : ',';Assign : '='; // '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|=' StarAssign : '*='; DivAssign : '/='; ModAssign : '%='; PlusAssign : '+='; MinusAssign : '-='; LeftShiftAssign : '<<='; RightShiftAssign : '>>='; AndAssign : '&='; XorAssign : '^='; OrAssign : '|=';Equal : '=='; NotEqual : '!=';Arrow : '->'; Dot : '.'; Ellipsis : '...';Identifier: IdentifierNondigit( IdentifierNondigit| Digit)*;fragment IdentifierNondigit: Nondigit| UniversalCharacterName//| // other implementation-defined characters...;fragment Nondigit: [a-zA-Z_];fragment Digit: [0-9];fragment UniversalCharacterName: '\\u' HexQuad| '\\U' HexQuad HexQuad;fragment HexQuad: HexadecimalDigit HexadecimalDigit HexadecimalDigit HexadecimalDigit;Constant: IntegerConstant| FloatingConstant//| EnumerationConstant| CharacterConstant;fragment IntegerConstant: DecimalConstant IntegerSuffix?| OctalConstant IntegerSuffix?| HexadecimalConstant IntegerSuffix?| BinaryConstant;fragment BinaryConstant: '0' [bB] [0-1]+;fragment DecimalConstant: NonzeroDigit Digit*;fragment OctalConstant: '0' OctalDigit*;fragment HexadecimalConstant: HexadecimalPrefix HexadecimalDigit+;fragment HexadecimalPrefix: '0' [xX];fragment NonzeroDigit: [1-9];fragment OctalDigit: [0-7];fragment HexadecimalDigit: [0-9a-fA-F];fragment IntegerSuffix: UnsignedSuffix LongSuffix?| UnsignedSuffix LongLongSuffix| LongSuffix UnsignedSuffix?| LongLongSuffix UnsignedSuffix?;fragment UnsignedSuffix: [uU];fragment LongSuffix: [lL];fragment LongLongSuffix: 'll' | 'LL';fragment FloatingConstant: DecimalFloatingConstant| HexadecimalFloatingConstant;fragment DecimalFloatingConstant: FractionalConstant ExponentPart? FloatingSuffix?| DigitSequence ExponentPart FloatingSuffix?;fragment HexadecimalFloatingConstant: HexadecimalPrefix HexadecimalFractionalConstant BinaryExponentPart FloatingSuffix?| HexadecimalPrefix HexadecimalDigitSequence BinaryExponentPart FloatingSuffix?;fragment FractionalConstant: DigitSequence? '.' DigitSequence| DigitSequence '.';fragment ExponentPart: 'e' Sign? DigitSequence| 'E' Sign? DigitSequence;fragment Sign: '+' | '-';fragment DigitSequence: Digit+;fragment HexadecimalFractionalConstant: HexadecimalDigitSequence? '.' HexadecimalDigitSequence| HexadecimalDigitSequence '.';fragment BinaryExponentPart: 'p' Sign? DigitSequence| 'P' Sign? DigitSequence;fragment HexadecimalDigitSequence: HexadecimalDigit+;fragment FloatingSuffix: 'f' | 'l' | 'F' | 'L';fragment CharacterConstant: '\'' CCharSequence '\''| 'L\'' CCharSequence '\''| 'u\'' CCharSequence '\''| 'U\'' CCharSequence '\'';fragment CCharSequence: CChar+;fragment CChar: ~['\\\r\n]| EscapeSequence; fragment EscapeSequence: SimpleEscapeSequence| OctalEscapeSequence| HexadecimalEscapeSequence| UniversalCharacterName; fragment SimpleEscapeSequence: '\\' ['"?abfnrtv\\]; fragment OctalEscapeSequence: '\\' OctalDigit| '\\' OctalDigit OctalDigit| '\\' OctalDigit OctalDigit OctalDigit; fragment HexadecimalEscapeSequence: '\\x' HexadecimalDigit+; StringLiteral: EncodingPrefix? '"' SCharSequence? '"'; fragment EncodingPrefix: 'u8'| 'u'| 'U'| 'L'; fragment SCharSequence: SChar+; fragment SChar: ~["\\\r\n]| EscapeSequence| '\\\n' // Added line| '\\\r\n' // Added line;ComplexDefine: '#' Whitespace? 'define' ~[#]*-> skip;// ignore the following asm blocks: /*asm{mfspr x, 286;}*/ AsmBlock: 'asm' ~'{'* '{' ~'}'* '}'-> skip;// ignore the lines generated by c preprocessor // sample line : '#line 1 "/home/dm/files/dk1.h" 1' LineAfterPreprocessing: '#line' Whitespace* ~[\r\n]*-> skip; LineDirective: '#' Whitespace? DecimalConstant Whitespace? StringLiteral ~[\r\n]*-> skip;PragmaDirective: '#' Whitespace? 'pragma' Whitespace ~[\r\n]*-> skip;Whitespace: [ \t]+-> skip;Newline: ( '\r' '\n'?| '\n')-> skip;BlockComment: '/*' .*? '*/'-> skip;LineComment: '//' ~[\r\n]*-> skip;

2.利用Antlr生成词法分析器和语法分析器

MyCGrammer.g4文件目录打开命令行

输入:

antlr4 MyCGrammer.g4 -visitor

-visitor(是生成visitor类,默认不生成,这涉及antlr的两种遍历方式,其实生不生成影响不大)

之后文件目录下会生成如下文件

 接着对其进行编译,在命令行输入:

javac MyCGrammer*.java

这样C语言词法分析器和语法分析器就生成好了。

 3.测试

在命令行输入:

grun MyCGrammer compilationUnit -tokens

再输入一段c语言代码,按Crtl+Z结束。就可以生成对应代码的词法分析结果。

 在命令行输入:

grun MyCGrammer compilationUnit -gui

同样再输入一段c语言代码,按Crtl+Z结束。就可以生成对应代码的语法分析树。

再介绍以下其它的选项:

-tokens:打印出词法符号流。

-tree:以LISP格式打印出语法分析树。

-gui:在对话框中以可视化方式显示语法分析树。

-ps file.ps :以PostScript格式生成可视化语法分析树,然后将其存储于file.ps。

-encoding encodingname:若当前的区域设定无法正确读取输入,使用这个选项指定测试组件输入文件的编码。

-trace:打印规则的名字以及进入和离开该规则时的词法符号。

-diagnostics:开启解析过程中的调试信息输出。

-SLL:使用另外一种更快但是功能稍弱的解析策略。
 

三、打包词法分析器和语法分析器

以上我们的工作都是在命令行进行的,如果要将词法分析和语法分析放到项目中,就需要将生成的文件进行打包。

在刚刚生成的文件目录下,新建两个文件夹libMyCGrammer

将下载的antlr-4.7.2-complete.jar复制到lib文件夹中

用IDEA打开(我此处用的是IntelliJ IDEA Community Edition 2021.1,其它应该也类似,或者自己搜索打包方法)

 找到以下文件,并在头部输入:

package MyCGrammer;

并移入MyCGrammer文件夹中 

 点击左上角File->Project Structure->Artifacts->JAR->From modules with dependecies->copy to...

 

 然后点击Build->Build Artifacts

即可在对应目录out中生成对应的jar包。

———————————————————————————————————————————

以上便是Antlr的整个教程,后续将利用此对C语言进行词法分析,语法分析,中间代码生成以及生成目标代码。

总结

以上是生活随笔为你收集整理的编译原理Antlr教程的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。