使用antlr和生成java代码

转载

mob64ca13f83523 2024-09-14 18:40:06

文章标签 使用antlr和生成java代码 java python 大数据机器学习 文章分类 Java 后端开发

语言

该语言将允许定义变量和表达式。我们将支持：

整数和十进制文字
变量定义和赋值
基本数学运算（加，减，乘，除）
括号的用法

有效文件的示例：

var a = 10 / 3  var b = ( 5 + 3 ) * 2  var c = a / b

我们将使用的工具

我们将使用：

ANTLR生成词法分析器和解析器
使用Gradle作为我们的构建系统
用Kotlin编写代码。鉴于我刚开始学习它，这将是非常基本的Kotlin。

设置项目

我们的构建。 gradle文件将如下所示

buildscript { 
    ext.kotlin_version = '1.3.70' 
    repositories { 
      mavenCentral() 
      maven { 
         name 'JFrog OSS snapshot repo' 
         url ' https://oss.jfrog.org/oss-snapshot-local/ ' 
      } 
      jcenter() 
    } 
    dependencies { 
      classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version" 
    }  }  apply plugin: 'kotlin'  apply plugin: 'java'  apply plugin: 'idea'  apply plugin: 'antlr' apply plugin: 'antlr'  repositories { 
   mavenLocal() 
   mavenCentral() 
   jcenter()  }  dependencies { 
   antlr "org.antlr:antlr4:4.8" 
   compile "org.antlr:antlr4-runtime:4.8" 
   compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version" 
   compile "org.jetbrains.kotlin:kotlin-reflect:$kotlin_version" 
   testCompile "org.jetbrains.kotlin:kotlin-test:$kotlin_version" 
   testCompile "org.jetbrains.kotlin:kotlin-test-junit:$kotlin_version" 
   testCompile 'junit:junit:4.13'  }  generateGrammarSource { 
     maxHeapSize = "64m" 
     arguments += [ '-package' , 'me.tomassetti.langsandbox' ] 
     outputDirectory = new File( "generated-src/antlr/main/me/tomassetti/langsandbox" .toString())  }  compileJava.dependsOn generateGrammarSource  sourceSets { 
     generated { 
         java.srcDir 'generated-src/antlr/main/' 
     }  }  compileJava.source sourceSets.generated.java, sourceSets.main.java  clean{ 
     delete "generated-src"  }  idea { 
     module { 
         sourceDirs += file( "generated-src/antlr/main" ) 
     }  }

我们可以运行：

./gradlew想法来生成IDEA项目文件
./gradlew generateGrammarSource生成ANTLR词法分析器和解析器

实施词法分析器

我们将在两个单独的文件中构建词法分析器和解析器。这是词法分析器：

lexer grammar SandyLexer;  // Whitespace  NEWLINE           : '\r\n' | 'r' | '\n' ;  WS                : [\t ]+ ;  // Keywords  VAR               : 'var' ;  // Literals  INTLIT            : '0' |[ 1 - 9 ][ 0 - 9 ]* ;  DECLIT            : '0' |[ 1 - 9 ][ 0 - 9 ]* '.' [ 0 - 9 ]+ ;  // Operators  PLUS              : '+' ;  MINUS             : '-' ;  ASTERISK          : '*' ;  DIVISION          : '/' ;  ASSIGN            : '=' ;  LPAREN            : '(' ;  RPAREN            : ')' ;  // Identifiers  ID                : [_]*[az][A-Za-z0-9_]* ;

现在，我们可以简单地运行./ gradlew generateGrammarSource，并且将根据先前的定义为我们生成词法分析器。

测试词法分析器

测试始终很重要，但是在构建语言时绝对至关重要：如果支持您的语言的工具不正确，这可能会影响您将为其构建的所有程序。因此，让我们开始测试词法分析器：我们只需要验证词法分析器产生的标记序列就是我们所关注的。

package me.tomassetti.sandy  import me.tomassetti.langsandbox.SandyLexer  import org.antlr.v4.runtime.CharStreams  import java.util.*  import kotlin.test.assertEquals  import org.junit.Test as test  SandyLexerTest { class SandyLexerTest { 
     fun lexerForCode(code: String) = SandyLexer(CharStreams.fromString(code)) 
     fun lexerForResource(resourceName: String) = SandyLexer(ANTLRInputStream( this .javaClass.getResourceAsStream( " https://mk0tuzolorusfnc7thxk.kinstacdn.com/ ${resourceName}.sandy" ))) 
     fun tokens(lexer: SandyLexer): List<String> { 
         val tokens = LinkedList<String>() 
         do { 
            val t = lexer.nextToken() 
             when (t.type) { 
                 - 1 -> tokens.add( "EOF" ) 
                 else -> if (t.type != SandyLexer.WS) tokens.add(lexer.ruleNames[t.type - 1 ]) 
             } 
         } while (t.type != - 1 ) 
         return tokens 
     } 
     @test fun parseVarDeclarationAssignedAnIntegerLiteral() { 
         assertEquals(listOf( "VAR" , "ID" , "ASSIGN" , "INTLIT" , "EOF" ), 
                 tokens(lexerForCode( "var a = 1" ))) 
     } 
     @test fun parseVarDeclarationAssignedADecimalLiteral() { 
         assertEquals(listOf( "VAR" , "ID" , "ASSIGN" , "DECLIT" , "EOF" ), 
                 tokens(lexerForCode( "var a = 1.23" ))) 
     } 
     @test fun parseVarDeclarationAssignedASum() { 
         assertEquals(listOf( "VAR" , "ID" , "ASSIGN" , "INTLIT" , "PLUS" , "INTLIT" , "EOF" ), 
                 tokens(lexerForCode( "var a = 1 + 2" ))) 
     } 
     @test fun parseMathematicalExpression() { 
         assertEquals(listOf( "INTLIT" , "PLUS" , "ID" , "ASTERISK" , "INTLIT" , "DIVISION" , "INTLIT" , "MINUS" , "INTLIT" , "EOF" ), 
                 tokens(lexerForCode( "1 + a * 3 / 4 - 5" tokens(lexerForCode( "1 + a * 3 / 4 - 5" ))) 
     } 
     @test fun parseMathematicalExpressionWithParenthesis() { 
         assertEquals(listOf( "INTLIT" , "PLUS" , "LPAREN" , "ID" , "ASTERISK" , "INTLIT" , "RPAREN" , "MINUS" , "DECLIT" , "EOF" ), 
                 tokens(lexerForCode( "1 + (a * 3) - 5.12" tokens(lexerForCode( "1 + (a * 3) - 5.12" ))) 
     }  }