用Java写一个翻译器,将Java的字节码翻译成汇编语言
堆栈运算命令
基本思路
主要写两个类,一个解析器类Parser负责处理输入的vm文件,解析vm指令,一个类CodeWriter负责将经过Parser解析过的vm指令翻译成汇编指令,输出asm文件。
首先编写类Parser,有六个成员函数,包含构造函数负责打开文件流并准备读取vm指令,hasMoreCommands函数判断是否还有指令,advance函数负责将vm指令处理干净,去掉空白和注释,commandType函数判断vm指令的类型,arg1和arg2函数负责返回指令的组成部分。
然后编写CodeWriter类,构造函数打开一个文件,准备写入asm指令,writeArithmetic函数写入算术逻辑运算asm指令,writerPushPop函数写入push和pop的asm指令。
然后最关键的地方来了,如何从vm指令到asm指令?
我们首先从算术逻辑运算指令来看,以二元运算为例,计算的两个数是放在栈上的,位于栈指针SP上面两个位置,而我们只有M和D两个寄存器可以用来计算,在A寄存器保存栈指针地址的情况下。
因此,对于所有二元运算,我们首先要把参与计算的两个数放在M和D寄存器上,具体操作是,栈指针SP自减,把M的值赋给D,然后再将栈指针上移。
然后再执行二元运算,这样就比较简单了。
对于一元运算比较简单,直接栈指针自减,对M进行操作即可。
而对于gt、eq和lt这样比较指令,则比较复杂,因为涉及到跳转,同样是二元运算,因此我们需要先按照上述方法先将参与计算的两个数拿出来放在M和D寄存器,然后计算M和D的差,通过比较差和0的大小来跳转。
而对于push constant x指令,将一个常数压入栈就比较简单了。
拿到常数的值后将它写入栈指针执行的内存,然后栈指针自增就行了。
核心代码
Parser
代码语言:javascript复制import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Objects;
import java.util.Scanner;
public class Parser {
private String command = null;
private Scanner scanner = null;
private String cmd0 = null;
private String cmd1 = null;
private int cmd2;
public Parser(File file) throws FileNotFoundException {
scanner = new Scanner(new FileReader(file));
}
public boolean hasMoreCommands() {
boolean hasMore = false;
while (scanner.hasNextLine()) {
command = scanner.nextLine();
if (!Objects.equals(command, "") && command.charAt(0) != '/') { //去掉空白行和注释
String[] pure = command.split("/");
command = pure[0];
hasMore = true;
break;
}
}
return hasMore;
}
public void advance() {
String[] cmd = command.split(" ");
cmd0 = cmd[0];
if (cmd.length > 1) {
cmd1 = cmd[1];
if (cmd.length > 2) {
cmd2 = Integer.parseInt(cmd[2]);
}
}
}
public String commandType() {
if (Objects.equals(cmd0, "push")) {
return "C_PUSH";
} else {
return "C_ARITHMETIC";
}
}
public String arg1() {
if (Objects.equals(commandType(), "C_ARITHMETIC"))
return cmd0;
return cmd1;
}
public int arg2() {
return cmd2;
}
public void close(){scanner.close();}
}
Code Writer
代码语言:javascript复制import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashMap;
import java.util.Objects;
public class CodeWriter {
private FileWriter asm = null;
private String asmCommand;
private final HashMap<String, String> vmToAsm = new HashMap<>();
private int jump=0;
public CodeWriter(File file) throws IOException {
asm = new FileWriter(file);
String fetch = "@SPnM=M-1nA=MnD=MnA=A-1n";
vmToAsm.put("add", fetch "M=M Dn");
vmToAsm.put("sub", fetch "M=M-Dn");
vmToAsm.put("and", fetch "M=M&Dn");
vmToAsm.put("or", fetch "M=M|Dn");
vmToAsm.put("gt", fetch "D=M-Dn@TRUEnD;JGTn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("eq", fetch "D=M-Dn@TRUEnD;JEQn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("lt", fetch "D=M-Dn@TRUEnD;JLTn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("neg", "D=0n@SPnA=M-1nM=D-Mn");
vmToAsm.put("not", "@SPnA=M-1nM=!Mn");
}
public void writeArithmetic(String vmCommand) throws IOException {
asmCommand=vmToAsm.get(vmCommand);
if(Objects.equals(vmCommand, "gt") || Objects.equals(vmCommand, "eq") || Objects.equals(vmCommand, "lt")){
asmCommand=asmCommand.replaceAll("TRUE","TRUE" Integer.toString(jump));
asmCommand=asmCommand.replaceAll("END","END" Integer.toString(jump));
jump ;
}
asm.write(asmCommand);
}
public void writePushPop(String cmd, String segment, int index) throws IOException {
if (Objects.equals(cmd, "C_PUSH")) {
if (Objects.equals(segment, "constant")) {
asmCommand = "@" Integer.toString(index) "nD=An@SPnA=MnM=Dn@SPnM=M 1n";
}
}
asm.write(asmCommand);
}
public void close() throws IOException {
asm.close();
}
}
Main
代码语言:javascript复制import java.io.File;
import java.io.IOException;
import java.util.Objects;
public class Main {
public static void main(String[] args) throws IOException {
Parser parser=new Parser(new File("C:\Users\Yezi\Desktop\Java程序设计\nand2tetris\projects\07\StackArithmetic\StackTest\StackTest.vm"));
CodeWriter codeWriter=new CodeWriter(new File("C:\Users\Yezi\Desktop\Java程序设计\nand2tetris\projects\07\StackArithmetic\StackTest\StackTest.asm"));
while(parser.hasMoreCommands()){
parser.advance();
if(Objects.equals(parser.commandType(), "C_ARITHMETIC")){
codeWriter.writeArithmetic(parser.arg1());
}else{
codeWriter.writePushPop(parser.commandType(), parser.arg1(), parser.arg2());
}
}
codeWriter.close();
parser.close();
}
}
实验结果,使用SimpleAdd、StackTest进行验证
我们用CPU Emulator装载.tst文件,用运行程序得到的.out文件和所给的.cmp文件进行比较,其中SimpleAdd比较结果如下图所示,可见成功翻译
StackTest的结果如下图所示,可见第一阶段翻译成功。
内存访问命令
基本思路
首先要搞明白的是,push操作是将内存上的数值压入栈中,而pop操作是将栈中的数值弹出来到内存中。
对于constant段,我们第一阶段已经解决了。
对于local、argument、this和that字段,就是从它们相应的内存地址上读取或写入数据。
Push的话,先拿到segment i的地址所指向的数值,然后将这个数值压入栈中,栈指针自增。
Pop的话,要复杂一些,因为我们只有A、M和D寄存器可以用,而pop我们首先要拿到segment i的地址,所以我们要先找一个地方存下来,原本的R系列寄存器在这里已经被字段占用了,所以我们这里取地址255的内存空间暂存一下地址。
而temp字段的push和pop操作相对而言要简单许多。
此时读写的地址为5 i。
对于pointer字段,其实就是把this和that的数值压入栈或者弹栈的数值到this和that中。
当参数为0时,对this进行操作,当参数为1时,对that进行操作,在this和that的地址上进行读写数据。
而对于static字段,与前面的字段相比,不过就是换了运算的地址空间而已。
Asm代码基本和前面的操作一样,就是运算的地址变成了16开始的地址。
核心代码
Parser
代码语言:javascript复制import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Objects;
import java.util.Scanner;
public class Parser {
private String command = null;
private final Scanner scanner;
private String cmd0 = null;
private String cmd1 = null;
private int cmd2;
public Parser(File file) throws FileNotFoundException {
scanner = new Scanner(new FileReader(file));
}
public boolean hasMoreCommands() {
boolean hasMore = false;
while (scanner.hasNextLine()) {
command = scanner.nextLine();
if (!Objects.equals(command, "") && command.charAt(0) != '/') { //去掉空白行和注释
String[] pure = command.split("/");
command = pure[0];
hasMore = true;
break;
}
}
return hasMore;
}
public void advance() {
String[] cmd = command.split(" ");
cmd0 = cmd[0];
if (cmd.length > 1) {
cmd1 = cmd[1];
if (cmd.length > 2) {
cmd2 = Integer.parseInt(cmd[2]);
}
}
}
public String commandType() {
if (Objects.equals(cmd0, "push")) {
return "C_PUSH";
} else if (Objects.equals(cmd0, "pop")) {
return "C_POP";
} else {
return "C_ARITHMETIC";
}
}
public String arg1() {
if (Objects.equals(commandType(), "C_ARITHMETIC"))
return cmd0;
return cmd1;
}
public int arg2() {
return cmd2;
}
public void close() {
scanner.close();
}
}
Code Writer
代码语言:javascript复制import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashMap;
import java.util.Objects;
public class CodeWriter {
private final FileWriter asm;
private String asmCommand;
private final HashMap<String, String> vmToAsm = new HashMap<>();
private int jump = 0;
public CodeWriter(File file) throws IOException {
asm = new FileWriter(file);
String fetch = "@SPnM=M-1nA=MnD=MnA=A-1n";
vmToAsm.put("add", fetch "M=M Dn");
vmToAsm.put("sub", fetch "M=M-Dn");
vmToAsm.put("and", fetch "M=M&Dn");
vmToAsm.put("or", fetch "M=M|Dn");
vmToAsm.put("gt", fetch "D=M-Dn@TRUEnD;JGTn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("eq", fetch "D=M-Dn@TRUEnD;JEQn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("lt", fetch "D=M-Dn@TRUEnD;JLTn@SPnA=M-1nM=0n@ENDn0;JMPn(TRUE)n@SPnA=M-1nM=-1n(END)n");
vmToAsm.put("neg", "D=0n@SPnA=M-1nM=D-Mn");
vmToAsm.put("not", "@SPnA=M-1nM=!Mn");
}
public void writeArithmetic(String vmCommand) throws IOException {
asmCommand = vmToAsm.get(vmCommand);
if (Objects.equals(vmCommand, "gt") || Objects.equals(vmCommand, "eq") || Objects.equals(vmCommand, "lt")) {
asmCommand = asmCommand.replaceAll("TRUE", "TRUE" jump);
asmCommand = asmCommand.replaceAll("END", "END" jump);
jump ;
}
asm.write(asmCommand);
}
public void writePushPop(String cmd, String segment, int index) throws IOException {
if (Objects.equals(cmd, "C_PUSH")) {
if (Objects.equals(segment, "constant")) {
asmCommand = "@" index "nD=An@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "local")) {
asmCommand = "@LCLnD=Mn@" index "nA=D AnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "argument")) {
asmCommand = "@ARGnD=Mn@" index "nA=D AnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "this")) {
asmCommand = "@THISnD=Mn@" index "nA=D AnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "that")) {
asmCommand = "@THATnD=Mn@" index "nA=D AnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "temp")) {
asmCommand = "@" (5 index) "nD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else if (Objects.equals(segment, "pointer")) {
if (index == 0) {
asmCommand = "@THISnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
} else {
asmCommand = "@THATnD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
}
} else if (Objects.equals(segment, "static")) {
asmCommand = "@" (16 index) "nD=Mn@SPnA=MnM=Dn@SPnM=M 1n";
}
} else {
if (Objects.equals(segment, "local")) {
asmCommand = "@LCLnD=Mn@" index "nD=D An@255nM=Dn@SPnM=M-1nA=MnD=Mn@255nA=MnM=Dn";
} else if (Objects.equals(segment, "argument")) {
asmCommand = "@ARGnD=Mn@" index "nD=D An@255nM=Dn@SPnM=M-1nA=MnD=Mn@255nA=MnM=Dn";
} else if (Objects.equals(segment, "this")) {
asmCommand = "@THISnD=Mn@" index "nD=D An@255nM=Dn@SPnM=M-1nA=MnD=Mn@255nA=MnM=Dn";
} else if (Objects.equals(segment, "that")) {
asmCommand = "@THATnD=Mn@" index "nD=D An@255nM=Dn@SPnM=M-1nA=MnD=Mn@255nA=MnM=Dn";
} else if (Objects.equals(segment, "temp")) {
asmCommand = "@SPnM=M-1nA=MnD=Mn@" (5 index) "nM=Dn";
} else if (Objects.equals(segment, "pointer")) {
if (index == 0) {
asmCommand = "@SPnM=M-1nA=MnD=Mn@THISnM=Dn";
} else {
asmCommand = "@SPnM=M-1nA=MnD=Mn@THATnM=Dn";
}
} else if (Objects.equals(segment, "static")) {
asmCommand = "@SPnM=M-1nA=MnD=Mn@" (16 index) "nM=Dn";
}
}
asm.write(asmCommand);
}
public void close() throws IOException {
asm.close();
}
}
Main
main函数没变
实验结果,使用进行验证。对比生成的二进制代码文件。
我们用CPU Emulator装载.tst文件,用运行程序得到的.out文件和所给的.cmp文件进行比较,其中BasicTest的比较结果如下图所示,可见成功翻译。
PointerTest的比较结果如下图所示,可见成功翻译
StaticTest的结果如下图所示,可见第二阶段翻译成功。