while
and break
Statements
This section introduces the while
and break
statement.
while
Statement
Compared with the simple form of the if
statement (excluding elseif
and else
branches), the while
statement just adds an unconditional jump bytecode at the end of the internal block, jumping back to the beginning of the statement. As shown in the jump on the left in the figure below:
/--->+----------------------+
| | while condition then |---\ skip the block if $condition is false
| +----------------------+ |
| |
| block |
\<---- |
+-----+ |
| end | |
+-----+ |
<--------------------------/
The format of the final generated bytecode sequence is as follows, where ...
represents the bytecode sequence of the inner code block:
/--> Test --\ `if` branch
| ... |
\--- Jump |
<----/ The end of the entire `while` statement
The syntax analysis process and code also add an unconditional jump bytecode on the basis of the if
statement. We skip the code here. One thing that needs to be changed is that the unconditional jump here is a backward jump. But the second parameter of the previous Jump
bytecode is u16
type, which can only jump forward. Now we need to change to i16
type, and use a negative number to represent a backward jump:
pub enum ByteCode {
Jump(i16),
Correspondingly, the execution part of the virtual machine needs to be modified as follows:
// unconditional jump
ByteCode::Jump(jmp) => {
pc = (pc as isize + jmp as isize) as usize;
}
Compared with C language, Rust's type management is stricter, so it looks more verbose.
break
Statement
The while
statement itself is very simple, but it introduces another statement: break
. The break
statement itself is also very simple, just unconditionally jump to the end of the block, but the problem is that not all blocks support break
, for example, the block inside the if introduced earlier does not support break
, only the block of the loop statement supports break
. To be precise, what the break
wants to jump out of is the loop block of the nearest layer. For example, the following example:
while 123 do -- outer loop block, support `break`
while true do -- middle-level loop block, support `break`
a = a + 1
if a < 10 then -- inner block, does not support `break`
`break` -- `break` out of the `while true do` loop
end
end
end
There are 3 layers of blocks in the code, the outer and middle while blocks support break
, and the inner if block does not support break
. At this time, break
is to jump out of the middle block.
If the break
statement is not within a loop block, it is a syntax error.
In order to realize the above functions, a parameter can be added to the block()
function to indicate the latest loop block when calling recursively. Since the block has not ended when the jump bytecode is generated, and the jump destination address is not yet known, so the jump bytecode can only be generated first, and the parameters are left blank; and then the byte is repaired at the end of the block code parameter. So the parameter of the block()
function is the index list of the break
jump bytecode of the latest loop block. When calling the block()
function,
- If it is a loop block, create a new index list as a call parameter, and after the call ends, use the current address (that is, the end position of the block) to repair the bytecode in the list;
- If it is not a cyclic block, use the current list (that is, the current most recent cyclic block) as the call parameter.
But the recursive call of block()
function is not direct recursion, but indirect recursion. If you want to pass parameters in this way, then all parsing functions must add this parameter, which is too complicated. So put this index list into the global ParseProto
. Locality is sacrificed for coding convenience.
Let's look at the specific coding implementation. First add the break_blocks
field in ParseProto
, the type is a list of "jump bytecode index list":
pub struct ParseProto<R: Read> {
break_blocks: Vec::<Vec::<usize>>,
When parsing the while statement, add a list before calling the block()
function; after calling, fix the jump bytecode in the list:
fn while_stat(&mut self) {
// Omit the conditional judgment statement processing part
// Before calling block(), append a list
self.break_blocks.push(Vec::new());
// call block()
assert_eq!(self.block(), Token::End);
// After calling block(), pop up the list just added, and fix the jump bytecode in it
for i in self.break_blocks.pop().unwrap().into_iter() {
self.byte_codes[i] = ByteCode::Jump((iend - i) as i16);
}
}
After the block is prepared, the break
statement can be implemented:
fn `break`_stat(&mut self) {
// Get the bytecode list of the nearest loop block
if let Some(breaks) = self.break_blocks. last_mut() {
// Generate a jump bytecode placeholder, the parameter is left blank
self.byte_codes.push(ByteCode::Jump(0));
// Append to the bytecode list
`break`s.push(self.byte_codes.len() - 1);
} else {
// Syntax error if there is no loop block
panic!("break outside loop");
}
}
continue
Statement?
After implementing the break
statement, the continue
statement naturally comes to mind. Moreover, the implementation of continue
is similar to break
, the difference is that one jumps to the end of the loop, and the other jumps to the beginning of the loop. Adding this function is a convenient thing. But Lua does not support the continue
statement! A small part of this has to do with the repeat..until
statement. We discuss the continue
statement in more detail after introducing the repeat..until
statement in the next section.