while and break Statements

This section introduces the while and break statement.

while Statement

Compared with the simple form of the if statement (excluding elseif and else branches), the while statement just adds an unconditional jump bytecode at the end of the internal block, jumping back to the beginning of the statement. As shown in the jump on the left in the figure below:

/--->+----------------------+
|    | while condition then |---\ skip the block if $condition is false
|    +----------------------+   |
|                               |
|        block                  |
\<----                          |
     +-----+                    |
     | end |                    |
     +-----+                    |
     <--------------------------/

The format of the final generated bytecode sequence is as follows, where ... represents the bytecode sequence of the inner code block:

/-->  Test --\  `if` branch
|     ...    |
\---  Jump   |
        <----/ The end of the entire `while` statement

The syntax analysis process and code also add an unconditional jump bytecode on the basis of the if statement. We skip the code here. One thing that needs to be changed is that the unconditional jump here is a backward jump. But the second parameter of the previous Jump bytecode is u16 type, which can only jump forward. Now we need to change to i16 type, and use a negative number to represent a backward jump:

pub enum ByteCode {
     Jump(i16),

Correspondingly, the execution part of the virtual machine needs to be modified as follows:

         // unconditional jump
         ByteCode::Jump(jmp) => {
             pc = (pc as isize + jmp as isize) as usize;
         }

Compared with C language, Rust's type management is stricter, so it looks more verbose.

break Statement

The while statement itself is very simple, but it introduces another statement: break. The break statement itself is also very simple, just unconditionally jump to the end of the block, but the problem is that not all blocks support break, for example, the block inside the if introduced earlier does not support break, only the block of the loop statement supports break. To be precise, what the break wants to jump out of is the loop block of the nearest layer. For example, the following example:

while 123 do -- outer loop block, support `break`
     while true do -- middle-level loop block, support `break`
         a = a + 1
         if a < 10 then -- inner block, does not support `break`
             `break` -- `break` out of the `while true do` loop
         end
     end
end

There are 3 layers of blocks in the code, the outer and middle while blocks support break, and the inner if block does not support break. At this time, break is to jump out of the middle block.

If the break statement is not within a loop block, it is a syntax error.

In order to realize the above functions, a parameter can be added to the block() function to indicate the latest loop block when calling recursively. Since the block has not ended when the jump bytecode is generated, and the jump destination address is not yet known, so the jump bytecode can only be generated first, and the parameters are left blank; and then the byte is repaired at the end of the block code parameter. So the parameter of the block() function is the index list of the break jump bytecode of the latest loop block. When calling the block() function,

  • If it is a loop block, create a new index list as a call parameter, and after the call ends, use the current address (that is, the end position of the block) to repair the bytecode in the list;
  • If it is not a cyclic block, use the current list (that is, the current most recent cyclic block) as the call parameter.

But the recursive call of block() function is not direct recursion, but indirect recursion. If you want to pass parameters in this way, then all parsing functions must add this parameter, which is too complicated. So put this index list into the global ParseProto. Locality is sacrificed for coding convenience.

Let's look at the specific coding implementation. First add the break_blocks field in ParseProto, the type is a list of "jump bytecode index list":

pub struct ParseProto<R: Read> {
     break_blocks: Vec::<Vec::<usize>>,

When parsing the while statement, add a list before calling the block() function; after calling, fix the jump bytecode in the list:

     fn while_stat(&mut self) {

         // Omit the conditional judgment statement processing part

         // Before calling block(), append a list
         self.break_blocks.push(Vec::new());

         // call block()
         assert_eq!(self.block(), Token::End);

         // After calling block(), pop up the list just added, and fix the jump bytecode in it
         for i in self.break_blocks.pop().unwrap().into_iter() {
             self.byte_codes[i] = ByteCode::Jump((iend - i) as i16);
         }
     }

After the block is prepared, the break statement can be implemented:

     fn `break`_stat(&mut self) {
         // Get the bytecode list of the nearest loop block
         if let Some(breaks) = self.break_blocks. last_mut() {
             // Generate a jump bytecode placeholder, the parameter is left blank
             self.byte_codes.push(ByteCode::Jump(0));
             // Append to the bytecode list
             `break`s.push(self.byte_codes.len() - 1);
         } else {
             // Syntax error if there is no loop block
             panic!("break outside loop");
         }
     }

continue Statement?

After implementing the break statement, the continue statement naturally comes to mind. Moreover, the implementation of continue is similar to break, the difference is that one jumps to the end of the loop, and the other jumps to the beginning of the loop. Adding this function is a convenient thing. But Lua does not support the continue statement! A small part of this has to do with the repeat..until statement. We discuss the continue statement in more detail after introducing the repeat..until statement in the next section.