repeat..until and continue Statements

This section introduces the repeat..until statement, and discusses and attempts to introduce the continue statement that Lua language does not support.

repeat..until Statement

The repeat..until statement is similar to the while statement, except that the judgment condition is placed behind to ensure that the internal code block is executed at least once.

     +--------+
     | repeat |
     +--------+
/--->
|        block
|
|    +-----------------+
\----| until condition |
     +-----------------+

The format of the final generated bytecode sequence is as follows, where ... represents the bytecode sequence of the inner code block:

     ... <--\
     Test ---/ `until` judgment condition

Compared with the bytecode sequence of the while statement, it seems that the Test is put at the end and the original Jump bytecode is replaced. But the situation is not that simple! Putting the judgment conditional statement behind the block will introduce a big problem. The local variables defined in the block may be used in the judgment conditional statement. For example, the following example:

-- keep retrying until the request succeeds
repeat
     local ok = request_xxx()
until ok

The variable ok after the last line until is obviously intended to refer to the local variable defined in the second line. However, the previous code block analysis function block() has deleted the internally defined local variables at the end of the function. That is to say, according to the previous syntax analysis logic, when until is parsed, the internally defined ok local variable has become invalid and cannot be used. This is clearly unacceptable.

In order to support the ability to read internal local variables during until, the original block() function needs to be modified (the code is always messed up by these strange requirements), and the control of local variables is independent. For this reason, a block_scope() function is added, which only does syntax analysis; while the scope of internal local variables is completed by the outer block() function. In this way, the place where the block() function was originally called (such as if, while statement, etc.) does not need to be modified, and this special repeat..until statement calls the block_scope() function for finer control. code show as below:

     fn block(&mut self) -> Token {
         let nvar = self. locals. len();
         let end_token = self. block_scope();
         self.locals.truncate(nvar); // expire internal local variables
         return end_token;
     }
     fn block_scope(&mut self) -> Token {
         ... // The original block parsing process
     }

Then, the analysis code of the repeat..until statement is as follows:

     fn repeat_stat(&mut self) {
         let istart = self.byte_codes.len();

         self. push_break_block();

         let nvar = self.locals.len(); // Internal local variable scope control!

         assert_eq!(self. block_scope(), Token::Until);

         let icond = self.exp_discharge_top();

         // expire internal local variables AFTER condition exp.
         self.locals.truncate(nvar); // Internal local variable scope control!

         let iend = self.byte_codes.len();
         self.byte_codes.push(ByteCode::Test(icond as u8, -((iend - istart + 1) as i16)));

         self. pop_break_block();
     }

In the above code, the 2 lines commented complete the scope control of the internal local variables in the original block() function. After calling exp_discharge_top() and parsing the conditional judgment statement, the internally defined local variables are deleted.

continue statement

It took a lot of space to explain the scope of variables in the repeat..until statement, which has a lot to do with the continue statement that does not exist in Lua.

When the break statement was supported in the previous section, it was mentioned that the Lua language does not support the continue statement. There is a lot of debate on this issue, and there is a high demand for adding a continue statement in Lua. As early as 2012, there was a related proposal, which listed in detail the advantages and disadvantages of adding the continue statement and related discussions. Twenty years have passed, and even though the stubborn Lua added the goto statement in version 5.2, it still did not add the continue statement.

The "Unofficial FAQ" explains this:

  • The continue statement is just one of many control statements, similar ones include goto, break with label, etc. The continue statement is nothing special, there is no need to add this statement;
  • Conflicts with existing repeat..until statements.

In addition, an email from Roberto, the author of Lua, is more representative of the official attitude. The reason for this is the first point above, that is, the continue statement is just one of many control statements. An interesting thing is that there are two examples in this email, and the other example just happens to be repeat..until besides continue. The above unofficial FAQ also mentioned that these two statements conflict.

The reason for the conflict between these two statements is that if there is a continue statement in the repeat..until internal code block, then it will jump to the until conditional judgment position. If there are local variables defined in the block are used in until statement, while the continue statement may skip the definition and jump to the until, then this local variable is meaningless in until. This is where the conflict lies. For example the following code:

repeat
     `continue` -- jump to until, skip the definition of `ok`
     local ok = request_xxx()
until ok -- how to deal with `ok` here?

In contrast, the equivalent of the repeat..until statement in the C language is the do..while statement, which supports continue. This is because in the do..while statement of the C language, the conditional judgment after the while is outside the scope of the internal code block. For example, the following code will compile error:

     do {
         bool ok = request_xx();
     } while (ok); // error: 'ok' undeclared

Such a specification (the conditional judgment is outside the scope of the inner code block) is not convenient in some usage scenarios (such as the above example), but there are also very simple solutions (such as move ok definition outside the loop), and the syntax analysis is simpler, for example, there is no need to separate the block_scope() function. Then why does Lua stipulate that the conditional judgment statement should be placed within the inner scope? The speculation is as follows, if Lua also follows the practice of C language (the conditional judgment is outside the scope of the internal code block), and then the user writes the following Lua code, ok after the until will be parsed as a Global variables, without reporting errors like C language! This is not the user's intention, thus causing a serious bug.

repeat
     local ok = request_xxx()
until ok

To sum up, the repeat..until statement needs to put the conditional judgment statement after until in the scope of the internal code block in order to avoid bugs with a high probability; then when the continue statement jumps to the conditional statement, it may skip the definition of local variables, and then there is a conflict.

Try Adding continue Statement

Lua's official reason for not supporting the continue statement is mainly that they think the frequency of use of the continue statement is very low and it is not worth supporting. But in my personal programming experience, whether in Lua or other languages, the frequency of use of the continue statement is still very high. Although it may not be as good as break, it is far more than goto and break with labels, and even more than repeat..until statement. Besides, the way to implement the continue function in Lua (repeat..until true + break, or goto) is more verbose than using continue directly. So can we add a continue statement to our interpreter?

First of all, we have to resolve the conflict with repeat..until mentioned above. There are several solutions:

  • Make a rule that the continue statement is not supported in repeat..until, just like the if statement does not support continue. But this is very easy to cause misunderstanding. For example, a piece of code has two layers of loops, the outer layer is a while loop, and the inner layer is a repeat loop; the user wrote a continue statement in the inner loop, intending to make the inner repeat loop take effect, but because repeat does not actually support continue, Then it will take effect in the outer while loop, and continue the outer while loop. This is a serious potential bug.

  • Make a rule that the continue statement is prohibited in repeat..until. If there is continue, an error will be reported. This can avoid the potential bugs of the above scheme, but this prohibition is too strict.

  • Make a rule that if an internal local variable is defined in repeat..until, the continue statement is prohibited. This plan is a little more relaxed than the last one, but it can be more relaxed.

  • Make a rule that after the continue statement appears in repeat..until, the definition of internal local variables is prohibited; in other words, continue prohibits jumping to local variable definitions. This is similar to the restriction on subsequent goto statements. However, it can be more relaxed.

  • On the basis of the previous solution, only the local variables defined after the continue statement are used in the conditional judgment statement after the until, which is prohibited. It’s just that the judgment of whether to use local variables in the statement is very complicated. If function closures and Upvalue are supported later, it is basically impossible to judge. So this plan is not feasible.

In the end, I chose to use the second-to-last solution. For specific coding implementation, there used to be break_blocks in ParseProto to record break statements, and now a similar continue_blocks is added, but the member type is (icode, nvar). Among them, the first variable icode is the same as the members of break_blocks, and records the position of the Jump bytecode corresponding to the continue statement for subsequent correction; the second variable nvar represents the number of local variables in the continue statement, which is used for Subsequent checks to see if the new local variable has been jumped.

Second, adding a continue statement cannot affect existing code. In order to support the continue statement, it is necessary to use continue as a keyword (similar to the break keyword), so many existing Lua codes use continue as a label, or even a variable name or function name (essentially a variable name) will fail to parse. To this end, a tricky solution is not to use continue as a keyword, but to judge when parsing a statement that if it starts with continue and is followed by a block-ending Token (such as end, etc.), it is considered to be continue statement. Thus in most other places, continue will still be interpreted as a normal Name.

In the corresponding block_scope() function, the part starting with Token::Name, the newly added code is as follows:

         loop {
             match self. lex. next() {
                 // Omit parsing of other types of statements
                 t@Token::Name(_) | t@Token::ParL => {
                     // this is not standard!
                     if self.try_continue_stat(&t) { // !! New !!
                         continue;
                     }

                     // The following omits the parsing of standard 
                     // function calls and variable assignment statements
                 }

The try_continue_stat() function is defined as follows:

     fn try_continue_stat(&mut self, name: &Token) -> bool {
         if let Token::Name(name) = name {
             if name.as_str() != "continue" { // The beginning of the judgment statement is `continue`
                 return false;
             }
             if !matches!(self.lex.peek(), Token::End | Token::Elseif | Token::Else) {
                 return false; // Judgment followed by one of these 3 Tokens
             }

             // Then, it's the `continue` statement. The following processing
             // is similar to the break statement processing
             if let Some(continues) = self.continue_blocks.last_mut() {
                 self.byte_codes.push(ByteCode::Jump(0));
                 continues.push((self.byte_codes.len() - 1, self.locals.len()));
             } else {
                 panic!("continue outside loop");
             }
             true
         } else {
             false
         }
     }

Before parsing to the code block of the loop body, it must be prepared first, which is the push_loop_block() function. After the block ends, use pop_loop_block() to handle breaks and continues. The jump corresponding to breaks is to jump to the end of the block, that is, the current position; the jump position corresponding to continues is determined according to different loops (for example, the while loop jumps to the beginning of the loop, and the repeat loop jumps to the end of the loop) , so parameters are required to specify; in addition, when processing continus, it is necessary to check whether there are new definitions of local variables, that is, compare the number of current local variables with the number of local variables in the continue statement.

     // before entering loop block
     fn push_loop_block(&mut self) {
         self. break_blocks. push(Vec::new());
         self. `continue`_blocks. push(Vec::new());
     }

     // after leaving loop block, fix `break` and `continue` Jumps
     fn pop_loop_block(&mut self, icon`continue`: usize) {
         // breaks
         let iend = self.byte_codes.len() - 1;
         for i in self.break_blocks.pop().unwrap().into_iter() {
             self.byte_codes[i] = ByteCode::Jump((iend - i) as i16);
         }

         // continues
         let end_nvar = self. locals. len();
         for (i, i_nvar) in self.`continue`_blocks.pop().unwrap().into_iter() {
             if i_nvar < end_nvar {
                // i_nvar is the number of local variables in the 
                // `continue` statement, end_nvar is the number of
                // current local variables
                 panic!("`continue` jump into local scope");
             }
             self.byte_codes[i] = ByteCode::Jump((i`continue` as isize - i as isize) as i16 - 1);
         }
     }

So far, we have implemented the continue statement while ensuring backward compatibility! You can use the following code to test:

-- validate compatibility
continue = print -- continue as global variable name, and assign it a value
continue(continue) -- call continue as function

-- continue in while loop
local c = true
while c do
    print "hello, while"
    if true then
      c = false
      continue
    end
    print "should not print this!"
end

-- continue in repeat loop
repeat
    print "hello, repeat"
    local ok = true
    if true then
      continue -- continue after local
    end
    print "should not print this!"
until ok

-- continue skip local in repeat loop
-- PANIC!
repeat
    print "hello, repeat again"
    if true then
      continue -- skip `ok`!!! error in parsing
    end
    local ok = true
until ok

repeat..until Existence

As can be seen above, the existence of the repeat..until statement introduces two problems because the scope of the local variables defined in the block needs to be extended in the until part:

  • In programming implementation, it is necessary to create a block_scope() function;
  • Conflict with continue statement.

I personally think that introducing the above two problems in order to support a statement that is rarely used like repeat..until is not worth the candle. If I were to design the Lua language, this statement would not be supported.

In the 8.4 Exercise section of the official "Lua Programming (4th Edition)" book, the following questions are raised:

Exercise 8.3: Many people think that because repeat-until is rarely used, it should not appear at the end in a simple programming language like Lua language. What do you think?

I really want to know the author's answer to this question, but unfortunately, none of the exercises in this book give an answer.