Logical Operations in Conditional Judgment

Logical operations include 3: and, or, and not. The last not is a unary operation, which has been introduced in the previous section Unary Operation. This chapter only introduces the first two and and or.

Then why not introduce and and or in the previous binary operation section? Because of "short circuit"! In mainstream programming languages (such as C, Rust), logical operations are short-circuited. For example, for the AND operation, if the first operand is false, then there is no need (and cannot) to execute or check the second operand. For example, the statement is_valid() and count(), if the return value of is_valid() is false, then the subsequent count() cannot be executed. Therefore, the execution process of logical operations is: 1. First judge the left operand, 2. If it is false, exit, 3. Otherwise judge the right operand. While the execution process of the binary arithmetic operation is: 1. First find the left operand, 2. Then find the right operand, 3. Finally calculate. It can be seen that the flow of logical operations is different from that of arithmetic operations, so the previous methods cannot be applied.

Before introducing the logic operation in detail, let's look at two usage scenarios of logic operations:

  1. As a judgment condition, such as the judgment condition statement in if, while and other statements in the previous chapter, such as if t and t.k then ... end;
  2. Evaluation, such as print(v>0 and v or -v).

In fact, the first scenario can be regarded as a special case of the second scenario. For example, the above if statement example is equivalent to the following code:

local tmp = t and t.k
if tmp then
     ...
end

It is to first evaluate the operation statement t and t.k, then put the value into a temporary variable, and finally judge whether the value is true or false to decide whether to jump. However, here we don't actually care whether the specific evaluation result is t or t.k, but only care about true or false, so we can save the temporary variable! As you can see below, the omission of temporary variables can save a bytecode, which is a great optimization. Since most applications of logical operations are in the first scenario, it is worthwhile to separate this scenario from the second general scenario for special optimization, by omitting temporary variables and directly judging whether to jump based on the evaluation result.

As the title of this section indicates, this section only introduces the first scenario; while the next section will introduce the second scenario.

Jump Rules

The short-circuit characteristics of logic operations are introduced above. After each operand is judged, a jump may occur and the next operand is skipped. The bytecode corresponding to the logical operation is to jump according to each operand. Different operation combinations will lead to various jump combinations. Now it is necessary to summarize jump rules from various jump combinations, so as to be used as subsequent parsing rules. This is probably the most convoluted part of the whole interpreter.

The following uses the simplest if statement as the application scenario, and first looks at the most basic and and or operations. The following two figures are the jump schematic diagrams of if A and B then ... end and if X or Y then ... end respectively:

 A and B                      X or Y

+-------+                    +-------+
|   A   +-False-\    /--True-+   X   |
+---+---+       |    |       +---+---+
    |True       |    |           |False
    V           |    |           V
+-------+       |    |       +-------+
|   B   +-False>+    |       |   Y   +-False-\
+---+---+       |    |       +---+---+      |
    |True       |    \---------->|True      |
    V           |                V          |
  block         |              block        |
    |           |                |          |
    +<----------/                +<--------/
    V                            V

The left figure is the AND operation. The processing after the judgment of the two operands A and B is the same: if True, continue to execute; if False, jump to the end of the code block.

The figure on the right is the OR operation. The processing flow of the two operands is different. The processing of the first operand X is: False continues execution, and True jumps to the following code block to start. While the processing of the second operand Y is the same as the processing of A and B before.

However, just looking at these two examples is not able to sum up the general law. Also need to look at some complex:

A and B and C               X or Y or Z                 (A and B) or Y               A and (X or Y)

+-------+                    +-------+                    +-------+                    +-------+
|   A   +-False-\    /--True-+   X   |                    |   A   |-False-\            |   A   +-False-\
+---+---+       |    |       +---+---+                    +---+---+       |            +---+---+       |
    |True       |    |           |False                       |True       |                |True       |
    V           |    |           V                            V           |                V           |
+-------+       |    |       +-------+                    +-------+       |            +-------+       |
|   B   +-False>+    +<-True-+   Y   |            /--True-+   B   |       |    /--True-+   X   |       |
+---+---+       |    |       +---+---+            |       +---+---+       |    |       +---+---+       |
    |True       |    |           |False           |      False|<---------/     |           |False      |
    V           |    |           V                |           V                |           V           |
+-------+       |    |       +-------+            |       +-------+            |       +-------+       |
|   C   +-False>+    |       |   Z   +-False-\    |       |   Y   +-False-\    |       |   Y   +-False>+
+---+---+       |    |       +---+---+       |    |       +---+---+       |    |       +---+---+       |
    |True       |    \---------->|True       |    \---------->|True       |    \---------->|True       |
    V           |                V           |                V           |                V           |
  block         |              block         |              block         |              block         |
    |           |                |           |                |           |                |           |
    +<---------/                 +<----------/                +<---------/                 +<---------/
    V                            V                            V                            V

According to these 4 diagrams, the following rules can be summarized (the specific steps of induction are omitted here. In practice, more examples may be needed to summarize, but too many examples are too bloated):

  • The jump condition depends on the logical operator (that is, and or or) behind the statement (such as A, B, X, Y, etc. in the above example):

    • If it is followed by and operation, False jumps and True continues execution. For example, A and B in the first picture are followed by and operations, so they are all False jumps.

    • If it is followed by an or operation, True jumps and False continues. For example, X and Z in the second picture are followed by or operations, so they are all True jumps.

    • If there is no logical operator behind, that is, the entire judgment statement ends, False jumps and True continues to execute. This rule is the same as for and above. This is true for the last judgment statement in the above four figures.

  • Rules for jump target positions:

    • If the same jump condition continues, jump to the same position. For example, there are 3 consecutive False jumps in the first picture, and 2 consecutive True jumps in the second picture; and the two False jumps in the third picture are not continuous, so the jump positions are different. Then during syntax analysis, if the two operands have the same jump condition, the jump list is merged.

    • If different jump conditions are encountered, terminate the previous jump list and jump to the end of the current judgment statement. For example, the False of Z in the second figure terminates the previous two True jump lists and jumps to the end of the Z statement; another example is the False jump list before the termination of B’s True in the third figure, and jumps to After the B statement.

    • However, the fourth picture does not seem to comply with the above two rules. The two False jumps are not continuous but connected, or the True jump of X does not end the False jump list of A. This is because A does not operate with X, but with (X or Y); you need to ask (X or Y) first, and the True jump of X is brand new at this time, and you don’t know the previous The False jump list of A; and then when asking A and (X or Y), the two jump lists of True and False coexist; the False at the end of the final statement merges the False jump list of A before, and Termination of X's True jump list.

    • The end of the judgment statement corresponds to the False jump, so the True jump list will be terminated and the False jump list will continue. After the end of the block, terminate the False jumpGo to the end of the block list. This is the case in the 4 figures above.

So far, the preparation knowledge has been introduced. Let's start coding.

Bytecode

Several conditional judgment statements in the control structure in the previous chapter, including if, while, and repeat..until, etc., all deal with the judgment conditions and jump on False, so there is only one bytecode for testing and jumping, namely Test. But now we need 2 kinds of jumps, jump on False and jump on True. For this reason, we remove the previous Test and add 2 bytecodes:

pub enum ByteCode {
     TestAndJump(u8, i16), // If Test is True, then Jump.
     TestOrJump(u8, i16), // Jump if Test is False. Same function as `Test` in the previous chapter.

The "And" and "Or" in the naming have nothing to do with the logical operations introduced in this section, but are derived from the method names of the Option and Error types in the Rust language, meaning "and then" and "otherwise then" respectively. However, in the two examples at the beginning of this section, t and t.k can be described as: if t exists "then then" take t.k, t.k or 100 can be described as: if t.k exists then take its value "otherwise then" Take 100. It can also be said to be related.

It’s just that the first jump rule introduced above, if it is followed by and operation, False jumps, corresponding to TestOrJump. The and and Or here do not correspond, but it doesn't matter much.

In the official Lua implementation, there is still only one bytecode TEST, which is associated with two parameters: the stack address of the judgment condition (same as ours), and the jump condition (True jump or False jump). For the specific jump position, you need to add a JUMP bytecode for an unconditional jump. It seems that 2 bytecodes are not very efficient. This is done for another application scenario, which will be introduced in the next section.

ExpDesc

When parsing logical operators to generate jump bytecodes, the destination of the jump is not yet known. Only one bytecode placeholder can be generated first, and the parameter of the jump position is left blank. The parameters are filled in after the destination location is determined later. This approach is the same as when we introduced control structures in the previous chapter. The difference is that there was only one jump bytecode in the previous chapter, but this time there may be multiple bytecode zippers, such as the first picture above, 3 bytecode jumps Go to the same location. This zipper may be a True jump or a False jump, or these two chains may exist at the same time, such as when Y is resolved in the fourth figure above. So a new ExpDesc type is needed to save the jump list. To this end, a new Test type is defined as follows:

enum ExpDesc {
     Test(usize, Vec<usize>, Vec<usize>), // (condition, true-list, false-list)

Associate 3 parameters. The first one is to determine the position of the condition on the stack. No matter what type (constant, variable, table index, etc.) it will be discharged to the stack first, and then the true or false will be judged. The next two parameters are the two jump lists of True and False, and the contents are the positions of the bytecodes that need to be completed.

In the official implementation of Lua, the jump list is implemented by jumping to the blank parameters in the bytecode. For example, if there are three consecutive False jumps in the first figure above, the bytecodes generated by judging A, B, and C are JUMP 0, JUMP $A, JUMP $B, and then save them in ExpDesc $C. In this way, $B can be found through $C, $A can be found through $B, and the parameter 0 indicates the end of the linked list. Finally, while traversing, it is uniformly fixed as JUMP $end. This design is very efficient, without additional storage, and the zipper can be realized by using the Jump parameter that is temporarily left blank. At the same time, it is also slightly obscure and error-prone. This kind of full use of resources and micro-manipulation of memory according to bits is a very typical practice of C language projects. The Rust language standard library provides a list Vec, although it will generate memory allocation on the heap, which slightly affects performance, but the logic is much clearer and clear at a glance. As long as it is not a performance bottleneck, obscure and dangerous practices should be avoided as much as possible, especially when using the safety-oriented Rust language.

Syntax Analysis

Now it is finally ready to parse. Start with the binary operation part of the exp() function. Before introducing the evaluation order of binary numerical operations, the first operand must be processed first. It is also introduced at the beginning of this section that for the processing order of logical operations, due to the short-circuit characteristics, the first operation and possible jumps must be processed first, and then the second operand can be parsed. So, before continuing to parse the second operand, the jump is handled:

     fn preprocess_binop_left(&mut self, left: ExpDesc, binop: &Token) -> ExpDesc {
         match binop {
             Token::And => ExpDesc::Test(0, Vec::new(), self. test_or_jump(left)),
             Token::Or => ExpDesc::Test(0, self. test_and_jump(left), Vec::new()),

             _ => // Omit the part of other types of discharge
         }
     }

In this function, the processing part of logical operation is added. Take and as an example, generate ExpDesc::Test type, temporarily save the processed 2 jump lists, and the associated first parameter is useless, fill in 0 here. Call the test_or_jump() function to process the jump list. According to the rules introduced above, the and operator corresponds to the False jump, which will terminate the previous True jump list, so the test_or_jump() function will terminate the previous True jump list and return only the False jump list. Then create a new list Vec::new() here as the True jump list.

Look at the specific implementation of test_or_jump():

     fn test_or_jump(&mut self, condition: ExpDesc) -> Vec<usize> {
         let (icondition, true_list, mut false_list) = match condition {
             // It is a constant of True, no need to test or jump, skip it directly.
             // Example: while true do ... end
             ExpDesc::Boolean(true) | ExpDesc::Integer(_) | ExpDesc::Float(_) | ExpDesc::String(_) => {
                 return Vec::new();
             }

             // The first operand is already of type Test, indicating that this
             // is not the first logical operator.
             // Just return the existing two jump lists directly.
             ExpDesc::Test(icondition, true_list, false_list) =>
                 (icondition, Some(true_list), false_list),

             // The first operand is another type, indicating that this is the
             // first logical operator.
             // Only need to discharge the first operand to the stack.
             // There was no True jump list before, so return None.
             // There was no False jump list before, so create a new list to save
             // this jump instruction.
             _ => (self. discharge_any(condition), None, Vec::new()),
         };

         // generate TestOrJump, but leave the second parameter blank
         self.byte_codes.push(ByteCode::TestOrJump(icondition as u8, 0));

         // Put the newly generated bytecode, if it is in the False jump list,
         // for subsequent repair
         false_list.push(self.byte_codes.len() - 1);

         // Finalize the previous True jump list and jump here, if any
         if let Some(true_list) = true_list {
             self.fix_test_list(true_list);
         }

         // return False jump list
         false_list
     }

For the or operator and the corresponding test_and_jump() function, it is similar, just flip the True and False jump lists. It will not be introduced here.

After processing the first operand and the jump, it is very simple to process the second operand, just connect the jump list:

     fn process_binop(&mut self, binop: Token, left: ExpDesc, right: ExpDesc) -> ExpDesc {
         match binop {
             // omit other binary operator processing
             Token::And | Token::Or => {
                 // The first operand has been converted to ExpDesc::Test in preprocess_binop_left() above
                 if let ExpDesc::Test(_, mut left_true_list, mut left_false_list) = left {
                     let icondition = match right {
                         // If the second operand is also of Test type, such as the example
                         // of `A and (X or Y)` in the fourth figure above in this section,
                         // Then connect the two jump lists separately.
                         ExpDesc::Test(icondition, mut right_true_list, mut right_false_list) => {
                             left_true_list.append(&mut right_true_list);
                             left_false_list.append(&mut right_false_list);
                             icondition
                         }
                         // If the second operand is another type, there is no need to deal with the jump list
                         _ => self.discharge_any(right),
                     };

                     // After returning to the connection, I want to create a new jump list
                     ExpDesc::Test(icondition, left_true_list, left_false_list)
                 } else {
                     panic!("impossible");
                 }
             }

After dealing with the binary operation part, the next step is the application scenario. This section only introduces the application scenarios used as judgment conditions, and the evaluation will be introduced in the next section. Several control structure statements (if, while, repeat..until, etc.) directly process the jump bytecode, and the code logic is similar. In the jump rules introduced at the beginning of this section, the judgment statement of the entire logical operation ends, which is a False jump, so calling the test_or_jump() function just introduced can replace and simplify the code that directly processes bytecodes in the previous chapter logic. Here we still use the if statement as an example:

     fn do_if_block(&mut self, jmp_ends: &mut Vec<usize>) -> Token {
         let condition = self. exp();

         // In the previous chapter, here is to generate Test bytecode.
         // Now, replace and simplify to the test_or_jump() function.
         // Terminate the True jump list and return a new False jump list.
         let false_list = self. test_or_jump(condition);

         self.lex.expect(Token::Then);

         let end_token = self. block();

         if matches!(end_token, Token::Elseif | Token::Else) {
             self.byte_codes.push(ByteCode::Jump(0));
             jmp_ends.push(self.byte_codes.len() - 1);
         }

         // In the last chapter, here is to fix a Test bytecode just generated.
         // Now, a False jump list needs to be modified.
         self.fix_test_list(false_list);

         end_token
     }

This completes the syntax analysis part.

Virtual Machine Execution

The execution part of the virtual machine first needs to process the newly added 2 bytecodes, which are very simple and will be ignored here. What needs to be said is the details of a stack operation. The function when assigning a value to the stack before is as follows:

     fn set_stack(&mut self, dst: u8, v: Value) {
         let dst = dst as usize;
         match dst.cmp(&self.stack.len()) {
             Ordering::Equal => self. stack. push(v),
             Ordering::Less => self.stack[dst] = v,
             Ordering::Greater => panic!("fail in set_stack"),
         }
     }

First determine whether the target address dst is within the range of the stack:

  • If it is, assign it directly;
  • If it is not and it is just the next position, use push() to push it onto the stack;
  • If not, and past the next position, it was impossible to appear before, so call panic!().

However, the short-circuit characteristics of logic operations may lead to the above-mentioned third situation. For example the following statement:

if (g1 or g2) and g3 then
end

According to our analysis method, the following temporary variables will be generated, occupying the position on the stack:

|      |
+------+
|  g1  |
+------+
|  g2  |
+------+
|  g3  |
+------+
|      |

But during execution, if g1 is true, the processing of g2 will be skipped, and g3 will be processed directly. At this time, the position of g2 in the above figure is not set, then g3 will exceed the top of the stack position, as shown in the figure below:

|      |
+------+
|  g1  |
+------+
|      |
:      :
:      : <-- set g3, beyond the top of the stack

Therefore, it is necessary to modify the above set_stack() function to support setting elements beyond the top of the stack. This can be achieved by calling set_vec().

Test

So far, the application scenario of logical operation in conditional judgment has been completed. This can be tested with the examples in the figures at the beginning of this section. omitted here.