Logical Operations in Evaluation

The previous section introduced the logical operations in conditional judgment. This section introduces another scenario, that is, the evaluation.

In the previous section, the syntax analysis process of logical operations in conditional judgment scenarios can be divided into two parts:

Process the logical operation itself, specifically, after encountering the and or or operator in the exp() function, generate the corresponding bytecode and process the True and False jump lists;
After the entire logic operation statement is parsed, put the parsing result into the conditional judgment scene of the if statement, first terminate the True jump list, and then terminate the False jump list after the end of the block.

In the evaluation scenario to be introduced in this section, it is also divided into two parts:

Dealing with the logical operation itself, this part is exactly the same as the previous section;
After the entire logical operation statement is parsed, the statement is evaluated, which is the part to be introduced in this section.

As shown in the figure below, the previous section completed parts (a) and (b), and this section implements part (c) on the basis of (a).

                                               +------------------------+
+--------------------+                    /--->| (b) Condition judgment |
| (a) Process        |   ExpDesc::Test   |     +------------------------+
| logical operations |------------------>+
+--------------------+                   |     +-----------------+
                                          \--->| (c) Evaluation  |
                                               +-----------------+

Result Type

Logical operations in Lua are different from those in C and Rust. The results of logical operations in C and Rust languages are Boolean types, which only distinguish between true and false. For example, the following C language code:

int i=10, j=11;
printf("%d\n", i && j); // output: 1

Will output 1, because the && operator will first convert the two operands to Boolean type (both are true in this example), and then execute the && operation, the result is true, which is 1 in C language. The Rust language is stricter, both operands of && must be of Boolean type, so the result is also of Boolean type.

But logical operations in Lua evaluate to the last evaluated operand. For example, the following are very common usages:

print(t and t.k), first judge whether t exists, and then find the index of t. If t does not exist, then there is no need to judge t.k, so the result is t which is nil; otherwise, it is t.k.
print(t.k or 100), index the table and provide a default value. First judge whether there is k in t, if there is, then there is no need to judge 100, so the result is t.k; otherwise it is 100.
print(v>0 and v or -v), find the absolute value. The result is v if positive, and -v otherwise. Simulates the ?: ternary operator in C.

Evaluation Rules

In order to understand the sentence "the evaluation result of a logical operation is the last evaluated operand" more clearly, some examples are shown below. Here we still use the flowchart at the beginning of the previous section as an example. Let's look at the most basic operations first:

 A and B                      X or Y

+-------+                    +-------+
|   A   +-False-\    /--True-+   X   |
+---+---+       |    |       +---+---+
    |True       |    |           |False
    V           |    |           V
+-------+       |    |       +-------+
|   B   |       |    |       |   Y   |
+---+---+       |    |       +---+---+
    |<----------/    \---------->|
    V                            V

In the figure on the left, if A is False, the evaluation result is A; otherwise, when B is evaluated, since B is the last operand, there is no need to make a judgment, and B is the evaluation result.

In the figure on the right, if X is True, the evaluation result is X; otherwise, when Y is evaluated, since Y is the last operand, there is no need to make a judgment, and Y is the evaluation result.

Let's look at a few more complex examples:

A and B and C               X or Y or Z                 (A and B) or Y               A and (X or Y)

+-------+                    +-------+                    +-------+                    +-------+
|   A   +-False-\    /--True-+   X   |                    |   A   |-False-\            |   A   +-False-\
+---+---+       |    |       +---+---+                    +---+---+       |            +---+---+       |
    |True       |    |           |False                       |True       |                |True       |
    V           |    |           V                            V           |                V           |
+-------+       |    |       +-------+                    +-------+       |            +-------+       |
|   B   +-False>+    +<-True-+   Y   |            /--True-+   B   |       |    /--True-+   X   |       |
+---+---+       |    |       +---+---+            |       +---+---+       |    |       +---+---+       |
    |True       |    |           |False           |      False|<---------/     |           |False      |
    V           |    |           V                |           V                |           V           |
+-------+       |    |       +-------+            |       +-------+            |       +-------+       |
|   C   |       |    |       |   Z   |            |       |   Y   |            |       |   Y   |       |
+---+---+       |    |       +---+---+            |       +---+---+            |       +---+---+       |
    |<---------/     \---------->|                \---------->|                \---------->|<---------/
    V                            V                            V                            V

The process of summarizing based on these 4 figures is omitted here, and the evaluation rules are directly given:

The last operand does not need to be judged, as long as the previous judgment does not skip the last operand, then the last operand is the final evaluation result. For example, in the first figure above, if both A and B are True, then C will be executed, and C is the evaluation result of the entire statement. C itself does not need to make judgments.
In the syntax analysis stage, after the parsing of the entire logical operation statement is completed, the operands on the unterminated jump list may be used as the final evaluation result. This statement is rather convoluted, and the following example illustrates it. For example, in the first figure above, the True jump lists of A and B end in B and C respectively, but the False jump lists are not terminated, then both A and B may be the final evaluation results, for example, if A is False Then A is the final evaluation result. As another counter-example, for example, the two jump lists of A’s True and False in the third figure above are terminated in B and Y respectively, that is to say, when the entire statement is parsed, the jump lists of A are terminated. , then A cannot be the evaluation result, and in either case A will not reach the end of the statement. Except for the third figure, all judgment conditions in other figures may be used as the final evaluation result.

After summarizing the evaluation rules, let's start coding.

ExpDesc

A new ExpDesc type representing logical operations was introduced in the previous section and is defined as follows:

enum ExpDesc {
     Test(usize, Vec<usize>, Vec<usize>), // (condition, true-list, false-list)

The latter two parameters respectively represent two jump linked lists, which will not be introduced here, and focus on the first parameter: the position of the judgment conditional statement on the stack. As mentioned in the previous section, all statements (such as variables, constants, table indexes, etc.) must be discharged to the stack first to determine whether they are true or false, so here we can use the stack index of usize type to represent the statement. This is no problem in the previous section, but in the evaluation scenario in this section, as mentioned above, the last operand does not need to be judged, so it may not need to be discharged to the stack. Like the following example:

local x = t and t.k

According to the current practice, first discharge the second operand t.k to a temporary variable on the stack; if t is true, assign the temporary variable to x through Move bytecode. Obviously this temporary variable is unnecessary, and t.k can be directly assigned to x. To do this, we need to delay the evaluation of the conditional statement, or delay the discharge. Then you need to transform the ExpDesc::Test type.

Lua's official approach is to assign two jump lists to all types of ExpDesc:

typedef struct expdesc {
   expkind k; // type tag
   union {
     // Data associated with various expkinds, omitted here
   } u;
   int t; /* patch list of 'exit when true' */
   int f; /* patch list of 'exit when false' */
} expdesc;

t and f in the above code are the jump lists of True and False respectively. But it is a bit inconvenient to define it in the Rust language. Because Rust's enum includes tags and associated data, corresponding to k and u above, one enum can define ExpDesc; but if you add two jump lists, you need to encapsulate a layer of struct outside Defined. And the struct variable is defined in the Rust languageWhen all members must be explicitly initialized, then in all places where ExpDesc is defined in the code, t and f must be initialized to Vec::new(). It's not worth it to affect other types for this one type.

Our approach is to define ExpDesc::Test recursively. Change the first parameter type of ExpDesc::Test from usize to ExpDesc. Of course, it cannot be defined directly, but it needs to encapsulate a layer of Box pointer:

enum ExpDesc {
     Test(Box<ExpDesc>, Vec<usize>, Vec<usize>), // (condition, true-list, false-list)

This definition has no effect on other types of ExpDesc in the existing code. For the Test type in the existing code, it is only necessary to remove the discharge processing.

Bytecode

The functions of the two new bytecodes TestAndJump and TestOrJump in the previous section are both: "test" + "jump". And the function we need now is: "test" + "assignment" + "jump". To this end, we add 2 more bytecodes:

pub enum ByteCode {
     Jump(i16),
     TestAndJump(u8, i16),
     TestOrJump(u8, i16),
     TestAndSetJump(u8, u8, u8), // add
     TestOrSetJump(u8, u8, u8), // add

The function of TestAndSetJump is: if the value of the first parameter is tested to be true, it is assigned to the stack position of the second parameter and jumps to the bytecode position of the third parameter. Similar to TestOrSetJump.

Here comes a problem. In the previous jump bytecodes (the first 3 in the above code), the jump parameters are all 2 bytes, i16 type, and the range of jumps can be large. And the newly added 2 bytecodes are associated with 3 parameters, so there is only one byte left for the jump parameter.

This is why, as mentioned in the previous section, in the official implementation of Lua, 2 bytecodes are used to represent conditional jump instructions. For example, as opposed to TestAndJump(t, jmp), it is TEST(t, 0); JUMP(jmp); and in the evaluation scenario introduced in this section, it is necessary to add a target address parameter dst, which is TESTSET (dst, t, 0); JUMP(jmp). This ensures that the jump parameter has 2 bytes of space. Moreover, although there are 2 bytecodes, during the execution of the virtual machine, when the TEST or TESTSET bytecode is executed, if a jump is required, the parameter of the next bytecode JUMP can be directly removed And execute the jump without having to do another instruction dispatch for the JUMP. It is equivalent to 1 bytecode, and JUMP is only used as an extended parameter, so it does not affect the performance during execution.

But we still use 1 byte code here, and use 1 byte to represent the jump parameter. In the conditional judgment scenario in the previous section, the judgment of the last operand is to jump to the end of the entire block, and the jump distance may be very long, requiring 2 bytes of space. In the evaluation scenario in this section, only jumps are made within the logic operation statement. You can refer to the above 6 figures, and the jump distance will not be very long; and since it only jumps forward, there is no need to represent negative numbers. So 1 byte u8 type means that 256 distances are enough to cover. When conditions permit, 1 bytecode is always better than 2.

Syntax Analysis

After introducing the above modification points, now start the syntax analysis. The so-called evaluation is discharge. So we only need to complete the ExpDesc::Test type in the discharge() function. In the previous section, this is not complete. The specific discharge method is: first discharge the recursively defined conditional statement, and then repair the judgment bytecodes in the two jump lists.

     fn discharge(&mut self, dst: usize, desc: ExpDesc) {
         let code = match desc {
             // omit other types
             ExpDesc::Test(condition, true_list, false_list) => {
                 // fix TestSet list after discharging

                 // first discharge the recursively defined conditional statement
                 self.discharge(dst, *condition);

                 // Fix the judgment bytecode in the True jump list
                 self.fix_test_set_list(true_list, dst);
                 // Fix the judgment bytecode in the False jump list
                 self.fix_test_set_list(false_list, dst);
                 return;
             }

Fixing the jump list fix_test_set_list() function needs to do 2 things:

fill jump parameters that were left blank before;
Replace the previously generated TestAndJump and TestOrJump bytecodes with TestAndSetJump and TestOrSetJump respectively.

The specific code is as follows:

     fn fix_test_set_list(&mut self, list: Vec<usize>, dst: usize) {
         let here = self.byte_codes.len();
         let dst = dst as u8;
         for i in list.into_iter() {
             let jmp = here - i - 1; // should not be negative
             let code = match self. byte_codes[i] {
                 ByteCode::TestOrJump(icondition, 0) =>
                     if icondition == dst {
                         // If the conditional statement is just at the target position,
                         // there is no need to change it to TestAndSetJump
                         ByteCode::TestOrJump(icondition, jmp as i16)
                     } else {
                         // Modify to TestAndSetJump bytecode
                         ByteCode::TestOrSetJump(dst as u8, icondition, jmp as u8)
                     }
                 ByteCode::TestAndJump(icondition, 0) =>
                     if icondition == dst {
                         ByteCode::TestAndJump(icondition, jmp as i16)
                     } else {
                         ByteCode::TestAndSetJump(dst as u8, icondition, jmp as u8)
                     }
                 _ => panic!("invalid Test"),
             };
             self.byte_codes[i] = code;
         }
     }

Test

So far, the application scenario of logical operations in evaluation has been completed. This can be tested with the examples in the figures at the beginning of this section. omitted here.

Build a Lua Interpreter in Rust