generic-for Statement

Closures were introduced in the previous sections of this chapter. The most typical application scenario of a closure is an iterator, and the most common place for an iterator is a for statement. So much so that "Lua Programming" and "Rust Programming Language" both put Closures, iterators, and the for statement introduced together. The for statement in the Lua language has two formats, numeric and generic. Previously has introduced the numerical-for statement, this section introduces the generic-for statement using iterators.

After introducing closures, the concept of an iterator itself is straightforward. The counter closure that has been used as an example in the previous sections of this chapter can be regarded as an iterator, which generates an incremented number each time. Let's look at a slightly more complex iterator to traverse the array part of a table. This is also the function of the ipairs() function that comes with the Lua language:

function ipairs(t)
     local i = 0
     return function ()
         i = i + 1
         local v = t[i]
         if v then
             return i, v
         end
     end
end

In the above code, ipairs() is a factory function that creates and returns a closure as an iterator. This iterator has 2 upvalues, one is the fixed table t, and the other is the traversed position i. We can call these two upvalues as "iteration environments". During the traversal process, the iterator returns the index and value of the array; when the traversal ends, it does not return a value, and it can also be considered as returning nil.

This iterator can be called directly, but is more commonly used in a generic-for statement:

-- call the iterator directly
local iter = ipairs(t)
while true do
     local i, v = iter()
     if not i then break end
     block -- do something
end

-- used in a generic-for statement
for i, v in ipairs(t) do
     block -- do something
end

The use of iterators is certainly very convenient, but the previous sections also introduced that creating a closure requires additional overhead compared to creating a normal function, that is, both Lua closures and Rust closures require 2 extra times Memory allocation and 2 pointer jumps. Therefore, the generic for statement in the Lua language is specially optimized for this purpose, that is, the generic for statement itself replaces the closure to save the "iteration environment" (that is, 2 upvalues). Since there is no need for upvalue, the iterator does not need to use closures, but only ordinary functions.

Specifically, the grammar of the generic for statement is as follows:

stat ::= for namelist in explist do block end
namelist ::= Name {‘,’ Name}

Its execution process is as follows:

  • At the beginning of the loop, explist is evaluated to get 3 values: the iteration function, the immutable state, and the control variable. In most cases, explist is a function call statement, so the evaluation follows the evaluation rules of the function return value, that is, if there are less than 3, it will be filled with nil, and if it exceeds 3, the excess will be discarded. Of course, instead of using a function call, you can directly list 3 values.

  • Then, before each execution of the loop, use the latter two values (immutable state and control variables) as parameters to call the iteration function, and judge the first return value: if it is nil, terminate the loop; otherwise, assign the return value to namelist, and additionally assign the first return value to the control variable as a parameter for subsequent calls to the iteration function.

It can be seen that the three values returned by explist are put together to form a closure function: the iteration function is the function prototype, and the latter two are upvalues. It's just that the generic-for statement helps maintain these two upvalues. Using this feature of the generic-for statement, reimplement the above iterator for traversing the array as follows:

local function iter(t, i) -- both t and i are changed from upvalue to parameter
     i = i + 1
     local v = t[i]
     if v then
         return i, v
     end
end

function ipairs(t)
     return iter, t, 0
end

Compared with the closure version above, here both t and i have changed from upvalue to parameters, and iter has become an ordinary function.

From this point of view, the generic-for statement can be completed without the need for a closure (such as after the function was introduced in the previous chapter). But after all, this is an optimization based on closures. Only by mastering closures can we understand why we do this. That's why we implement the generic-for statement after introducing closures.

In addition, the function call statement ipairs(t) here only returns 3 variables, and these 3 variables can also be listed directly in the generic for statement:

for i, v in ipairs(t) do ... end
for i, v in iter, t, 0 do ... end -- directly list 3 variables

The following direct list method omits a function call, but it is inconvenient. So the first one is more common.

After introducing the characteristics of the generic-for statement in Lua, let's start to implement it.

Accomplish

According to the above introduction, the generic-for statement saves and maintains the iteration environment by itself. Where is that saved? Naturally, it is still on the stack. Just like the numerical-for statement will automatically create 3 variables (1 count variable and 2 anonymous variables) on the stack, the generic-for statement also needs to automatically create 3 anonymous variables, corresponding to the above iteration environment: iteration function, immutable state, control variables. These 3 variables are obtained after evaluating explist, as shown in the stack diagram shown in the left figure below:

|           |        |           |     |           |
+-----------+        +-----------+     +-----------+
| iter func |entry   | iter func |     | iter func |
+-----------+        +-----------+     +-----------+
| state     |\       | state     |     | state     |
+-----------+ 2args  +-----------+     +-----------+
| ctrl var  |/       | ctrl var  |     | ctrl var  |<--first return value
+-----------+        +-----------+   ->+-----------+
|           |        :           :  /  | name-     | i
                     +-----------+ /   | list      | v
                     | return-   |-    |           |
                     | values    |
                     |           |

Next, the loop is executed, including three steps: calling the iterative function, judging whether to continue the loop, and controlling variable assignment.

First, the iteration function iter func is called with the immutable state state and the control variable ctrl var as two parameters. Look at the stack diagram in the left figure, it has just been arranged into a function call formation, so the function can be called directly;

Secondly, after calling the function (the middle picture above), judge whether the first return value is nil, if so, exit the loop; if there is no return value, also exit the loop; otherwise, continue to execute the loop. Before executing the loop body, the return value needs to be processed (right picture above):

  • Assign the first return value to the control variable (ctrl-var in the above figure) as the parameter for the next call to the iterative function;

  • Assign the return value to the variable list, which is namelist in the above BNF. For example, in the above example of traversing the array, it is i, v. If the number of return values is less than the variable list, it will be filled with nil. This padding operation is consistent with ordinary function calls. The inconsistency is that the return value of an ordinary function call will be moved to the function entry, which is the position of iter func in the above figure; but here it is shifted down by 3 positions.

One thing that needs to be explained here is that the control variable ctrl-var is the first name of namelist. So in fact, there is no need to reserve a place for ctrl-var on the stack; after calling the iterative function each time, just move all the return values directly to the place of ctrl-var in the figure, so that the first return value happens to be where ctrl-var is. The figure below is a comparison of the two schemes. The left picture is the original plan, which specially reserved the position for ctrl-var; the right picture is the new plan, which only needs 2 anonymous variables to save the iteration environment, and ctrl-var overlaps with the first name:

 |           |      |           |
 +-----------+      +-----------+
 | iter func |      | iter func |
 +-----------+      +-----------+
 | state     |      | state     |
 +-----------+      +-----------+
 | ctrl var  |     i| name-     | <--ctrl var
 +-----------+     v| list      |
i| name-     |      |           |
v| list      |
 |           |

The solution on the right is simpler, with one less variable assignment. And under normal circumstances, the functions of the two programs are the same. But in one case, the function will be different, that is, when the control variable is modified in the loop body. For example, the following sample code:

for i, v in ipairs(t) do
     i = i + 1 -- modify the control variable `i`
     print(i)
end

According to the scheme in the left figure above, the i modified here is a variable exposed to programmers, and the control variable ctrl var is a hidden anonymous variable, and these two variables are independent. So modifications to i do not affect the control variable ctrl var. So this loop can still traverse the entire array.

According to the scheme on the right, i and ctrl var are the same value, and modifying i means modifying ctrl var, which affects the next iteration function call, and eventually leads to the failure to traverse the entire array normally.

Which behavior is more reasonable? Lua Manual explains: You should not change the value of the control variable during the loop. In other words, there is no clear definition of this behavior, so any solution is fine. However, the official implementation of Lua is based on the behavior in the left picture. In order to maintain consistency, we also choose the left picture here.

Bytecode

The above describes the operation of the generic-for statement in the loop. These operations require a new bytecode ForCallLoop to complete.

Before defining this bytecode, let's see where this bytecode should be placed? Is it the beginning of the loop, or the end of the loop? If you follow the Lua code, it should be placed at the beginning of the loop, and then generate a Jump bytecode at the end of the loop to jump back and continue the loop, like this:

ForCallLoop # If calling the iteration function returns nil, jump to the end
... block ...
Jump (back-to-ForCallLoop)

But in this case, 2 bytecodes are executed for each loop, ForCallLoop at the beginning and Jump at the end. To reduce the bytecode once, we can put ForCallLoop at the end of the loop, so that only 2 bytecodes need to be executed in the first loop, and only 1 bytecode needs to be executed in each subsequent loop:

Jump (forward-to-ForCallLoop)
... block ...
ForCallLoop # If the return of calling the iteration function is not nil, then jump to the above block

After determining the bytecode location, let's look at the definition. This bytecode needs to be associated with 3 parameters:

  • the stack index of the iteration function iter func;
  • The number of variables used for the assignment of the return value. If the number of return values is less than the number of variables, you need to fill in nil;
  • Jump distance.

Both the first two parameters can be represented by 1 byte, so there is only 1 byte left for the final jump distance, which can only represent a distance of 255, which is not enough. For this reason, we can only add a Jump bytecode to complete this function. However, in most cases, the loop body is not large, and the distance does not exceed 255. It is a bit wasteful to add another bytecode for a small number of large loop bodies. The best thing to do in this situation is to:

  • For the small loop body, the jump distance can be programmed into ForCallLoop bytecode, only this 1 bytecode is used;
  • For the large loop body, set the jump distance of the third parameter in the ForCallLoop bytecode to 0, and add a Jump bytecode to cooperate.

In this way, when the virtual machine is executed:

  • In the case where the loop needs to jump backwards: for a small loop body, jump directly according to the third parameter; for a large loop body, the third parameter is 0, actually jump to the next Jump bytecode, and then Execute the jump of the loop body again.
  • In the case where the loop needs to continue to execute forward: for small loop bodies, no special processing is required; for large loop bodies, the next Jump bytecode needs to be skipped.

In summary, the bytecode ForCallLoop is defined as follows:

pub enum ByteCode {
     ForCallLoop(u8, u8, u8),

The specific syntax analysis and virtual machine execution code are omitted here.

At this point, the generic-for statement is completed. We also finished all the syntax of Lua! (Applause please!)