Arguments

The previous section introduced the definition and calling process of Lua functions. This section describes the arguments of the function.

The term "argument" has two concepts:

  • "parameter", refers to the variable in the function prototype, including information such as parameter name and parameter type;
  • "argument", refers to the actual value when the function is called.

When introducing syntax analysis and virtual machine execution later in this section, "parameter" and "argument" must be clearly distinguished sometimes.

A very important point is: in the Lua language, the parameters of the function are local variables! During syntax analysis, the parameters will also be placed in the initial position of the local variable table, so that if there is a reference to the parameter in the subsequent code, it will also be located in the local variable table. In the virtual machine execution phase, the arguments are loaded onto the stack immediately following the function entry, followed by local variables, which is consistent with the order in the local variable table in the syntax analysis phase. For example, for the following functions:

local function foo(a, b)
     local x, y = 1, 2
end

When the foo() function is executed, the stack layout is as follows (numbers 0-3 on the right side of the stack are relative indices):

|     |
+-----+
| foo |
+=====+ <---base
|  a  | 0  \
+-----+     + arguments
|  b  | 1  /
+-----+
|  x  | 2  \
+-----+     + local variables
|  y  | 3  /
+-----+
|     |

The only difference between arguments and local variables is that the value of the parameter is passed in by the caller when calling, while the local variable is assigned inside the function.

Syntax Analysis of Parameter

The syntax analysis of the parameter is also the syntax analysis of the function definition. When the function definition was introduced in the previous section, the parameter part was omitted in the process of syntax analysis, and it is added now. The BNF of the function definition is funcbody, which is defined as follows:

    funcbody ::= `(` [parlist] `)` block end
    parlist ::= namelist [`,` `...`] | `...`
    namelist ::= Name {`,` Name}

As you can see, the parameter list consists of two optional parts:

  • Multiple optional Names are fixed parameters. In the previous section, when parsing the new function and creating the FuncProto structure, the locals field of the local variable table was initialized to an empty list. Now initialize to a parameter list instead. In this way, the parameters are at the front of the local variable table, and the subsequent newly created local variables follow, which is consistent with the stack layout diagram at the beginning of this section. In addition, since the number of arguments of the calling function in Lua language is allowed to be different from the number of parameters. If it is more, it will be discarded, and if it is less, it will be filled with nil. Therefore, the number of parameters should also be added to the result of FuncProto for comparison during virtual machine execution.

  • The last optional ... indicates that this function supports variable arguments. If it is supported, then in the subsequent syntax analysis, ... can be used in the body of the function to refer to variable parameters, and in the virtual machine execution stage, special processing should also be done for variable parameters. Therefore, a flag needs to be added in FuncProto to indicate whether this function supports variable parameters.

In summary, there are three modification points in total. Add two fields to FuncProto:

pub struct FuncProto {
     // Whether to support variable parameters.
     // Used in both parsing and virtual machine execution.
     pub has_varargs: bool,

     // The number of fixed parameters. 
     // Used in virtual machine execution.
     pub nparam: usize,
     
     pub constants: Vec<Value>,
     pub byte_codes: Vec<ByteCode>,
}

In addition, when initializing the ParseProto structure, use the parameter list to initialize the local variable locals field. code show as below:

impl<'a, R: Read> ParseProto<'a, R> {
     // Add has_varargs and params two parameters
     fn new(lex: &'a mut Lex<R>, has_varargs: bool, params: Vec<String>) -> Self {
         ParseProto {
             fp: FuncProto {
                 has_varargs: has_varargs, // Whether to support variable parameters
                 nparam: params.len(), // number of parameters
                 constants: Vec::new(),
                 byte_codes: Vec::new(),
             },
             sp: 0,
             locals: params, // Initialize the locals field with the parameter list
             break_blocks: Vec::new(),
             continue_blocks: Vec::new(),
             gotos: Vec::new(),
             labels: Vec::new(),
             lex: lex,
         }
     }

At this point, the syntax analysis of the parameters is completed. It involves variable parameters, virtual machine execution and other parts, which will be described in detail below.

Syntax Analysis of Arguments

Syntactic analysis of arguments, that is, syntactic analysis of function calls. This has been implemented in the previous chapter when prefixexp was implemented: the parameter list is read through the explist() function, and loaded to the position behind the function entry on the stack in turn. Consistent with the stack layout diagram at the beginning of this section, it is equivalent to assigning values to parameters. The actual number of arguments is parsed here and written into the arguments of the bytecode Call for comparison with the formal during the execution phase of the virtual machine.

But the implementation at the time was incomplete and did not support variable parameters. More details later in this section.

Virtual Machine Execution

In the above syntax analysis of the arguments, the arguments have been loaded onto the stack, which is equivalent to assigning values to the parameters, so when the virtual machine executes the function call, it does not need to process the parameters. However, in the Lua language, the number of arguments may not be equal to the number of parameters when a function is called. If there are more arguments than parameters, there is no need to deal with it, and the extra part is considered to be a temporary variable that occupies the stack position but is useless; but if the argument is less than the parameter, then the insufficient part needs to be set to nil, otherwise the subsequent words The reference to this parameter by the section code will cause Lua's stack access exception. In addition, the execution of Call bytecode does not require other processing of parameters.

As mentioned above in the grammatical analysis, the number of parameters and arguments are respectively in the nparam field in the FuncProto structure and the associated parameters of Call bytecode. So the virtual machine execution code of the function call is as follows:

     ByteCode::Call(func, narg) => { // `narg` is the actual number of arguments passed in
         self.base += func as usize + 1;
         match &self.stack[self.base - 1] {
             Value::LuaFunction(f) => {
                 let narg = narg as usize;
                 let f = f. clone();
                 if narg < f.nparam { // `f.nparam` is the number of parameters in the function definition
                     self.fill_stack(narg, f.nparam - narg); // fill nil
                 }
                 self. execute(&f);
             }

So far, the fixed parameter part is completed, which is relatively simple; the variable parameter part is introduced below, and it becomes complicated.

Variable Parameters

In Lua, the ... expression is used for variable parameters and variable arguments both. It's variable parameters in parameter list in function definition; and it's variable arguments otherwise.

Variable parameters have been mentioned in Syntax Analysis of parameters above, and their functions are relatively simple, which just indicates that this function does support variable parameters. The rest of this section mainly introduces the processing of variable arguments when executing a function call.

At the beginning of this section, the parameters of the function are introduced as local variables, and the layout of the stack is drawn. However, this statement is only suitable for fixed arguments, but not for variable parameters. Add variable parameters to the previous foo() function as an example, the code is as follows:

local function foo(a, b, ...)
     local x, y = 1, 2
     print(x, y, ...)
end
foo(1, 2, 3, 4, 5)

What should the stack layout look like after adding variable parameters? In other words, where does the variable argument exist? When the last line foo() in the above code is called, 1 and 2 correspond to the parameters a and b respectively, and 3, 4 and 5 are variable Arguments. Before the call starts, the stack layout is as follows:

|     |
+-----+
| foo |
+=====+ <-- base
|  1  |  \
+-----+   + Fixed arguments, corresponding to `a` and `b`
|  2  |  /
+-----+
|  3  |  \
+-----+   |
|  4  |   + Variable arguments, corresponding to `...`
+-----+   |
|  5  |  /
+-----+
|     |

After entering the foo() function, where should the next three arguments exist? The most direct idea is to keep the above layout unchanged, that is, the variable arguments are stored behind the fixed arguments. However, this is not acceptable! Because this will occupy the space of local variables, that is, x and y in the example will be moved back, and the distance moved back is the number of variable arguments. However, the number of variable arguments cannot be determined during the syntax analysis stage, so the position of the local variable on the stack cannot be determined, and the local variable cannot be accessed.

The official implementation of Lua is to ignore the variable parameters in the syntax analysis stage, so that the local variables are still behind the fixed parameters. But when the virtual machine is executing, after entering the function, the variable parameters are moved to the front of the function entry, and the number of variable arguments is recorded. In this way, when accessing variable parameters, the stack position can be located according to the function entry position and the number of variable arguments, that is, stack[self.base - 1 - number of arguments.. self.base - 1] . The following is a stack layout diagram:

|     |
+-----+
|  3  | -4 \
+-----+     |                                  num_varargs: usize  // record the #variable arguments
|  4  | -3  + move the variable arguments           +-----+
+-----+     | to the front of the function entry    |  3   |
|  5  | -2 /                                        +-----+
+-----+
| foo | <-- function entry
+=====+ <-- base
| a=1 | 0  \
+-----+     + fixed arguments, corresponding to `a` and `b`
| b=2 | 1  /
+-----+
|  x  | 2  \
+-----+     + local variables
|  y  | 3  /  following to fixed arguments

Since this solution needs to record additional information (the number of variable arguments) when the virtual machine is executed, and also move the parameters on the stack, it is easier to record the variable arguments directly:

|     |
+-----+
| foo | <-- function entry                 varargs: Vec<Value>  // record variable arguments
+=====+                                       +-----+-----+-----+
| a=1 | 0  \                                  |  3  |  4  |  5  |
+-----+     + fixed arguments                 +-----+-----+-----+
| b=2 | 1  /
+-----+
|  x  | 2  \
+-----+     + local variables
|  y  | 3  /


Compared with the official implementation of Lua, this method does not use the stack, but uses Vec, which will have additional memory allocation on the heap. But more intuitive and clear.

After determining the storage method of the variable arguments, we can perform syntax analysis and virtual machine execution.

ExpDesc::VarArgs and Application Scenarios

The above is about passing variable parameters when the function is called, and then introduces how to access variable parameters in the function body.

Access to a variable argument is an independent expression, the syntax is ..., parsed in the exp_limit() function, and a new expression type ExpDesc::VarArgs is added, this type is not associated parameter.

It is very simple to read this expression, first check whether the current function supports variable parameters (whether there is ... in the function prototype), and then return ExpDesc::VarArgs. The specific code is as follows:

     fn exp_limit(&mut self, limit: i32) -> ExpDesc {
         let mut desc = match self. lex. next() {
             Token::Dots => {
                 if !self.fp.has_varargs { // Check if the current function supports variable parameters?
                     panic!("no varargs");
                 }
                 ExpDesc::VarArgs // New expression type
             }

But what to do with ExpDesc::VarArgs read? This requires first sorting out the three scenarios of using variable arguments:

  1. When ... is used as the last argument of a function call, the last value of a return statement, or the last list member of a table construction, it represents all the arguments passed in. For example, the following example:

    print("hello: ", ...) -- last argument
    local t = {1, 2, ...} -- last list member
    return a+b, ... -- the last return value
    
  2. When ... is used as the last expression after the equal sign = of a local variable definition statement or an assignment statement, the number will be expanded or reduced as required. For example, the following example:

    local x, y = ... -- Take the first 2 arguments and assign them to x and y respectively
    t.k, t.j = a, ... -- Take the first argument and assign it to t.j
    
  3. Other places only represent the first actual argument passed in. For example, the following example:

    local x, y = ..., b -- not the last expression, only take the first argument and assign it to x
    t.k, t.j = ..., b -- not the last expression, only take the first argument and assign it to t.k
    if ... then -- conditional judgment
        t[...] = ... + f -- table index, and operands of binary operations
    end
    

Among them, the first scenario is the most basic, but it is also the most complicated to implement; the latter two scenarios are special cases and relatively simple to implement. The three scenarios are analyzed in turn below.

Scenario 1: All Variable Arguments

The first scenario is introduced first, that is, loading all variable arguments. The 3 statements in this scenario are as follows:

  1. The last argument of the function call, is to use the variable arguments of the current function as the variable arguments of the calling function. Here it involves variable arguments in 2 functions, which is a bit confusing and inconvenient to describe;

  2. The last value of the return statement, but the return value is not supported yet, which will be introduced in the next section;

  3. The last list member of the table construction.

The implementation ideas of these three statements are similar. When parsing the expression list, only the previous expression is discharged, and the last expression is not discharged; and then after the complete statement is parsed, it is checked separately whether the last statement is ExpDesc::VarArgs:

  • If not, discharge normally. In this case, the quantity of all expressions can be determined during parsing, and the number of values can be encoded into the corresponding bytecode.

  • If yes (ExpDesc::VarArgs), use the newly added bytecode VarArgs to load all variable parameters, and the number of arguments is not known during syntax analysis, but can only be known when the virtual machine is executed, so the total number of expressions can't be encoded into the corresponding bytecode, so it needs to be handled with a special value or a new bytecode.

Among the three statements, the third statement table structure is relatively the simplest, so we introduce it first.

The syntax analysis process of the previous table construction is: in the process of reading all members in a loop, if an array member is parsed, it will be immediately discharged to the stack; after the loop reading is completed, all array members are loaded on the stack in turn, and then generated SetList bytecode adds it to the list. The second associated parameter of this SetList bytecode is the number of array members. For simplicity, the processing of batch loading when there are more than 50 members is ignored here.

Now modify the process: in order to process the last expression alone, when parsing to an array member, we need delay the discharge. The specific method is relatively simple but not easy to describe, you can refer to the following code. The code is excerpted from the table_constructor() function, and only the content related to this section is kept.

     // Add this variable to save the last read array member
     let mut last_array_entry = None;

     // Loop to read all members
     loop {
         let entry = // omit the code to read members
         match entry {
             TableEntry::Map((op, opk, key)) => // omit the code of the member part of the dictionary
             TableEntry::Array(desc) => {
                 // Use the replace() function to replace the last read member with the new member desc
                 // and discharge. And the new member, the current "last member", is
                 // Store in last_array_entry.
                 if let Some(last) = last_array_entry. replace(desc) {
                     self.discharge(sp0, last);
                 }
             }
         }
     }

     // process the last expression, if any
     if let Some(last) = last_array_entry {
         let num = if self. discharge_expand(last) {
             // variable arguments. It is impossible to know the specific number
             // of arguments in the syntax analysis stage, so `0` is used to
             // represent all arguments on the stack.
             0
         } else {
             // not variable arguments, so we can calculate the total number of members
             (self.sp - (table + 1)) as u8
         };
         self.fp.byte_codes.push(ByteCode::SetList(table as u8, num));
     }

The above code sorting process is relatively simple, so I won’t introduce them line by line here. There are a few details to cover when dealing with the last expression:

  • Added discharge_expand() method for special handling of ExpDesc::VarArgs type expressions. It is foreseeable that this function will be used by the other two statements (return statement and function call statement) later. Its code is as follows:
     fn discharge_expand(&mut self, desc: ExpDesc) -> bool {
         match desc {
             ExpDesc::VarArgs => {
                 self.fp.byte_codes.push(ByteCode::VarArgs(self.sp as u8));
                 true
             }
             _ => {
                 self.discharge(self.sp, desc);
                 false
             }
         }
     }
  • If the last expression is a variable parameter, then the second associated parameter of SetList bytecode is set to 0. Previously (when variable arguements expressions were not supported), this parameter of SetList bytecode could not be 0, because if there is no array member, then it is sufficient not to generate SetList bytecode, and there is no need to generate an association SetList with parameter 0. So here we can use 0 as a special value. In contrast, the other two statements in this scenario (return statement and function call statement) originally support 0 expressions, that is, there is no return value and no parameters, so 0 cannot be used as a special value. Then think of other ways.

    Of course, the special value 0 may not be used here, but a new bytecode, such as SetListAll, is specially used to deal with this situation. These two approaches are similar, we still choose to use the special value 0.

  • When the virtual machine is executing, if the second associated parameter of SetList is 0, all the values behind the table on the stack will be fetched. That is, from the position of the table to the top of the stack, they are all expressions used for initialization. The specific code is as follows, adding the judgment of 0:

     ByteCode::SetList(table, n) => {
         let ivalue = self.base + table as usize + 1;
         if let Value::Table(table) = self.get_stack(table). clone() {
             let end = if n == 0 { // 0, variable arguments, means all expressions up to the top of stack
                 self.stack.len()
             } else {
                 ivalue + n as usize
             };
             let values = self.stack.drain(ivalue .. end);
             table.borrow_mut().array.extend(values);
         } else {
             panic!("not table");
         }
     }
  • Since the actual number of expressions can be obtained according to the top of the stack when the virtual machine is executed in the case of variable arguments, then can we do this in the case of fixed expressions before? In this way, is the second parameter associated with SetList useless? The answer is no, because there may be temporary variables on the stack! For example the following code:
t = { g1+g2 }

The two operands of the expression g1+g2 are global variables. Before evaluating the entire expression, they must be loaded on the stack separately, and two temporary variables need to be occupied. The stack layout is as follows:

|       |
+-------+
|   t   |
+-------+
| g1+g2 | load the global variable g1 into here temporaryly, and covered by g1+g2 later
+-------+
|   g2  | load the global variable g2 into here temporaryly
+-------+
|       |

At this time, the top of the stack is g2. If the method of going from the back of the list to the top of the stack is also followed, then g2 will also be considered a member of the list. Therefore, for the previous case (fixed number of expressions), it is still necessary to determine the number of expressions in the syntax analysis stage.

  • Then, why can the top of the stackto determine the number of expressions in the case of variable arguments? This requires the virtual machine to clean up temporary variables when executing bytecodes that load variable parameters. this point is very important. The specific code is as follows:
     ByteCode::VarArgs(dst) => {
         self.stack.truncate(self.base + dst as usize); // Clean up temporary variables! ! !
         self.stack.extend_from_slice(&varargs); // load variable parameters
     }

So far, the processing of the variable arguments as the last expression of the table construction is completed. There are not many related codes, but it is not easy to sort out the ideas and some details.

Scenario 1: All Variable Arguments (continued)

The table construction statement in the first scenario was introduced above, and now we introduce the case where variable arguments are used as the last parameter of a function call. Just listening to this description is confusing. These two statements handle variable arguments in the same way, and only the differences are introduced here.

It has been explained in the introduction of Syntax Analysis of arguments above that all arguments are loaded to the top of the stack sequentially through the explist() function, and the number of arguments is written to Call bytecode. But the implementation at the time did not support variable arguments. Now in order to support variable arguments, the last expression needs to be treated specially. To do this, we modify the explist() function to keep and return the last expression, but just discharge the previous expressions onto the stack in turn. The specific code is relatively simple, skip it here. To review, in the assignment statement, when reading the expression list on the right side of the equal sign =, we also need to keep the last expression not discharged. After modifying the exp_list() function this time, it can also be used in the assignment statement.

After modifying the explist() function, combined with the above introduction to the table construction statement, the variable arguments in the function call can be realized. code show as below:

     fn args(&mut self) -> ExpDesc {
         let ifunc = self.sp - 1;
         let narg = match self. lex. next() {
             Token::ParL => { // parameter list wrapped in brackets ()
                 if self.lex.peek() != &Token::ParR {
                     // Read the argument list. Keep and return the last expression
                     // `last_exp`, and discharge the previous expressions onto the
                     // stack in turn and return their number `nexp`.
                     let (nexp, last_exp) = self.explist();
                     self.lex.expect(Token::ParR);

                     if self. discharge_expand(last_exp) {
                         // Variable arguments !!!
                         // Generate the newly added `VarArgs` bytecode
                         // and read all variable arguments.
                         none
                     } else {
                         // Fixed arguments. `last_exp` is also discharged onto the stack as the last argument.
                         Some(nexp + 1)
                     }
                 } else { // no parameters
                     self. lex. next();
                     some(0)
                 }
             }
             Token::CurlyL => { // table construction without parentheses
                 self. table_constructor();
                 some(1)
             }
             Token::String(s) => { // string constant without parentheses
                 self.discharge(ifunc+1, ExpDesc::String(s));
                 some(1)
             }
             t => panic!("invalid args {t:?}"),
         };

         // For `n` fixed arguments, convert to `n+1`;
         // Converts to `0` for variable arguments.
         let narg_plus = if let Some(n) = narg { n + 1 } else { 0 };

         ExpDesc::Call(ifunc, narg_plus)
     }

The difference from the table construction statement introduced before is that the bytecode corresponding to the table construction statement is SetList, and in the case of fixed members, the associated parameter used to represent the quantity will not be 0; so we can use 0 as a special value to represent a variable number of members. However, for the function call statement, it supports the case of no argument, that is to say, the parameter of the argument value associated with the bytecode Call may already be 0, so it is not possible to simply put 0 as a special value. Then, there are 2 options:

  • Choose another special value, such as u8::MAX, that is, 255 as a special value;
  • Still use 0 as a special value, but in the case of a fixed argument, add 1 to the parameter. For example, if there are 5 arguments, then write 6 in the Call bytecode; if N bytecodes, write N+1; in this way, you can ensure that in the case of a fixed parameter, this parameter must be greater than 0.

I feel that the first solution is slightly better, it's clearer and less error-prone. But the official implementation of Lua uses the second solution. We also use the second option. Corresponding to the two variables in the above code:

  • narg: Option<usize> indicates the actual number of arguments, None indicates variable arguments, Some(n) indicates that there are n fixed arguments;
  • narg_plus: usize is the corrected value to be written into Call bytecode.

The same thing as the table construction statement introduced before is that since the special value 0 is used to represent the variable parameter, then when the virtual machine executes, there must be a way to know the actual number of arguments. The number of arguments can only be calculated by the distance between the pointer on the top of the stack and the function entry, so it is necessary to ensure that the top of the stack is all arguments and there are no temporary variables. For this requirement, there are two cases:

  • The argument is also a variable arguments ..., that is, the last argument is VarArgs, for example, the call statement is foo(1, 2, ...), then since the virtual machine execution of VarArgs introduced before will ensure clean up temporary variables, so there is no need to clean up again in this case;
  • The argument is fixed arguments. For example, if the calling statement is foo(g1+g2), then it is necessary to clean up the possible temporary variables.

Correspondingly, the function call in the virtual machine execution phase, that is, the execution of Call bytecode, needs to be modified as follows:

  • Modify the associated parameter narg_plus;
  • Clean up possible temporary variables on the stack when needed.

code show as below:

     ByteCode::Call(func, narg_plus) => { // `narg_plus` is the corrected number of real parameters
         self.base += func as usize + 1;
         match &self.stack[self.base - 1] {
             Value::LuaFunction(f) => {
                 let narg = if narg_plus == 0 {
                     // Variable arguments. As mentioned above, the execution
                     // of VarArgs bytecode will clean up possible temporary
                     // variable, so the top of the stack can be used to determine
                     // the actual number of arguments.
                     self.stack.len() - self.base
                 } else {
                     // Fixed arguments. Need to subtract 1 for correction.
                     narg_plus as usize - 1
                 };

                 if narg < f.nparam { // fill nil, original logic
                     self.fill_stack(narg, f.nparam - narg);
                 } else if f.has_varargs && narg_plus != 0 {
                     // If the called function supports variable arguments, and the
                     // call is a fixed argument, then we need to clean up possible
                     // temporary variables on the stack
                     self.stack.truncate(self.base + narg);
                 }

                 self. execute(&f);
             }

So far, we have completed the part of the first scenario of variable arguments. This part is the most basic and also the most complex. Two other scenarios are described below.

Scenario 2: The first N Variable Arguments

Now introduce the second scenario of variable arguments, which requires a fixed number of variable arguments. The number of paraargumentsmeters to be used in this scenario is fixed and can be compiled into bytecode, which is much simpler than the previous scenario.

This scenario includes 2 statements: a local variable definition statement and an assignment statement. When the variable arguments are used as the last expression after the equal sign =, the number will be expanded or reduced as required. For example, the following sample code:

     local x, y = ... -- Take the first 2 arguments and assign them to x and y respectively
     t.k, t.j = a, ... -- Take the first argument and assign it to t.j

The processing of these two statements is basically the same. Only the first local variable definition statement is introduced here.

The processing flow of the previous statement is to first load the expressions on the right side of = onto the stack in order to complete the assignment of local variables. If the number of expressions on the right side of = is less than the number of local variables on the left, then generate LoadNil bytecode to assign values to the extra local variables; if it is not less than, no processing is required.

Now special treatment is required for the last expression: if the number of expressions is less than the number of local variables, and the last expression is a variable arguments ..., then the arguments is read as needed; if it is not variable arguments, it still falls back to the original method, which is filled with LoadNil. The explist() function that was modified just now comes in handy again, the specific code is as follows:

     let want = vars.len();

     // Read the list of expressions.
     // Keep and return the last expression `last_exp`, and discharge the previous
     // the expressions onto the stack in turn and return their number `nexp`.
     let (nexp, last_exp) = self.explist();
     match (nexp + 1).cmp(&want) {
         Ordering::Equal => {
             // If the expression is consistent with the number of local variables,
             // the last expression is also dischargeed on the stack.
             self.discharge(self.sp, last_exp);
         }
         Ordering::Less => {
             // If the expressions are less than the number of local variables,
             // we need to try to treat the last expression specially! ! !
             self.discharge_expand_want(last_exp, want - nexp);
         }
         Ordering::Greater => {
             // If the expression is more than the number of local variables,
             // adjust the top pointer of the stack; the last expression
             // is no need to deal with it.
             self.sp -= nexp - want;
         }
     }

In the above code, the added logic is discharge_expand_want() function, which is used to load want - nexp expressions onto the stack. code show as below:

     fn discharge_expand_want(&mut self, desc: ExpDesc, want: usize) {
         debug_assert!(want > 1);
         let code = match desc {
             ExpDesc::VarArgs => {
                 // variadic expression
                 ByteCode::VarArgs(self.sp as u8, want as u8)
             }
             _ => {
                 // For other types of expressions, still use the previous method, that is, use LoadNil to fill
                 self.discharge(self.sp, desc);
                 ByteCode::LoadNil(self.sp as u8, want as u8 - 1)
             }
         };
         self.fp.byte_codes.push(code);
     }

This function is similar to the discharge_expand() function in the first scenario above, but there are two differences:

  • Previously, all variable arguments in the actual execution were required, but this function has a certain number of requirements, so there is an additional parameter want;

  • The previous function needs to return whether it is a variable arguments, so that the caller can make a distinction; but this function has no return value because the requirement is clear and the caller does not need to make a distinction.

Compared with the first scenario above, another important change is that VarArgs bytecode adds an associated parameter to indicate how many arguments need to be loaded onto the stack. Because in this scenario, this parameter is definitely not less than 2, and in the next scenario, this parameter is fixed at 1, and 0 is not used, so 0 can be used as a special value to represent the value in the first scenario above: all arguments at execution time.

The virtual machine execution code of this bytecode is also changed as follows:

     ByteCode::VarArgs(dst, want) => {
         self.stack.truncate(self.base + dst as usize);

         let len = varargs.len(); // actual number of arguments
         let want = want as usize; // need the number of arguments
         if want == 0 { // All arguments are required, and the process remains unchanged
             self.stack.extend_from_slice(&varargs);
         } else if want > len {
             // Need more than actual, fill `nil` with fill_stack()
             self.stack.extend_from_slice(&varargs);
             self.fill_stack(dst as usize + len, want - len);
         } else {
             // needs as much or less than actual
             self.stack.extend_from_slice(&varargs[..want]);
         }
     }

So far, the second scenario of variable parameters has been completed.

Scenario 3: Only Take the First Variable Argument

The two scenarios introduced above are in a specific statement context, and the variable arguments are loaded onto the stack through the discharge_expand_want() or discharge_expand() function respectively. And the 3rd scenario is everywhere except the above specific statement context. So from this perspective, the third scene can be regarded as a general scene, so a general loading method must be used. Before the variable arguments expression is introduced in this section, all other expressions are loaded onto the stack by calling the discharge() function, which can be regarded as a general loading method. So in this scenario, the variable arguments expression should also be loaded through the discharge() function.

In fact, this scenario has already been encountered above. For example, in the second scenario above, if the number of expressions on the right side of = is equal to the number of local variables, the last expression is processed by the discharge() function:

     let (nexp, last_exp) = self.explist();
     match (nexp + 1).cmp(&want) {
         Ordering::Equal => {
             // If the expression is consistent with the number of
             // local variables, the last expression is also normal
             // discharged on the stack.
             self.discharge(self.sp, last_exp);
         }

Here the last expression of discharge() may also be a variable arguments expression ..., then it is the current scene.

For another example, the explist() function is called in the above two scenarios to process the expression list. Except for the last expression, all previous expressions are loaded onto the stack by this function by calling discharge(). If there is a variable arguments expression ... in the previous expression, such as foo(a, ..., b), then it is also the current scene.

In addition, the above also lists examples of variable expressions in other statements, all of which belong to the current scene.

Since this scene belongs to a general scene, there is no need to make any changes in the syntax analysis stage, but only need to complete the processing of the variable expression ExpDesc::VarArgs in the discharge() function. This process is also very simple, just use the VarArgs bytecode introduced above, and only load the first argument to the stack:

     fn discharge(&mut self, dst: usize, desc: ExpDesc) {
         let code = match desc {
             ExpDesc::VarArgs => ByteCode::VarArgs(dst as u8, 1), // 1 means only load the first argument

This completes the third scenario.

At this point, all the scenarios of variable arguments are finally introduced.

Summary

This section begins by introducing the mechanism of parameters and arguments respectively. For parameters, syntax analysis puts the parameters in the local variable table and uses them as local variables. For arguments, the caller loads the parameters onto the stack, which is equivalent to assigning values to the parameters.

Most of the following pages introduce the processing of variable arguments, including three scenarios: all arguments, fixed number of arguments, and the first argument in general scenarios.