Table Construction

This section describes the construction of tables. The construction supports 3 types: list type, record type, and general type. See the following sample codes respectively:

local key = "kkk"
print { 100, 200, 300; -- list style
        x="hello", y="world"; -- record style
        [key]="vvv"; -- general style
}

Let's first look at how the official implementation of Lua handles the construction of tables. The output of luac is as follows:

$ luac -l test_lua/table.lua

main <test_lua/table.lua:0,0> (14 instructions at 0x600001820080)
0+ params, 6 slots, 1 upvalue, 1 local, 7 constants, 0 functions
    1	[1]	VARARGPREP	0
    2	[1]	LOADK    	0 0	; "kkk"
    3	[2]	GETTABUP 	1 0 1	; _ENV "print"
    4	[2]	NEWTABLE 	2 3 3	; 3
    5	[2]	EXTRAARG 	0
    6	[2]	LOADI    	3 100
    7	[2]	LOADI    	4 200
    8	[2]	LOADI    	5 300
    9	[3]	SETFIELD 	2 2 3k	; "x" "hello"
    10	[3]	SETFIELD 	2 4 5k	; "y" "world"
    11	[4]	SETTABLE 	2 0 6k	; "vvv"
    12	[5]	SETLIST  	2 3 0
    13	[2]	CALL     	1 2 1	; 1 in 0 out
    14	[5]	RETURN   	1 1 1	; 0 out

The bytecodes related to the construction of the table are lines 4 to 12:

  • Line 4, NEWTABLE, is used to create a table. There are 3 parameters in total, which are the position of the new table on the stack, the length of the array part, and the part length of the hash table.
  • Line 5, I don't understand it, ignore it for now.
  • Lines 6, 7, and 8, three LOADIs, respectively load the values 100, 200, and 300 of the array part to the stack for later use.
  • Lines 9 and 10, bytecode SETFIELD, insert x and y into the hash table part respectively.
  • Line 11, bytecode SETTABLE, inserts the key into the hash table.
  • Line 12, SETLIST, loads the data loaded on the stack in lines 6-8 above, and inserts it into the array at one time.

The stack situation corresponding to the execution of each bytecode is as follows:

           |       |        /<--- 9.SETFILED
           +-------+        |<---10.SETFILED
4.NEWTABLE |  { }  |<----+--+<---11.SETTABLE
           +-------+     |
   6.LOADI |  100  |---->|
           +-------+     |12.SETLIST
   7.LOADI |  200  |---->|
           +-------+     |
   8.LOADI |  300  |---->/
           +-------+
           |       |

First of all, it can be seen that the table is constructed in real time by inserting members one by one during the execution of the virtual machine. This is a bit beyond my expectation (although I didn't think about the process before). I have previously written code similar to the following:

local function day_of_week(day)
     local days = {
         "Sunday"=0, "Monday"=1, "Tuesday"=2,
         "Wednesday"=3, "Thursday"=4, "Friday"=5,
         "Saturday"=6,
     }
     return days[day]
end

It is natural to put days inside the day_of_week() function, because this variable is only used inside this function. However, according to the realization of the above table structure, every time this function is called, the table will be constructed in real time, that is, the 7 dates will be inserted into the table. This cost is a bit high (8 string hashes and 1 string are required In comparison, at least 9 bytecodes are required, and there is more than one memory allocation brought about by creating a table). It feels not even as fast as comparing week by week names (an average of 4 string comparisons are required, and 2 bytecodes are compared for a total of 8). A better way is to put the days variable outside the function (that is UpValue introduced later), and there is no need to construct a table every time you enter the function, but in this way it is not a good programming practice to put variables inside a function outside. Another approach (not supported by Lua's official implementation) is to construct a table composed of all constants in the parsing stage, and then just quote it later, but this will bring some complexity. No energy to finish by now.

Back to the construction of the table, the processing methods for the array part and the hash table part are different:

  • The array part is to first load the values onto the stack in sequence, and finally insert them into the array at one time;
  • The hash table part is directly inserted into the hash table each time.

One is batch and one is sequential. The reasons for the different methods are speculated as follows:

  • If the array part is also inserted one by one, then inserting certain types of expressions requires 2 bytecodes. For example, for global variables, you need to use GetGlobal bytecode to load it on the stack first, and then use a bytecode similar to AppendTable to insert into the array, then inserting N values requires at most 2N bytecodes . If you insert in batches, only N+1 bytecodes are needed for N values. So bulk insert is better for the array part.

  • As for the hash table part, each piece of data has two values of key and value. If the batch method is also used, 2 bytecodes are required to load both values onto the stack. And if it is inserted one by one, only one bytecode is needed in many cases. For example, the last three items in the above sample code only correspond to one bytecode. In this way, the batch method requires more bytecodes, so inserting one by one is more suitable for the hash table part.

In this section, according to the official Lua implementation method, the following 4 bytecodes are correspondingly added:

pub enum ByteCode {
     NewTable(u8, u8, u8),
     SetTable(u8, u8, u8), // key is on the stack
     SetField(u8, u8, u8), // key is a string constant
     SetList(u8, u8),

However, the two bytecodes in the middle do not support the case where the value is a constant, only the index on the stack is supported. We'll add optimizations to constants in a later section.

Syntax Analysis

After introducing the principle of table construction, let's look at the specific implementation. Look at the syntax analysis section first. The code is very long, but it is just according to the above introduction, the logic is very simple. The code is posted here for reference only, readers who are not interested can skip here.

fn table_constructor(&mut self, dst: usize) {
     let table = dst as u8;
     let inew = self.byte_codes.len();
     self.byte_codes.push(ByteCode::NewTable(table, 0, 0)); // Create a new table

     let mut narray = 0;
     let mut nmap = 0;
     let mut sp = dst + 1;
     loop {
         match self. lex. peek() {
             Token::CurlyR => { // `}`
                 self. lex. next();
                 break;
             }
             Token::SqurL => { // `[` exp `]` `=` exp, general formula
                 nmap += 1;
                 self. lex. next();

                 self.load_exp(sp); // key
                 self.lex.expect(Token::SqurR); // `]`
                 self.lex.expect(Token::Assign); // `=`
                 self. load_exp(sp + 1); // value

                 self.byte_codes.push(ByteCode::SetTable(table, sp as u8, sp as u8 + 1));
             },
             Token::Name(_) => { // Name `=` exp | Name
                 nmap += 1;
                 let key = if let Token::Name(key) = self. lex. next() {
                     self. add_const(key)
                 };
                 if self.lex.peek() == &Token::Assign { // Name `=` exp, recorded
                     self. lex. next();
                     self. load_exp(sp); // value
                     self.byte_codes.push(ByteCode::SetField(table, key as u8, sp as u8));
                 } else {
                     narray += 1;
                     self.load_exp_with_ahead(sp, Token::Name(key)); // exp, list

                     sp += 1;
                     if sp - (dst + 1) > 50 { // too many, reset it
                         self.byte_codes.push(ByteCode::SetList(table, (sp - (dst + 1)) as u8));
                         sp = dst + 1;
                     }
                 }
             },
             _ => { // exp, list
                 narray += 1;
                 self. load_exp(sp);

                 sp += 1;
                 if sp - (dst + 1) > 50 { // too many, reset it
                     self.byte_codes.push(ByteCode::SetList(table, (sp - (dst + 1)) as u8));
                     sp = dst + 1;
                 }
             },
         }

         match self. lex. next() {Token::SemiColon | Token::Comma => (),
             Token::CurlyR => break,
             t => panic!("invalid table {t:?}"),
         }
     }

     if sp > dst + 1 {
         self.byte_codes.push(ByteCode::SetList(table, (sp - (dst + 1)) as u8));
     }

     // reset narray and nmap
     self.byte_codes[inew] = ByteCode::NewTable(table, narray, nmap);
}

The NewTable bytecode is generated at the beginning of the function, but since the number of members of the array and the hash table is not yet known, the latter two parameters are temporarily filled with 0. And write down the position of this bytecode, and modify the parameters at the end of the function.

The intermediate loop is to traverse all members of the table. There are 3 syntax types in total:

  • General type, [ exp ] = exp, key and value are both expressions, respectively loaded to the sp and sp+1 positions of the stack through the load_exp() function, and then generate SetTable bytecode;

  • Record type, Name = exp, key is Name, which is a string constant, added to the constant table, value is an expression, and finally generates SetField bytecode. There is a place here that is related to Rust's ownership mechanism, that is, the key obtained by matching the pattern branch Token::Name(key) of match self.lex.peek() cannot be directly passed through add_const(* key) added to the constant table. This is because peek() returns not Token itself, but a reference to Token, which is returned by self.lex.peek(), so the associated self.lex and self are also in the referenced state; calling self.add_const() is also a mut reference to self, which violates the reference rules. The correct way is to abandon the return value of peek(), but call self.lex.next() to return Token and re-match. At this time, Rust's inspection is too strict, because the Token reference returned by self.lex.peek() does not affect self.add_const(). It should be that Rust has no ability to determine that there is no influence between the two.

  • List type, exp, is loaded to the sp position of the stack, and sp is updated, waiting for the last SetList to perform insertion. But you can't load data on the stack infinitely, because this will cause the stack to reallocate memory all the time, so if the current data on the stack exceeds 50, generate a SetList bytecode to clean up the stack.

What needs to be explained here is that when the Name is parsed, it may be either a record type or a list type. We need to peek the next Token to distinguish between the two: if the next Token is =, it is a record type , otherwise it is tabular. The problem here is that Name is already peeked, and the lexical analysis only supports peek one Token because of [using Peekable](./ch03-03.read_input.md#use peekable), so it can only Modify the expression parsing function load_exp() to support a Token read in advance, and add load_exp_with_ahead() function for this purpose. In the entire Lua grammar, there is only one place that needs to look forward to two Tokens.

This kind of behavior that needs to look forward to two Tokens to determine the expression, I wonder if it is called LL(2)?

Virtual Machine Execution

The following is the virtual machine execution code of the newly added 4 bytecodes, which is also very simple and can be skipped:

     ByteCode::NewTable(dst, narray, nmap) => {
         let table = Table::new(narray as usize, nmap as usize);
         self.set_stack(dst, Value::Table(Rc::new(RefCell::new(table))));
     }
     ByteCode::SetTable(table, key, value) => {
         let key = self.stack[key as usize].clone();
         let value = self.stack[value as usize].clone();
         if let Value::Table(table) = &self. stack[table as usize] {
             table.borrow_mut().map.insert(key, value);
         } else {
             panic!("not table");
         }
     }
     ByteCode::SetField(table, key, value) => {
         let key = proto.constants[key as usize].clone();
         let value = self.stack[value as usize].clone();
         if let Value::Table(table) = &self. stack[table as usize] {
             table.borrow_mut().map.insert(key, value);
         } else {
             panic!("not table");
         }
     }
     ByteCode::SetList(table, n) => {
         let ivalue = table as usize + 1;
         if let Value::Table(table) = self.stack[table as usize].clone() {
             let values = self. stack. drain(ivalue .. ivalue + n as usize);
             table.borrow_mut().array.extend(values);
         } else {
             panic!("not table");
         }
     }

The first bytecode NewTable is very simple and will not be introduced. The latter two bytecodes SetTable and SetField are similar, and both need to get the mut reference of the table through borrow_mut(). The final bytecode SetList encounters Rust’s ownership problem again, and needs to explicitly call the clone() function on the list on the stack to create a pointer to an independent list. If clone() is not called, then the table variable obtained by the first line if let statement matching is a reference to the member on the stack, that is, a reference to the stack, and this reference needs to continue until the third line, so it cannot be released in advance; the second line calling stack.drain() needs to obtain the variable reference of the stack, which conflicts with the reference obtained by the table variable in the first line. Therefore, clone() needs to generate a pointer to an independent table, so that the table variable matched in the first line is only a reference to the table, and is separated from the reference to the stack, thereby avoiding conflicts.

The mandatory clone() here increases performance consumption, but also avoids potential bugs. For example, the stack location where the table is located may be included in the subsequent stack.drain(), so the address becomes invalid, and then the operation of inserting data into the table in the third subsequent line will be abnormal. Of course, in the scenario of SetList, the syntax analysis will ensure that the stack location cleaned by stack.drain() does not include the table, but the Rust compiler does not know, and there is no guarantee that it will not be included in the future. So clone() here completely eliminates this hidden danger, and it is worthwhile.

So far, we have completed the construction of the table, and the following sections will introduce the reading and writing of the table.