Wednesday, November 2, 2022

Implementing WHILE in a toy BASIC interpreter

While working on a toy BASIC implementation, I ran into an unexpected challenge implementing the WHILE instruction.

The implementation of this instruction seems simple. Here is an example :

10 X = 1
20 WHILE X <> 10
30 PRINT X
40 X = X + 1
50 WEND

This program is going to print the numbers from 1 to 9. The WHILE statement is really a combination of WHILE and WEND. The WEND statement indicates the end of the 'block'.

The first challenge is to create a relation between these statements. This relation is going to be used by the interpreter to identify where to 'jump' when evaluating WHILE and WEND. One interesting challenge is that BASIC (GWBASIC) supports `nested` WHILE blocks:

10 X = 10
20 Y = 10
30 WHILE X <> 0
40 PRINT "X = ", X
50 Y = 10
60 WHILE Y <> 0
70 PRINT "y =", Y
80 Y = Y - 1
90 WEND
100 X = X - 1
110 WEND

Since the source of the original GW-BASIC implementation is published in GitHub, we can take a look at it here: https://github.com/microsoft/GW-BASIC. There is code in GWMAIN.ASM used to search for the correspoing WEND of a WHILE (a code block called WNDSCN). This also seems to be used to locate the FOR=/=NEXT pair of instructions. It seems that the interpreter tries to find the matching instruction by scanning the instructions that follow the WHILE. A counter is used to keep track of nested WHILE/WEND blocks:

..
PUBLIC	WNDSCN
WNDSCN: MOV	CL,LOW OFFSET ERRWH	;SCAN FOR MATCHING WEND THIS IS ERROR IF FAIL
.
.
.
FORINC: INC	CH		;INCREMENT THE COUNT WHENEVER "FOR" IS SEEN
.
.
.
        JZ	SHORT NXTLOK	;FOR/NEXT SEARCHING
        CMP	AL,LOW OFFSET $WHILE	;ANOTHER WHILE/WEND NEST?
        JZ	SHORT FORINC
        CMP	AL,LOW OFFSET $WEND
        JNZ	SHORT FNLOP
        DEC	CH
.
.
.

1.2 Strategy

One way to start implementing this feature in the interpreter was to add a table (pair_instruction_table) to the interpreter context. This table is going to keep the relation between instructions. In particular it will be used to relate a WHILE and WEND pair. In the near future it will keep the relation between FOR and NEXT.

The evaluation context will now look like this:

pub struct EvaluationContext<'a> {
    pub variables: HashMap<String, ExpressionEvalResult>,
    pub array_variables: HashMap<String, GwArray>,
    pub jump_table: HashMap<i16, i16>,
    pub underlying_program: Option<&'a mut GwProgram>,   
    pub pair_instruction_table: HashMap<i16, i16>,
}

With this information we can create a search function that emulates the same behavior as WNDSCN . Something like this:

fn find_wend(line: i16, real_lines: &Vec<&Box<dyn GwInstruction>>) -> i16{
    let mut curr_line = line + 1;
    let mut while_end_balance = 0;
    loop {
        if curr_line >= real_lines.len() as i16 {
            break;
        } else if let Some(ref instr) = real_lines.get(curr_line as usize) {

            if instr.is_while() {
                while_end_balance += 1;
            }
            if instr.is_wend() {
                if while_end_balance  == 0 {
                    return curr_line as i16;
                } else {
                    while_end_balance -= 1;
                }
            }
        }
        curr_line += 1;
    }
    return -1;
}

Notice that we solve the problem of nested WHILE/WEND blocks by keeping a counter while_end_balance.

With this utility we can create the implementation of the WHILE statement with the following code:

impl GwInstruction for GwWhile {
    fn eval (&self, line: i16, context : &mut EvaluationContext) -> InstructionResult {
        let mut wend_line : i16 = 0;

        // Find the cached corresponding line for this WHILE statement
        if let Some(corresponding_wend) =  context.pair_instruction_table.get(&line) {      
            wend_line = *corresponding_wend;
        } else if let Some(ref real_lines) = context.real_lines {
            // Try to look for the WEND statement in the program lines
            let index_of_wend = find_wend(line, real_lines);
            if index_of_wend == -1 {
                return InstructionResult::EvaluateToError(String::from("WHILE WITHOUT WEND"));
            } else {
                context.pair_instruction_table.insert(line, index_of_wend);
                context.pair_instruction_table.insert(index_of_wend, line);
            }       
            wend_line = index_of_wend;
        }

        // Evaluate the condition and move the following line
        let condition_evaluation = self.condition.eval(context);
        match condition_evaluation {
            ExpressionEvalResult::IntegerResult(result) if result == 0 => {
                InstructionResult::EvaluateLine(wend_line + 1)
            }
            ExpressionEvalResult::IntegerResult(_) => {
                InstructionResult::EvaluateNext
            }       
            _ => {
                InstructionResult::EvaluateToError(String::from("Type mismatch"))
            }
        }
    }

1.3 Additional problems

While working in this implementation I found an additional problem. In BASIC you can have several statements in one line separated by colons (:) . For example you can have a complete WHILE block in just one line:

10 X = 1 : WHILE X < 10 : PRINT X : X = X + 1 :WEND

Or you can have a WHILE in one line and the WEND inside another line:

10  x = 1
20 WHILE x <> 5
30 print x
40 x = x + 1 : print "a" : WEND
50 print "END"

To support his scenario the program is "flattened" before executing it. For example, the previous snippet becomes is converted to:

0  x = 1
1 WHILE x <> 5
2 print x
3 x = x + 1 
4 print "a" 
5 WEND
6 print "END"

This way it is easy to implement a jump between instructions that didn't exist in as explicit lines in the original program. This transformation is performed before running the program:

...
   let real_lines = &mut context.real_lines.as_mut().expect("instr vector");
   for e in self.lines.iter() {
      real_lines.push(&e.instruction);
      if let Some(ref rest) = e.rest_instructions {
         for nested in rest {
            real_lines.push(&nested);
         }
      }	    
   }
...

Something that needs to be improved is the way to identify existing WHILE and WEND instructions. Sadly, right now this is implemented using a pair of methods is_while and is_wend. These methods are defined in the GwInstruction trait and overriden on GwWhile and GwWend. This is ugly since these methods are very specific to this problem. It doesn't seem right to have them in GwInstruction.

One alternative to solve this problem is to redesign the code representation to include a way to retrieve the original instruction from a GwInstruction. This is one of the things that will be implemented next.

Code for the interpreter is here: https://github.com/ldfallas/rgwbasic .