A JavaScript Implementation Of The Logo Programming Language - Part 2

In this article we continue developing the Logo Programming language further by writing a simple TokenCollection class to help traverse the array of tokens returned from the lexer we created in part 1. This class will be used in Part 3, when we develop the parser.


In part 1, we developed a simply lexer function that returns an array of tokens given an input string. We now need to create a class that will allow us to traverse the tokens with the ability to move back and forth. To understand why we need this class, let's take a look at how we can use a loop to iterate through the tokens.

Listing 1

let tokens = tokenize("FORWARD 100");

for(let i=0; i < tokens.length; i++){
    let token = token[i];
}

On first iteration the variable token is FORWARD and on the second iteration the token variable is 100. The token 100 is an argument of the FORWARD command and as a result, we would need to look ahead at the next token while currently viewing the FORWARD command. The loop also needs to account for white space tokens and in some cases we would need to look ahead on more than one token.

A more efficient way to iterate over the tokens, is to encapsulate the tokens in a class and call methods such as getNextToken() which would return a token and advance an internal array pointer. Let's take a look at some code for a better understanding.

Listing 2

class TokenCollection{

    constructor(tokens){
        this._tokens = tokens;
        this._index = -1;
    }

    getNextToken(){
        this._index ++;

        if(this._index < this._tokens.length){
            return this._tokens[this._index];
        }
        return false;
    }
}

let str = "FORWARD 100";

let tokens = tokenize(str);

let tokenCollection = new TokenCollection(tokens);

console.log(tokenCollection.getNextToken());
console.log(tokenCollection.getNextToken());
console.log(tokenCollection.getNextToken());
console.log(tokenCollection.getNextToken());

The TokenCollection class above is initialized with an array of tokens that is returned from the tokenize() function. Notice the _index variable is initialized with the value -1. This value is incremented each time the getNextToken() method is called. The final call to getNextToken() will return false as there are only three tokens.

The second call to getNextToken() returns a white space token. White space tokens are problematic because it's not known in advance how many white space tokens exists between commands and command arguments. We can eliminate this problem by writing a function that consumes all white space tokens until a non white space token is encountered. This function needs to inspect the next token without advancing the internal index. If the next token is of type T_SPACE then we advance to the next token and repeat the process again.

Let's first create a function that can inspect the next token without advancing the internal pointer.

Listing 3

class TokenCollection{
...
    inspectNextToken(){
        if((this._index + 1) < this._tokens.length){
            return this._tokens[this._index + 1];
        }
        return false;
    }
...
}

Using the function above we can now write a function that will consume all white space tokens as shown below.

Listing 4

...
    consumeSpaces(){
        let next = this.inspectNextToken();
        if(next && next.type == 'T_SPACE'){
            this._index++;
            this.consumeSpaces();
        }
    }
...


Notice that the function is recursive. We only advance to the next token if the current inspecting token is of type T_SPACE.

We now have a class that we can use to traverse an array of tokens, getting and inspecting tokens. In part 3, we take a look at the Recursive Descent Parser, an algorithm used in parsing.

HANA DB

Rust

Java

SAP Business One

Node.js