# A journey of 603 pages...

That feeling of "I have no idea what this is" is exciting for me; It's what led me to pick up [Crafting Interpreters by Robert Nystrom](https://craftinginterpreters.com/).

I already have a *decent* understanding of what an [interpreter](https://en.wikipedia.org/wiki/Interpreter_(computing)) is, what it does, why we need it, and how it differs from a [compiler](https://en.wikipedia.org/wiki/Compiler).

It turns out that the differences between interpreted and compiled languages are somewhat fuzzy. I might dive into this topic in a later post...

What's more important to understand is how languages are **implemented**.

That is, how do we take a language's human-readable code and convert it into something a CPU can understand?

## The Map of a Language Implementation

In the first two chapters of his book, Robert took me through the high-level components of a language's **implementation**.

You've got the **lexer** which takes in the raw text of the language and *tokenizes* it. These tokens are like the individual words and punctuation of a natural language.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1680558550115/bc914e21-712a-4793-9141-b661f25e0925.png align="center")

Then the **parser** takes these tokens and builds an **abstract syntax tree** that captures how the tokens work together to do something (i.e. the behavior of the program). Essentially, the parser structures the tokens according to the grammar of the language.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1680558865622/ded027b0-ebab-48c2-862e-5344e68df782.png align="center")

Next comes **static analysis**. This is where we take the structure provided by the **parser**, resolve names in the code to their target values, perform type checks if necessary, and finally store this context somewhere so that it can be used in later steps.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1680559893291/072987d1-512e-45eb-a090-a21dc22c83bb.png align="center")

Once **static analysis** is complete, we need to represent the code in a way that is agnostic about the CPU architecture it will run on. We need an [**intermediate representation (IR)**](https://en.wikipedia.org/wiki/Intermediate_representation).

This way if I write an interpreter for my language, I can run any code written in that language on x86, ARM, or any other architecture.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1680560102288/32b7aa60-dd5f-40e3-863d-3f0f5f6aacea.png align="center")

With our code expressed as an **IR**, we can apply optimizations to it. For example, we can use **constant folding** to find calculations that are built on static values and replace them with the resulting value. This saves the program from needing to re-calculate the value each time it runs.

Finally, we **generate code** that can be read by the CPU (or a virtual one in the case of a VM) and implement a **runtime** that includes features like garbage collection, type checking, and exception handling.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1680561105619/2e996830-67d6-4ead-ae80-2bf8e526f766.png align="center")

Great! So now I know all of the high-level components I need to build into the interpreter.

## So, what am I building exactly?

With the help of Nystrom's book, I'll build an interpreter for a fictitious language called Lox. To build this interpreter, I'll use the Rust programming language!

But before I start, I need to first understand the intended structure, syntax, and grammar of Lox.

In the next installment, I'll share what I think of Lox and what I expect the challenges will be when building the interpreter for it.

Until then!