Learn to Code via Tutorials on Repl.it!

← Back to all posts
How To Make A Language: Parsing
CSharpIsGud (436)

The last tutorial didn't really give an in depth explanation as to what everything was doing aside from some comments, however for parsing I feel I should explain how it works.

Markdown was really weird on this post.

We are going to be using a recursive descent parser, written all by hand and with no dependencies.

Lets start with an expression 8 + 5 * 4 / 2

The first thing we are going to do is tokenize it, so this expression will tokenize to this

1: ("Number", 8)
2: ("Operator", "+")
3: ("Number", 5)
4: ("Operator", "*")
5: ("Number", 4)
6: ("Operator", "/")
7: ("Number", 2)

Im going to do a little experiment here, and if it works you should be able to read the above stream of tokens just like an rdp parser. When you see Add(), Sub(), Mul(), Div() and Literal() in the text below jump to the corresponding label!

Start with Add.

Add:
Left = Sub()

Is the next token a + operator? If so then:
       Right = Sub()
       Return (Left, "+", Right)

Otherwise return the left side

Sub:
Left = Mul()

Is the next token a - operator? If so then:
       Right = Mul()
       Return (Left, "-", Right)

Otherwise return the left side

Mul:
Left = Div()

Is the next token a * operator? If so then:
       Right = Div()
       Return (Left, "*", Right)

Otherwise return the left side

Div:
Left = Literal()

Is the next token a / operator? If so then:
       Right = Literal()
       Return (Left, "/", Right)

Otherwise return the left side

Literal:
Return the current token and advance to the next one

Now, if you went through everything above and kept track of where you were in the sequence you will notice that you end with a tree that looks roughly like this:

Commentshotnewtop
[deleted]

I like this

Soz about the 10yr old making Python classes rather than a programming language

LoganSpong (44)

@BXRStudios I get it already! I took down the post!

[deleted]

@LoganSpong I don’t think you are @CSharpIsGud but if you are hello otherwise I was talking to @CSharpIsGud

LoganSpong (44)

@BXRStudios No, I'm the 10 year old that made the uhh... thing.

CodeLongAndPros (875)

@BXRStudios

<rant>

Can people just

Get over it!

It is not a big deal! Point it out once, then be done!

You can ask them to change the name, but really?

This kind of behavior is downright mean. Some people just like to make stuff and show off what they made!

</rant>

CSharpIsGud (436)

@CodeLongAndPros Mean is kind of relative, since people will call calmly pointing out that importing a module doesn't make a new language mean while also ones that say the same thing in an insulting tone mean so there are 2 kinds of mean on replit

CodeLongAndPros (875)

@CSharpIsGud There are some people that don’t like it because it is “not canonical”. I mean, sure, you can point it out, but don’t fixate on it. I’ve pointed some things to that effect, but I don’t focus on them. You could say that some people are stating a fact, and others are fixated on that same fact.

[deleted]

Well let's just stop here then and all just take a break. :)

bramley (188)

Please check the Repl.it TOS, @LoganSpong

LoganSpong (44)

@bramley I did. I could'nt find the section at which you're talking about. Which one is it?

LoganSpong (44)

@bramley Is it the copyright one? Like this: You may be held accountable for damages (including costs and attorneys' fees) for misrepresentation or bad-faith claims on the infringement of any User Content found on and/or through the Service on your copyright.

Or is it this one?
If we learn of any user that is under the age of 13, we will immediately terminate the user’s Account and delete any and all information we may have about that user.

PythonPrograms (2)

Wow I like this, this is pretty cool and it’s very well explained

Viper2211 (54)

Ummmm, is that supposed to happen?

CSharpIsGud (436)

@Viper2211 oh oops, I didn't implement negative numbers

Viper2211 (54)

@CSharpIsGud That makes more sense. I thought the parser was broken or something...

CSharpIsGud (436)

@Viper2211 negative numbers work now

LoganSpong (44)

Cool! I wanted to learn how to make a parser, now I know! But what is the node.type thing?

CSharpIsGud (436)

@LoganSpong It could be a Binary Operator like 1 + 2, it could be an assignment like var = 15 or it could just be a Literal like "Hello World"

LoganSpong (44)

@CSharpIsGud cool. I'm making an CLI. It's pretty helpful

[deleted]

just make a bunch of functions and classes /s

DynamicSquid (1968)

This was really helpful!

StudentFires (327)

All this really does is parse an expression and then call Python's built-in operator, which probably calls C's built-in operator, which already sounds inefficient. Now if this were compiled into some binary that performed this, and maybe held some variables...

CSharpIsGud (436)

@StudentFires Thats basically how any interpreted language does it, this tutorial isn't going to cover a compiler but I might make a separate one after this series is finished.

StudentFires (327)

@CSharpIsGud Gonna show of the process of how you're making Qyl? If you do it might hype up Qyl so when it arrives it might get noticed.

VulcanWM (1709)

How do I add a parser to this: https://repl.it/@medcho/VainSmoggyApplicationsuite
I still don’t get what you mean?

VulcanWM (1709)

Thanks man, I seriously needed this

TheForArkLD (607)

I like these tutorial, can you make language with me by nodejs?

JadenGarcia (36)

Isn't recursion like, the worst way to do anything? Aren't bottom-up non-recursive parsers usually more efficient and fast?

CSharpIsGud (436)

@JadenGarcia Yes but I find rdp easier to understand and I haven't looked into other algorithms yet

JadenGarcia (36)

@CSharpIsGud Well, it's always better to look at all of your options, actually, I take that back. Humans suck at making decisions with too many choices.