This page will introduce a way of writing down what is and is not allowed in a computer program. I will start by applying the tool to someting a little more familiar- the language this page appears in.

The longish first part may seem utterly pointless. The idea of the frist part is to make the second part easier to grasp. And the point of the second part is to give you a way of viewing a language which will be an enormous help when you're working on a program, and the compiler is complaining that you've done something that causes it to refuse to do what you've asked it to do. (That claim is repeated, on purpose, near the end of this!)

It is Just Wrong, in English, to write something like "ae23qfcq23.as\ q23 qqq2bv." That's gibberish. It is wrong to write "cat plane green". And then there are all those things about making sentences and paragraphs which our primary school teachers tried to help us understand.

We won't finish the job of saying what's allowed in English in this essay! But we will make a start, before going on to do something similar for what's allowed in a computer language. (The tools we will be talking about can be applied to any computer language. (Or "natural" language, for that matter, but doing a full description of a natural language would be an impossibly large job.)

I'm going to shirk everything connected with using capital letters. We'll pretend they don't exist. And we won't do a lot with punctuation rules.

The process starts from the bottom up....

First Rule: A digit is any one of the following characters: 0,1,2...8,9

Second Rule: A letter is any one of the following characters: a,b,c... x,y,z

That wasn't so bad!

But it wasn't quite done in our "finished form". We're going to say such things as follows. We enclose descriptions of anything in curley brackets ({...}). It is more usual to use < and >, but using thus in a web page is a PAIN for the author. (We will just have to do without the thing the curley brackets are usually used for!)

I should stress something here: "5" is a digit. It is also a number. "55" is a number, made up of two digits. Use those words carefully in what follows.

The ";;=" means "the thing to the left of me is defined as follows..."

The "|" means "or".

Whew! And so far we're just trying to say "What is a letter?" "What is a digit?" "! Hang in there... it WILL be worth it, and is NOT as bad as it may seem at the moment.

{digit}::='0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'|

{letter}::= 'a'|'b'|'c'|'d' .. and so on up to... 'x'|'y'|'z'

(I cheated with my "... and so on up to..."... but if you can't understand and forgive, you might as well give up now. (But try to avoid such cheating, in your own use of the system!)

Now that we have said what a digit and a letter are, we have made a great start!

Now we can say what numbers and words are! Well.. we can give a rule that gives some of the numbers that are possible, and we can give a rule that gives real words... and things that only look like words! Do properly describe English is possible... but a huge task. To properly describe a computer language is not a huge task (relative to the benefits it gives us to have a way to speak about what is allowed.)

A number is one or more digits; a word is one or more letters. We can say that as follows...

{number}::=+

The "+" says "one or more of these"

{word}::=+

Remember: I've already conceded that some of the numbers we allow in everyday use (for instance "three and a half") are not provided for. And that some "words" can be made by our rule which we wouldn't think were words. But all the words we DO use CAN be made by it!

Digression starts: The "+" is a little messy. If you want to be very clever, you can say that a number is...

{number}::={digit}|{digit}{number}

... i.e. "A number is... a) a digit... or...
b) a digit followed by a number.

Thinking about that: How, by our rules, is "123" a number?

Well, by our rules, all by themselves, no help from elsewhere, 1 is a digit. 2 is a digit. 3 is a digit.

So 1 is also a number.

Using...

{number}::={number}{digit}

.... we can say 12 is a number, because 12 is the number "1" followed by the digit "2".

That makes 12 a number. Now we can say 123 is a number because it is 12 followed by a 3!

Digression ends.

So... if we want to say "one or more of {these}, we put a + after it, as in what we saw a moment ago...

{number}::=+

Good news! That's the end, for a moment, of hard stuff. Don't think too deeply about what follows. Read it. Make sure you've "got it"... but don't look too deeply for profundities.

Time for four new rules. (A "period" is what people who speak American, not English, call a "full stop", by the way, for my international audience.)

{noun}::='dog'|'book'|'wind'|'light'
{verb-intrans}::='barks'|'blows'|'shines'
{period}:=='.'
{space}::=' '

(In the last, there is a "space" between the 's. You knew that.)

With those, we can make the following rules....

{sentence}::={noun}{space}{verb-intrans}{period}

.. and with that, we can say that the following are valid sentences...

VALID sentences. They don't break our very simple, very limited rules for English. (Or American.)

Anyone trying to describe a natural language has taken on a huge task, but there are ways to do it. Remember: We're only doing this as a prelude to using the same system to describe computer languages. When applied to computer languages, you will get valid and sensible (up to a point) "sentences", etc.

"Up to a point"?? There are some errors which you cannot prevent, just with these tools. Suppose your computer program was to control a burglar alarm system, and you "said", in the program, "When a burglar is detected, ring all of the bells". But you forgot to say "... if the system is armed at the time." Our tools... they're called "syntax rule definitions"... won't help you with that sort of error. But they will catch many! Is would be possible to write a language for programming alarm systems, and, just as we have defined {noun}, the language would have dictionaries of allowed words. If the programmer mis-typed, say, "burglar", then the syntax checking would catch it. Or if programmer tried to get away with "Burglar detected- ring bells", the syntax checking might say it isn't enough. The "when" might be part of how the language knows what to put together for the computer which will run the program.

Time to stop reading quickly. Another "thing" to master. It is a bit like the "+" we talked about earlier...

{number}::={digit}+

Meant "A number is one or more digits", remember?

If we put a * after the }, it means NONE or more of the thing in the brackets...

{number}::={digit}{digit}*

.. would actually be the same rule as before... in a less "tidy" form. The most recent rule says...

"A number is a digit followed by none or more further digits"

Same result! Just "the hard way".

Here's an improved version of {sentence:==

{sentence}::={noun}{','{space}{noun}}*{space}{verb-intrans}{period}

NOW we can make the following!...

Hang in there! This IS leading to something useful. If you think reading it is tedious, imagine how much fun writing it was! (Yes, by the way, I know that the second should be "dog, light bark" (not barkS). Etc. Did I SAY that his is a rough example?)

A couple of things to notice:

In the rule, a comma appears... as a comma. We didn't do the usual thing of making a rule, "comma:==','. We did that for {space} because it isn't easy to SEE a space. There was no need, for commas.

One part of the rule is as follows...

{','{space}{noun}}*

The OUTER brackets, and the fact that the asterisk is outside them, says that ALL OF ",{space}{noun} appears zero or one or more times after the first noun, which must always be present.

(If you are really keen, work on the rule a bit more, to make it require the usual "and", e.g. make sentences like...

book and light barks.

... but there's no need to do that.

Hurrah! That's enough!

That's enough to let us describe any computer language extremely well.

In this essay, I will be describing the language used with Arduinos.

My description will not cover everything that is allowed in Arduino programming. But it will be built up in stages. Stages which will alllow novices to do novice things after reading a little

Basic tenets...

We'll start with some things we had in our attempt at syntax rules for English!...

{digit}::='0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'|
{letter}::= 'a'|'b'|'c'|'d' .. and so on up to... 'x'|'y'|'z'
{number}::={digit}+
{space}:==' '

-------------------
Watch it... next will take some reading and re-reading...

RESERVED WORDS: The language has many words which are "reserved". They already have a meaning. You are not allowed to use them for other things. Some you may have come across already: pinMode, digitalWrite, delay. Note that case matters. "PINmode" is "a different word" from "pinMode", as far as the Arduino language is concerned. (Even though you CAN, it is a Bad Idea to use something like PINmode, an unreserved word, when pinMode, a reserved word, exisits.)

Reserved words are quite easy to spot... the editor gives them colors. I've cheated in not doing rules to say all of that. Sorry. But not as sorry as you would have been if I'd done it "properly".

I'm going to "cheat" again here too, and not use our system rigously to tell you that we're going to use {id-user} to mean "a letter, followed by any mix of letters*, digits*, and underscores*, in any "mix", which, together, do not make up a RESERVED WORD". (The *'s ARE from our rule system, to say that you can have none or more of any of those.) That Shortcut saved you a very tedious "proper", "with the rules" block of text to say the same thing the hard way with a "{id-user}:==" rule.

Some sample {id-user}s: myVariable, bTmp, bTmpByJS, bHowLongOn

In a moment, I am going to use {block-stmts}. We haven't said what that is yet. We should have done. We will in a little while.

I already warned you to read carefully. Consider that warning doubled.... Now we come to...

{decl-func}:==
     'void'{space}{{'setup'}|{'loop'}|{id-user}}()
               '{'
                   {block-stmts}
               '}'

Examples first, still being vague about the {block-stmts} part...

For all of these, we will use "int x=5" for the {decl-func} required by the rule...

void setup()
   {
     int x=5
   }
2

void loop()
   {
     int x=5
   }


void OneNamedByMe()
   {
     int x=5
   }

Right.. back to reading just "hard", not "doubly hard".

How does our rule tell use that's what a {decl-func} is? ("decl-func" for "the declaration of a function", by the way. Whatever a "function" might be! (We'll come to it!))

{decl-func}:==
     'void'{space}{{'setup'}|{'loop'}|{id-user}}()
               '{'
                   {block-stmts}
               '}'

The rule says (for now) that every {decl-func} begins with "void", followed by a space. So far so good! Then we have either: "setup", or (because of the |, remember) "loop", or a {user-id}. "OneNamedByMe" was a {user-id}... an id created by the programmer, the "user".

THEN, according to the rule (which is right) we have two normal brackets... ()

THEN a "{"... that actual character, in our program. HERE it has nothing to do with how we use in in our rules, say for things like "{user-id}".

Re-read that last one. Sorry. Sigh.

The the mysterious block of statements. In the examples, the {block-stmts} was "int x=5"

AND LASTLY, a "}".

And THAT's a {decl-func}!!

Take a deep breath. Make a cup of coffee. And go back to where I said "Watch it" and read the section again. You won't "get" all of it the first two times... but it will prepare you to start using the ideas, and with use they will become good friends.

(A small digression: {decl-func}s don't always start with "void". What else can go there, why, is a story for another time. End of digression!)

So what's a {block-stmts} ?

It's a block! Of statements!

So we start with "What's a statement", and for now, I am just going to show you four examples of "a statement", in other words, a {stmt}...

Sample statements:

int x=5

pinMode(13,OUTPUT)

delay(200)

Not too terrible?

NOW I can say what a {block-stmts} is!...

{block-stmts}:=={stmt}';'+

In other words, a block of statements is one or more statements, with a semicolon after each one, including the last.

That may seem a simple enough idea... and it is. But you'll be amazed at how many times a missing... or extra... semicolon makes a nuisance of itself.

HURRAH!!! THAT's the back of this problem broken. You have a way to go yet, but the hardest bits are behind us!

Allowed statements

Now that you have the framework for writing programs in this language, there are a bunch... not hundreds, and only a few essential ones... of statements to master.

But they're all pretty straightforward.

Take the "delay" statement... it is defined as follows:

{stmt-DELAY}:=='delay('{number}')'

... so "delay(200)" is a legal version of the delay statement. (If you remember that, according to our rules, a number can only be positive, and can't have a fraction part. Not the "everyday" meaning of "number", but that's what our rules HERE mean!)

I've called that {stmt-DELAY} with delay in caps to say that this is what the Arduino language uses the RESERVED WORD "delay" for.

Another {stmt}...

{stmt-PINMODE}::='pinMode('{number}','
        {'OUTPUT'|'INPUT'|'INPUT_PULLUP'}')'

From that rule, okay...

pinMode(13,OUTPUT)

Two more statements...

{comment-OneLine}:=='//' followed by whatever you like, for one line only

{comment-MultiLine}:=='/*' followed by what you like, then '*/

Examples....

//This is a one line comment.

/*This is a multi-line
*  comment even thought the
*  lines are quite short.
*The asterisk at the start
*  of the middle lines is
*  not required by the
*  language, but helps
*  the reader */

For reasons that will become obvious in a moment, I hope, and to demonstrate something powerful in this system of saying EXACTLY what is allowed/ required...

{stmt-COMMENT}:=={comment-MultiLine}|{comment-OneLine}

(This one is a little unusual in that we don't use the word "comment" on a comment statement line. (As we DID use "pinMode" on a {stmt-PINMODE} line.)

Earlier, I said...

{block-stmts}:=={stmt}';'+

.. but I hadn't said what a {stmt} was! Bad!

A {stmt} is any one of the things we have given names starting {stmt-...

... and there are a whole bunch more to show you, in due course. bur first! Ta da!....

What goes in an Arduino program?

We can now say, comprehensively, unabiguously, elegantly, what an Arduino Program Is!! (Some people call them "sketches". I think I see why. I just don't feel the need to make the distinction I think they are trying for. But you will see "sketch", so I wanted to explain.)

An Arduino program is...

{ArduinoProgram}:==
        {stmt-COMMENT}*

        {decl-func}

        {stmt-COMMENT}*

        {decl-func}

        {stmt-COMMENT}*

        {decl-func}*

From the above rule: You may... or may not... have one or more comment lines anywhere in the code. Or you may have none at all. (The program will run... but not having comments is a bad idea. Writing good comments is an art. It takes time to learn.)

There will always be at least two function declarations ({decl-func}s).

There must be one with the word "setup", and one with the word "loop". They should be in that order. (There are ways to add those requirements to our "rules" system, but I thought I'd spare you, when it can be said in English quite clearly.

That's not all

That, as you may have guessed, is not all there is to writing Arduino programs.

But it is a very good... and a solid start.

There are other things that can go in {ArduinoProgram}:==. I was tempted to say....

{ArduinoProgram}:=={Stuff}+

... and that {Stuff} had to include one setup() {decl-func} and one loop() {decl-func} in that order, and that other {Stuff} in the program could be {stmt-COMMENT}s.. and other stuff that we will discover as we proceed.

That... my simple was of saying what an {ArduionProgram} is does say the same thing as the "fancy", "correct" version says. But it hardly does so quite so "by the rules".

Ah well... it is what you unerstand that matters... and we'll find out what that is as we try to use what wrestling with all of that may have given you.

A good grasp of this view of a language is of enormous help when you're working on a program, and the compiler is complaining that you've done something that has made it refuse to do what you've asked it to do. (Repeated from earlier on purpose!)


Here is how you can contact this page's author, Tom Boyd.