Bookmark this on Delicious Recommend toStumbleUpon

The beauty of XOR

We're going to start gently, and *then* get "tricky".

We're going to pull a few things together, to achieve a whole perhaps greater than its parts.

What I will have described by the end of this page is not, by itself, a complete system of encryption. But this page should give you ideas of things to go into such a system.

We are going to be using a mathematical system here, so it will be best if we can first turn any letters we want to encrypt into numbers.

We could just call "A" 1, "B" 2, "C" 3, etc... but if we do, how do we show an "a" (vs. an "A") when we want to? How do we show digits? Punctuation marks?

No. Far better to use the scheme called the ASCII "code"(-WP-). (Not a secret "code", and a little different... though not much... from the strict meaning of "code" in a cryptological context).

The ASCII code is widely used in computer work. It assigns a number to stand for every letter (with different numbers for the letter's upper case form and its lower case form), every decimal digit, and many punctuation marks.

As it is widely used in computer work, any sensible programming language has functions for going from the character to the number or back the other way.

For what we are going to do here, it will suffice to know the following...

Number Stands for 65 A 66 B 67 C . . 84 T

We're going to encrypt the plaintext "CAT".

We start by converting each letter to its ASCII form...

C A T becomes... 67 65 84

Harder to read as "67 65 84", but not very "secret" yet!

Remember "random" number generators? If you haven't read the Flat Earth Academy page on random numbers, this would be a good... but not essential... time to do it.

We learned that the "random" number generators (RNGs) in computer programming languages turn out a series of *seemingly* random numbers, but if you use the program language's tools for *seeding* the generator, then each time you run it after using a particular seed, you set the same series of numbers. Ah.

So. Let's pretend that our RNG, when seeded with, say, 42, turns out 3, 2 and 5 as the first three numbers in the series.

To encrypt "CAT", we add those numbers to the ASCII representations...

67 65 84, plus.... 3 2 5, making.... ------------- 70 67 89

That string of numbers is the encrypted version of "CAT" which you would send to the person you wanted to send "CAT" to, without sending it in plaintext.

Your recipient needs to know what RNG you were using (and have access to a copy of it), and he needs to know what seed you used. He would put the seed into his RNG, and generate three numbers. They *would* be 3, 2, 5, because of the way these things work. He would then take the ciphertext, and, one number at a time, *subtract* 3, 2 and 5, thus reversing the encryption step....

70 67 89, MINUS.... 3 2 5, making.... ------------- 67 65 84

That's "all" there is to it! (Almost)

You *can* do it exactly as above, if you wish. It will work fine... but you can end up with some "too big" numbers, depending on the range of numbers you have set the RNG to create.

It is reasonable to assume that if you are using this cipher, you will be getting a computer to do the arithmetic for you.

And it is easier for the computer if the only numbers involved are in the range 0-255 inclusive.

The basic ASCII meets this requirement quite comfortably.

If you set up your RNG to create only 0s to 9s, you probably won't end up with an encrypted number of more than 255... but your encryption won't be as strong as it easily could be.

Bring on the magic of "XOR".

And before we do that, we need to understand that a number still stands for whatever it stands for, no matter how we write it. "Twelve" and "12" and "A Dozen" all mean the same thing, don't they? You know those "codes". How about 00001100? That's another way of writing 12. That's 12 in binary, with leading zeros.

"Leading zeros"... that bit is really easy. We usually drop leading zeros, but think of the odometer (how-far-it-has-driven-thingie) in a new car. It may well say 000025 when the car's gone twenty five miles. You don't let the zeros in front of the 25 confuse you, do you?

I'm going to be using a lot of leading zeros, because we just do, in this context.

Remember I said that computers "like" to have numbers going just from 0-255? Why 255??? Well, 255, in binary, just happens to be 11111111. (Eight 1s).

Now. We come to the tricky bit. (The rest were just easy bits, to warm you up.)

There's a "thing" called XOR. It is a rule for combining two numbers to make a third number. It is a bit like adding... but easier: there is no "carrying". XOR is really easy.... If I XOR the following two numbers, "Top number" and "Bottom number", I get the answer shown....

11110011 ("Top number") 01010101 ("Bottom number").... if "XOR'd", become... -------- 10100110 (The answer to (Top number) XOR (Bottom number))

To check my work, you look at each column in turn, in isolation. If there's a 1 (in a given column) in EITHER "Top number" **or** "B", then the digit in that column of the answer is a 1. But! If there is an 1 in that column of "Top number" *and* a 1 in that column of "Bottom number", then that's the e**X**ception to the OR, (hence XOR), and you have a 0 in that column of the answer. (When there are no 1s, when it's a 0 in both "Top number" and "Bottom number", it's a 0 in the answer too.)

So far, so dull. Why bother?

Let me tell you how we're going to use this for encrypting.

Remember the first letter we wanted to encrypt was "C".

And the ASCII for that was 67.

67, in binary, is 01000011.

THAT will be our "Top number".

And the first number from the RNG was a 3.

In binary: 00000011
THAT will be our "Bottom number"....

01000011 ("Top number", a way of "saying" "C") 00000011 ("Bottom number", the 3 from the RNG) .... when "XOR'd", they become... -------- 01000000

... and THAT (01000000) is what you send to the person you want to send "CAT" to. (Along with the encrypted versions of the A and the T, of course.

As before, your recipient knows that the numbers used to encrypt the message were 3, 2 and 5. And he uses them on the encrypted text. How? He just uses XOR again! Here's the first one done, so you can check you really understood....

01000000 (The encrypted "C") a way of "saying" "C") 00000011 (The 3 from the RNG) .... when "XOR'd", they become... -------- 01000011

... which is the ASCII for "C"! Rather neat don't you think? Same mathematical rule both ENcrupts each character and DEcrypts it.

Why better than adding and subtracting the number from the RNG?

If you do that, there will be case where the result goes over 255, which can be a nuisance. If we use XOR, we can tell the RNG to give us numbers from 0-255 (inclusive) to XOR with the number we want to send, in encrypted form, and even if the number to be sent is 255 and the number from the RNG is 255, we still end up with a number in the range 0-255. And that will be true regardless of what ASCII number we want to encrypt, regardless of what number (in range 0-255) we have been given by the RNG.(Using a big range of numbers from the RNG is A Good Idea, and makes our encryption harder to break.

It is all just rather neat.

One approach to encryption, and it can of course be "mixed in" with other barriers to decryption, is to....

- Convert each character of the plaintext to a number
- Obtain a repeatable string of "random" numbers
- XOR the character's numbers with the random numbers

The result is your text encrypted. The recipient merely...

- Uses the same string of "random" numbers
- XORs each number in the encrypted text with the next "random" number
- Turns the resulting number back into a character.

Try writing a pair of computer programs to use this approach to prepare messages for secure transmission, and then decrypt them. Not very hard, really...

You probably are not yet asking the question I am about to answer... but bear with me. I'll explain the question, and two answers, and then you'll be ready for the question when it arises in what you are doing yourself!

Trust me... there are times, not easily explained, but they exist, when you want to keep the numbers you are creating within certain limits.

Let's say that you've decided you only need to encrypt A-Z... only upper case, no punctuation marks, no digits.

You might well "go with" a plan that made A zero, B one, and so on, up to Z made 25.

And then you might "do things" with the numbers, to make your cipher robust... say:....

Add 1 to the first letter in the message 2 to the second, 3 to the third, 1 to the 4th, 2 to the 5th, 3 to the 6th, 1 to the 7th... ... and so on.

Now... if the message was CAT, there would be no problem with this.

But if the message was BUZZ, and you had reasons for wanting only to use 0-25 in your encrypted text, as well as your plaintext, you'd be in trouble, as I'll explain in a moment.

(Just before explaining, let me promise you that there will be times when you DO want to stay within a limited range of numbers.)

In trouble? Why? Well, when you go to do the first step of encrypting the first "Z", you're supposed to add 3 to 25. That.. normally... would give you 28. And we said we wanted everything to stay in the range 0-25, inclusive.

Now... one solution is to use XOR (explained above) instead of the solution I'm going to give in a moment. Actually, that will only keep you "inside" your desired range if your desired range is 0 to some power of 2 minus 1... e.g. 0-3, 0-7, 0-15... But when you WANT to stay inside a specific range, you usually DO want to be within a range of 0 to some power of 2 minus 1, so that's okay then, isn't it?!

But if you Just Don't Like XOR (Because you didn't understand it? Go back and figure it out... it is worth it!), there's something else called modular arithmetic which can also do the job, in many of the cases that might arise as you try to understand encryption and cryptanalysis. (The Wikipedia article, by the way, goes WAY beyond what you need to know for this!!)

First: What "modular arithmetic" is NOT: It is NOT "arithmetic in modules".

So... what IS it...

The simplest example of modular arithmetic is the odometer of a car. (The thing that says how many miles it has gone.) I saw a very old car with just 5 wheels on the odometer once. It didn't stop and die at the end of the 99999th mile.... the odometer just "went around" and started a zero again.

And that's what you do in modular arithmetic.

Take the case of my "Z" (25) plus 3. In "ordinary" arithmetic, that would, of course, be 28. But if we are working "modulo 26", then 25 + 3 is 2. We start at 25. When we go "up" 1, that puts us at ZERO. Then we go up two more times, so we've gone up (added 1) three times.

Another example, still working modulo 26. (We can work with other numbers, and sometimes will. But 26 is good here, because that's the number of letters in the alphabet.)

As I was saying... "Another example..."

20 +6 would be zero.

More...

20+7=1 20+8=2, etc. 24+1=25 24+2=0 24+3=1

Now... so far so simple, but seemingly pointless. But look at this...

What happens when we subtract?....

3-1=2 3-2=1 3-3=0 3-4=25 << ! 3-5=24 3-6=23, etc. 2-1=1 2-2=0 2-3=25 2-4=24, etc

Think back to when we added 3 to 25, but did it with modulo 26 so that we got 2 instead of 28. If we want to reverse that "adding on", it's no problem 25 plus 3 came to 2.

2 (what we got) minus 3 (what we added) gets us back to 25!

Shezam! This IS useful. You'll see why, one day... if you manage to remember this stuff! Or maybe you work out why it is useful, now that you know it is?!

Go make up some code system, and then ask yourself how it could be broken!

Have you heard of Flattr? Great new idea to make it easy for you to send small thank you$ to people who provide Good Stuff on the web. If you want to send $$erious thank yous, there are better ways, but for a small "tip" here and there, Flattr ticks a lot of boxes which no one else has found a way to do yet. Please at least check out my introduction to Flattr, if you haven't heard of it? "No obligation", as they say!

Encryption using XOR and a string of 'random' numbers... from Flat Earth AcademySearch across all my sites with the Google search button at the top of the page the link will take you to.

Or...

Search just this site without using forms,

Or... again to search just this site, use...

The search engine merely looks for the words you type, so....

*! Spell them properly !*

Don't bother with "How do I get rich?" That will merely return pages with "how", "do", "I", "get" and "rich".

I have other sites. My Google custom search button will include things from them....

One of my SheepdogGuides pages.

My site at Arunet.

This page's editor, Tom Boyd, will be pleased if you get in touch by email.

Page tested for compliance with INDUSTRY (not MS-only) standards, using the free, publicly accessible validator at validator.w3.org. Mostly passes. There were two "unknown attributes" in Google+ button code, two further "wrong" things in the Google Translate code, and similar in Flattr code. Sigh.