# Making and breaking Ciphers on the Commodore 64 Part 7 - Pseudo Random with Linear Congruential Generators

So far in this 7 part (shocked I've made it through 7!) series on doing weird things with Ciphers on old computers we've:

- Reversed a string
- Shifted bits left and right (mostly right)
- Brute forced a known simple cipher
- Channeled our inner John Connor by cracking ATM machines
- Searched through a large number looking for matches
- Learned how to use XOR to recover data

### Author's Note:

*This one stretched my skills quite a bit, but I kept it under 2000 words (by 89). I'm not sure when I started this I knew I'd be doing a Linear Congruential Generator on a 40 year old machine, but here we are ;-)*

# Pseudo Random

This our the first dive into creating randomness. It will require several explorations, and this isn't THE discussion we'll have about this. Several volumes could be written about this, so we'll break it down a little at a time.

The Commodore SID random number method isn't suitable for serious use at the time I'm writing this in mid 2021, but it's perfectly serviceable to study how encryption works, and the fact that it's insecure will actually help us to be able to demonstrate the insecurities in Pseudo random easier. Who wants to wait around a couple hundred years to break a SHA-512 key? Not this guy.

Much of what I knew about Commodore Random numbers before preparing to write this is from this article:

https://www.atarimagazines.com/compute/issue72/random_numbers.php

and this breakdown of the SID details:

http://www.oxyron.de/html/registers_sid.html

Many cryptographic algorithms require the use of one or more random numbers, typically to initialize a process with a seed, generate keys, create salts, or randomize something.

We use a mathematical function to generate a sequence of pseudo-random numbers. This is called a pseudo-random number generator (PRNG).

We seed the PRNG with a random number and then it recursively generates a deterministic set of numbers, always the same for a given seed.

A good pseudo-random number generator is awfully difficult to distinguish from a sequence of random numbers.

In our case, we'll use a pretty good PRNG and seed it with the SID random number generator. This should be adequate for our purposes here.

## Linear Congruential Generator (LCG)

The linear congruential generator (LCG) is a commonly used PRNG. It has a simple definition [^1] [^2]

```
Xn+1 = (m*xn + a) mod P
P >= 2 is an integer. P is the period of the PRNG.
The integer coefficient m is called the multiplier; m > 0 and m < P.
a is an additive integer constant where a > 0 and a < P.
The initial value of the generator x0 is the seed.
xn is the nth pseudo-random number generated by the function.
```

An LCG is recursive and is both fast and pretty easy to implement in code. The statistical randomness of LCG is sensitive to the modulus value `P`

and the seed used.

### Let's do a python version first

It's important to note that with the same parameters (seed included) this will always produce the same sequence, so choose your seeds wisely. Better yet, make them random.

Obviously in the above python code we're just checking a sane example to make sure our Assembly version is working later. A more sensible use would be the GNU C library glibc's standard:

glibc uses a period of 2^{45}, a multiplier value of 25,214,903,917 and a seed value of 11, in the specification of its LCG.

```
generateLCG(25214903917,37, 2**45, 11, 10)
Results:
277363943124 0x40942de6d4
12373672706153 0xb40f85dc069
16644769661658 0xf23699542da
25034216526583 0x16c4bb9552f7
31583928204624 0x1cb9b4d23d50
9976241847093 0x912c659fb35
6630739383222 0x607d6dd93b6
27812309592227 0x194b8ed568a3
17765547376524 0x10285d23ff8c
25722508515009 0x1764fcfb96c1
```

These period and multiplier values are pretty difficult on the Commodore at 1Mhz. As we've seen before, it takes a whole day for these machines to just count to 4 billion, let alone do a couple rounds of shifts and math to get there. If you want to give them a try, knock yourself out.

## How is an LFSR different from an LCG and which is better?

The SID on the Commodore 64 uses a 23 bit LFSR. On shifting, bit 0 is filled with bit 22 EXOR bit 17.

An LFSR is different than a LCG in its approach: it does its shifting with Exclusive OR(usually) instead of integer addition and multiplication.

Neither is better and both suffer from different problems. Applied Cryptography: Protocols, Algorithms, and Source Code in C is a good resource to really get into the weeds on these two methods.

One "common" solution to generate better randomness is to use two PRNGs of different types, adding the results together. [^5]

We'll do that here by seeding the LCG with a set of bytes from the Commodore SID Random number generator at `$d41b`

, which is an LFSR.

Remember that these are probably not suitable for cryptography, especially with the limits of an 8 bit processor that can only address 64k of memory at a time. And by probably not suitable, I mean they are not suitable at all. We're studying them as an evolution over a series. I promise there's a point to the whole endeavor. Probably.

# Getting the LCG on the Commodore

By far the easiest way to get this simple algorithm on the Commodore is to do it in BASIC. Just like the XOR post we are going to use Simons' BASIC to get a mod function without having to implement it ourselves.

## Basic Version

# LCR in Assembly

I'll be using Turbo Macro Pro as usual. Because I don't use macros or scripting of any kind, this should port very easily to your assembler of choice.

### Multiply and Divide

We need to be able to multiply and divide larger numbers for the rest of this series, so let's work on routines for that. These are pretty well known routines so I didn't try to reinvent the wheel. Both Codebase64 and 6502 assembly language programming by Lance Leventhal can help you explore these further.

Let's start with a multiplier that can handle 32 bit results (2^{32} or 4,294,967,296)

And now a 24 bit divider

### The LCG

Just a reminder, this is written for clarity over efficiency. I'd love to hear how you'd optimize. Definitely some bytes and cycles could be saved here at a cost of readability.

A couple of these label locations could definitely go in zero page to save some cycles. Have fun experimenting with that.

## Run it

Remember from our Python and BASIC proof of concepts above we're expecting the hex values:

```
0x25
0x1bc
0x399
0xa8
0x375
0x304
0x211
0x358
0x1c5
0x14
```

It works! I definitely did a little dance move when I ran this the first time because it actually worked the first time without any bugs. I can't remember ever having a first run that didn't have at least something that needed to be adjusted.

## Combine the LCG and LFSR

This is all great, but how are we going to get random numbers out of it in a more reasonable fashion?

We know we can get a random value from

via the SID's 23 Bit LFSR, so if we seed the LCR with the SID's LFSR, we're now combining two methods to produce more random results. I almost said results to make cryptographers and mathematicians happy, but let's be honest, those two groups of people make the world better because they're NEVER happy with the results. `$d41b`

Even without increasing the multiplier and period dramatically (glibc uses 1103515245 as the multiplier) seeding this VERY simple example does make this much more random than just the SID LFSR on it's own.

## Linear Feedback Shift Register (LFSR) in Assembly

Since I know someone will ask, here's an example Linear Feedback Shift Register from Codebase64 [^3].

```
random
asl random
rol random+1
rol random+2
rol random+3
bcc nofeedback
lda random
eor #$B7
sta random
lda random+1
eor #$1D
sta random+1
lda random+2
eor #$C1
sta random+2
lda random+3
eor #$04
sta random+3
nofeedback
rts
random: .byte $FF,$FF,$FF,$FF
```

# Extra Credit

- Change these routines to handle the very large periods common in PRNGs
- Optimize the routines to run faster

## Side project

As a side project I've been working on trying to find the repeat period for the Commodore SID 23 Bit LFSR in real life (find the repeat point). Counting and storing to 2^{23} requires an REU and I've not quite cracked it yet.

But here's some code running that shows the distribution of each of the possible 256 random bytes over 500k samples.

Baby steps, this takes about 30 seconds to actually run, so we skip the blank screen part.

### Footnotes

[^1] Excerpt From: Karan Singh Garewal. “Practical Blockchains and Cryptocurrencies.” Apple Books. p 65

[^2] https://en.wikipedia.org/wiki/Linear_congruential_generator

[^3] https://en.wikipedia.org/wiki/Linear-feedback_shift_register

[^4] https://codebase64.org/doku.php?id=base:32bit_galois_lfsr

[^5] https://en.wikipedia.org/wiki/Cyclic_redundancy_check#CRC-32_algorithm

[^6] Direct from the LFSR Wikipedia page