Making and breaking Ciphers on the Commodore 64 Part 7 - Pseudo Random with Linear Congruential Generators

Making and breaking Ciphers on the Commodore 64 Part 7 - Pseudo Random  with Linear Congruential Generators

So far in this 7 part (shocked I've made it through 7!) series on doing weird things with Ciphers on old computers we've:

Author's Note:

This one stretched my skills quite a bit, but I kept it under 2000 words (by 89). I'm not sure when I started this I knew I'd be doing a Linear Congruential Generator on a 40 year old machine, but here we are ;-)

Pseudo Random

This our the first dive into creating randomness. It will require several explorations, and this isn't THE discussion we'll have about this. Several volumes  could be written about this, so we'll break it down a little at a time.

The Commodore SID random number method isn't suitable for serious use at the time I'm writing this in mid 2021, but it's perfectly serviceable to study how encryption works, and the fact that it's insecure will actually help us to be able to demonstrate the insecurities in Pseudo random easier. Who wants to wait around a couple hundred years to break a SHA-512 key? Not this guy.

Much of what I knew about Commodore Random numbers before preparing to write this is from this article:

https://www.atarimagazines.com/compute/issue72/random_numbers.php

and this breakdown of the SID details:

http://www.oxyron.de/html/registers_sid.html

Many cryptographic algorithms require the use of one or more random numbers, typically to initialize a process with a seed, generate keys, create salts, or randomize something.

We use a mathematical function to generate a sequence of pseudo-random numbers. This is called a pseudo-random number generator (PRNG).

We seed the PRNG with a random number and then it recursively generates a deterministic set of numbers, always the same for a given seed.

A good pseudo-random number generator be difficult to distinguish from a sequence of random numbers.

In our case, we'll use a pretty good PRNG and seed it with the SID random number generator. This should be adequate for our purposes here.

Linear Congruential Generator (LCG)

The linear congruential generator (LCG) is a commonly used PRNG. It has a simple definition [^1] [^2]

Xn+1 = (m*xn + a) mod P
P >= 2 is an integer. P is the period of the PRNG.
The integer coefficient m is called the multiplier; m > 0 and m < P.
a is an additive integer constant where a > 0 and a < P.
The initial value of the generator x0 is the seed.
xn  is the nth pseudo-random number generated by the function.

An LCG is recursive and is both fast and pretty easy to implement in code. The statistical randomness of LCG is sensitive to the modulus value P and the seed used.

Let's do a python version first

def generateLCG(multiplier, addr, period, seed,iterations):
    count = 0
    lastValue = seed
    while ct < iterations:
        lastValue = (multiplier*lastValue + addr) % period
        count += 1
        print(lastValue, ' ', hex(lastValue))
    
generateLCG(11, 37, 1000, 0, 10)

Results:
37    0x25
444   0x1bc
921   0x399
168   0xa8
885   0x375
772   0x304
529   0x211
856   0x358
453   0x1c5
20    0x14
Decimal and hex to compare it's working later

It's important to note that with the same parameters (seed included) this will always produce the same sequence, so choose your seeds wisely. Better yet, make them random.

Obviously in the above python code we're just checking a sane example to make sure our Assembly version is working later. A more sensible use would be the GNU C library glibc's standard:

glibc uses a period of 245, a multiplier value of 25,214,903,917 and a seed value of 11, in the specification of its LCG.

generateLCG(25214903917,37, 2**45, 11, 10)

Results:

277363943124     0x40942de6d4
12373672706153   0xb40f85dc069
16644769661658   0xf23699542da
25034216526583   0x16c4bb9552f7
31583928204624   0x1cb9b4d23d50
9976241847093    0x912c659fb35
6630739383222    0x607d6dd93b6
27812309592227   0x194b8ed568a3
17765547376524   0x10285d23ff8c
25722508515009   0x1764fcfb96c1

These period and multiplier values are pretty difficult on the Commodore at 1Mhz. As we've seen before, it takes a whole day for these machines to just count to 4 billion, let alone do a couple rounds of shifts and math to get there. If you want to give them a try, knock yourself out.

How is an LFSR different from an LCG and which is better?

The SID on the Commodore 64 uses a 23 bit LFSR. On shifting, bit 0 is filled with bit 22 EXOR bit 17.

An LFSR is different than a LCG in its approach: it does its shifting with Exclusive OR(usually) instead of integer addition and multiplication.

Neither is better and both suffer from different problems. Applied Cryptography: Protocols, Algorithms, and Source Code in C is a good resource to really get into the weeds on these two methods.

One "common" solution to generate better randomness is to use two PRNGs of different types, adding the results together. [^5]

We'll do that here by seeding the LCG with a set of bytes from the Commodore SID Random number generator at $d41b, which is an LFSR.

Remember that these are probably not suitable for cryptography, especially with the limits of an 8 bit processor that can only address 64k of memory at a time. And by probably not suitable, I mean they are not suitable at all.  We're studying them as an evolution over a series. I promise there's a point to the whole endeavor. Probably.

Getting the LCG on the Commodore

By far the easiest way to get this simple algorithm on the Commodore is to do it in BASIC. Just like the XOR post we are going to use Simons' BASIC to get a mod function without having to implement it ourselves.

Basic Version

Written for readability over efficiency

LCR in Assembly

I'll be using Turbo Macro Pro as usual. Because I don't use macros or scripting of any kind, this should port very easily to your assembler of choice.

Multiply and Divide

We need to be able to multiply and divide larger numbers for the rest of this series, so let's work on routines for that. These are pretty well known routines so I didn't try to reinvent the wheel. Both Codebase64 and 6502 assembly language programming by Lance Leventhal can help you explore these further.

Let's start with a multiplier that can handle 32 bit results (232 or 4,294,967,296)

multiply16
         .block
         lda #0
         sta product+2
         sta product+3

         ldx #16
shiftr
         lsr multiplier+1
         ror multiplier
         bcc rotater
         lda product+2
         clc
         adc multiplicand
         sta product+2
         lda product+3
         adc multiplicand+1

rotater
         ror a
         sta product+3
         ror product+2
         ror product+1
         ror product
         dex
         bne shiftr
         rts
         .bend
         
;multiply16
multiplier .byte $ff,$ff
multiplicand .byte $ff,$ff
product  .byte 0,0,0,0
16 Bit Multiplier that will handle 32 bit result

And now a 24 bit divider

divide24
         ;dividend/divisor
         .block
         lda #0
         sta remainder
         sta remainder+1
         sta remainder+2
         ldx #24 ;every bit

loop
         asl dividend
         rol dividend+1
         rol dividend+2
         rol remainder
         rol remainder+1
         rol remainder+2
         lda remainder
         sec
         sbc divisor
         tay
         lda remainder+1
         sbc divisor+1
         sta zp1
         lda remainder+2
         sta divisor+2
         bcc skip

         sta remainder+2
         lda zp1
         sta remainder+1
         sty remainder
         inc answer
skip
         dex
         bne loop
         rts

         .bend



;divide24
remainder .byte 0,0,0
dividend .byte 0,0,0
divisor  .byte 0,0,0

answer   = dividend
24 Bit Divider

The LCG

Just a reminder, this is written for clarity over efficiency. I'd love to hear how you'd optimize. Definitely some bytes and cycles could be saved here at a cost of readability.

dolcg
         .block
loop
         lda multip
         sta multiplier
         lda lastvalue
         sta multiplicand
         lda lastvalue+1
         sta multiplicand+1
         jsr multiply16
         lda product
         clc
         adc addr
         sta dividend
         lda product+1
         adc #0
         sta dividend+1
         lda period
         sta divisor
         lda period+1
         sta divisor+1
         jsr divide24
         lda remainder
         sta lastvalue
         lda remainder+1
         sta lastvalue+1
printresult
         lda lastvalue+1
         jsr printbyte
         lda lastvalue
         jsr printbyte
         lda #13
         jsr $ffd2
         inc count
         lda count
         cmp iterations
         bne loop
         
         rts
         .bend
         
;lcg
count    .byte 0

multip   .byte 11
addr     .byte 37
period   .byte $e8,$03
seed     .byte 0,0,0
lastvalue = seed
iterations .byte 10
lcgvalue .byte 0,0
I included the code to give us the first 10 in the sequence here, you might not want this if you reuse it somewhere else.

A couple of these label locations could definitely go in zero page to save some cycles. Have fun experimenting with that.

Run it

Remember from our Python and BASIC proof of concepts above we're expecting the hex values:

0x25
0x1bc
0x399
0xa8
0x375
0x304
0x211
0x358
0x1c5
0x14

It works! I definitely did a little dance move when I ran this the first time because it actually worked the first time without any bugs. I can't remember ever having a first run that didn't have at least something that needed to be adjusted.

Combine the LCG and LFSR

This is all great, but how are we going to get random numbers out of it in a more reasonable fashion?

We know we can get a random value from $d41b via the SID's 23 Bit LFSR, so if we seed the LCR with the SID's LFSR, we're now combining two methods to produce more random results. I almost said results to make cryptographers and mathematicians happy, but let's be honest, those two groups of people make the world better because they're NEVER happy with the results.

Even without increasing the multiplier and period dramatically (glibc uses 1103515245 as the multiplier) seeding this VERY simple example does make this much more random than just the SID LFSR on it's own.

Linear Feedback Shift Register (LFSR) in Assembly

Since I know someone will ask, here's an example Linear Feedback Shift Register from Codebase64 [^3].

random
        asl random
        rol random+1
        rol random+2
        rol random+3
        bcc nofeedback
        lda random
        eor #$B7
        sta random
        lda random+1
        eor #$1D
        sta random+1
        lda random+2
        eor #$C1
        sta random+2
        lda random+3
        eor #$04
        sta random+3
nofeedback
        rts

random: .byte $FF,$FF,$FF,$FF

Extra Credit

  • Change these routines to handle the very large periods common in PRNGs
  • Optimize the routines to run faster

Side project

As a side project I've been working on trying to find the repeat period for the Commodore SID 23 Bit LFSR in real life (find the repeat point). Counting and storing to 223 requires an REU and I've not quite cracked it yet.

But here's some code running that shows the distribution of each of the possible 256 random bytes over 500k samples.

Baby steps, this takes about 30 seconds to actually run, so we skip the blank screen part.

Footnotes

[^1] Excerpt From: Karan Singh Garewal. “Practical Blockchains and Cryptocurrencies.” Apple Books. p 65

[^2] https://en.wikipedia.org/wiki/Linear_congruential_generator

[^3] https://en.wikipedia.org/wiki/Linear-feedback_shift_register

[^4] https://codebase64.org/doku.php?id=base:32bit_galois_lfsr

[^5] https://en.wikipedia.org/wiki/Cyclic_redundancy_check#CRC-32_algorithm

[^6] Direct from the LFSR Wikipedia page