10 PRINT in Rust vs C

By Michael Doornbos

July 30, 2024 - 5 minutes read - 939 words

Introduction

We’ve done 10PRINT on a lot of machines.

But we haven’t done a comparison between Rust and C.

For no particular reason, I thought it would be fun to compare the two languages side by side.

A race? Yes, please.

10 PRINT Quick review

10 PRINT is a one-liner program that generates a maze-like pattern using the characters / and \.

The program is based on a one-liner BASIC program from the late 70s and 80s. There’s even been a book written about it called: “10 PRINT CHR$(205.5+RND(1)); : GOTO 10”.

On a Commodore PET (1977 ish), it looks like this:

Neat right?

Rust

My “one-liner” in Rust looks like this:

use rand::Rng; fn main() { let mut rng = rand::thread_rng(); loop { print!("{}", if rng.gen::<bool>() { '/' } else { '\\' }); } }

This is cool and very fast, but the program is designed to run indefinitely, which makes it difficult to measure and compare their performance directly. To perform a speed comparison, we need to modify the programs to run for a fixed number of iterations, allowing us to measure the time taken for a known amount of work.

use rand::Rng;

const ITERATIONS: u64 = 100_000_000;

fn main() {
    let mut rng = rand::thread_rng();
    for _ in 0..ITERATIONS {
        print!("{}", if rng.gen::<bool>() { '/' } else { '\\' });
    }
}

So, if we count to 100 million, we can measure the time it takes to run the program. This should take long enough to get a good idea of how fast the program runs.

C

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main() { srand(time(NULL)); while (1) printf("%c", rand() % 2 ? '/' : '\\'); return 0; }

This is the C version of the program. It’s a little more verbose than the Rust version, but it’s still pretty simple.

And our modified version with a 100 million iterations:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define ITERATIONS 100000000

int main() {
    srand(time(NULL));
    for (long i = 0; i < ITERATIONS; i++) {
        putchar(rand() % 2 ? '/' : '\\');
    }
    return 0;
}

Benchmarking

Now that we have our modified programs, we can compile and run them to measure their performance.

Steps to Benchmark

Compile the Programs:

C Program:
```
gcc -O3 -o ten_print_c ten_print_c.c
```
Rust Program:
```
cargo build --release
```

Run and Measure Execution Time:

C Program:
```
time ./ten_print_c > /dev/null
```

Rust Program:

time ./target/release/tenprint > /dev/null

By setting a large but finite number of iterations (e.g., 100 million), these modified versions will allow you to measure the time taken to execute that number of iterations.

We’re redirecting the output to /dev/null to avoid the overhead of printing to the console.

How did they do?

I ran the benchmark on my 2020 MacBook Pro (2 GHz Quad-Core Intel Core i5), and here are the results:

The Rust version is about 16% faster than the C version in this benchmark.

Hmm… I’m not sure what to make of this. I was expecting the C version to be faster, but it looks like the Rust version is actually faster in this case.

Improving the C version

I tried using a profiler to see where the C version spent most of its time and make some improvements.

llvm-profdata merge -output=default.profdata *.profraw\n
./ten_print_c > /dev/null
llvm-profdata merge -output=default.profdata *.profraw\n
clang -fprofile-instr-use=default.profdata -O3 -o ten_print_c ten_print_c.c

That actually made it slower…

Then, I tried some flags to optimize the code even more.

ALL THE FLAGS.

gcc -O3 -march=native -flto -funroll-loops -ffast-math -s -DNDEBUG -o output_program source_file.c

That made it slower, too…

I’m sure there’s a way to make the C version faster on my Mac. If you have any ideas, I’d love to hear them!

This highlights some of the improvements the Rust build system provides. The Rust compiler is very aggressive when you use the –release flag. It will optimize the code to the best of its ability. The C compiler is also very good at optimizing code, but it’s not quite as aggressive as the Rust compiler. This is likely a big reason why the Rust version is faster in this benchmark on my Mac.

Maybe. As we’ll soon see, this may just be a “problem” with the C compiler on my Intel Mac.

Other Platforms

I ran the same code on an Intel Xeon E-2278GEL @ 2.00GHz running Debian 12. These results are interesting:

Here, the C is quite a bit faster than the Rust version. Whoever wrote the default C compiler for Debian 12 knew what they were doing.

On my M1 MacBook Air (ARM), both results were significantly faster than the Intel machines. But here, the C version is still much faster than the Rust version.

Compilers and systems matter when coding at a low level!

Python, just for fun

I also ran the same benchmark in Python for fun. Here’s the Python version of the program:

import random

ITERATIONS = 100_000_000

def main():
    for _ in range(ITERATIONS):
        print('/' if random.choice([True, False]) else '\\', end='')

if __name__ == "__main__":
    main()

Go get your hot beverage of choice, because this one is going to take a while…

time python3 ten_print.py > /dev/null

Extra Credit

I’m looking forward to the hate mail on this one because I’m sure you’re an amazing programmer and I’m doing it wrong.

Try and have some fun, eh?

If you have a better way to do this, SHOW YOUR WORK! I want to hear about how systems and compilers impact performance and how and why there are better ways to do this in both languages.

BUT SHOW YOUR WORK!

Words are cheap, and code and results are what we’re after.

Happy 10PRINTing!