March 15, 2022

Decompiling Zelda: An Introduction to Understanding Machine Code and Decompiling

Holland & Knight IP/Decode Blog
Jacob W. S. Schneider
IP/Decode Blog

Like a lot of kids growing up in the 1980s, Nintendo was a big part of my childhood. A game that stands out for a lot of us is the original Legend of Zelda. That game put you in a very large, 8-bit world where you roamed around to fight monsters and enter dungeons where you had to solve logic puzzles. The game came in a gold case in 1987 – Nintendo knew it was a huge hit already. I have to add that it took me 31 years to finish the original Legend of Zelda. Although I loved them, I was never very good at these games.

Since that first entry, the Zelda franchise has spawned dozens of follow-on games, many of which are considered among the all-time best video games. In 1998, Zelda made the big leap into 3D with The Legend of Zelda: Ocarina of Time. It only took me 23 years to finish this one. While I was working on it, a developer community was working to "decompile" the entire game into its source code.

This post first takes a deep dive into decompiling, then examines whether the developers are running afoul of any agreement or the U.S. copyright regime.

What Is Decompiling?

As the name indicates, decompiling is the act of reversing compiling, so we should start with compiling. Compiling is when human-readable code gets transformed into machine code. Human-readable code is what software developers write as they code software. A simple example:

for (int i = 1; i <= 5; ++i) {

printf("Number %d\n", i);

}

This simple example would output the following:

Number 1
Number 2
Number 3
Number 4
Number 5

While not a paradigm of clarity, humans can work out the logic of this human-readable code to imagine what it will do.

Before creating that output, however, the code must be compiled into machine code. Machine code is what computers actually process as they execute software. While it is possible to write code at the machine-code level, it would be a headache because it is nearly indecipherable to humans. (There are some brilliant people who can code at the machine level, and it is a reasonable guess that they most likely took fewer than three decades to beat old Nintendo games.) Using , here is what the example above looks like in machine code:

0000000100003f40 <_main>:

100003f40: ff 83 00 d1         sub   sp, sp, #32

100003f44: fd 7b 01 a9        stp    x29, x30, [sp, #16]

100003f48: fd 43 00 91        add   x29, sp, #16

100003f4c: bf c3 1f b8         stur  wzr, [x29, #-4]

100003f50: 28 00 80 52        mov  w8, #1

100003f54: e8 0b 00 b9       str     w8, [sp, #8]

100003f58: e8 0b 40 b9       ldr     w8, [sp, #8]

100003f5c: 08 15 00 71        subs w8, w8, #5

100003f60: 8c 01 00 54        b.gt   0x100003f90 <_main+0x50>

100003f64: e9 0b 40 b9       ldr     w9, [sp, #8]

100003f68: e8 03 09 aa        mov  x8, x9

100003f6c: 00 00 00 90        adrp x0, 0x100003000 <_main+0x2c>

100003f70: 00 b0 3e 91        add   x0, x0, #4012

100003f74: e9 03 00 91        mov  x9, sp

100003f78: 28 01 00 f9         str     x8, [x9]

100003f7c: 09 00 00 94        bl      0x100003fa0 <_printf+0x100003fa0>

100003f80: e8 0b 40 b9       ldr     w8, [sp, #8]

100003f84: 08 05 00 11        add   w8, w8, #1

100003f88: e8 0b 00 b9       str     w8, [sp, #8]

100003f8c: f3 ff ff 17   b       0x100003f58 <_main+0x18>

100003f90: 00 00 80 52        mov  w0, #0

100003f94: fd 7b 41 a9        ldp    x29, x30, [sp, #16]

100003f98: ff 83 00 91         add   sp, sp, #32

100003f9c: c0 03 5f d6        ret

 

Disassembly of section __TEXT,__stubs:

 

0000000100003fa0 <__stubs>:

100003fa0: 10 00 00 b0        adrp x16, 0x100004000 <__stubs+0x4>

100003fa4: 10 02 40 f9         ldr     x16, [x16]

100003fa8: 00 02 1f d6            br        x16

(To be precise, the above is "assembly code," which rests one level higher than pure-binary machine code but corresponds completely to machine code.) The above code is clearly less user-friendly, but what is important is that it is much more computer-friendly.

Machine code is often what software companies distribute to end users. Because it is indecipherable, machine code adds a layer of security for those companies: whatever trade secrets lie inside the code remain secret. There is, however, a way to reverse the process and covert machine code back to human-readable source code: decompiling.

Below is a decompiled version of our original example's machine code, which I performed with Hopper Disassembler:

int _main() {

r31 = r31 - 0x20;

saved_fp = r29;

stack[-8] = r30;

r29 = &saved_fp;

var_4 = 0x0;

var_8 = 0x1;

do {

r8 = var_8 - 0x5;

if (r8 > 0x0) {

break;

}

stack[-32] = var_8;

r0 = printf("Number %d\n", r1);

var_8 = var_8 + 0x1;

} while (true);

r0 = 0x0;

r29 = saved_fp;

r30 = stack[-8];

r31 = r31 + 0x20;

return 0x0;

}

Comparing the original to the decompiled version, it is obvious which is easier to decipher. That said, the two blocks of code are logically equivalent. If you compile and run either of them, you will get the same output: that string of numbers.

To put the amount of work necessary to decompile Zelda into perspective, imagine undertaking the arduous task of rewriting the decompiled above code into the original, simple code. Now imagine doing that for an entire video game's source code. That is precisely what the collection of developers have done for Ocarina of Time.

How They Decompiled the Nintendo Classic

A group of developers apparently took about 21 months to work backward from decompiled code to something more human-readable. Some details of their work are explained in this article. They were motivated in part to figure out how to beat the game faster, but also because they have such affection for it. (People compete to beat games as quickly as possible in "speedruns." The record for Ocarina of Time (by exploiting a glitch) is under 7 seconds . . . not 23 years.) The group had to make a copy of the game cartridge's contents, which is referred to as a ROM (because the games were composed of read-only memory). The group next had to uncover the secrets of the game's original compiler before developing a way to turn the processing backward to decompile the game's machine code. After that, they had to identify which blocks of code expressed themselves in which parts of the game, then rewrite logically equivalent blocks that were human-readable. This was done block-by-block until complete – a mind-numbingly difficult task.

Potential Violations of Agreements, U.S. Copyright Law

The developers' actions raise a whole host of legal questions.

First, to move the game's ROM onto a computer, they had to extract the ROM from a cartridge (or download it from someone who did). Whether it is illegal to extract ROM from your own copy of a game (e.g., creating a personal backup) is a murky area. Drawing an analogy to music, it is long-settled that you can, for example, make a copy of a song you purchased for personal – not commercial – use. It is less clear whether you can do so with a game you own. However, the developers did more than just play a copy of their game – they deconstructed it and made it available to the public. Those actions have the effect of depriving Nintendo of payment for the game and control over its intellectual property.

Second, the deconstruction runs afoul of the license that appeared in the original game box. The game booklet that Nintendo included with Ocarina of Time said, in part:

WARNING: Copying of any Nintendo game is illegal and is strictly prohibited by domestic and international copyright laws. "Back-up" or "archival" copies are not authorized and are not necessary to protect your software. Violators will be prosecuted.

As such, the developers' activities revoked any license Nintendo offered with the game.

Third, the Digital Millennium Copyright Act (DMCA) includes an "anti-circumvention" provision that prohibits workarounds to extract copyrighted material. 17 U.S.C. Sec. 1201 states that, "No person shall circumvent a technological measure that effectively controls access to a work protected under this title." (Emphasis added.) The phrase "circumvent a technological measure" means "to descramble a scrambled work, to decrypt an encrypted work, or otherwise to avoid, bypass, remove, deactivate, or impair a technological measure, without the authority of the copyright owner[,]" and "a technological measure 'effectively controls access to a work' if the measure, in the ordinary course of its operation, requires the application of information, or a process or a treatment, with the authority of the copyright owner, to gain access to the work." (An example of a violation under this provision would be intentionally cracking digital-rights management restrictions on media, such as songs or movies.) In view of the statute, it is arguable that the developers bypassed two levels of technical measures that Nintendo employed to obfuscate its source code: reverse engineering the original, proprietary compiler, then using it to decompile the machine code.

Fourth, and perhaps in an attempt to avoid copyright issues, at least one group of developers has plans to that includes only the game's logic and controls. This version would not include items such as art design, characters, music and other elements that clearly fall on the "creative" side of the line. (The developers suggest that these creative elements could be extracted from a user's personal ROM backup of the game to reconstruct the original.) This attempt to separate non-copyrightable functional elements from copyrightable creative elements will almost certainly be difficult. For example, players have to solve puzzles to advance in the game. Those puzzles certainly use game logic, but one could argue that they are creative regardless.

The developers' work on the project is no doubt an impressive engineering feat, but they most likely will face a host of legal issues. Moreover, there is a much easier way to enjoy this classic – it is currently (and legally) available on the Nintendo Switch Online platform, and it is still a complete blast to play.

Related Insights