Golang on the PlayStation 2 (part 3)
By Ricardo

Holy shit (again!)
I’m really glad to see this topic is interesting to a lot more people than I anticipated. I’ve been working on this on my spare time just for fun, without any actual goal - and yet people seem to dig it.
As a matter of fact, I’m giving two talks this year about this topic - exactly, two opportunities for everyone to join me in this crazy saga! I’ll be talking and explaining this project (possibly with live demos!) at both GoLab (Oct 5-7) and GambiConf (Nov 29-30) this year. So grab your tickets and join me on this rollercoaster of quirks, bugs, Sony weirdness and a bunch of hacks!

Fun fact: for GoLab you can use my
SP20FR
coupon for 20% off on conference ticket!
Ok, with all that said, let’s dive into part 3!
Disclaimer
Once again, a lot of what you’ll see here is based on my crazy weekends trying to hack things together. I am not, by any chance in the world, an expert in any of the subjects discussed in this post and/or series. Information here provided is as-in and may be inaccurate and/or wrong, and I am not responsible for any harm to your consoles or sanity.
You have been warned.
Our plan
There are multiple things I’d like to do at this point in the project. The first one, however, is to reduce the amount of hacks on TinyGo itself, specially as they are very platform-specific. If I had to choose, I’d prefer to have them on the LLVM codebase, leaving TinyGo generating a normal bog standard IR code. For that, I need to dive deeper into that world.
The general process for compiling things on TinyGo works like this:

Right now we’re hacking things on the TinyGo section, generating a valid LLVM IR code for our platform. We want to move our hacks into the LLVM part instead, so that TinyGo has as little modifications as possible and LLVM has to deal with our weird, quirky CPU. Easy, right?
To make our lives easier from now on, I’ve created a small test application that goes through a series of tests, validating each relevant type, some of its variants, and even running through some standard code to see if anything breaks (spoiler: it will!). At the time of writing, this is what I was testing:
- 8/16/32/64-bit integers (signed and unsigned) for addition, subtraction, multiplication, division and remainder operations
- 32/64-bit floats for addition, subtraction, multiplication, division and remainder operations
- Some basic string operations and formatting
- String formatting for all the integer types from above
- String formatting for all the float types from above
Here’s a sneak peek of how some of its tests are defined:
type numberTest[T comparable] struct {
name string
left T
right T
result T
fn func(T, T) T
}
var (
int64Tests = []numberTest[int64]{
{"add", 1234567890, 9876543210, 11111111100, func(a, b int64) int64 { return a + b }},
{"sub", 9876543210, 1234567890, 8641975320, func(a, b int64) int64 { return a - b }},
{"mul", 123456, 789012, 97408265472, func(a, b int64) int64 { return a * b }},
{"div", 11111111100, 3, 3703703700, func(a, b int64) int64 { return a / b }},
{"mod", 11111111100, 1000, 100, func(a, b int64) int64 { return a % b }},
}
uint64Tests = []numberTest[uint64]{
{"add", 1234567890, 9876543210, 11111111100, func(a, b uint64) uint64 { return a + b }},
{"sub", 10000000000, 1234567890, 8765432110, func(a, b uint64) uint64 { return a - b }},
{"mul", 123456, 789012, 97408265472, func(a, b uint64) uint64 { return a * b }},
{"div", 11111111100, 3, 3703703700, func(a, b uint64) uint64 { return a / b }},
{"mod", 11111111100, 1000, 100, func(a, b uint64) uint64 { return a % b }},
}
)
With this, we can build our TinyGo test program and see if the changes in the LLVM fixed the problems or not. And I can also use this for other tests in the future. Plus, since we saw in part 2 that the emulator and the actual hardware deal with things differently, I can actually use this to validate all the code changes against the real thing.
Disclaimer: I need reiterate that I am not an expert in any of this. This post may give inaccurate and/or plain wrong information. There was a lot of ChatGPT and just hacking through code from now on, so fasten your seatbelts because this will be a wild ride!
Faking it: the int64 solution
The first hack I’d like to remove is the MUL
(*
) and QUO
(/
) operations I’ve hacked in part 1 This is easy enough: for the relevant operations in the LLVM IR code, I want to force it to use a library call instead (ie. do the operation in software).
You see, when the LLVM is trying to convert its IR code into actual machine code, it goes through a few steps, and one of them is called “lowering”. In this step, as far as I understand, it needs to know what is legal and what is not on a CPU. This is important as not all CPUs are equal: a Pentium 3 doesn’t support all the features of a Pentium 4, for example. The LLVM is responsible for knowing what is allowed and what is not, and translating it into the appropriate valid instructions if possible, or doing the entire thing in software.
The LLVM has a few possible actions for legalizing instructions on a target:
/// This enum indicates whether operations are valid for a target, and if not,
/// what action should be used to make them valid.
enum LegalizeAction : uint8_t {
Legal, // The target natively supports this operation.
Promote, // This operation should be executed in a larger type.
Expand, // Try to expand this to other ops, otherwise use a libcall.
LibCall, // Don't try to expand this to other ops, always use a libcall.
Custom // Use the LowerOperation hook to implement custom lowering.
};
I won’t go into the details of this, but we want to target LibCall
in our scenarios. This basically tells the LLVM that we want to fake the instruction: it’s not supported on our target, so create by doing a function call that can do such operation for us.
I think we could probably get this solved by either using
Expand
(to expand this into multiple supported instructions) orCustom
(and manually implement such instructions), but I’m lazy as hell and want this to get done in software. Plus, we were doing it this way anyway, just in a slightly worse manner!
For that to happen in our case, we need to look at the MipsSETargetLowering
class. Its constructor define what is supported and what is not based on the target and subtarget machines. At the end of it, we want to tell LLVM that any multiplication/division/similar operation with 64-bit integers, signed or not, is not allowed in our target and must be performed in software - a library call (LibCall
). Since the simple MUL
and QUO
tokens can be lowered into a bunch of instructions depending on the use case (also known as: I have no clue why just a few didn’t work), I’ve opted to force all of the int64-related things into a library call. This is what it looks like:
MipsSETargetLowering::MipsSETargetLowering(const MipsTargetMachine &TM,
const MipsSubtarget &STI)
: MipsTargetLowering(TM, STI) {
// (...)
setOperationAction(ISD::MUL, MVT::i64, LibCall);
setOperationAction(ISD::SDIV, MVT::i64, LibCall);
setOperationAction(ISD::UDIV, MVT::i64, LibCall);
setOperationAction(ISD::SREM, MVT::i64, LibCall);
setOperationAction(ISD::UREM, MVT::i64, LibCall);
setOperationAction(ISD::SDIVREM, MVT::i64, LibCall);
setOperationAction(ISD::UDIVREM, MVT::i64, LibCall);
setOperationAction(ISD::SMUL_LOHI, MVT::i64, LibCall);
setOperationAction(ISD::UMUL_LOHI, MVT::i64, LibCall);
setOperationAction(ISD::MULHS, MVT::i64, LibCall);
setOperationAction(ISD::MULHU, MVT::i64, LibCall);
// (...)
}
Mind you that, since we’re doing this the hacky way, I didn’t even bother adding a flag, but we might want to do so in the future so that we can keep other parts of this LLVM functional.
Anyway, with these calls, we’re essentially telling the LLVM that IR instructions such as MUL
, SDIV
and UDIV
for 64bit integers are not allowed, and must be performed in software. This way, even if TinyGo outputs something like %47 = mul i64 2, %40, !dbg !25370
, which is a 64-bit integer multiplication in LLVM IR, our compiler will not try to write that as a MULT
or DMULT
instructions, but as a call to __muldi3
instead.
And sure enough, after fixing all the multiplication and division instructions for 64bit integers, our arithmetic problems seem to be fixed:

Note that we didn’t change any addition or subtraction: those instructions are supported by the PS2. Only multiplication and division are not. Weird. Anyway, now we can restore TinyGo’s original code for handling these operations to its former glory and not have to worry about this anymore (hopefully).
The FPU is single
Quick thing to explain before we dive into this: the floating point numbers operations in the PS2 are done inside a coprocessor called COP1
. It handles only single-precision floating point numbers (ie. float
and not double
).
During my (definitely not exhaustive) testing, I’ve noticed that floating point numbers are not behaving correctly - especially here:
func main() {
debug.Init()
debug.Printf("Start\n\n")
for i := 0.12345; i <= 12345; i = i * 10 {
debug.Printf("%.02f\n", i)
}
debug.Printf("\n\nEnd")
for {}
}
This is what it prints:

Ok, this is definitely not normal. If we disassamble the main.main
function, we get this:
00054d50 <main.main>:
54d50: 27bdffd0 addiu sp,sp,-48
54d54: ffbf0028 sd ra,40(sp)
54d58: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54d5c: 00000000 nop
54d60: 3c010000 lui at,0x0
54d64: 24210000 addiu at,at,0
54d68: 00202025 move a0,at
54d6c: 64050007 daddiu a1,zero,7
54d70: 64080000 daddiu a4,zero,0
54d74: 01003025 move a2,a4
54d78: 01003825 move a3,a4
54d7c: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54d80: 00000000 nop
54d84: 3c010000 lui at,0x0
54d88: d4200000 ldc1 $f0,0(at)
54d8c: f7a00020 sdc1 $f0,32(sp)
54d90: 08000000 j 0 <(internal/gclayout.Layout).AsPtr>
54d94: 00000000 nop
54d98: d7a00020 ldc1 $f0,32(sp)
54d9c: f7a00018 sdc1 $f0,24(sp)
54da0: 3c010000 lui at,0x0
54da4: d4210000 ldc1 $f1,0(at)
54da8: 46210036 c.ole.d $f0,$f1
54dac: 00000000 nop
54db0: 45000026 bc1f 54e4c <main.main+0xfc>
54db4: 00000000 nop
54db8: 08000000 j 0 <(internal/gclayout.Layout).AsPtr>
54dbc: 00000000 nop
54dc0: 64040008 daddiu a0,zero,8
54dc4: ffa40008 sd a0,8(sp)
54dc8: 640500c5 daddiu a1,zero,197
54dcc: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54dd0: 00000000 nop
54dd4: dfa40008 ld a0,8(sp)
54dd8: afa20014 sw v0,20(sp)
54ddc: 64050000 daddiu a1,zero,0
54de0: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54de4: 00000000 nop
54de8: d7a00018 ldc1 $f0,24(sp)
54dec: 00401825 move v1,v0
54df0: 8fa20014 lw v0,20(sp)
54df4: f4600000 sdc1 $f0,0(v1)
54df8: 3c010000 lui at,0x0
54dfc: 24210000 addiu at,at,0
54e00: ac430004 sw v1,4(v0)
54e04: ac410000 sw at,0(v0)
54e08: 00403025 move a2,v0
54e0c: 3c010000 lui at,0x0
54e10: 24210000 addiu at,at,0
54e14: 00202025 move a0,at
54e18: 64050006 daddiu a1,zero,6
54e1c: 64080001 daddiu a4,zero,1
54e20: 01003825 move a3,a4
54e24: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54e28: 00000000 nop
54e2c: d7ac0018 ldc1 $f12,24(sp)
54e30: 3c010000 lui at,0x0
54e34: d42d0000 ldc1 $f13,0(at)
54e38: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54e3c: 00000000 nop
54e40: f7a00020 sdc1 $f0,32(sp)
54e44: 08000000 j 0 <(internal/gclayout.Layout).AsPtr>
54e48: 00000000 nop
54e4c: 3c010000 lui at,0x0
54e50: 24210000 addiu at,at,0
54e54: 00202025 move a0,at
54e58: 64050005 daddiu a1,zero,5
54e5c: 64080000 daddiu a4,zero,0
54e60: 01003025 move a2,a4
54e64: 01003825 move a3,a4
54e68: 0c000000 jal 0 <(internal/gclayout.Layout).AsPtr>
54e6c: 00000000 nop
54e70: 08000000 j 0 <(internal/gclayout.Layout).AsPtr>
54e74: 00000000 nop
54e78: 08000000 j 0 <(internal/gclayout.Layout).AsPtr>
54e7c: 00000000 nop
At first glance things look normal… until you look into some of those instructions:
ldc1 $f0,0(at)
sdc1 $f0,32(sp)
You see, LDC1
and SDC1
are instructions for loading double words to coprocessor 1 (the FPU). However, Sony didn’t implement these instructions in the PS2 CPU, as seen in their own manual (EE Core User’s Manual, page 50):

To fix this, we can do the same LLVM hack as before: for the floating point instructions, force anything that is using float64
to use software instead of hardware operations:
setOperationAction(ISD::FABS, MVT::f64, LibCall);
setOperationAction(ISD::FADD, MVT::f64, LibCall);
setOperationAction(ISD::FSUB, MVT::f64, LibCall);
setOperationAction(ISD::FMUL, MVT::f64, LibCall);
setOperationAction(ISD::FDIV, MVT::f64, LibCall);
setOperationAction(ISD::FREM, MVT::f64, LibCall);
setOperationAction(ISD::FP_ROUND, MVT::f64, LibCall);
setOperationAction(ISD::BITCAST, MVT::f64, LibCall);
setOperationAction(ISD::LOAD, MVT::f64, LibCall);
setOperationAction(ISD::STORE, MVT::f64, LibCall);
With all of that, we finally get working float64 operations:

… or do we?
Why can’t we have nice things?
Great! Now that we’ve sorted out the core of the arithmetic functions, we need to make sure some other stuff is functional. Do you wanna know a really cool way to test this? Formatting strings.
I’m not sure why exactly, but formatting strings has been a freaking nightmare in this project. It always breaks if something isn’t just right. This is ok though, as this also becomes a cool way of testing our code!
Let’s do some tests:
type genericTest[T comparable] struct {
name string
fn func() T
expected T
}
var (
// Some very simple tests
stringTests = []genericTest[string]{
{"emp", func() string { return "" }, ""},
{"cst", func() string { return "Hello, World!" }, "Hello, World!"},
{"app", func() string { return "Hello, " + "World!" }, "Hello, World!"},
}
// Formatting strings
formatStringTests = []genericTest[string]{
{" %%s", func() string { return fmt.Sprintf("%s!", "abc") }, "abc!"},
{" %%v", func() string { return fmt.Sprintf("%v!", "abc") }, "abc!"},
}
// Formatting integers
formatIntegerTests = []genericTest[string]{
{" s8", func() string { return fmt.Sprintf("%d", int8(42)) }, "42"},
{" u8", func() string { return fmt.Sprintf("%d", uint8(42)) }, "42"},
{"s16", func() string { return fmt.Sprintf("%d", int16(1234)) }, "1234"},
{"u16", func() string { return fmt.Sprintf("%d", uint16(1234)) }, "1234"},
{"s32", func() string { return fmt.Sprintf("%d", int32(123456)) }, "123456"},
{"u32", func() string { return fmt.Sprintf("%d", uint32(123456)) }, "123456"},
{"s64", func() string { return fmt.Sprintf("%d", int64(1234567890)) }, "1234567890"},
{"u64", func() string { return fmt.Sprintf("%d", uint64(1234567890)) }, "1234567890"},
}
// Formatting floats
formatFloatTests = []genericTest[string]{
{"32f", func() string { return fmt.Sprintf("%.5f", float32(1.234)) }, "1.23400"},
{"32v", func() string { return fmt.Sprintf("%v", float32(1.234)) }, "1.234"},
{"64f", func() string { return fmt.Sprintf("%.5f", float64(123456789.123456789)) }, "123456789.12346"},
{"64v", func() string { return fmt.Sprintf("%v", float64(123456789.123456789)) }, "1.2345678912345679e+08"},
}
)
And this is where our new nightmare begins:

Ok, it can’t format even float32
now. Great. Let’s look at the instructions it is generating for float32
and float64
:
0005af28 <main.validateAllNumberTests[float32]>:
// ...
5b084: c4800008 lwc1 $f0,8(a0)
5b088: c481000c lwc1 $f1,12(a0)
5b08c: c4820010 lwc1 $f2,16(a0)
// ...
5b0a0: e7a200d8 swc1 $f2,216(sp)
5b0a4: e7a100d4 swc1 $f1,212(sp)
5b0a8: e7a000d0 swc1 $f0,208(sp)
// ...
0005b650 <main.validateAllNumberTests[float64]>:
// ...
5b7ac: d4800008 ldc1 $f0,8(a0)
5b7b0: d4810010 ldc1 $f1,16(a0)
5b7b4: d4820018 ldc1 $f2,24(a0)
// ...
5b7c8: f7a20108 sdc1 $f2,264(sp)
5b7cc: f7a10100 sdc1 $f1,256(sp)
5b7d0: f7a000f8 sdc1 $f0,248(sp)
// ...
So, for float32
, it is generating the correct instructions: LWC1
and SWC1
. However, for float64
it is still using LDC1
and SDC1
, even though we told it that it had to use a LibCall
for that. Interesting, and quite odd, as the operations we tried before seem to be working just fine? This might be because the numbers I’m using are lower precision, but that’s something we can fix later.
Fun fact: while investigating this, I noticed that I already tell the compiler to not use
LDC1
andSDC1
through the-mno-ldc1-sdc1
flag, but it seems to ignore it. My theory is that the way TinyGo calls the LLVM is a bit different and that flag gets lost, or something rewrites it along the way. Anyway.
After careful consideration (aka asking ChatGPT a few things), I’ve noticed that if I try to force single float through a flag in our ps2.json
configuration file, it breaks the LLVM:
Stack dump: 0. Program arguments: /Users/ricardo/dev/tinygo/llvm-build/bin/clang -fno-pic -c --target=mips64el -mcpu=mips3 -fno-inline-functions -mabi=n32 -mhard-float -mxgot -mlittle-endian -v -o build/test.o build/test.ll 1. Code generation 2. Running pass 'Function Pass Manager' on module 'build/test.ll'. 3. Running pass 'MIPS DAG->DAG Pattern Instruction Selection' on function '@"(*sync.Once).Do"'
Going back and forth with it, it eventually told me about some constraints being added by the defer
code generated by TinyGo. This is how it handles a defer checkpoint on mips
:
case "mips":
// $4 flag (zero or non-zero)
// $5 defer frame
asmString = `
.set noat
move $$4, $$zero
jal 1f
1:
addiu $$ra, 8
sw $$ra, 4($$5)
.set at`
constraints = "={$4},{$5},~{$1},~{$2},~{$3},~{$5},~{$6},~{$7},~{$8},~{$9},~{$10},~{$11},~{$12},~{$13},~{$14},~{$15},~{$16},~{$17},~{$18},~{$19},~{$20},~{$21},~{$22},~{$23},~{$24},~{$25},~{$26},~{$27},~{$28},~{$29},~{$30},~{$31},~{memory}"
if !strings.Contains(b.Features, "+soft-float") {
// Using floating point registers together with GOMIPS=softfloat
// results in a crash: "This value type is not natively supported!"
// So only add them when using hardfloat.
constraints += ",~{$f0},~{$f1},~{$f2},~{$f3},~{$f4},~{$f5},~{$f6},~{$f7},~{$f8},~{$f9},~{$f10},~{$f11},~{$f12},~{$f13},~{$f14},~{$f15},~{$f16},~{$f17},~{$f18},~{$f19},~{$f20},~{$f21},~{$f22},~{$f23},~{$f24},~{$f25},~{$f26},~{$f27},~{$f28},~{$f29},~{$f30},~{$f31}"
}
Funny enough, disabling the constraints for the FPU registers on single float mode makes it… work? Do I understand that? Not at all. Do I care? Also not at all!
Yes, I know, I wanted to do as little code changes to TinyGo as I can, but unfortunately it is what it is. 🤷♂️
// ...
if !strings.Contains(b.Features, "+soft-float") && !strings.Contains(b.Features, "+single-float") {
// ...
}

And sure enough, disassembling the code I no longer see the invalid instructions:
$ mips64r5900el-ps2-elf-objdump -d test.o | grep -i ldc1 | wc -l
0
$ mips64r5900el-ps2-elf-objdump -d test.o | grep -i sdc1 | wc -l
0
$ mips64r5900el-ps2-elf-objdump -d test.o | grep -i lwc1 | wc -l
82
$ mips64r5900el-ps2-elf-objdump -d test.o | grep -i swc1 | wc -l
56
Disclaimer: this will definitely bite me back in the future, but I’m ok with that for now. Let’s tackle one problem at a time. Plus, we haven’t used
defer
on live hardware exactly because it breaks things.
Fun fact, with these changes, we no longer need to modify the LLVM to use LibCall
for the f64
instructions. As such, I’ve disabled that change for now.
The final piece in the puzzle is now that the floating point values do not match. This is fine, we can implement some code around it on our tests to consider “close enough” numbers to be equal:
switch any(got).(type) {
case float32:
const eps = 1e-5
equal = math.Abs(float64(any(got).(float32))-float64(any(expected).(float32))) <= eps
case float64:
const eps = 1e-9
equal = math.Abs(any(got).(float64)-any(expected).(float64)) <= eps
default:
equal = got == expected
}
And, finally:

Plus, the best part in my opinion is this - it works on real hardware:
Finally! Ok, now we’re done!
The Source Code
This project has arrived at a point where I’m comfortable enough to release its source code. The whole thing consists of 3 repositories.
The first repository is my code with all the demos, test applications, etc. It’s all very much hacked together and suffers from terrible coding standards as this is all for weekend fun.
The second and third repositories are the TinyGo and LLVM forks, used to maintain all the code hacks and changes I’ve done to make things work together.
Since this is all a bunch of hacks that I managed to make work together, I do not intend to send pull requests upstream (to TinyGo’s repository) at this moment. If things gets more stable and cleaner in the future, this might change. As such, for now, you’ll have to rely on my repositories to get things to work.
That said, as usual, things as provided AS IS, meaning that no support is provided and I’m not responsible if this breaks your machine and/or console. So… use it at your own discretion!
Have fun!