@ ARM version of trin.S, q.v. The point here is to see how @ much of a slowdown dumbjit.c causes. This takes 4.6 seconds @ on my Ryzen laptop, while the equivalent program compiled @ with dumbjit takes 14-15 seconds. .equ N, 100*1000 .equ M, 10*1000 .globl main .thumb_func .thumb main: push {r4, lr} ldr r4, =N 1: ldr r0, =M bl tri sub r4, #1 bne 1b mov r1, r0 ldr r0, =format bl printf movs r1, #0 pop {r4, pc} .ltorg format: .asciz "%d\n" tri: movs r1, #0 tst r0, r0 beq 1f 2: add r1, r0 sub r0, #1 bne 2b 1: mov r0, r1 bx lr