@@ -*- asm -*- @@ Simple Tetris in ARM assembly. @@ arm-linux-gnueabi-gcc -nostdlib tetris.S @@ ASCIIcast at @@ A couple of global variables are assigned to registers: @@ r7: a pointer to the in-memory board state @@ r8: the current y-location of the current piece, with 0 being @@ “starting on the top row”, 1 being “starting on the @@ next-to-top row”, and 19 being “starting on the bottom row”. @@ This code otherwise adheres to the ARM EABI: r0-4 are @@ argument or temporary registers, r5-6 and r9-12 are @@ preserved across calls but can be used for local variables. @@ One game in emulation went as follows: @@ real 4m51,596s @@ user 0m0,080s @@ sys 0m0,090s @@ @@ It would be a lot faster if it didn’t redraw the screen @@ after every input byte... @@ Problems that should be fixed: @@ - failing to rotate pieces on the right border (and crashing) (fixed) @@ - still walking pieces to the left as they rotate @@ - redrawing so much @@ - not keeping score @@ - not speeding up @@ - echoing input (fixed) @@ - not restoring tty settings on SIGINT exit @@ - the version shrunk with sstrip misaligns .bss (?!), @@ causing a bus error in the stmia in npiece (but not in @@ emulation) .syntax unified .thumb .cpu cortex-m4 .bss board: .fill 4*20 @ One word per row, one bit per cell buf: .fill 80*22 @ Terminal output buffer. piece: .fill 4*4 @ same format as board and xored into it .data pieces: .byte 0b1111 @ two bytes per possible piece, in binary. .byte 0 @ this is I. .byte 0b111 @ T .byte 0b010 .byte 0b111 @ L .byte 0b100 .byte 0b111 @ J .byte 0b001 .byte 0b110 @ Z .byte 0b011 .byte 0b011 @ S .byte 0b110 .byte 0b11 @ O .byte 0b11 .text .thumb_func .globl _start _start: ldr r7, =board @ initialize global board pointer register movs r0, #2 @ which piece? the L piece bl npiece @ get new piece, also initializing r8 movs r0, #1 bl cbreak @ set tty for char-at-a-time input (-icanon) @@ (note that icanon is not restored on exit, so hopefully @@ your shell has command-line editing) 1: movs r0, #500 @ main loop bl waitis 4: bl xor_piece bl draw_board bl xor_piece bl iwait @ wait for input or timeout in r0 cbz r0, 2f @ if it timed out, skip down to moving piece down bl key b 4b @ after key-handling, redraw without timeout reset 2: adds r8, #1 @ move piece down the board bl hit @ does this new position hit something? cbz r0, 3f @ if so, subs r8, #1 bl xor_piece @ redraw it in the previous pos, and bl clear_lines @ clear any full lines, and ands r0, r8, #7 @ choose a new piece (for now just by line XXX) cmp r0, #7 @ illegal piece? it eq @ if so, moveq r0, #3 @ pick an L; bl npiece @ get a new piece, and bl hit @ if the new piece hits, cbnz r0, 1f @ leave the loop 3: b 1b 1: movs r0, #0 bl cbreak @ restore cooked echo mode (r0=0) movs r0, #0 bl exit @@ Handle the keystroke in r0 .thumb_func key: push {r4, lr} cmp r0, #'j @ from tetris-bsd beq 1f cmp r0, #'J beq 1f cmp r0, #'D @ VT100 left is ^[[D beq 1f cmp r0, #'l @ from tetris-bsd beq 3f cmp r0, #'L beq 3f cmp r0, #'C @ VT100 right is ^[[C beq 3f cmp r0, #'k @ yup, you guessed it beq 4f cmp r0, #'K beq 4f cmp r0, #'A @ VT100 up is ^[[A beq 4f cmp r0, #'n @ recent versions of tetris-bsd beq 4f cmp r0, #'N beq 4f cmp r0, #'B @ VT100 down is ^[[B beq 5f 2: pop {r4, pc} 5: movs r0, #0 @ to drop, set wait before dropping to zero bl waitis b 2b 4: bl rotate b 2b 3: bl right b 2b 1: bl left b 2b @@ Rotate the current piece counterclockwise, but only if @@ that’s possible without hitting anything. .thumb_func rotate: push {r4, r9-r12, lr} @ r12 is IP ldr r1, =piece ldmia r1, {r9-r12} @ load piece ldmia r1, {r0-r3} @ another copy! bl ror0r3 @ rotate r0-r3 ldr r4, =piece stmia r4!, {r0-r3} bl hit @ check to see if proposed rotation collides cbz r0, 1f ldr r1, =piece stmia r1!, {r9-r12} @ restore previous state if so 1: pop {r4, r9-r12, pc} @@ Rotate the piece in registers r0-r3 counterclockwise, @@ returning it in the same registers. This almost works, but @@ consistently produces wrong answers. .thumb_func ror0r3: push {r4, r5, r6, r9, r10, r11, r12, lr} @@ Start by shifting the piece to the right edge and @@ remembering how far. movs r4, #0 @ Shift count 1: orrs r5, r0, r1 @ See if we can shift further safely. orrs r5, r2 orrs r5, r3 tst r5, #1 bne 1f @ Exit the loop if we’re at the right margin adds r4, #1 lsrs r0, #1 lsrs r1, #1 lsrs r2, #1 lsrs r3, #1 b 1b @@ Now we have the piece right-justified in r0-3, occupying no @@ more than the last four bits. Let’s pack it into one @@ register. 1: orrs r0, r1, r0, lsl #4 orrs r0, r2, r0, lsl #4 orrs r0, r3, r0, lsl #4 @@ Now we’re going to use a nested loop to build up the @@ rotated piece in r5. r1-3 are free now, so we’re going to @@ use r1 for an outer loop counter and r3 for a bitmask which @@ doubles as an inner loop counter. movs r1, #4 movs r5, #0 1: movs r3, #0x800 @ Bitmask to select the bit of interest lsls r3, r1 2: lsls r5, #1 @ make space for a new bit in r5 tst r0, r3 @ Is this bit set? it ne @ if not zero, addne r5, #1 @ put a 1 bit in r5 lsrs r3, #4 @ and either way go to the next nibble bne 2b @ and repeat unless we’re out of nibbles. subs r1, #1 @ Outer loop. bne 1b @@ Now we have the rotated piece in r5, but we need to return @@ it in r0-r3. ands r0, r5, #0xf lsrs r5, #4 ands r1, r5, #0xf lsrs r5, #4 ands r2, r5, #0xf movs r3, r5, lsr #4 @@ But we also need to shift it back. I’m not sure exactly @@ why r4 is always one shift too many, but I’m decrementing @@ it here as a hack. cbz r4, 1f subs r4, #1 lsls r3, r4 lsls r2, r4 lsls r1, r4 lsls r0, r4 1: pop {r4, r5, r6, r9, r10, r11, r12, pc} @@ Move the piece to the right, returning 0 on success or 1 on @@ failure. On failure it undoes its effects. .thumb_func right: push {r4, r5, r6, lr} ldr r4, =piece ldmia r4, {r0-r3} @ load piece @@ First, check to see if we’re at the right wall, by ORing @@ all the pieces together. orrs r6, r2, r3 orrs r6, r1 orrs r6, r0 ands r6, #1 bne 1f @ If so, leave without updating; otherwise, lsrs r3, #1 @ shift each row right and store. lsrs r2, #1 lsrs r1, #1 lsrs r0, #1 stmia r4!, {r0-r3} bl hit @ Does the new position hit something? cbz r0, 1f bl left @ If so, undo the move. 1: pop {r4, r5, r6, pc} @@ Similarly, but moving left. Currently omits the hit tests. .thumb_func left: push {r4, r5, r6, lr} ldr r1, =piece ldmia r1, {r2, r3, r4, r5} @ load piece lsls r2, #1 lsls r3, #1 lsls r4, #1 lsls r5, #1 stmia r1!, {r2, r3, r4, r5} bl hit @ Does the new position hit something? (Incl. left wall.) cbz r0, 1f bl right @ If so, undo the move. (Potential iloop.) 1: pop {r4, r5, r6, pc} @ Choose the piece indexed by r0 0-6 .thumb_func npiece: push {r4, lr} ldr r1, =piece ldr r2, =pieces adds r2, r2, r0, lsl #1 @ compute pointer to piece ldrb r0, [r2] @ load first byte of piece ldrb r2, [r2, #1] @ load second byte of piece lsls r0, #3 @ shift piece to center lsls r2, #3 movs r3, #0 movs r4, #0 stmia r1, {r0, r2, r3, r4} movs r8, #0 @ new piece is at top pop {r4, pc} @@ Set sleep for iwait to r0 milliseconds. @@ (r0 must be under 1000) .thumb_func waitis: ldr r2, =wait @ struct timeval movs r3, #0 @ 0 sec str r3, [r2] @ .tv_sec = 0 ldr r1, =1000 @ multiplier for ms mul r0, r1 str r0, [r2, #4] @ set .tv_usec bx lr .bss wait: .fill 8 @ the struct timeval .text @@ Wait for input for up to time set by waitis. Uses Linux @@ (SysV) semantics for select(2) to keep track of leftover @@ time as long as waitis isn’t called again. .thumb_func iwait: push {r4, r5, r7, lr} .bss 1: .long 0 @ fd_set for input .text movs r0, #1 @ stdin only (used as both bitmask and count) ldr r1, =1b @ input fds str r0, [r1] @ must re-initialize input fd_set each time movs r2, #0 @ no output fds movs r3, #0 @ no exceptions ldr r4, =wait @ struct timeval set up by waitis movs r7, #0x8e @ _newselect() system call number svc 0 cbz r0, 1f @ if we have input on stdin (select returned 1), movs r0, #0 @ read from stdin ldr r1, =1b @ reuse fd_set as a character buffer movs r2, #1 @ one byte! movs r7, #3 @ read() syscall number from svc 0 @ ldr r1, =1b ldrb r0, [r1] @ read into r0 (which is 0 if we’re skipping this) 1: pop {r4, r5, r7, pc} .thumb_func xor_piece: push {r4, lr} adds r1, r7, r8, lsl #2 @ calculate address of top row ldr r2, =piece movs r0, #12 1: ldr r3, [r2, r0] cbz r3, 2f @ xoring empty rows may overflow buffer ldr r4, [r1, r0] eors r3, r4 str r3, [r1, r0] 2: subs r0, #4 bpl 1b @ repeat if index not yet negative pop {r4, pc} @@ Returns nonzero in r0 iff the current piece would collide @@ with something on the board at the current height, or with @@ the left wall, or the floor. .thumb_func hit: push {r4, r5, r6, lr} adds r1, r7, r8, lsl #2 @ pointer to row coinciding with top of piece adds r5, r7, #80 @ pointer to end of board ldr r2, =piece movs r0, #12 1: ldr r3, [r2, r0] @ load row of piece cbz r3, 2f @ (but disregard empty rows) adds r6, r1, r0 @ Calculate pointer to row of blocks. cmp r6, r5 @ If a nonempty row is below the floor, bge 1f @ we have a hit; jump to fail handler. ldr r4, [r6] tst r3, r4 @ is there a hit on a block? bne 1f @ jump to fail handler if so cmp r3, #0x400 @ Also, is it hitting the left wall? bge 1f 2: subs r0, #4 bpl 1b movs r0, #0 @ finished loop without finding a hit pop {r4, r5, r6, pc} 1: movs r0, #1 pop {r4, r5, r6, pc} .thumb_func exit: movs r7, #1 svc 0 b . @ 🟦🟥🟩🟧🟪🟨 ? ⬛⬜? .thumb_func draw_board: push {r4, r5, r6, lr} ldr r0, =buf @ output pointer .data 1: .asciz "\033[H" @ ANSI home cursor sequence .text ldr r1, =1b bl putstr 1: movs r5, #0 @ row 1: ldr r1, [r7, r5] @ offset from base pointer to board bl draw_line @ returns updated pointer in r0 adds r5, #4 cmp r5, #(4*20) bne 1b @@ draw floor movw r1, #0x3ff bl draw_line @@ set up write(2) args ldr r1, =buf subs r2, r0, r1 @ calculate size movs r0, #1 @ stdout bl write pop {r4, r5, r6, pc} @@ Append nul-terminated string at r1 to buffer pointer at r0. @@ Returns updated buffer pointer in r0. This is basically @@ strcpy without copying the terminator. .thumb_func putstr: ldrb r2, [r1], #1 cbz r2, 1f strb r2, [r0], #1 b putstr 1: bx lr .thumb_func draw_line: push {r4, r5, r6, lr} movs r5, #(1<<11) @ column bit mask movs r4, r1 lsls r4, #1 adds r4, r5 @ left wall adds r4, #1 @ right wall 2: tst r4, r5 .data block: .asciz "🟪" @ Unicode purple block space: .asciz "⬜" @ Unicode white block .text ldr r1, =space beq 3f @ skip forward to draw space if zero subs r1, #(space-block) 3: bl putstr lsrs r5, #1 bne 2b @ exit if mask bit right-shifted out of word movs r5, #'\n strb r5, [r0], #1 pop {r4, r5, r6, pc} .thumb_func write: push {r7, lr} movs r7, #4 @ write system call number svc 0 pop {r7, pc} .thumb_func clear_line: 1: subs r2, r0, #4 @ r0 is initially the offset to clear ldr r3, [r7, r2] str r3, [r7, r0] mov r0, r2 bne 1b @ loop unless new r0 is 0 str r0, [r7] @ make top line blank bx lr .thumb_func clear_lines: push {r4, r5, r6, lr} movs r4, #76 @ offset of candidate line to examine movw r5, #0x3ff @ what a full line looks like 1: ldr r2, [r7, r4] cmp r2, r5 bne 2f mov r0, r4 bl clear_line b 1b @ after clearing a line check it again 2: subs r4, #4 bpl 1b @ repeat if r4 >= 0 pop {r4, r5, r6, pc} @@ Code to turn on or off ICANON and ECHO in Linux (see @@ cbreak.c), turning them off if r0 is nonzero and on if it's @@ zero. Unix tty handling is horrible, and so is ioctl, and @@ so is your mother’s face. Note that if you ^C this won’t @@ restore them. .bss tios: .fill 60 @ sizeof(struct termios) .text .thumb_func cbreak: push {r4, r7, r9, lr} mov r4, r0 @ input argument says whether to set cbreak (r0=1) or restore (r0=0) movs r0, #0 @ stdin movw r1, #0x5401 @ TCGETS ldr r2, =tios @ char *argp movs r7, #0x36 @ ioctl system call number svc 0 ldr r2, =tios ldr r3, [r2, #12] @ tios.c_lflag bic r3, #0xa @ & = ~(ICANON | ECHO) cbnz r4, 1f @ or if r4 (original arg r0) is zero, orr r3, #0xa @ |= (ICANON | ECHO) 1: str r3, [r2, #12] @ update c_lflag movs r0, #0 @ stdin again movw r1, #0x5402 @ TCSETS ldr r2, =tios @ XXX is this really necessary? movs r7, #0x36 @ ioctl again; XXX or this? svc 0 pop {r4, r7, r9, pc}