I’m still having fun with my From Nand to Tetris toolchain (https://github.com/mossprescott/pynand).

Just merged 2 experimental CPU variants: one that adds complexity to do more in fewer cycles, and another that’s also more complex, runs slower, but can fit a much larger program in the same ROM space.

Still wondering if there’s a way to make a smaller CPU, but inspiration is lacking so far.