Cycle counting
It is often useful to delay a specific number of CPU cycles. Timing raster effects or generating PCM audio are some examples that might utilize this. This article outlines a few relevant techniques.
Short delays
Here are few ways to create short delays without side effects. As the shortest instruction time is 2 cycles, it is not possible to delay 1 cycle on its own.
NOP
- 2 cycles, 1 byte, no side effectsJMP *+3
- 3 cycles, 3 bytes, no side effectsBxx *+2
- 3 cycles, 2 bytes, no side effects but requires a known flag state (e.g. BCC if carry is known to be clear)IGN zp
- 3 cycles, 2 bytes, only side effect is a read, unofficial instruction
Clockslide
A clockslide[1] is a sequence of instructions that wastes a small constant amount of cycles plus one cycle per executed byte, no matter whether it's entered on an odd or even address.
With official instructions, one can construct a clockslide from CMP instructions: ... C9 C9 C9 C9 C5 EA
Disassemble from the start and you get CMP #$C9 CMP #$C9 CMP $EA
(6 bytes, 7 cycles).
Disassemble one byte in and you get CMP #$C9 CMP #$C5 NOP
(5 bytes, 6 cycles).
The entry point can be controlled with an indirect jump or the RTS Trick to precisely control raster effect or sample playback timing.
CMP has a side effect of destroying most of the flags, but unofficial instructions that skip one byte can be used to preserve them. For example, replace $C9 (CMP) with $89 or $80, which skips one immediate byte, and replace $C5 with $04, $44, or $64, which reads a byte from zero page and ignores it.