PPU frame timing
The following behavior is tested by the ppu_vbl_nmi_timing test ROMs. Only the NTSC PPU is covered, though most probably applies to the PAL PPU.
Even/Odd Frames
- The PPU has an even/odd flag that is toggled every frame, regardless of whether rendering is enabled or disabled.
- With rendering disabled (background and sprites disabled in PPUMASK ($2001)), each PPU frame is 341*262=89342 PPU clocks long. There is no skipped clock every other frame.
- With rendering enabled, each odd PPU frame is one PPU clock shorter than normal. This is done by skipping the first idle tick on the first visible scanline (by jumping directly from (339,261) on the pre-render scanline to (0,0) on the first visible scanline and doing the last cycle of the last dummy nametable fetch there instead; see this diagram).
- By keeping rendering disabled until after the time when the clock is skipped on odd frames, you can get a different color dot crawl pattern than normal (it looks more like that of interlace, where colors flicker between two states rather than the normal three). Presumably Battletoads (and others) encounter this, since it keeps the BG disabled until well after this time each frame.
CPU-PPU Clock Alignment
The NTSC PPU runs at 3 times the CPU clock rate, so for a given power-up PPU events can occur on one of three relative alignments with the CPU clock they fall within. Since the PPU divides the master clock by four, there are actually more than just three alignments possible: The beginning of a CPU tick could be offset by 0-3 master clock ticks from the nearest following PPU tick. The results below only cover one particular set of alignments, namely the one which gives the fewest number of special cases, where a read will see a change to a flag if and only if it starts at or after the PPU tick where the flag changes. (Other alignments might cause the change to be visible 1 PPU tick earlier or later; see this thread.)
Synchronizing the CPU and PPU clocks
If rendering is off, each frame will be 341*262/3 = 29780 2/3 CPU clocks long. If the CPU checks the VBL flag in a loop every 29781 clocks, the read will occur one PPU tick later relative to the start of the frame each frame, until at some point the CPU "catches up" to the location where the flag gets set. At this point, the CPU and PPU synchronization is known down the PPU tick.
PPU 29780.7 29780.7 29780.7 29780.7 -C--P-------C--P-------C-P-------CP-------* CPU 29781 29781 29781 29781 * The above synchronization will result in an exact PPU/CPU alignment. -|--.--V--|- -|--V--.--|- -V--.--.--|- Read Read Read Loop will stop here
VBL Flag Timing
- See also: NMI
- Reading $2002 within a few PPU clocks of when VBL is set results in special-case behavior. Reading one PPU clock before reads it as clear and never sets the flag or generates NMI for that frame. Reading on the same PPU clock or one later reads it as set, clears it, and suppresses the NMI for that frame. Reading two or more PPU clocks before/after it's set behaves normally (reads flag's value, clears it, and doesn't affect NMI operation).
- On an NTSC machine, the VBL flag is cleared 6820 PPU clocks, or exactly 20 scanlines, after it is set. In other words, it's cleared at the start of the pre-render scanline. (TO DO: confirmation on PAL NES and common PAL famiclone)