PPU registers: Difference between revisions

From NESdev Wiki
Jump to navigationJump to search
(→‎Color Control: Added % for each RGB component for color emphasis in the RGB color space.)
m (Fixes the anchors; apparently HTML escape codes aren't acceptable.)
 
(54 intermediate revisions by 16 users not shown)
Line 1: Line 1:
The PPU exposes eight memory-mapped registers to the CPU. These nominally sit at $2000 through $2007 in the CPU's address space, but because they're incompletely decoded, they're [[Mirroring|mirrored]] in every 8 bytes from $2008 through $3FFF, so a write to $3456 is the same as a write to $2006.
The [[PPU]] exposes eight memory-mapped registers to the CPU. These nominally sit at $2000 through $2007 in the CPU's address space, but because their addresses are incompletely decoded, they're [[Mirroring#Memory Mirroring|mirrored]] in every 8 bytes from $2008 through $3FFF. For example, a write to $3456 is the same as a write to $2006.
 
Immediately after powerup, the PPU isn't necessarily in a usable state.
The program needs to do a few things to get it going; see [[PPU power up state]] and [[Init code]].  


The PPU starts rendering immediately after power-on or reset, but ignores writes to most registers (specifically $2000, $2001, $2005 and $2006) until reaching the pre-render scanline of the next frame; more specifically, for around 29658 NTSC CPU cycles or 33132 PAL CPU cycles, assuming the CPU and PPU are reset at the same time. See [[PPU power up state]] and [[Init code]] for details.
<noinclude>
<noinclude>
__TOC__
__TOC__
</noinclude>
</noinclude>
== Summary ==
== Summary ==
{| class="tabular"
{| class="tabular"
! Common Name
! Common Name
! Address
! Address
! Bits
! Bits
! Type
! Notes
! Notes
|-
|-
! [[#PPUCTRL|PPUCTRL]]
! [[#PPUCTRL|PPUCTRL]]
! $2000
! $2000
| <tt style="white-space: nowrap">VPHB SINN</tt> || NMI enable (V), PPU master/slave (P), sprite height (H), background tile select (B), sprite tile select (S), increment mode (I), nametable select (NN)
| style="text-align: right" | <tt style="white-space: nowrap">VPHB SINN</tt> || W || [[NMI]] enable (V), PPU master/slave (P), sprite height (H), background tile select (B), sprite tile select (S), increment mode (I), nametable select / X and Y scroll bit 8 (NN)
|-
|-
! [[#PPUMASK|PPUMASK]]
! [[#PPUMASK|PPUMASK]]
! $2001
! $2001
| <tt style="white-space: nowrap">BGRs bMmG</tt> || color emphasis (BGR), sprite enable (s), background enable (b), sprite left column enable (M), background left column enable (m), greyscale (G)
| style="text-align: right" | <tt style="white-space: nowrap">BGRs bMmG</tt> || W || color emphasis (BGR), sprite enable (s), background enable (b), sprite left column enable (M), background left column enable (m), greyscale (G)
|-
|-
! [[#PPUSTATUS|PPUSTATUS]]
! [[#PPUSTATUS|PPUSTATUS]]
! $2002
! $2002
| <tt style="white-space: nowrap">VSO- ----</tt> || vblank (V), sprite 0 hit (S), sprite overflow (O), read resets write pair for $2005/2006
| style="text-align: right" | <tt style="white-space: nowrap">VSO- ----</tt> || R || vblank (V), sprite 0 hit (S), sprite overflow (O); read resets write pair for $2005/$2006
|-
|-
! [[#OAMADDR|OAMADDR]]
! [[#OAMADDR|OAMADDR]]
! $2003
! $2003
| <tt style="white-space: nowrap">aaaa aaaa</tt> || OAM read/write address
| style="text-align: right" | <tt style="white-space: nowrap">AAAA AAAA</tt> || W || [[PPU OAM|OAM]] read/write address
|-
|-
! [[#OAMDATA|OAMDATA]]
! [[#OAMDATA|OAMDATA]]
! $2004
! $2004
| <tt style="white-space: nowrap">dddd dddd</tt> || OAM data read/write
| style="text-align: right" | <tt style="white-space: nowrap">DDDD DDDD</tt> || RW || OAM data read/write
|-
|-
! [[#PPUSCROLL|PPUSCROLL]]
! [[#PPUSCROLL|PPUSCROLL]]
! $2005
! $2005
| <tt style="white-space: nowrap">xxxx xxxx</tt> || fine scroll position (two writes: X, Y)
| style="text-align: right" | <tt style="white-space: nowrap">XXXX XXXX YYYY YYYY</tt> || Wx2 || X and Y scroll bits 7-0 (two writes: X scroll, then Y scroll)
|-
|-
! [[#PPUADDR|PPUADDR]]
! [[#PPUADDR|PPUADDR]]
! $2006
! $2006
| <tt style="white-space: nowrap">aaaa aaaa</tt> || PPU read/write address (two writes: MSB, LSB)
| style="text-align: right" | <tt style="white-space: nowrap">..AA AAAA AAAA AAAA</tt> || Wx2 || VRAM address (two writes: most significant byte, then least significant byte)
|-
|-
! [[#PPUDATA|PPUDATA]]
! [[#PPUDATA|PPUDATA]]
! $2007
! $2007
| <tt style="white-space: nowrap">dddd dddd</tt> || PPU data read/write
| style="text-align: right" | <tt style="white-space: nowrap">DDDD DDDD</tt> || RW || VRAM data read/write
|-
|-
! [[#OAMDMA|OAMDMA]]
! [[#OAMDMA|OAMDMA]]
! $4014
! $4014
| <tt style="white-space: nowrap">aaaa aaaa</tt> || OAM DMA high address
| style="text-align: right" | <tt style="white-space: nowrap">AAAA AAAA</tt> || W || OAM DMA high address
|}
|}


== Ports ==
Register types:
* '''R''' - Readable
* '''W''' - Writeable
* '''x2''' - Internal 2-byte state accessed by two 1-byte accesses
 
{{Anchor|Ports}}
== MMIO registers ==


The PPU has an internal data bus that it uses for communication with the CPU.
The PPU has an internal data bus that it uses for communication with the CPU.
Line 60: Line 63:
Reading any readable port (PPUSTATUS, OAMDATA, or PPUDATA) also fills the latch with the bits read.
Reading any readable port (PPUSTATUS, OAMDATA, or PPUDATA) also fills the latch with the bits read.
Reading a nominally "write-only" register returns the latch's current value, as do the unused bits of PPUSTATUS.
Reading a nominally "write-only" register returns the latch's current value, as do the unused bits of PPUSTATUS.
This value begins to decay after a frame or so, faster once the PPU has warmed up, and it is likely that values with alternating bit patterns (such as $55 or $AA) will decay faster.<ref>[http://forums.nesdev.org/viewtopic.php?p=143801#p143801 Reply to "Riding the open bus"] by lidnariq</ref>
This value begins to decay after a frame or so, faster once the PPU has warmed up, and it is likely that values with alternating bit patterns (such as $55 or $AA) will decay faster.<ref>[//forums.nesdev.org/viewtopic.php?p=143801#p143801 Reply to "Riding the open bus"] by lidnariq</ref>


=== <span id="PPUCTRL"><span id="Reg2000">Controller ($2000) > write</span></span> ===  
{{Anchor|PPUCTRL}}{{Anchor|Reg2000}}{{Anchor|Controller_($2000)_>_write}}
 
=== PPUCTRL - Miscellaneous settings ($2000 write) ===
* Common name: '''PPUCTRL'''
----
* Description: PPU control register
* Access: write
 
Various flags controlling PPU operation
  7  bit  0
  7  bit  0
  ---- ----
  ---- ----
Line 80: Line 79:
  ||||      (0: $0000; 1: $1000; ignored in 8x16 mode)
  ||||      (0: $0000; 1: $1000; ignored in 8x16 mode)
  |||+------ Background pattern table address (0: $0000; 1: $1000)
  |||+------ Background pattern table address (0: $0000; 1: $1000)
  ||+------- Sprite size (0: 8x8; 1: 8x16)
  ||+------- [[Sprite size]] (0: 8x8 pixels; 1: 8x16 pixels – see [[PPU OAM#Byte 1]])
  |+-------- PPU master/slave select
  |+-------- PPU master/slave select
  |          (0: read backdrop from EXT pins; 1: output color on EXT pins)
  |          (0: read backdrop from EXT pins; 1: output color on EXT pins)
  +--------- Generate an [[NMI]] at the start of the
  +--------- [[wikipedia:Vertical blanking interval|Vblank]] [[NMI]] enable (0: off, 1: on)
            [[wikipedia:Vertical blanking interval|vertical blanking interval]] (0: off; 1: on)
 
PPUCTRL (the "control" or "controller" register) contains a mix of settings related to rendering, scroll position, vblank NMI, and dual-PPU configurations. [[PPU power up state|After power/reset]], writes to this register are ignored until the first pre-render scanline.
 
==== Vblank NMI ====


Equivalently, bits 0 and 1 are the most significant bit of the scrolling coordinates (see [[PPU_nametables|Nametables]] and [[#PPUSCROLL|PPUSCROLL]]):
Enabling NMI in PPUCTRL causes the NMI handler to be called at the start of vblank (scanline 241, dot 1). This provides a reliable time source for software so it can run at the display's frame rate, and it signals vblank to the software. Vblank is the only time with rendering enabled that the software can send data to VRAM and OAM, and this NMI is the ''only'' reliable way to detect vblank; polling the vblank flag in [[#PPUSTATUS|PPUSTATUS]] can miss vblank entirely.
 
Changing NMI enable from 0 to 1 while the vblank flag in [[#PPUSTATUS|PPUSTATUS]] is 1 will immediately trigger an NMI. This happens during vblank if the PPUSTATUS register has not yet been read. It can result in graphical glitches by making the NMI routine execute too late in vblank to finish on time, or cause the game to handle more frames than have actually occurred. To avoid this problem, it is prudent to read PPUSTATUS first to clear the vblank flag before enabling NMI in PPUCTRL.
 
==== Scrolling ====
The current nametable bits in PPUCTRL bits 0 and 1 can equivalently be considered the most significant bit of the scroll coordinates, which are 9 bits wide (see [[PPU_nametables|Nametables]] and [[#PPUSCROLL|PPUSCROLL]]):
  7  bit  0
  7  bit  0
  ---- ----
  ---- ----
  .... ..YX
  .... ..YX
         ||
         ||
         |+- 1: Add 256 to the X scroll position
         |+- X scroll position bit 8 (i.e. add 256 to X)
         +-- 1: Add 240 to the Y scroll position
         +-- Y scroll position bit 8 (i.e. add 240 to Y)


Another way of seeing the explanation above is that when you reach the end of a nametable, you must switch to the next one, hence, changing the nametable address.
These two bits go to the same [[#Internal registers|internal t register]] as the values written to [[PPUSCROLL]], and they must be written alongside [[#PPUSCROLL|PPUSCROLL]] in order to fully specify the scroll position.


[[PPU power up state|After power/reset]], writes to this register are ignored for about 30000 cycles.
==== Master/slave mode and the EXT pins ====
 
Bit 6 of PPUCTRL should never be set on stock consoles because it may damage the PPU.
When turning on the NMI flag in bit 7, if the PPU is currently in vertical blank and the [[#PPUSTATUS|PPUSTATUS]] ($2002) vblank flag is set, an NMI will be generated immediately.
This can result in graphical errors (most likely a misplaced scroll) if the NMI routine is executed too late in the blanking period to finish on time.
To avoid this problem it is prudent to read $2002 immediately before writing $2000 to clear the vblank flag.


==== Master/slave mode and the EXT pins ====
When this bit is clear (the usual case), the PPU gets the [[PPU_palettes|palette index]] for the backdrop color from the EXT pins. The stock NES grounds these pins, making palette index 0 the backdrop color as expected. A secondary picture generator connected to the EXT pins would be able to replace the backdrop with a different image using colors from the background palette, which could be used for features such as parallax scrolling.
When bit 6 of PPUCTRL is clear (the usual case), the PPU gets the [[PPU_palettes|palette index]] for the background color from the EXT pins. The stock NES grounds these pins, making palette index 0 the background color as expected. A secondary picture generator connected to the EXT pins would be able to replace the background with a different image using colors from the background palette, which could be used e.g. to implement parallax scrolling.


Setting bit 6 causes the PPU to output the lower four bits of the palette memory index on the EXT pins for each pixel (in addition to normal image drawing) - since only four bits are output, background and sprite pixels can't normally be distinguished this way. As the EXT pins are grounded on an unmodified NES, setting bit 6 is discouraged as it could potentially damage the chip whenever it outputs a non-zero pixel value (due to it effectively shorting Vcc and GND together). Looking at the relevant circuitry in [[Visual 2C02]], it appears that the [[PPU palettes|background palette hack]] would not be functional for output from the EXT pins; they would always output index 0 for the background color.
Setting bit 6 causes the PPU to output the lower four bits of the palette memory index on the EXT pins for each pixel. Since only four bits are output, background and sprite pixels can't normally be distinguished this way. Setting this bit does not affect the image in the PPU's composite video output. As the EXT pins are grounded on an unmodified NES, setting bit 6 is discouraged as it could potentially damage the chip whenever it outputs a non-zero pixel value (due to it effectively shorting Vcc and GND together). Note that EXT output for transparent pixels is not a backdrop color as normal, but rather entry 0 of that background sliver's palette. When rendering is disabled, EXT output is always index 0 regardless of [[PPU palettes|backdrop override]].


==== Bit 0 bus conflict ====
==== Bit 0 race condition ====
Be very careful when writing to this register outside vertical blanking if you are using vertical mirroring (horizontal arrangement) or 4-screen VRAM.
Be careful when writing to this register outside vblank if using a horizontal nametable arrangement (a.k.a. vertical mirroring) or 4-screen VRAM.
For specific CPU-PPU alignments, [http://forums.nesdev.org/viewtopic.php?p=112424#p112424 a write near the end of a visible scanline] may cause only the next scanline to be erroneously drawn from the left nametable.
For specific CPU-PPU alignments, [//forums.nesdev.org/viewtopic.php?p=112424#p112424 a write that starts] on [[PPU scrolling#At dot 257 of each scanline|dot 257]] will cause only the next scanline to be erroneously drawn from the left nametable.
This can cause a visible glitch.
This can cause a visible glitch, and it can also interfere with sprite 0 hit for that scanline (by being drawn with the wrong background).
Worse, it can theoretically cause a sprite 0 hit to fail, which may crash a game using a sprite 0 spin loop that's not resilient.


Only writes at the exact moment between active picture and horizontal blanking cause this glitch; well-timed mid-scanline writes do not, nor do writes that land well within horizontal blanking.
The glitch has no effect in horizontal or one-screen mirroring because the left and right nametables are identical.
The glitch has no effect in horizontal or one-screen mirroring.
Only writes that start on dot 257 and continue through dot 258 can cause this glitch: any other horizontal timing is safe.
It also does not appear if bit 0 of the written value is 0; this always correctly sets the left nametable.
The glitch specifically writes the value of open bus to the register, which will almost always be the upper byte of the address. Writing to this register or the mirror of this register at $2100 according to the desired nametable appears to be a [//forums.nesdev.org/viewtopic.php?p=230434#p230434 functional workaround].


This produces an occasionally [[Game bugs|visible glitch]] in ''Super Mario Bros.'' when the program writes to PPUCTRL at the end of game logic.
This produces an occasionally [[Game bugs|visible glitch]] in ''Super Mario Bros.'' when the program writes to PPUCTRL at the end of game logic.
It appears to be turning NMI off during game logic and then turning NMI back on once the game logic has finished in order to prevent the NMI handler from being called again before the game logic finishes.
It appears to be turning NMI off during game logic and then turning NMI back on once the game logic has finished in order to prevent the NMI handler from being called again before the game logic finishes.
To work around this in new productions, have your game logic set a flag that your NMI handler checks.
Another workaround is to use a software flag to prevent NMI reentry, instead of using the PPU's NMI enable.
 
=== <span id="PPUMASK"><span id="Reg2001">Mask ($2001) > write</span></span> ===
 
* Common name: '''PPUMASK'''
* Description: PPU mask register
* Access: write
 
This register controls the rendering of sprites and backgrounds, as well as colour effects.


{{Anchor|PPUMASK}}{{Anchor|Reg2001}}{{Anchor|Mask_($2001)_>_write}}
=== PPUMASK - Rendering settings ($2001 write) ===
----
  7  bit  0
  7  bit  0
  ---- ----
  ---- ----
  BGRs bMmG
  BGRs bMmG
  |||| ||||
  |||| ||||
  |||| |||+- Greyscale (0: normal color, 1: produce a greyscale display)
  |||| |||+- Greyscale (0: normal color, 1: greyscale)
  |||| ||+-- 1: Show background in leftmost 8 pixels of screen, 0: Hide
  |||| ||+-- 1: Show background in leftmost 8 pixels of screen, 0: Hide
  |||| |+--- 1: Show sprites in leftmost 8 pixels of screen, 0: Hide
  |||| |+--- 1: Show sprites in leftmost 8 pixels of screen, 0: Hide
  |||| +---- 1: Show background
  |||| +---- 1: Enable background rendering
  |||+------ 1: Show sprites
  |||+------ 1: Enable sprite rendering
  ||+------- Emphasize red*
  ||+------- Emphasize red (green on PAL/Dendy)
  |+-------- Emphasize green*
  |+-------- Emphasize green (red on PAL/Dendy)
  +--------- Emphasize blue*
  +--------- Emphasize blue
 
PPUMASK (the "mask" register) controls the rendering of sprites and backgrounds, as well as color effects. [[PPU power up state|After power/reset]], writes to this register are ignored until the first pre-render scanline.


<nowiki>*</nowiki> NTSC colors. PAL and Dendy swaps green and red<ref>PAL PPU swaps green and red emphasis bits: http://forums.nesdev.org/viewtopic.php?p=131889#p13188</ref><ref>Dendy PPU swaps green and red emphasis bits: http://forums.nesdev.org/viewtopic.php?p=155513#p155513</ref>.
Most commonly, PPUMASK is set to $00 outside of gameplay to allow transferring a large amount of data to VRAM, and $1E during gameplay to enable all rendering with no color effects.


==== Render Control ====
==== Rendering control ====


* Bits 3 and 4 enable the rendering of background and sprites, respectively.
Rendering is the PPU's process of actively fetching memory and drawing an image to the screen. Rendering as a whole is enabled as long as one or both of sprite and background rendering is enabled in PPUMASK. If one component is enabled and the other is not, the disabled component is simply treated as transparent; the rendering process is otherwise unaffected. When both components are disabled via bits 3 and 4, the rendering process stops and the PPU displays the backdrop color.


* Bits 1 and 2 enable rendering of the background and sprites in the leftmost 8 pixel columns. Setting these bits to 0 will mask these columns, which is often useful in horizontal scrolling situations where you want partial sprites or tiles to scroll in from the left.
During rendering, the PPU is actively using VRAM and OAM. This prevents the CPU from being able to access VRAM via [[#PPUDATA|PPUDATA]] or OAM via [[#OAMDATA|OAMDATA]], so these accesses must be done outside of rendering: either during vblank (for data transfers during gameplay) or with rendering turned off (for large data transfers, such as when loading a level). To avoid numerous hardware bugs and limitations, it is generally recommended that rendering be turned on or off only during vblank. This can be done by writing the desired PPUMASK value to a variable rather than the register itself and then only copying that variable to PPUMASK during vblank in the NMI handler.


* A value of $1E enables all rendering, with no color effects. A value of $00 disables all rendering. It is usually best practice to write this register only during vblank, to prevent partial-frame visual artifacts.
The PPU can optionally hide sprites and backgrounds in just the leftmost 8 pixels of the screen, making them transparent and thus drawing the backdrop color there. For sprites, this can be useful to avoid sprite pop-in, a limitation where sprites cannot partially hang off the left edge of the screen like they can off the right edge. For backgrounds, this can eliminate tile artifacts and reduce attribute artifacts when scrolling horizontally with either a vertical or one-screen nametable arrangement, as these arrangements do not allow hiding the scroll seam off-screen. Note that the backdrop color may not match the color used by the art for the background, so disabling the left column may be more distracting than minor artifacts.


* If either of bits 3 or 4 is enabled, at any time outside of the vblank interval the PPU will be making continual use to the PPU address and data bus to fetch tiles to render, as well as internally fetching sprite data from the OAM. If you wish to make changes to PPU memory outside of vblank (via '''$2007'''), you must set ''both'' of these bits to 0 to disable rendering and prevent conflicts.
Notes:
* Writing to PPUDATA during rendering can corrupt VRAM, so writes must be done in vblank or with rendering disabled in PPUMASK bits 3 and 4.
* Sprite 0 hit does not trigger in any area where the background or sprites are disabled.
* Toggling rendering takes effect approximately 3-4 dots after the write. This delay is required by Battletoads to avoid a crash.
* Toggling rendering mid-screen often corrupts 1 row of OAM and draws incorrect sprites for the current and next scanline. (See: [[Errata#OAM and Sprites|Errata]])
* Turning rendering off mid-screen can corrupt palette RAM if the low 14 bits of the [[#Internal registers|internal v register]] have a value between $3C00-$3FFF.
* Turning rendering on late causes the dot at the end of pre-render to never be skipped, which can cause dot crawl on stationary screens.
* Turning rendering on late causes the PPU to have an incorrect scroll value unless it is [[PPU_scrolling#Split_X/Y_scroll|set manually with a complicated series of writes]].


* Disabling rendering (clear both bits 3 and 4) during a visible part of the frame can be problematic. It can cause a corruption of the sprite state, which will display incorrect sprite data on the next frame. (See: [[Errata]]) It is, however, perfectly fine to mask sprites but leave the background on (set bit 3, clear bit 4) at any time in the frame.
==== Color control ====


* Sprite 0 hit does not trigger in any area where the background or sprites are hidden.
Greyscale mode forces all colors to be a shade of grey or white. This is done by bitwise ANDing the color with $30, causing all colors to come from the grey column ($00, $10, $20, $30), which notably lacks a black color. Note that this AND behavior means that RGB PPUs with scrambled colors (the 2C04 series) do not actually get shades of grey, but rather whatever colors are in the $x0 column. When reading from palette RAM, the returned value reflects this AND behavior, but the underlying data is preserved. Palette writes function normally regardless of greyscale mode.


==== Color Control ====
[[Color emphasis]] causes a color tint effect that works by darkening the other two color components, making the selected component comparatively brighter and thus emphasized. Emphasizing all 3 components simply dims all colors. This works independently of greyscale, allowing greys to be tinted. Note that PAL and Dendy PPUs have a different emphasis bit order, so ports and dual-region games should reorder the bits. Furthermore, emphasis on RGB PPUs is completely different, instead maximizing the brightness of the emphasized component and producing a completely white screen when all components are emphasized. RGB emphasis is far less useful and generally best avoided.


* Bit 0 controls a greyscale mode, which causes the palette to use only the colors from the grey column: $00, $10, $20, $30. This is implemented as a bitwise AND with $30 on any value read from PPU $3F00-$3FFF, both on the display and through [[#PPUDATA|PPUDATA]]. Writes to the palette through [[#PPUDATA|PPUDATA]] are not affected. Also note that black colours like $0F will be replaced by a non-black grey $00.
{{Anchor|PPUSTATUS}}{{Anchor|Reg2002}}{{Anchor|Status_($2002)_<_read}}
=== PPUSTATUS - Rendering events ($2002 read) ===
----
7  bit  0
---- ----
VSOx xxxx
|||| ||||
|||+-++++- ([[Open_bus_behavior#PPU_open_bus|PPU open bus]] or 2C05 PPU identifier)
||+------- [[PPU_sprite_evaluation#Sprite_overflow_bug|Sprite overflow]] flag
|+-------- [[PPU_OAM#Sprite_zero_hits|Sprite 0 hit]] flag
+--------- Vblank flag, cleared on read. <u>'''Unreliable'''</u>; see below.


* Bits 5,6,7 control a color "emphasis" or "tint" effect. Each bit emphasizes 1 color while darkening the other two. Setting all three emphasis bits will darken all colors.
PPUSTATUS (the "status" register) reflects the state of rendering-related events and is primarily used for timing. The three flags in this register are automatically cleared on dot 1 of the prerender scanline; see [[PPU rendering]] for more information on the set and clear timing.
** Bit 5 emphasizes red on the NTSC PPU, and green on the PAL & Dendy PPUs.
** Bit 6 emphasizes green on the NTSC PPU, and red on the PAL & Dendy PPUs.
** Bit 7 emphasizes blue on the NTSC, PAL, & Dendy PPUs.
** See [[NTSC video]] for a description of how bits 5-7 work on NTSC and PAL PPUs.
** The [[Vs. System|RGB PPU]] used by PlayChoice and some other systems treat the emphasis bits differently. Instead of darkening other RGB components, it forces one component to maximum brightness. [[Colour-emphasis games|A few games]], which set all three tint bits to darken all colors, are unplayable on these PPUs.


* The emphasis bits are applied independently of greyscale, so they will still tint the color of the grey image.
Reading this register has the side effect of clearing the PPU's [[#Internal registers|internal w register]]. It is commonly read before writes to [[#PPUSCROLL|PPUSCROLL]] and [[#PPUADDR|PPUADDR]] to ensure the writes occur in the correct order.


* From a RGB color space, these values are used to create the color emphasis effect. Note that emphasis red+green becomes yellow, and so on.
==== Vblank flag ====


$2001:$E0      %Red  %Green  %Blue
The vblank flag is set at the start of vblank (scanline 241, dot 1). Reading PPUSTATUS will return the current state of this flag and then clear it. If the vblank flag is not cleared by reading, it will be cleared automatically on dot 1 of the prerender scanline.
001 Red        1.239  .915    .743
010 Green      .794  1.086    .882
011 Yellow    1.019  .980    .653
100 Blue        .905  1.026  1.277
101 Magenta    1.023  .908    .979
110 Cyan        .741  .987  1.001
111 Black      .750  .750    .750


=== <span id="PPUSTATUS"><span id="Reg2002">Status ($2002) < read</span></span> ===
<u>'''Reading the vblank flag is not a reliable way to detect vblank. [[NMI thread|NMI]] should be used, instead.'''</u> Reading the flag on the dot before it is set (scanling 241, dot 0) causes it to read as 0 and be cleared, so polling PPUSTATUS for the vblank flag can miss vblank and cause games to stutter. NMI is also suppressed when this occurs, and may even be suppressed by reads landing on the following dot or two. On NTSC and PAL, it is guaranteed that the flag cannot be dropped two frames in a row, but on Dendy, it is possible for it to [[PPU_power_up_state#Dendy|happen every frame]], crashing the game. Using NMI ensures that software correctly detects vblank every frame. It is also required by PlayChoice-10, which will reject the game if NMI is disabled for too long. Polling the vblank flag is still required while booting up the console, but timing at this point is not critical (see [[Init code]] for more information on booting safely).


* Common name: '''PPUSTATUS'''
The vblank flag is used in the generation of NMI, and enabling NMI while this flag is 1 will cause an immediate NMI (see [[PPU_registers#Vblank_NMI|PPUCTRL]]).
* Description: PPU status register
* Access: read


This register reflects the state of various functions inside the PPU.
==== Sprite 0 hit flag ====
It is often used for determining timing.
<span id="Sprite_0">To determine when the PPU has reached a given pixel of the screen, put an opaque pixel of sprite 0 there.</span>


7  bit  0
[[PPU_OAM#Sprite_zero_hits|Sprite 0 hit]] is a hardware collision detection feature that detects pixel-perfect collision between the first sprite in OAM (sprite 0) and the background. The sprite 0 hit flag is immediately set when any opaque pixel of sprite 0 overlaps any opaque pixel of background, regardless of sprite priority. 'Opaque' means that the pixel is not 'transparent' — that is, its [[PPU_pattern_tables|two pattern bits]] are not %00. The flag stays set until dot 1 of the prerender scanline; thus, it can only detect one collision per frame.
---- ----
 
VSO. ....
Although this flag detects collision, it is primarily used for timing. Many games place sprite 0 at a fixed location on the screen and poll this flag until it becomes set. This allows the CPU to know its approximate location on the screen so it can time mid-screen writes to hardware registers. Commonly, this is used to change the scroll position mid-screen to allow for a background-based HUD, like in ''Super Mario Bros.'' However, some modern homebrew games use this for actual collision, such as [https://forums.nesdev.org/viewtopic.php?t=15850 ''Lunar Limit''] and [https://fiskbit.itch.io/irritating-ship ''Irritating Ship''].
|||| ||||
 
|||+-++++- Least significant bits previously written into a PPU register
Sprite 0 hit cannot detect collision at X=255, nor anywhere where either sprites or backgrounds are disabled via [[#PPUMASK|PPUMASK]]. This includes X=0..7 when the leftmost 8 pixels are hidden. However, it is not affected by the cropping on the left and right edges on PAL.
|||        (due to register not being updated for this address)
||+------- Sprite overflow. The intent was for this flag to be set
||        whenever more than eight sprites appear on a scanline, but a
||        hardware bug causes the actual behavior to be more complicated
||        and generate false positives as well as false negatives; see
||        [[PPU sprite evaluation]]. This flag is set during sprite
||        evaluation and cleared at dot 1 (the second dot) of the
||        pre-render line.
|+-------- Sprite 0 Hit. Set when a nonzero pixel of sprite 0 overlaps
|          a nonzero background pixel; cleared at dot 1 of the pre-render
|          line. Used for raster timing.
+--------- Vertical blank has started (0: not in vblank; 1: in vblank).
            Set at dot 1 of line 241 (the line *after* the post-render
            line); cleared after reading $2002 and at dot 1 of the
            pre-render line.


==== Notes ====
There are some important considerations when using this flag for timing:
* Reading the status register will clear D7 mentioned above and also the address latch used by [[#PPUSCROLL|PPUSCROLL]] and [[#PPUADDR|PPUADDR]]. It does not clear the sprite 0 hit or overflow bit.
* Because sprite 0 hit is not cleared until the prerender scanline, software can potentially mistake the previous frame's hit as being from the current frame. Therefore, it may be necessary to poll the flag until it becomes clear before then polling for it to be set again.
* Once the sprite 0 hit flag is set, it will not be cleared until the end of the next vertical blank. If attempting to use this flag for raster timing, it is important to ensure that the sprite 0 hit check happens outside of vertical blank, otherwise the CPU will "leak" through and the check will fail.  The easiest way to do this is to place an earlier check for D6 = 0, which will wait for the pre-render scanline to begin.
* If a game expects sprite 0 hit to occur and it does not, this often results in a crash. If there is any risk that the hit may not occur (perhaps because an overlap may not happen when scrolling or because it relies on precise mid-screen timings that may vary across power cycles, consoles, or emulators), it can be critical to have another way to exit the poll loop. For example, this may be done by also polling the vblank flag or having the NMI handler check if the game is still polling for sprite 0 hit.
* If using sprite 0 hit to make a bottom scroll bar below a vertically scrolling or freely scrolling playfield, be careful to ensure that the tile in the playfield behind sprite 0 is opaque.
* Games often don't handle sprite 0 hit on lag frames, preventing the mid-screen event from occurring. A common result of this is HUD flickering during lag. Handling sprite 0 hit in the NMI handler, at least on lag frames, can work around this.
* Sprite 0 hit is not detected at x=255, nor is it detected at x=0 through 7 if the background or sprites are hidden in this area.
* See: [[PPU rendering]] for more information on the timing of setting and clearing the flags.
* Some [[Vs. System]] PPUs return a constant value in D4-D0 that the game checks.
* '''Caution:''' Reading PPUSTATUS at the exact start of vertical blank will return 0 in bit 7 but clear the latch anyway, causing the program to miss frames. See [[NMI]] for details.


=== <span id="OAMADDR"><span id="Reg2003">OAM address ($2003) > write</span></span> ===  
==== Sprite overflow flag ====


* Common name: '''OAMADDR'''
The sprite overflow flag was intended to be set any time there are more than 8 sprites on a scanline. Unfortunately, the logic for detecting this does not work correctly, resulting in the PPU checking incorrect indices in OAM when searching for a 9th sprite. This produces both false positives and false negatives. See [[PPU sprite evaluation#Sprite_overflow_bug|PPU sprite evalution]] for details on its incorrect behavior. In practice, sprite overflow is usually used for timing like sprite 0 hit, but because of its buggy behavior and its cost of 9 sprite tiles, it is generally only used when more than one timing source is required. Like sprite 0 hit, this flag is cleared at the start of the prerender scanline and can only be set once per frame.
* Description: OAM address port
* Access: write


Write the address of [[PPU OAM|OAM]] you want to access here.  Most games just write $00 here and then use [[#OAMDMA|OAMDMA]]. (DMA is implemented in the 2A03/7 chip and works by repeatedly writing to [[#OAMDATA|OAMDATA]])
Using sprite overflow is often a last resort. When mapper IRQs are not available, the [[APU_DMC#Usage_of_DMC_for_syncing_to_video|DMC IRQ]] can be an effective alternative for timing, albeit complicated to use.


==== Values during rendering ====
==== 2C05 identifier ====


OAMADDR is set to 0 during each of ticks 257-320 (the sprite tile loading interval) of the pre-render and visible scanlines.
The 2C05 series of arcade PPUs returns an identifier in bits 4-0 instead of PPU open bus. This value is checked by games as a form of copy protection. Note that this does not apply to the consumer 2C05-99, which returns open bus as usual. While we haven't yet collected data directly from the PPUs, 2C05 games expect the following values:


The value of OAMADDR when sprite evaluation starts at tick 65 of the visible scanlines will determine where in OAM sprite evaluation starts, and hence which sprite gets treated as sprite 0. The first OAM entry to be checked during sprite evaluation is the one starting at <tt>OAM[OAMADDR]</tt>. If OAMADDR is unaligned and does not point to the y position (first byte) of an OAM entry, then whatever it points to (tile index, attribute, or x coordinate) will be reinterpreted as a y position, and the following bytes will be similarly reinterpreted. No more sprites will be found once the end of OAM is reached, effectively hiding any sprites before <tt>OAM[OAMADDR]</tt>.
{| class="tabular"
! PPU
! Mask
! Value
|-
| 2C05-02
| $3F
| $3D
|-
| 2C05-03
| $1F
| $1C
|-
| 2C05-04
| $1F
| $1B
|}


==== OAMADDR precautions ====
{{Anchor|OAMADDR}}{{Anchor|Reg2003}}{{Anchor|OAM_address_($2003)_>_write}}
=== OAMADDR - Sprite RAM address ($2003 write) ===
----
7  bit  0
---- ----
AAAA AAAA
|||| ||||
++++-++++- OAM address


On the 2C02, writes to OAMADDR reliably corrupt OAM<!--, copying a pair of sprites from the old address value to the open-bus address high-byte of the OAMADDR write-->.<ref name = "OAMglitch">Manual OAM write glitchyness: http://forums.nesdev.org/viewtopic.php?f=2&t=10189</ref> This can then be worked around by writing all 256 bytes of OAM.
Write the address of [[PPU OAM|OAM]] you want to access here. Most games just write $00 here and then use [[#OAMDMA|OAMDMA]]. (DMA is implemented in the 2A03/7 chip and works by repeatedly writing to [[#OAMDATA|OAMDATA]])


It is also the case that if OAMADDR is not less than eight when rendering starts, the eight bytes starting at <tt>OAMADDR & 0xF8</tt> are copied to the first eight bytes of OAM; it seems likely that this is related. The former bug is known to have been fixed in the 2C07; the latter is suspected to be. On the Dendy, the latter bug is required for 2C02 compatibility.
==== Values during rendering ====


=== <span id="OAMDATA"><span id="Reg2004">OAM data ($2004) <> read/write</span></span> ===
OAMADDR is set to 0 during each of ticks 257–320 (the sprite tile loading interval) of the pre-render and visible scanlines. This also means that at the end of a normal complete rendered frame, OAMADDR will always have returned to 0.


* Common name: '''OAMDATA'''
If rendering is enabled mid-scanline<ref name="OAMADDR Clarification"/>, there are further consequences of an OAMADDR that was not set to 0 before OAM sprite evaluation begins at tick 65 of the visible scanline. The value of OAMADDR at this tick determines the starting address for sprite evaluation for this scanline, which can cause the sprite at OAMADDR to be treated as it was sprite 0, both for [[sprite-0 hit]] and priority. If OAMADDR is unaligned and does not point to the Y position (first byte) of an OAM entry, then whatever it points to (tile index, attribute, or X coordinate) will be reinterpreted as a Y position, and the following bytes will be similarly reinterpreted. No more sprites will be found once the end of OAM is reached, effectively hiding any sprites before the starting OAMADDR.
* Description: OAM data port
* Access: read, write


Write OAM data here. Writes will increment [[#OAMADDR|OAMADDR]] after the write; reads during vertical or forced blanking return the value from OAM at that address but do not increment.
==== OAMADDR precautions ====


Because changes to OAM should normally be made only during vblank, writing through OAMDATA is only effective for partial updates (it is too slow). Most games will use the DMA feature through [[#OAMDMA|OAMDMA]] instead.
On the 2C02G, writes to OAMADDR corrupt OAM. The exact corruption isn't fully described, but this usually seems to copy sprites 8 and 9 (address $20) over the 8-byte row at the target address. The source address for this copy seems to come from the previous value on the CPU BUS (most often $20 from the $2003 operand).<ref name="OAMADDR Clarification">[//forums.nesdev.org/viewtopic.php?p=285674#p285674 OAMDATA $2003 corruption clarification?] - forum thread</ref><ref name = "OAMglitch">[//forums.nesdev.org/viewtopic.php?t=10189 Manual OAM write glitchyness] thread by blargg</ref> There may be other possible behaviors as well. This can then be worked around by writing all 256 bytes of OAM, though due to the limited time before [[PPU OAM#Dynamic RAM decay|OAM decay]] will begin this should normally be done through OAMDMA.


* Reading OAMDATA while the PPU is rendering will expose internal OAM accesses during sprite evaluation and loading; Micro Machines does this.
It is also the case that if OAMADDR is not less than eight when rendering starts, the eight bytes starting at <tt>OAMADDR & 0xF8</tt> are copied to the first eight bytes of OAM; it seems likely that this is related. On the Dendy, the latter bug is required for 2C02 compatibility.


* Writes to OAMDATA during rendering (on the pre-render line and the visible lines 0-239, provided either sprite or background rendering is enabled) do not modify values in OAM, but do perform a glitchy increment of [[#OAMADDR|OAMADDR]], bumping only the high 6 bits (i.e., it bumps the ''[n]'' value in [[PPU sprite evaluation]] - it's plausible that it could bump the low bits instead depending on the current status of sprite evaluation). This extends to DMA transfers via [[#OAMDMA|OAMDMA]], since that uses writes to $2004. For emulation purposes, it is probably best to completely ignore writes during rendering.
It is known that in the 2C03, 2C04, 2C05<ref name="noOAMglitch">[//forums.nesdev.org/viewtopic.php?p=179676#p179676 Writes to $2003 appear to not cause OAM corruption] post by lidnariq</ref>, and 2C07, OAMADDR works as intended. It is not known whether this bug is present in all revisions of the 2C02.


* It used to be thought that reading from this register wasn't reliable<ref>$2004 reading reliable? http://forums.nesdev.org/viewtopic.php?f=2&t=6424</ref>, however more recent evidence seems to suggest that this is solely due to corruption by [[#OAMADDR|OAMADDR]] writes.
{{Anchor|OAMDATA}}{{Anchor|Reg2004}}{{Anchor|OAM_data_($2004)_<>_read/write}}
=== OAMDATA - Sprite RAM data ($2004 read/write) ===
----
7  bit  0
---- ----
DDDD DDDD
|||| ||||
++++-++++- OAM data


* In the oldest instantiations of the PPU, as found on earlier Famicoms and NESes, this register is not readable<ref>$2003 not readable on early revisions: http://forums.nesdev.org/viewtopic.php?p=62137#p62137</ref>. The readability was added on the RP2C02G, found on most NESes and later Famicoms.<ref>hardware revisions and $2003 reads: http://forums.nesdev.org/viewtopic.php?f=2&t=12958&start=45#p150926</ref>
Write OAM data here. Writes will increment [[#OAMADDR|OAMADDR]] after the write; reads do not. Reads during vertical or forced blanking return the value from OAM at that address.


* In the 2C07, sprite evaluation can ''never'' be fully disabled, and will always start 20 scanlines after the start of vblank<ref>2C07 PPU sprite evaluation notes: http://forums.nesdev.org/viewtopic.php?f=9&t=11041</ref> (same as when the prerender scanline would have been on the 2C02). As such, you must upload anything to OAM that you intend to within the first 20 scanlines after the 2C07 signals vertical blanking.
'''Do not write directly to this register in most cases.''' Because changes to OAM should normally be made only during vblank, writing through OAMDATA is only effective for partial updates, as it is too slow to update all of OAM within one vblank interval, and as described above, partial writes cause corruption. Most games use the DMA feature through [[#OAMDMA|OAMDMA]] instead.


=== <span id="PPUSCROLL"><span id="Reg2005">Scroll ($2005) >> write x2</span></span> ===
* Reading OAMDATA while the PPU is rendering will expose internal OAM accesses during sprite evaluation and loading; ''Micro Machines'' does this.
* Writes to OAMDATA during rendering (on the pre-render line and the visible lines 0–239, provided either sprite or background rendering is enabled) do not modify values in OAM, but do perform a glitchy increment of [[#OAMADDR|OAMADDR]], bumping only the high 6 bits (i.e., it bumps the ''[n]'' value in [[PPU sprite evaluation]] – it's plausible that it could bump the low bits instead depending on the current status of sprite evaluation). This extends to DMA transfers via [[#OAMDMA|OAMDMA]], since that uses writes to $2004. For emulation purposes, it is probably best to completely ignore writes during rendering.
* It used to be thought that reading from this register wasn't reliable<ref>[//forums.nesdev.org/viewtopic.php?t=6424 $2004 reading reliable?] thread by blargg</ref>, however more recent evidence seems to suggest that this is solely due to corruption by [[#OAMADDR|OAMADDR]] writes.
* In the oldest instantiations of the PPU, as found on earlier Famicoms and NESes, this register is not readable<ref>[//forums.nesdev.org/viewtopic.php?p=62137#p62137 $2004 not readable on early revisions] reply by jsr</ref>. The readability was added on the RP2C02G, found on most NESes and later Famicoms.<ref>[//forums.nesdev.org/viewtopic.php?p=150926#p150926 hardware revisions and $2004 reads] reply by Great Hierophant</ref>
* In the 2C07, sprite evaluation can ''never'' be fully disabled, and will always start 24 scanlines after the start of vblank<ref>[//forums.nesdev.org/viewtopic.php?t=11041 2C07 PPU sprite evaluation notes] thread by thefox</ref> (same as when the prerender scanline would have been on the 2C02). As such, any updates to OAM should be done within the first 24 scanlines after the 2C07 signals vertical blanking.


* Common name: '''PPUSCROLL'''
{{Anchor|PPUSCROLL}}{{Anchor|Reg2005}}{{Anchor|Scroll_($2005)_>>_write_x2}}
* Description: PPU scrolling position register
=== PPUSCROLL - X and Y scroll ($2005 write) ===
* Access: write twice
----
1st write
7  bit  0
---- ----
XXXX XXXX
|||| ||||
++++-++++- X scroll bits 7-0 (bit 8 in PPUCTRL bit 0)
2nd write
7  bit  0
---- ----
YYYY YYYY
|||| ||||
++++-++++- Y scroll bits 7-0 (bit 8 in PPUCTRL bit 1)
 
This register is used to change the [[PPU scrolling|scroll position]], telling the PPU which pixel of the nametable selected through [[#PPUCTRL|PPUCTRL]] should be at the top left corner of the rendered screen. PPUSCROLL takes two writes: the first is the X scroll and the second is the Y scroll. Whether this is the first or second write is tracked internally by the [[#Internal_registers|w register]], which is shared with [[#PPUADDR|PPUADDR]]. Typically, this register is written to during vertical blanking to make the next frame start rendering from the desired location, but it can also be modified during rendering in order to split the screen. Changes made to the vertical scroll during rendering will only take effect on the next frame. Together with the nametable bits in PPUCTRL, the scroll can be thought of as 9 bits per component, and PPUCTRL must be updated along with PPUSCROLL to fully specify the scroll position.


This register is used to change the [[PPU scrolling|scroll position]], that is, to tell the PPU which pixel of the nametable selected through [[#PPUCTRL|PPUCTRL]] should be at the top left corner of the rendered screen. Typically, this register is written to during vertical blanking, so that the next frame starts rendering from the desired location, but it can also be modified during rendering in order to split the screen. Changes made to the vertical scroll during rendering will only take effect on the next frame.
{{mbox
| type = warning
| text = <font size=+1>The PPU scroll registers [[PPU_scrolling#PPU_internal_registers|share internal state]] with the PPU address registers. Because of this, PPUSCROLL and the nametable bits in PPUCTRL should be written ''after'' any writes to PPUADDR.}}


After reading [[#PPUSTATUS|PPUSTATUS]] to reset the address latch, write the horizontal and vertical scroll offsets here just before turning on the screen:
After reading [[#PPUSTATUS|PPUSTATUS]] to clear [[#Internal_registers|w (the write latch)]], write the horizontal and vertical scroll offsets to PPUSCROLL just before turning on the screen:


  ; Set the high bit of X and Y scroll.
  lda ppuctrl_value
  ora current_nametable
  sta PPUCTRL
  ; Set the low 8 bits of X and Y scroll.
   bit PPUSTATUS
   bit PPUSTATUS
  ; possibly other code goes here
   lda cam_position_x
   lda cam_position_x
   sta PPUSCROLL
   sta PPUSCROLL
Line 279: Line 312:
   sta PPUSCROLL
   sta PPUSCROLL


Horizontal offsets range from 0 to 255. "Normal" vertical offsets range from 0 to 239, while values of 240 to 255 are treated as -16 through -1 in a way, but tile data is incorrectly fetched from the attribute table.
Horizontal offsets range from 0 to 255. "Normal" vertical offsets range from 0 to 239, while values of 240 to 255 cause the attributes data at the end of the current nametable to be used incorrectly as tile data. The PPU normally skips from 239 to 0 of the next nametable automatically, so these "invalid" scroll positions only occur if explicitly written.
 
By changing the values here across several frames and writing tiles to newly revealed areas of the nametables, one can achieve the effect of a camera panning over a large background.


=== <span id="PPUADDR"><span id="Reg2006">Address ($2006) >> write x2</span></span> ===
By changing the scroll values here across several frames and writing tiles to newly revealed areas of the nametables, one can achieve the effect of a camera panning over a large background.


* Common name: '''PPUADDR'''
{{Anchor|PPUADDR}}{{Anchor|Reg2006}}{{Anchor|Address_($2006)_>>_write_x2}}
* Description: PPU address register
=== PPUADDR - VRAM address ($2006 write) ===
* Access: write twice
----
1st write 2nd write
15 bit  8  7  bit  0
---- ----  ---- ----
..AA AAAA  AAAA AAAA
  || ||||  |||| ||||
  ++-++++--++++-++++- VRAM address


Because the CPU and the PPU are on separate buses, neither has direct access to the other's memory.
Because the CPU and the PPU are on separate buses, neither has direct access to the other's memory. The CPU writes to VRAM through a pair of registers on the PPU by first loading an address into [[#PPUADDR|PPUADDR]] and then writing data repeatedly to [[#PPUDATA|PPUDATA]]. The VRAM address only needs to be set once for every series of data writes because each PPUDATA access automatically increments the address by 1 or 32, as configured in [[#PPUCTRL|PPUCTRL]].
The CPU writes to VRAM through a pair of registers on the PPU.
First it loads an address into [[#PPUADDR|PPUADDR]], and then it writes repeatedly to [[#PPUDATA|PPUDATA]] to fill VRAM.


After reading [[#PPUSTATUS|PPUSTATUS]] to reset the address latch, write the 16-bit address of VRAM you want to access here, upper byte first.
The 16-bit address is written to PPUADDR one byte at a time, high byte first. Whether this is the first or second write is tracked by the PPU's [[#Internal_registers|internal w register]], which is shared with [[#PPUSCROLL|PPUSCROLL]]. If w is not 0 or its state is not known, it must be cleared by reading [[#PPUSTATUS|PPUSTATUS]] before writing the address. For example, to set the VRAM address to $2108 after w is known to be 0:
For example, to set the VRAM address to $2108:


   lda #$21
   lda #$21
Line 301: Line 335:
   sta PPUADDR
   sta PPUADDR


Valid addresses are $0000-$3FFF; higher addresses will be [[mirroring|mirrored]] down.
The [[PPU_memory_map|PPU address space]] is 14-bit, spanning $0000–$3FFF. Bits 14 and 15 of the value written to this register are ignored. However, bit 14 of the [[#Internal_registers|internal t register]] that holds the data written to PPUADDR is forced to 0 when writing the PPUADDR high byte. This detail doesn't matter when using PPUADDR to set a VRAM address, but is an important limitation when using it to control mid-screen scrolling (see [[PPU scrolling]] for more information).


==== note ====
==== Note ====
Access to [[#PPUSCROLL|PPUSCROLL]] and [[#PPUADDR|PPUADDR]] during screen refresh produces interesting raster effects; the starting position of each scanline can be set to any pixel position in nametable memory. For more information, see [[PPU scrolling]] and tokumaru's sample code on the BBS.<ref>PPU synchronization from NMI: http://forums.nesdev.org/viewtopic.php?p=64111#p64111</ref>
Access to [[#PPUSCROLL|PPUSCROLL]] and [[#PPUADDR|PPUADDR]] during screen refresh produces interesting raster effects; the starting position of each scanline can be set to any pixel position in nametable memory. For more information, see [[PPU scrolling]].


''' Editor's note:''' Last comment about external page should be re-directed to the getting started section instead.
==== Palette corruption ====
In specific circumstances, entries of the PPU's palette can be corrupted. It's unclear exactly how or why this happens, but all revisions of the NTSC PPU seem to be at least somewhat susceptible.<ref>[//forums.nesdev.org/viewtopic.php?t=23209 Problem with palette discoloration when PPU is turned off during rendering] thread by N·K</ref>


=== <span id="PPUDATA"><span id="Reg2007">Data ($2007) <> read/write</span></span> ===
When done writing to palette memory, the workaround is to always
# Update the address, if necessary, so that it's pointing at $3F00, $3F10, $3F20, or any other mirror.
# Only then change the address to point outside of palette memory.


* Common name: '''PPUDATA'''
A code fragment to implement this workaround is present in vast numbers of games:<ref>[//forums.nesdev.org/viewtopic.php?p=280899#p280899 Weird PPU writes] thread by Fiskbit</ref>
* Description: PPU data port
* Access: read, write


VRAM read/write data register. After access, the video memory address will increment by an amount determined by $2000:2.
  lda #$3F
  sta PPUADDR
  lda #0
  sta PPUADDR
  sta PPUADDR
  sta PPUADDR
 
==== Bus conflict ====
During raster effects, if the second write to PPUADDR happens at specific times, at most one axis of scrolling will be set to the bitwise AND of the written value and the current value. The only safe time to finish the second write is during blanking; see [[PPU scrolling]] for more specific timing. [//forums.nesdev.org/viewtopic.php?p=230391#p230391]
 
{{Anchor|PPUDATA}}{{Anchor|Reg2007}}{{Anchor|Data_($2007)_<>_read/write}}
=== PPUDATA - VRAM data ($2007 read/write) ===
----
7  bit  0
---- ----
DDDD DDDD
|||| ||||
++++-++++- VRAM data
 
VRAM read/write data register. After access, the video memory address will increment by an amount determined by bit 2 of $2000.
 
When the screen is turned off by disabling the background/sprite rendering flag with the [[#PPUMASK|PPUMASK]] or during vertical blank, data can be read from or written to VRAM through this port. Since accessing this register increments the VRAM address, it should not be accessed outside vertical or forced blanking because it will cause graphical glitches, and if writing, write to an unpredictable address in VRAM. However, a handful of games are known to read from PPUDATA during rendering, causing scroll position changes. See [[PPU scrolling#$2007 reads and writes|PPU scrolling]] and [[Tricky-to-emulate games]].
 
VRAM reading and writing shares the same internal address register that rendering uses. Therefore, after loading data into video memory, the program should reload the scroll position afterwards with [[#PPUSCROLL|PPUSCROLL]] and [[#PPUCTRL|PPUCTRL]] (bits 1-0) writes in order to avoid wrong scrolling.
 
{{Anchor|The PPUDATA read buffer (post-fetch)}}
==== The PPUDATA read buffer ====
 
Reading from PPUDATA does not directly return the value at the current VRAM address, but instead returns the contents of an internal read buffer. This read buffer is updated on every PPUDATA read, but only '''after''' the previous contents have been returned to the CPU, effectively delaying PPUDATA reads by one. This is because PPU bus reads are too slow and cannot complete in time to service the CPU read. Because of this read buffer, after the VRAM address has been set through PPUADDR, one should first read PPUDATA to prime the read buffer (ignoring the result) before then reading the desired data from it.


When the screen is turned off by disabling the background/sprite rendering flag with the [[#PPUMASK|PPUMASK]] or during vertical blank, you can read or write data from VRAM through this port. Since accessing this register increments the VRAM address, it should not be accessed outside vertical or forced blanking because it will cause graphical glitches, and if writing, write to an unpredictable address in VRAM. However, two games are known to [[Reading 2007 during rendering|read from PPUDATA during rendering]]: see [[Tricky-to-emulate games]].
Note that the read buffer is updated '''only''' on PPUDATA reads. It is not affected by writes or other PPU processes such as rendering, and it maintains its value indefinitely until the next read.


VRAM reading and writing shares the same internal address register that rendering uses. So after loading data into video memory, the program should reload the scroll position afterwards with [[#PPUSCROLL|PPUSCROLL]] writes in order to avoid wrong scrolling.
==== Reading palette RAM ====


==== The PPUDATA read buffer (post-fetch) ====
Later PPUs added an unreliable feature for reading palette data from $3F00-$3FFF. These reads work differently than standard VRAM reads, as palette RAM is a separate memory space internal to the PPU that is overlaid onto the PPU address space. The referenced 6-bit palette data is returned immediately instead of going to the internal read buffer, and hence no priming read is required. Simultaneously, the PPU also performs a normal read from PPU memory at the specified address, "underneath" the palette data, and the result of this read goes into the read buffer as normal. The old contents of the read buffer are discarded when reading palettes, but by changing the address to point outside palette RAM and performing one read, the contents of this shadowed memory ([[PPU memory map|usually mirrored nametables]]) can be accessed. On PPUs that do not support reading palette RAM, this memory range behaves the same as the rest of PPU memory.


When reading while the VRAM address is in the range 0-$3EFF (i.e., before the palettes), the read will return the contents of an internal read buffer. This internal buffer is updated '''only''' when reading [[#PPUDATA|PPUDATA]], and so is preserved across frames. After the CPU reads and gets the contents of the internal buffer, the PPU will immediately update the internal buffer with the byte at the current VRAM address. Thus, after setting the VRAM address, one should first read this register and discard the result.
This feature is supported by the 2C02G, 2C02H, and PAL PPUs. The byte returned when reading palettes contains [[Open_bus_behavior#PPU_open_bus|PPU open bus]] in the top 2 bits, and the value is returned after it is modified by greyscale mode, which clears the bottom 4 bits if enabled. Unfortunately, on some consoles, palette reads can be corrupted on one of the 4 CPU/PPU alignments relative to the master clock. This corruption depends on when the [[PPU pinout|PPU /CS]] signal that indicates register access is deasserted, which varies by console. Combined with this feature not being present in all PPUs, developers should not rely on reading from palette RAM.


Reading palette data from $3F00-$3FFF works differently. The palette data is placed immediately on the data bus, and hence no dummy read is required. Reading the palettes still updates the internal buffer though, but the data placed in it is the mirrored nametable data that would appear "underneath" the palette. (Checking the [[PPU memory map]] should make this clearer.)
==== Read conflict with DPCM samples ====


=== <span id="Reg4014"><span id="OAMDMA">OAM DMA ($4014) > write</span></span> ===
If currently playing DPCM samples, there is a chance that an interruption from the APU's sample fetch will cause an extra read cycle if it happened at the same time as an instruction that reads $2007. This will cause an extra increment and a byte to be skipped over, resulting in the wrong data being read. See: [[APU_DMC#Conflict with controller and PPU read|APU DMC]]


* Common name: '''OAMDMA'''
{{Anchor|OAMDMA}}{{Anchor|Reg4014}}
* Description: OAM DMA register (high byte)
=== OAMDMA - Sprite DMA ($4014 write) ===
* Access: write
----
7  bit  0
---- ----
AAAA AAAA
|||| ||||
++++-++++- Source page (high byte of source address)


This port is located on the CPU. Writing $XX will upload 256 bytes of data from CPU page $XX00-$XXFF to the internal PPU OAM. This page is typically located in internal RAM, commonly $0200-$02FF, but cartridge RAM or ROM can be used as well.
OAMDMA is a CPU register that suspends the CPU so it can quickly copy a page of CPU memory to PPU OAM using [[DMA]]. It always copies 256 bytes and the source address always starts page-aligned (ending in $00). The value written to this register is the high byte of the source address, and the copy begins on the cycle immediately after the write. The copy takes 513 or 514 cycles and is implemented as 256 pairs of a read from CPU memory and a write to [[#OAMDATA|OAMDATA]]. Because vblank is so short and because changing [[#OAMADDR|OAMADDR]] often corrupts OAM, OAM DMA is normally the only realistic option for updating sprites each frame. 0 should be written to OAMADDR before initiating DMA to ensure the data is properly aligned and [[Errata|to avoid corruption]].<ref name = "OAMglitch" /> While OAM DMA is possible to do mid-frame while rendering is disabled, it is normally only done in vblank.


* The CPU is suspended during the transfer, which will take 513 or 514 cycles after the $4014 write tick. (1 dummy read cycle while waiting for writes to complete, +1 if on an odd CPU cycle, then 256 alternating read/write cycles.)
OAM consists of dynamic RAM (DRAM) which decays if not refreshed often enough, and this requires different considerations on NTSC and PAL. Refresh happens automatically any time a row of DRAM is read or written, so it is refreshed every scanline during rendering by the sprite evaluation process. On NTSC, vblank is short enough that OAM will not decay before rendering begins again, so OAM DMA can be done anytime in vblank. On PAL, vblank is much longer, so to avoid decay during that time, the PPU automatically performs a forced refresh starting 24 scanlines after NMI, during which OAM cannot be written. This means that OAM DMA is limited to the start of vblank on PAL. Note that NTSC vblank is shorter than 24 PAL scanlines, so NTSC-compatible NMI handlers will finish before the forced refresh and therefore should work on PAL regardless of their OAM DMA timing. In either case, OAM does not decay if it is not updated during vblank, and in fact it should generally not be updated on lag frames (frames where the CPU did not finish its work before vblank) to avoid copying incomplete sprite data to the PPU.


* The OAM DMA is the only effective method for initializing all 256 bytes of OAM. Because of the decay of OAM's dynamic RAM when rendering is disabled, the initialization should take place within vblank. Writes through [[#OAMDATA|OAMDATA]] are generally too slow for this task.
== Internal registers ==


* The DMA transfer will begin at the current OAM write address. It is common practice to initialize it to 0 with a write to [[#OAMADDR|OAMADDR]] before the DMA transfer. Different starting addresses can be used for a simple OAM cycling technique, to alleviate sprite priority conflicts by flickering. If using this technique, after the DMA [[#OAMADDR|OAMADDR]] should be set to 0 before the end of vblank to prevent potential OAM corruption (See: [[Errata]]). However, due to OAMADDR writes also having a "corruption" effect<ref name = "OAMglitch" /> this technique is not recommended.
The PPU also has 4 internal registers, described in detail on [[PPU scrolling#PPU internal registers|PPU scrolling]]:
* '''v''': During rendering, used for the scroll position. Outside of rendering, used as the current VRAM address.
* '''t''': During rendering, specifies the starting coarse-x scroll for the next scanline and the starting y scroll for the screen. Outside of rendering, holds the scroll or VRAM address before transferring it to v.
* '''x''': The fine-x position of the current scroll, used during rendering alongside v.
* '''w''': Toggles on each write to either [[#PPUSCROLL|PPUSCROLL]] or [[#PPUADDR|PPUADDR]], indicating whether this is the first or second write. Clears on reads of [[#PPUSTATUS|PPUSTATUS]]. Sometimes called the 'write latch' or 'write toggle'.


== References ==
== References ==
<references />
<references />

Latest revision as of 19:52, 24 October 2024

The PPU exposes eight memory-mapped registers to the CPU. These nominally sit at $2000 through $2007 in the CPU's address space, but because their addresses are incompletely decoded, they're mirrored in every 8 bytes from $2008 through $3FFF. For example, a write to $3456 is the same as a write to $2006.

The PPU starts rendering immediately after power-on or reset, but ignores writes to most registers (specifically $2000, $2001, $2005 and $2006) until reaching the pre-render scanline of the next frame; more specifically, for around 29658 NTSC CPU cycles or 33132 PAL CPU cycles, assuming the CPU and PPU are reset at the same time. See PPU power up state and Init code for details.

Summary

Common Name Address Bits Type Notes
PPUCTRL $2000 VPHB SINN W NMI enable (V), PPU master/slave (P), sprite height (H), background tile select (B), sprite tile select (S), increment mode (I), nametable select / X and Y scroll bit 8 (NN)
PPUMASK $2001 BGRs bMmG W color emphasis (BGR), sprite enable (s), background enable (b), sprite left column enable (M), background left column enable (m), greyscale (G)
PPUSTATUS $2002 VSO- ---- R vblank (V), sprite 0 hit (S), sprite overflow (O); read resets write pair for $2005/$2006
OAMADDR $2003 AAAA AAAA W OAM read/write address
OAMDATA $2004 DDDD DDDD RW OAM data read/write
PPUSCROLL $2005 XXXX XXXX YYYY YYYY Wx2 X and Y scroll bits 7-0 (two writes: X scroll, then Y scroll)
PPUADDR $2006 ..AA AAAA AAAA AAAA Wx2 VRAM address (two writes: most significant byte, then least significant byte)
PPUDATA $2007 DDDD DDDD RW VRAM data read/write
OAMDMA $4014 AAAA AAAA W OAM DMA high address

Register types:

  • R - Readable
  • W - Writeable
  • x2 - Internal 2-byte state accessed by two 1-byte accesses

MMIO registers

The PPU has an internal data bus that it uses for communication with the CPU. This bus, called _io_db in Visual 2C02 and PPUGenLatch in FCEUX,[1] behaves as an 8-bit dynamic latch due to capacitance of very long traces that run to various parts of the PPU. Writing any value to any PPU port, even to the nominally read-only PPUSTATUS, will fill this latch. Reading any readable port (PPUSTATUS, OAMDATA, or PPUDATA) also fills the latch with the bits read. Reading a nominally "write-only" register returns the latch's current value, as do the unused bits of PPUSTATUS. This value begins to decay after a frame or so, faster once the PPU has warmed up, and it is likely that values with alternating bit patterns (such as $55 or $AA) will decay faster.[2]

PPUCTRL - Miscellaneous settings ($2000 write)


7  bit  0
---- ----
VPHB SINN
|||| ||||
|||| ||++- Base nametable address
|||| ||    (0 = $2000; 1 = $2400; 2 = $2800; 3 = $2C00)
|||| |+--- VRAM address increment per CPU read/write of PPUDATA
|||| |     (0: add 1, going across; 1: add 32, going down)
|||| +---- Sprite pattern table address for 8x8 sprites
||||       (0: $0000; 1: $1000; ignored in 8x16 mode)
|||+------ Background pattern table address (0: $0000; 1: $1000)
||+------- Sprite size (0: 8x8 pixels; 1: 8x16 pixels – see PPU OAM#Byte 1)
|+-------- PPU master/slave select
|          (0: read backdrop from EXT pins; 1: output color on EXT pins)
+--------- Vblank NMI enable (0: off, 1: on)

PPUCTRL (the "control" or "controller" register) contains a mix of settings related to rendering, scroll position, vblank NMI, and dual-PPU configurations. After power/reset, writes to this register are ignored until the first pre-render scanline.

Vblank NMI

Enabling NMI in PPUCTRL causes the NMI handler to be called at the start of vblank (scanline 241, dot 1). This provides a reliable time source for software so it can run at the display's frame rate, and it signals vblank to the software. Vblank is the only time with rendering enabled that the software can send data to VRAM and OAM, and this NMI is the only reliable way to detect vblank; polling the vblank flag in PPUSTATUS can miss vblank entirely.

Changing NMI enable from 0 to 1 while the vblank flag in PPUSTATUS is 1 will immediately trigger an NMI. This happens during vblank if the PPUSTATUS register has not yet been read. It can result in graphical glitches by making the NMI routine execute too late in vblank to finish on time, or cause the game to handle more frames than have actually occurred. To avoid this problem, it is prudent to read PPUSTATUS first to clear the vblank flag before enabling NMI in PPUCTRL.

Scrolling

The current nametable bits in PPUCTRL bits 0 and 1 can equivalently be considered the most significant bit of the scroll coordinates, which are 9 bits wide (see Nametables and PPUSCROLL):

7  bit  0
---- ----
.... ..YX
       ||
       |+- X scroll position bit 8 (i.e. add 256 to X)
       +-- Y scroll position bit 8 (i.e. add 240 to Y)

These two bits go to the same internal t register as the values written to PPUSCROLL, and they must be written alongside PPUSCROLL in order to fully specify the scroll position.

Master/slave mode and the EXT pins

Bit 6 of PPUCTRL should never be set on stock consoles because it may damage the PPU.

When this bit is clear (the usual case), the PPU gets the palette index for the backdrop color from the EXT pins. The stock NES grounds these pins, making palette index 0 the backdrop color as expected. A secondary picture generator connected to the EXT pins would be able to replace the backdrop with a different image using colors from the background palette, which could be used for features such as parallax scrolling.

Setting bit 6 causes the PPU to output the lower four bits of the palette memory index on the EXT pins for each pixel. Since only four bits are output, background and sprite pixels can't normally be distinguished this way. Setting this bit does not affect the image in the PPU's composite video output. As the EXT pins are grounded on an unmodified NES, setting bit 6 is discouraged as it could potentially damage the chip whenever it outputs a non-zero pixel value (due to it effectively shorting Vcc and GND together). Note that EXT output for transparent pixels is not a backdrop color as normal, but rather entry 0 of that background sliver's palette. When rendering is disabled, EXT output is always index 0 regardless of backdrop override.

Bit 0 race condition

Be careful when writing to this register outside vblank if using a horizontal nametable arrangement (a.k.a. vertical mirroring) or 4-screen VRAM. For specific CPU-PPU alignments, a write that starts on dot 257 will cause only the next scanline to be erroneously drawn from the left nametable. This can cause a visible glitch, and it can also interfere with sprite 0 hit for that scanline (by being drawn with the wrong background).

The glitch has no effect in horizontal or one-screen mirroring because the left and right nametables are identical. Only writes that start on dot 257 and continue through dot 258 can cause this glitch: any other horizontal timing is safe. The glitch specifically writes the value of open bus to the register, which will almost always be the upper byte of the address. Writing to this register or the mirror of this register at $2100 according to the desired nametable appears to be a functional workaround.

This produces an occasionally visible glitch in Super Mario Bros. when the program writes to PPUCTRL at the end of game logic. It appears to be turning NMI off during game logic and then turning NMI back on once the game logic has finished in order to prevent the NMI handler from being called again before the game logic finishes. Another workaround is to use a software flag to prevent NMI reentry, instead of using the PPU's NMI enable.

PPUMASK - Rendering settings ($2001 write)


7  bit  0
---- ----
BGRs bMmG
|||| ||||
|||| |||+- Greyscale (0: normal color, 1: greyscale)
|||| ||+-- 1: Show background in leftmost 8 pixels of screen, 0: Hide
|||| |+--- 1: Show sprites in leftmost 8 pixels of screen, 0: Hide
|||| +---- 1: Enable background rendering
|||+------ 1: Enable sprite rendering
||+------- Emphasize red (green on PAL/Dendy)
|+-------- Emphasize green (red on PAL/Dendy)
+--------- Emphasize blue

PPUMASK (the "mask" register) controls the rendering of sprites and backgrounds, as well as color effects. After power/reset, writes to this register are ignored until the first pre-render scanline.

Most commonly, PPUMASK is set to $00 outside of gameplay to allow transferring a large amount of data to VRAM, and $1E during gameplay to enable all rendering with no color effects.

Rendering control

Rendering is the PPU's process of actively fetching memory and drawing an image to the screen. Rendering as a whole is enabled as long as one or both of sprite and background rendering is enabled in PPUMASK. If one component is enabled and the other is not, the disabled component is simply treated as transparent; the rendering process is otherwise unaffected. When both components are disabled via bits 3 and 4, the rendering process stops and the PPU displays the backdrop color.

During rendering, the PPU is actively using VRAM and OAM. This prevents the CPU from being able to access VRAM via PPUDATA or OAM via OAMDATA, so these accesses must be done outside of rendering: either during vblank (for data transfers during gameplay) or with rendering turned off (for large data transfers, such as when loading a level). To avoid numerous hardware bugs and limitations, it is generally recommended that rendering be turned on or off only during vblank. This can be done by writing the desired PPUMASK value to a variable rather than the register itself and then only copying that variable to PPUMASK during vblank in the NMI handler.

The PPU can optionally hide sprites and backgrounds in just the leftmost 8 pixels of the screen, making them transparent and thus drawing the backdrop color there. For sprites, this can be useful to avoid sprite pop-in, a limitation where sprites cannot partially hang off the left edge of the screen like they can off the right edge. For backgrounds, this can eliminate tile artifacts and reduce attribute artifacts when scrolling horizontally with either a vertical or one-screen nametable arrangement, as these arrangements do not allow hiding the scroll seam off-screen. Note that the backdrop color may not match the color used by the art for the background, so disabling the left column may be more distracting than minor artifacts.

Notes:

  • Writing to PPUDATA during rendering can corrupt VRAM, so writes must be done in vblank or with rendering disabled in PPUMASK bits 3 and 4.
  • Sprite 0 hit does not trigger in any area where the background or sprites are disabled.
  • Toggling rendering takes effect approximately 3-4 dots after the write. This delay is required by Battletoads to avoid a crash.
  • Toggling rendering mid-screen often corrupts 1 row of OAM and draws incorrect sprites for the current and next scanline. (See: Errata)
  • Turning rendering off mid-screen can corrupt palette RAM if the low 14 bits of the internal v register have a value between $3C00-$3FFF.
  • Turning rendering on late causes the dot at the end of pre-render to never be skipped, which can cause dot crawl on stationary screens.
  • Turning rendering on late causes the PPU to have an incorrect scroll value unless it is set manually with a complicated series of writes.

Color control

Greyscale mode forces all colors to be a shade of grey or white. This is done by bitwise ANDing the color with $30, causing all colors to come from the grey column ($00, $10, $20, $30), which notably lacks a black color. Note that this AND behavior means that RGB PPUs with scrambled colors (the 2C04 series) do not actually get shades of grey, but rather whatever colors are in the $x0 column. When reading from palette RAM, the returned value reflects this AND behavior, but the underlying data is preserved. Palette writes function normally regardless of greyscale mode.

Color emphasis causes a color tint effect that works by darkening the other two color components, making the selected component comparatively brighter and thus emphasized. Emphasizing all 3 components simply dims all colors. This works independently of greyscale, allowing greys to be tinted. Note that PAL and Dendy PPUs have a different emphasis bit order, so ports and dual-region games should reorder the bits. Furthermore, emphasis on RGB PPUs is completely different, instead maximizing the brightness of the emphasized component and producing a completely white screen when all components are emphasized. RGB emphasis is far less useful and generally best avoided.

PPUSTATUS - Rendering events ($2002 read)


7  bit  0
---- ----
VSOx xxxx
|||| ||||
|||+-++++- (PPU open bus or 2C05 PPU identifier)
||+------- Sprite overflow flag
|+-------- Sprite 0 hit flag
+--------- Vblank flag, cleared on read. Unreliable; see below.

PPUSTATUS (the "status" register) reflects the state of rendering-related events and is primarily used for timing. The three flags in this register are automatically cleared on dot 1 of the prerender scanline; see PPU rendering for more information on the set and clear timing.

Reading this register has the side effect of clearing the PPU's internal w register. It is commonly read before writes to PPUSCROLL and PPUADDR to ensure the writes occur in the correct order.

Vblank flag

The vblank flag is set at the start of vblank (scanline 241, dot 1). Reading PPUSTATUS will return the current state of this flag and then clear it. If the vblank flag is not cleared by reading, it will be cleared automatically on dot 1 of the prerender scanline.

Reading the vblank flag is not a reliable way to detect vblank. NMI should be used, instead. Reading the flag on the dot before it is set (scanling 241, dot 0) causes it to read as 0 and be cleared, so polling PPUSTATUS for the vblank flag can miss vblank and cause games to stutter. NMI is also suppressed when this occurs, and may even be suppressed by reads landing on the following dot or two. On NTSC and PAL, it is guaranteed that the flag cannot be dropped two frames in a row, but on Dendy, it is possible for it to happen every frame, crashing the game. Using NMI ensures that software correctly detects vblank every frame. It is also required by PlayChoice-10, which will reject the game if NMI is disabled for too long. Polling the vblank flag is still required while booting up the console, but timing at this point is not critical (see Init code for more information on booting safely).

The vblank flag is used in the generation of NMI, and enabling NMI while this flag is 1 will cause an immediate NMI (see PPUCTRL).

Sprite 0 hit flag

Sprite 0 hit is a hardware collision detection feature that detects pixel-perfect collision between the first sprite in OAM (sprite 0) and the background. The sprite 0 hit flag is immediately set when any opaque pixel of sprite 0 overlaps any opaque pixel of background, regardless of sprite priority. 'Opaque' means that the pixel is not 'transparent' — that is, its two pattern bits are not %00. The flag stays set until dot 1 of the prerender scanline; thus, it can only detect one collision per frame.

Although this flag detects collision, it is primarily used for timing. Many games place sprite 0 at a fixed location on the screen and poll this flag until it becomes set. This allows the CPU to know its approximate location on the screen so it can time mid-screen writes to hardware registers. Commonly, this is used to change the scroll position mid-screen to allow for a background-based HUD, like in Super Mario Bros. However, some modern homebrew games use this for actual collision, such as Lunar Limit and Irritating Ship.

Sprite 0 hit cannot detect collision at X=255, nor anywhere where either sprites or backgrounds are disabled via PPUMASK. This includes X=0..7 when the leftmost 8 pixels are hidden. However, it is not affected by the cropping on the left and right edges on PAL.

There are some important considerations when using this flag for timing:

  • Because sprite 0 hit is not cleared until the prerender scanline, software can potentially mistake the previous frame's hit as being from the current frame. Therefore, it may be necessary to poll the flag until it becomes clear before then polling for it to be set again.
  • If a game expects sprite 0 hit to occur and it does not, this often results in a crash. If there is any risk that the hit may not occur (perhaps because an overlap may not happen when scrolling or because it relies on precise mid-screen timings that may vary across power cycles, consoles, or emulators), it can be critical to have another way to exit the poll loop. For example, this may be done by also polling the vblank flag or having the NMI handler check if the game is still polling for sprite 0 hit.
  • Games often don't handle sprite 0 hit on lag frames, preventing the mid-screen event from occurring. A common result of this is HUD flickering during lag. Handling sprite 0 hit in the NMI handler, at least on lag frames, can work around this.

Sprite overflow flag

The sprite overflow flag was intended to be set any time there are more than 8 sprites on a scanline. Unfortunately, the logic for detecting this does not work correctly, resulting in the PPU checking incorrect indices in OAM when searching for a 9th sprite. This produces both false positives and false negatives. See PPU sprite evalution for details on its incorrect behavior. In practice, sprite overflow is usually used for timing like sprite 0 hit, but because of its buggy behavior and its cost of 9 sprite tiles, it is generally only used when more than one timing source is required. Like sprite 0 hit, this flag is cleared at the start of the prerender scanline and can only be set once per frame.

Using sprite overflow is often a last resort. When mapper IRQs are not available, the DMC IRQ can be an effective alternative for timing, albeit complicated to use.

2C05 identifier

The 2C05 series of arcade PPUs returns an identifier in bits 4-0 instead of PPU open bus. This value is checked by games as a form of copy protection. Note that this does not apply to the consumer 2C05-99, which returns open bus as usual. While we haven't yet collected data directly from the PPUs, 2C05 games expect the following values:

PPU Mask Value
2C05-02 $3F $3D
2C05-03 $1F $1C
2C05-04 $1F $1B

OAMADDR - Sprite RAM address ($2003 write)


7  bit  0
---- ----
AAAA AAAA
|||| ||||
++++-++++- OAM address

Write the address of OAM you want to access here. Most games just write $00 here and then use OAMDMA. (DMA is implemented in the 2A03/7 chip and works by repeatedly writing to OAMDATA)

Values during rendering

OAMADDR is set to 0 during each of ticks 257–320 (the sprite tile loading interval) of the pre-render and visible scanlines. This also means that at the end of a normal complete rendered frame, OAMADDR will always have returned to 0.

If rendering is enabled mid-scanline[3], there are further consequences of an OAMADDR that was not set to 0 before OAM sprite evaluation begins at tick 65 of the visible scanline. The value of OAMADDR at this tick determines the starting address for sprite evaluation for this scanline, which can cause the sprite at OAMADDR to be treated as it was sprite 0, both for sprite-0 hit and priority. If OAMADDR is unaligned and does not point to the Y position (first byte) of an OAM entry, then whatever it points to (tile index, attribute, or X coordinate) will be reinterpreted as a Y position, and the following bytes will be similarly reinterpreted. No more sprites will be found once the end of OAM is reached, effectively hiding any sprites before the starting OAMADDR.

OAMADDR precautions

On the 2C02G, writes to OAMADDR corrupt OAM. The exact corruption isn't fully described, but this usually seems to copy sprites 8 and 9 (address $20) over the 8-byte row at the target address. The source address for this copy seems to come from the previous value on the CPU BUS (most often $20 from the $2003 operand).[3][4] There may be other possible behaviors as well. This can then be worked around by writing all 256 bytes of OAM, though due to the limited time before OAM decay will begin this should normally be done through OAMDMA.

It is also the case that if OAMADDR is not less than eight when rendering starts, the eight bytes starting at OAMADDR & 0xF8 are copied to the first eight bytes of OAM; it seems likely that this is related. On the Dendy, the latter bug is required for 2C02 compatibility.

It is known that in the 2C03, 2C04, 2C05[5], and 2C07, OAMADDR works as intended. It is not known whether this bug is present in all revisions of the 2C02.

OAMDATA - Sprite RAM data ($2004 read/write)


7  bit  0
---- ----
DDDD DDDD
|||| ||||
++++-++++- OAM data

Write OAM data here. Writes will increment OAMADDR after the write; reads do not. Reads during vertical or forced blanking return the value from OAM at that address.

Do not write directly to this register in most cases. Because changes to OAM should normally be made only during vblank, writing through OAMDATA is only effective for partial updates, as it is too slow to update all of OAM within one vblank interval, and as described above, partial writes cause corruption. Most games use the DMA feature through OAMDMA instead.

  • Reading OAMDATA while the PPU is rendering will expose internal OAM accesses during sprite evaluation and loading; Micro Machines does this.
  • Writes to OAMDATA during rendering (on the pre-render line and the visible lines 0–239, provided either sprite or background rendering is enabled) do not modify values in OAM, but do perform a glitchy increment of OAMADDR, bumping only the high 6 bits (i.e., it bumps the [n] value in PPU sprite evaluation – it's plausible that it could bump the low bits instead depending on the current status of sprite evaluation). This extends to DMA transfers via OAMDMA, since that uses writes to $2004. For emulation purposes, it is probably best to completely ignore writes during rendering.
  • It used to be thought that reading from this register wasn't reliable[6], however more recent evidence seems to suggest that this is solely due to corruption by OAMADDR writes.
  • In the oldest instantiations of the PPU, as found on earlier Famicoms and NESes, this register is not readable[7]. The readability was added on the RP2C02G, found on most NESes and later Famicoms.[8]
  • In the 2C07, sprite evaluation can never be fully disabled, and will always start 24 scanlines after the start of vblank[9] (same as when the prerender scanline would have been on the 2C02). As such, any updates to OAM should be done within the first 24 scanlines after the 2C07 signals vertical blanking.

PPUSCROLL - X and Y scroll ($2005 write)


1st write
7  bit  0
---- ----
XXXX XXXX
|||| ||||
++++-++++- X scroll bits 7-0 (bit 8 in PPUCTRL bit 0)

2nd write
7  bit  0
---- ----
YYYY YYYY
|||| ||||
++++-++++- Y scroll bits 7-0 (bit 8 in PPUCTRL bit 1)

This register is used to change the scroll position, telling the PPU which pixel of the nametable selected through PPUCTRL should be at the top left corner of the rendered screen. PPUSCROLL takes two writes: the first is the X scroll and the second is the Y scroll. Whether this is the first or second write is tracked internally by the w register, which is shared with PPUADDR. Typically, this register is written to during vertical blanking to make the next frame start rendering from the desired location, but it can also be modified during rendering in order to split the screen. Changes made to the vertical scroll during rendering will only take effect on the next frame. Together with the nametable bits in PPUCTRL, the scroll can be thought of as 9 bits per component, and PPUCTRL must be updated along with PPUSCROLL to fully specify the scroll position.

After reading PPUSTATUS to clear w (the write latch), write the horizontal and vertical scroll offsets to PPUSCROLL just before turning on the screen:

 ; Set the high bit of X and Y scroll.
 lda ppuctrl_value
 ora current_nametable
 sta PPUCTRL

 ; Set the low 8 bits of X and Y scroll.
 bit PPUSTATUS
 lda cam_position_x
 sta PPUSCROLL
 lda cam_position_y
 sta PPUSCROLL

Horizontal offsets range from 0 to 255. "Normal" vertical offsets range from 0 to 239, while values of 240 to 255 cause the attributes data at the end of the current nametable to be used incorrectly as tile data. The PPU normally skips from 239 to 0 of the next nametable automatically, so these "invalid" scroll positions only occur if explicitly written.

By changing the scroll values here across several frames and writing tiles to newly revealed areas of the nametables, one can achieve the effect of a camera panning over a large background.

PPUADDR - VRAM address ($2006 write)


1st write  2nd write
15 bit  8  7  bit  0
---- ----  ---- ----
..AA AAAA  AAAA AAAA
  || ||||  |||| ||||
  ++-++++--++++-++++- VRAM address

Because the CPU and the PPU are on separate buses, neither has direct access to the other's memory. The CPU writes to VRAM through a pair of registers on the PPU by first loading an address into PPUADDR and then writing data repeatedly to PPUDATA. The VRAM address only needs to be set once for every series of data writes because each PPUDATA access automatically increments the address by 1 or 32, as configured in PPUCTRL.

The 16-bit address is written to PPUADDR one byte at a time, high byte first. Whether this is the first or second write is tracked by the PPU's internal w register, which is shared with PPUSCROLL. If w is not 0 or its state is not known, it must be cleared by reading PPUSTATUS before writing the address. For example, to set the VRAM address to $2108 after w is known to be 0:

  lda #$21
  sta PPUADDR
  lda #$08
  sta PPUADDR

The PPU address space is 14-bit, spanning $0000–$3FFF. Bits 14 and 15 of the value written to this register are ignored. However, bit 14 of the internal t register that holds the data written to PPUADDR is forced to 0 when writing the PPUADDR high byte. This detail doesn't matter when using PPUADDR to set a VRAM address, but is an important limitation when using it to control mid-screen scrolling (see PPU scrolling for more information).

Note

Access to PPUSCROLL and PPUADDR during screen refresh produces interesting raster effects; the starting position of each scanline can be set to any pixel position in nametable memory. For more information, see PPU scrolling.

Palette corruption

In specific circumstances, entries of the PPU's palette can be corrupted. It's unclear exactly how or why this happens, but all revisions of the NTSC PPU seem to be at least somewhat susceptible.[10]

When done writing to palette memory, the workaround is to always

  1. Update the address, if necessary, so that it's pointing at $3F00, $3F10, $3F20, or any other mirror.
  2. Only then change the address to point outside of palette memory.

A code fragment to implement this workaround is present in vast numbers of games:[11]

  lda #$3F
  sta PPUADDR
  lda #0
  sta PPUADDR
  sta PPUADDR
  sta PPUADDR

Bus conflict

During raster effects, if the second write to PPUADDR happens at specific times, at most one axis of scrolling will be set to the bitwise AND of the written value and the current value. The only safe time to finish the second write is during blanking; see PPU scrolling for more specific timing. [1]

PPUDATA - VRAM data ($2007 read/write)


7  bit  0
---- ----
DDDD DDDD
|||| ||||
++++-++++- VRAM data

VRAM read/write data register. After access, the video memory address will increment by an amount determined by bit 2 of $2000.

When the screen is turned off by disabling the background/sprite rendering flag with the PPUMASK or during vertical blank, data can be read from or written to VRAM through this port. Since accessing this register increments the VRAM address, it should not be accessed outside vertical or forced blanking because it will cause graphical glitches, and if writing, write to an unpredictable address in VRAM. However, a handful of games are known to read from PPUDATA during rendering, causing scroll position changes. See PPU scrolling and Tricky-to-emulate games.

VRAM reading and writing shares the same internal address register that rendering uses. Therefore, after loading data into video memory, the program should reload the scroll position afterwards with PPUSCROLL and PPUCTRL (bits 1-0) writes in order to avoid wrong scrolling.

The PPUDATA read buffer

Reading from PPUDATA does not directly return the value at the current VRAM address, but instead returns the contents of an internal read buffer. This read buffer is updated on every PPUDATA read, but only after the previous contents have been returned to the CPU, effectively delaying PPUDATA reads by one. This is because PPU bus reads are too slow and cannot complete in time to service the CPU read. Because of this read buffer, after the VRAM address has been set through PPUADDR, one should first read PPUDATA to prime the read buffer (ignoring the result) before then reading the desired data from it.

Note that the read buffer is updated only on PPUDATA reads. It is not affected by writes or other PPU processes such as rendering, and it maintains its value indefinitely until the next read.

Reading palette RAM

Later PPUs added an unreliable feature for reading palette data from $3F00-$3FFF. These reads work differently than standard VRAM reads, as palette RAM is a separate memory space internal to the PPU that is overlaid onto the PPU address space. The referenced 6-bit palette data is returned immediately instead of going to the internal read buffer, and hence no priming read is required. Simultaneously, the PPU also performs a normal read from PPU memory at the specified address, "underneath" the palette data, and the result of this read goes into the read buffer as normal. The old contents of the read buffer are discarded when reading palettes, but by changing the address to point outside palette RAM and performing one read, the contents of this shadowed memory (usually mirrored nametables) can be accessed. On PPUs that do not support reading palette RAM, this memory range behaves the same as the rest of PPU memory.

This feature is supported by the 2C02G, 2C02H, and PAL PPUs. The byte returned when reading palettes contains PPU open bus in the top 2 bits, and the value is returned after it is modified by greyscale mode, which clears the bottom 4 bits if enabled. Unfortunately, on some consoles, palette reads can be corrupted on one of the 4 CPU/PPU alignments relative to the master clock. This corruption depends on when the PPU /CS signal that indicates register access is deasserted, which varies by console. Combined with this feature not being present in all PPUs, developers should not rely on reading from palette RAM.

Read conflict with DPCM samples

If currently playing DPCM samples, there is a chance that an interruption from the APU's sample fetch will cause an extra read cycle if it happened at the same time as an instruction that reads $2007. This will cause an extra increment and a byte to be skipped over, resulting in the wrong data being read. See: APU DMC

OAMDMA - Sprite DMA ($4014 write)


7  bit  0
---- ----
AAAA AAAA
|||| ||||
++++-++++- Source page (high byte of source address)

OAMDMA is a CPU register that suspends the CPU so it can quickly copy a page of CPU memory to PPU OAM using DMA. It always copies 256 bytes and the source address always starts page-aligned (ending in $00). The value written to this register is the high byte of the source address, and the copy begins on the cycle immediately after the write. The copy takes 513 or 514 cycles and is implemented as 256 pairs of a read from CPU memory and a write to OAMDATA. Because vblank is so short and because changing OAMADDR often corrupts OAM, OAM DMA is normally the only realistic option for updating sprites each frame. 0 should be written to OAMADDR before initiating DMA to ensure the data is properly aligned and to avoid corruption.[4] While OAM DMA is possible to do mid-frame while rendering is disabled, it is normally only done in vblank.

OAM consists of dynamic RAM (DRAM) which decays if not refreshed often enough, and this requires different considerations on NTSC and PAL. Refresh happens automatically any time a row of DRAM is read or written, so it is refreshed every scanline during rendering by the sprite evaluation process. On NTSC, vblank is short enough that OAM will not decay before rendering begins again, so OAM DMA can be done anytime in vblank. On PAL, vblank is much longer, so to avoid decay during that time, the PPU automatically performs a forced refresh starting 24 scanlines after NMI, during which OAM cannot be written. This means that OAM DMA is limited to the start of vblank on PAL. Note that NTSC vblank is shorter than 24 PAL scanlines, so NTSC-compatible NMI handlers will finish before the forced refresh and therefore should work on PAL regardless of their OAM DMA timing. In either case, OAM does not decay if it is not updated during vblank, and in fact it should generally not be updated on lag frames (frames where the CPU did not finish its work before vblank) to avoid copying incomplete sprite data to the PPU.

Internal registers

The PPU also has 4 internal registers, described in detail on PPU scrolling:

  • v: During rendering, used for the scroll position. Outside of rendering, used as the current VRAM address.
  • t: During rendering, specifies the starting coarse-x scroll for the next scanline and the starting y scroll for the screen. Outside of rendering, holds the scroll or VRAM address before transferring it to v.
  • x: The fine-x position of the current scroll, used during rendering alongside v.
  • w: Toggles on each write to either PPUSCROLL or PPUADDR, indicating whether this is the first or second write. Clears on reads of PPUSTATUS. Sometimes called the 'write latch' or 'write toggle'.

References