PPU OAM
OAM (Object Attribute Memory) contains a display list of up to 64 sprites, where each sprite's information occupies 4 bytes.
Byte 0
Y position of top of sprite
Sprite data is delayed by one scanline; you must subtract 1 from the sprite's Y coordinate before writing it here. Hide a sprite by writing any values in $EF-$FF here.
Byte 1
Tile index number
For 8x8 sprites, this is the tile number of this sprite within the pattern table selected in bit 3 of PPUCTRL ($2000).
For 8x16 sprites, the PPU ignores the pattern table selection and selects a pattern table from bit 0 of this number.
76543210 |||||||| |||||||+- Bank ($0000 or $1000) of tiles +++++++-- Tile number of top of sprite (0 to 254; bottom half gets the next tile)
Thus, the pattern table memory map for 8x16 sprites looks like this:
- $00: $0000-$001F
- $01: $1000-$101F
- $02: $0020-$003F
- $03: $1020-$103F
- $04: $0040-$005F
[...] - $FE: $0FE0-$0FFF
- $FF: $1FE0-$1FFF
Byte 2
Attributes
76543210 |||||||| ||||||++- Palette (4 to 7) of sprite |||+++--- Unimplemented ||+------ Priority (0: in front of background; 1: behind background) |+------- Flip sprite horizontally +-------- Flip sprite vertically
Flipping does not change the position of the sprite's bounding box, just the position of pixels within the sprite. If, for example, a sprite covers (120, 130) through (127, 137), it'll still cover the same area when flipped. In 8x16 mode, vertical flip flips each of the subtiles and also exchanges their position; the odd-numbered tile of a vertically flipped sprite is drawn on top. This behavior differs from the behavior of the unofficial 16x32 and 32x64 pixel sprite sizes on the Super NES, which will only vertically flip each square sub-region.
Byte 3
X position of left side of sprite
X-scroll values of F9-FF do NOT result in the sprite wrapping around to the left side of the screen.
DMA
Most programs write to a copy of OAM somewhere in CPU addressable RAM (often $0200-$02FF) and then copy it to OAM each frame using the OAM_DMA ($4014) register. (This register triggers repeated writes to OAMDATA ($2004).) This takes 513 cycles (+1 on odd cycles) to copy 256 bytes from this memory into $2004, where an unrolled LDA/STA loop would usually take four times as long.
Sprite zero hit flag
The sprite zero hit flag ($2002:6) is raised when background and sprite rendering are both enabled (via $2001:4-3), at the point where a non-bg background pixel overlaps a non-bg sprite zero pixel during image drawing ("non-bg" is equivalent to the pattern bits not both being zero). Sprite zero hits do not register at x=255 (independent of sprite zero's x coordinate), but sprites are still drawn there.
The sprite zero flag is automatically set to 0 at dot 1 of the pre-render line. There is no way to manually reset it, which means that you only get one sprite zero hit per frame.
Sprite overlapping
Priority between sprites is determined by their address inside OAM. So to have a sprite displayed in front of another sprite in a scanline, the sprite data that occurs first will overlap any other sprites after it. For example, when sprites at OAM $0C and $28 overlap, the sprite at $0C will appear in front.
Notes
Each OAM entry is 29 bits wide. The unimplemented bits of each sprite's byte 2 do not exist in the PPU. On PPU revisions that allow reading PPU OAM through $2004, the unimplemented bits of each sprite's byte 2 always read back as 0. This can be emulated by ANDing byte 2 with $E3, either when writing to OAM or when reading back. It has not been determined whether the PPU actually drives these bits low or whether this is the effect of data bus capacitance from reading the last byte of the instruction (LDA $2004, which assembles to AD 04 20).
Internal operation
In addition to the primary OAM memory, the PPU contains 32 bytes (enough for 8 sprites) of secondary OAM memory that is not directly accessible by the program. During each visible scanline this secondary OAM is first cleared, and then a linear search of the entire primary OAM is carried out to find sprites that are within y range for the next scanline (the sprite evaluation phase). The OAM data for each sprite found to be within range is copied into the secondary OAM, which is then used to initialize eight internal sprite output units.
For the precise timing, see this timing diagram.