Visual circuit tutorial: Difference between revisions

From NESdev Wiki
Jump to navigationJump to search
(Add section on shift registers)
 
(111 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This is a crash course on making sense of the circuit displays in [http://visual6502.org/ Visual 6502]/2C02/2A03, written for people without much low-level electronics
This is a crash course on making sense of the [[#Terms|NMOS]] circuit displays in [http://visual6502.org/ Visual 6502]/2C02/2A03, written for people without much low-level electronics
experience (like the author). It aims to present the information needed to read
experience (like the primary author). It aims to present the information needed to read
the diagrams at a basic level in simple language, omitting details that are
the diagrams at a basic level in simple language, omitting details that are
unimportant when starting out.
less important when starting out.


You might want to read [http://visual6502.org/wiki/index.php?title=JssimUserHelp the Visual 6502 user's guide] and the [[Visual 2C02]] page first.
You might want to read [[visual6502wiki/JssimUserHelp|the Visual 6502 user's guide]] and the [[Visual 2C02]] page first.


== What the different colored areas are ==
== What the different colored areas are ==
Line 12: Line 12:
[[File:vis_areas.png|none]]
[[File:vis_areas.png|none]]


* Green areas are diffusion (explained below) connected to ground.
* Green areas are diffusion connected to ground.
* Red areas are diffusion connected to VCC (power).
* Red areas are diffusion connected to VCC (power).
* Yellow areas are diffusion that is neither connected directly to ground nor directly to VCC.
* Yellow areas are diffusion that is neither connected directly to ground nor directly to VCC.
Line 18: Line 18:
* Purple areas are polysilicon (often shortened to just "poly").
* Purple areas are polysilicon (often shortened to just "poly").


At the level presented here, diffusion, metal, and polysilicon can be thought
At the level presented here, diffusion, metal, and polysilicon can be thought of as roughly equivalent when viewed in isolation; they all conduct current. The important difference is in how they interact with each other, which is explained below.
of as roughly equivalent when viewed in isolation; they all conduct current. The
important difference is in how they interact with each other, which is
explained below.


== Basic building blocks ==
== Basic building blocks ==
Line 27: Line 24:
=== Transistors ===
=== Transistors ===


When a piece of polysilicon is sandwiched between two areas of diffusion, it
When a piece of polysilicon is sandwiched between two areas of diffusion, it acts as a ''gate'', only letting current through when the polysilicon is connected to power (or, equivalently, is ''on'', ''conducting'', ''high'', or ''1''). The diffusion area from which current flows when the gate is high is called the ''source''. The diffusion area into which current flows is called the ''drain''. The gate together with the source and drain is what makes a [[wikipedia:Field-effect transistor|''transistor'']].
acts as a ''gate'', only letting current through when the polysilicon is powered
(or, equivalently, ''on'', ''conducting'', ''high'', or ''1''). The diffusion area from which
current will flow when the gate is high is called the ''source''. The diffusion
area into which current will flow is called the ''drain''. The gate together with
the source and drain is what makes a [[wikipedia:Field-effect transistor|''transistor'']].


[[File:vis_transistor.png|none]]
[[File:vis_transistor.png|none]]


The transistor here is an [[wikipedia:Depletion_and_enhancement_modes|enhancement-mode transistor]].
The transistor here is an [[wikipedia:Depletion_and_enhancement_modes|enhancement-mode transistor]]. All the "ordinary" selectable (see the [[#Nodes|nodes section]]) transistors have this type.
All the "ordinary" selectable (see the [[#Nodes|nodes section]]) transistors have this type.


=== Power sources ===
=== Power sources ===


Around an area of powered diffusion we will often see something like the following
Around areas of powered diffusion we often see something like the following (note the distinctive "hook" in the polysilicon):
(note the distinctive "hook" in the polysilicon):


[[File:vis_power.png|none]]
[[File:vis_power.png|none]]


Here the polysilicon acts roughly like a resistor (or more specifically a [[#Terms|pull-up resistor]]), preventing a short from VCC to
Here the polysilicon acts roughly like a resistor (or more specifically a [[#Terms|pull-up resistor]]). This prevents there from ever being a short from VCC to ground (through some path of high gates). In the simulators, this entire configuration is simply modeled as a power source.
ground when the power source would otherwise have a direct connection to ground along
some path of high gates.


The transistor here is a [[wikipedia:Depletion_and_enhancement_modes|depletion-mode transistor]], a different type of
The transistor here is a [[wikipedia:Depletion_and_enhancement_modes|depletion-mode transistor]], a different type of transistor compared to above (though it appears the same visually).
transistor compared to above (though it appears the same visually). In the simulators, this configuration is simply modelled as a power source.
 
=== An example that brings together transistors and power sources ===
 
In the following curcuit, the gate will be high and ''A'' powered/pulled to VCC:
 
<pre>
                      VCC
                      |
                      |
                      |
[power source]------[gate]
                      |
                      |
                      |
                      A
</pre>
 
No current will ever flow from the power source to ''A'' (ideally). The voltage on the gate only controls whether there's a conductive path between VCC and ''A''.
 
Similarly, the gate will be low and ''A'' ''not'' powered/pulled to VCC In the following circuit:
 
<pre>
                      VCC
                      |
                      |
                      |
GND-----------------[gate]
                      |
                      |
                      |
                      A
</pre>
 
Using a switch analogy for transistors (which is common in digital electronics), the above circuits can be viewed as follows:
 
<pre>
                                          A
                                          |
                                          |
                                          |
                                          \
B----------[remote switch controller]      \  <- controlled switch
                                            \
                                          |
                                          |
                                          |
                                          C
</pre>
 
When ''B'' is high, the switch is closed, connecting ''A'' and ''C''. When ''B'' is low, it is open (like in the figure).


== Nodes ==
== Nodes ==


Electrically common areas are called ''nodes'' in Visual 6502/2C02/2A03. Clicking
Electrically common areas are called ''nodes'' in Visual 6502/2C02/2A03. Clicking on a node highlights it, making it easier to see how things are connected. (Clicking on powered or grounded diffusion won't work; these only modify properties of other nodes and are not themselves nodes.) When a node is highlighted, a numeric ID unique to the node is displayed in the upper right, along with a name for the node if it has one. Node names are defined in '''nodenames.js'''.
on a node will highlight it, making it easier to see how things are connected
(clicking on powered or grounded diffusion won't work; these only modify
properties of other nodes and are not themselves nodes). When a node is
highlighted, a numeric ID unique to the node will be displayed in the upper
right, along with a name for the node if it has one. Node names are defined in
'''nodenames.js'''.


Transistors can be selected separately by clicking on the gate (the part of the
Transistors can be selected separately by clicking on the gate (the part of the polysilicon between the diffusion areas). They have names that start with 't', followed by a numeric ID.
polysilicon between the diffusion areas). They have names that start with "t",
followed by a numeric ID.


The '''Find:''' edit field can be used to locate nodes, either by numeric ID or by
The '''Find:''' edit field can be used to locate nodes, either by numeric ID or by name. Numeric IDs can also be used to trace the values of nodes without an assigned name.
name. Numeric IDs can also be used to trace the values of nodes without an assigned
name.


== Basic logic elements ==
== Basic logic elements ==
Line 79: Line 106:
[[File:vis_inverter.png|none]]
[[File:vis_inverter.png|none]]


When the input gate is low, current flows into the output wire. When the input
Note that there is a hook in the gate to the left, meaning the left part of the circuit is a [[#Power sources|power source]] instead of a "normal" transistor.
gate is high, current flows into ground, driving the output wire low. The
output wire is hence the inverse of the input wire.


When one node is the inverse of another, it is said that it ''inverts into'' the
When the input gate is low, current from the power source flows into the output wire, pulling the voltage high. When the input gate is high, current from the power source instead flows into ground, driving the voltage on the output wire low. The output wire is hence the inverse of the input wire.
other node.
 
When one node is the inverse of another, we will say that it ''inverts into'' the other node.


=== NOR gates ===
=== NOR gates ===


Below is an example of a NOR gate taken from Visual 2A03, related to
Below is an example of a NOR gate taken from Visual 2A03, related to controlling when the first square channel is silenced:
controlling when the first square channel is silenced:


[[File:vis_nor.png|none]]
[[File:vis_nor.png|none]]


If any of the gates in red circles are high, the voltage of the
If any of the gates in red circles are high, the voltage of the highlighted node is pulled to ground instead of pulled high (as current will flow from the power source on the left into ground through any high gates). The value that reaches the gate in the blue circle is hence the NOR of the values on the gates in the red circles.
highlighted node will be pulled to ground instead of pulled high (current will
flow to ground through the gates in red circles that are high).  
The value that reaches the gate in the blue circle is hence the NOR of
the values on the gates in the red circles.


The gate in the blue circle is part of a ''pass transistor'', so called because
Note that the circles represent the only transistors in this image (except for the depletion-mode transistors on the power sources). There are polysilicon traces passing above (or in reality, below) metal traces in a few spots, but this does not form a transistor. The highlighting (which was activated by clicking on the node) shows how things are connected.
it passes current between two nodes rather than driving or grounding a node.
 
The gate in this case is [[#APU_clock_signals|'''apu_clk1''']], and we say that value is "buffered on
The gate in the blue circle is part of a ''pass transistor'', so called because it passes current between two nodes rather than driving or grounding a node. The gate in this case is [[#APU_clock_signals|'''apu_clk1''']], and we say that value is "buffered on '''apu_clk1'''".
'''apu_clk1'''".


== Storage elements ==
== Storage elements ==


=== Cross-coupled inverters ===
=== Wire capacitance as storage ===


Two cross-coupled inverters will make a latch (an element that stores a single
This is the simplest form of storage, and so is covered first.
bit). This arrangement is often used for latches that are set or cleared by
specific logic rather than by having a value copied into them (from e.g. a data
bus line).


Below is the VBlank flag from Visual 2C02. To the left the '''vbl_flag''' node is
If a wire is "closed off" so that it is no longer connected to neither power nor ground, it retains its value for a while through capacitance. This is used to store some short-lived data "on the wire". As an example, here's the read buffer for the 2C02's '''VBlank''' flag, which lets its value be read even though reading [[PPU_registers|$2002]] immediately clears the '''VBlank''' flag:
highlighted, and to the right its inverse is highlighted. (We would label the
 
inverse '''/vbl_flag''', where "'''/'''" denotes "inverse" or "active low"). As can be
[[File:vis_vblbuf.png|none]]
seen by the two gates in gray circles, each inverts into the other, forming
 
two cross-coupled inverters.
The circled gate is controlled by the '''/read_2002_output_vblank_flag''' signal (shortened to '''ov''' from here on). While '''ov''' is high, the value of '''vbl_flag''' (or rather '''/vbl_flag''' in this case) is connected to the highlighted wire. When '''ov''' goes low, the value on the wire is held.
 
A '/' denotes 'inverse', meaning the signal is the inverse of another signal with the same name but without the '/'. A '/' can also mean 'active low', meaning the signal is considered "active" when low. Visual 6502 has a slightly different convention – see [[#Node_names_in_Visual_6502|Node names in Visual 6502]].
 
While a node or wire is isolated from both VCC and ground in the above fashion, it is said to be ''floating''. For bus lines, a floating line is said to be ''tri-stated'', as the floating state can be viewed as a third state in addition to 0 and 1. This third state allows other devices to use the bus without interference.
 
Using capacitance as storage in the above fashion is an instance of [[wikipedia:Dynamic_logic_%28digital_electronics%29|dynamic logic]], so called since it has time-dependent behavior beyond just the input clock. Chips that make use of dynamic logic techniques tend to have a minimum clock speed at which they function correctly, as values stored via capacitance degrade to zero over time.
 
=== Latches (cross-coupled inverters) ===
 
Two cross-coupled inverters make a ''latch'' – an element that stores a single bit.
 
Below is the VBlank flag from Visual 2C02. In the left-most picture the '''vbl_flag''' node is highlighted, and in the middle picture its inverse ('''/vbl_flag''') is highlighted. As can be seen by the two gates in gray circles, each node inverts into the other, forming two cross-coupled inverters.


[[File:vis_crossreg.png|none]]
[[File:vis_crossreg.png|none]]


The gates marked "set" and "clear" set and clear the latch, respectively. To clear
The gates marked ''set'' and ''clear'' set and clear the latch, respectively. To clear the latch, '''vbl_flag''' is driven low. To set the latch, '''/vbl_flag''' is driven low.
the latch, '''vbl_flag''' is driven low. To set the latch, '''/vbl_flag''' is driven low.


This circuit is an example of an ''[[wikipedia:Flip-flop (electronics)#SR_NOR_latch|SR Latch]]'', where ''S'' stands for ''set'' and ''R'' for ''reset'', corresponding
This circuit is an example of an ''[[wikipedia:Flip-flop (electronics)#SR_NOR_latch|SR Latch]]'', where ''S'' stands for ''set'' and ''R'' for ''reset'', corresponding to the ''set'' and ''clear'' gates above. It is more specifically an SR NOR Latch, as it can be viewed as being built of NOR gates. The corresponding schematic using NOR gates is shown in the right-most picture.
to the ''set'' and ''clear'' gates above. It is more specifically an SR NOR Latch, as it can be viewed as being
built of NOR gates, where e.g. the values on the ''set'' gate and the upper gate in the gray circle constitute the inputs to a NOR gate. The corresponding schematic using NOR gates is shown on the
right.


=== Clocked latches ===
=== Clocked latches ===


When a latch can be set directly from the value of some line, e.g. a data bus
When a latch can be set directly from the value of some line, e.g. a data bus line, an arrangement involving a clock is often used. The motivation is to avoid having to form both '''data_line''' and '''/data_line''' and route them to the ''set'' and ''clear'' terminals of the latch, which would use more logic. The clock is already routed all around the chip, so mixing it in usually isn't as much of a problem.
line, an arrangement involving a clock is often used. The motivation is to
avoid having to form both '''data_line''' and '''/data_line''' and route them to the
respective terminals of the latch, which would use more logic. (The clock is
already routed all around the chip, so mixing it in usually isn't as much of a
problem.)


As an example, here's the '''noi_lfsrmode''' node (the "Loop noise" flag from
As an example, here's the '''noi_lfsrmode''' node (the ''Loop noise'' flag from [[APU Noise|$400E]]):
[[APU Noise|$400E]]):


[[File:vis_clockedreg.png|none]]
[[File:vis_clockedreg.png|none]]


When [[#APU_clock_signals|'''apu_clk1''']] is high, '''noi_lfsrmode''' will flow into the second highlighted node,
While [[#APU_clock_signals|'''apu_clk1''']] is high, '''noi_lfsrmode''' will flow into the floating node (so called because it will [[#Wire_capacitance_as_storage|float]] when both '''apu_clk1''' and '''w400e''' are low), which then inverts into '''noi_/lfsrmode''', forming a cross-coupled inverter latch. While '''apu_clk1''' is low, the loop will be broken momentarily, and during this phase a new value can be copied into the latch through the gate controlled by the '''w400e''' signal (which goes high on writes to $400E). The value let through by the pass transistor is the '''db7''' node, corresponding to the 8th bit of the data bus. (There's a [[#Terms|via]] between the diffusion and the metal '''db7''' line easier to see if the node is highlighted.) If the loop was not broken during the write operation, the old value in the latch would interfere with setting a new value.
which then inverts into '''/noi_lfsrmode''', forming a cross-coupled inverter latch.
While '''apu_clk1''' is low, the loop will be broken momentarily, and during this
phase a new value can be copied into the latch through the gate controlled by the '''w400e'''
signal (which goes high on writes to $400E). The value let through by the pass transistor is the
'''_db7''' node, corresponding to the seventh bit of the data bus. (There's a [[#Terms|via]] between
the diffusion and the metal '''_db7''' line - easier to see if the node is highlighted.) If the loop was not
broken during the write operation, the old value in the latch would interfere with setting a new value.


=== Wire capacitance as storage ===
For another, less cluttered view of the same type of circuit, see [http://www.ece.unm.edu/~jimp/vlsi/slides/chap5_2-21.gif this image] (substitute "'''apu_clk1'''" for '''"/φ₁"''' and "'''w400e'''" for '''"φ₁"''').


If a wire is "closed off" so that it is no longer connected to neither power
(The circuitry in the lower-right corner is a ''multiplexer'', which selects between one of two inputs depending on whether '''noi_lfsrmode''' or '''noi_/lfsrmode''' is high; i.e., depending on whether '''noi_lfsrmode''' is 0 or 1. The output of the multiplexer is on the left side.)
nor ground, it will retain its value for a while through capacitance. This is
used to store some short-lived data "on the wire" without requiring a latch
(this is called [[wikipedia:Dynamic_logic_%28digital_electronics%29|dynamic logic]],
since it has time-dependent behavior beyond just the input clock). As an example,
here's the read buffer for the 2C02's VBlank flag, which lets its value be read even though
reading [[PPU_registers|$2002]] immediately clears the VBlank flag:
 
[[File:vis_vblbuf.png|none]]
 
When the circled gate (controlled by the '''/read_2002_output_vblank_flag''' signal) goes low, the
value on the wire is held. When the circled gate is high, the value of '''vbl_flag'''
(or rather '''/vbl_flag''' in this case) is connected to the wire.
 
The clocked latch, described above, also makes use of wire capacitance when both the clock and the write enable are low. Chips which make use of this technique tend to have a minimum clock speed at which they can function correctly.


=== DRAM (Dynamic RAM) ===
=== DRAM (Dynamic RAM) ===
Line 178: Line 176:
In the left and right pictures the two sides of the cell are highlighted (with a different highlight color on the right due to the node being high). The two nodes are always inverses of each other, with the node highlighted in the left picture corresponding to the value held in the cell (low for 0 and high for 1).
In the left and right pictures the two sides of the cell are highlighted (with a different highlight color on the right due to the node being high). The two nodes are always inverses of each other, with the node highlighted in the left picture corresponding to the value held in the cell (low for 0 and high for 1).


Note that this is ''not'' an instance of cross-coupled inverters, as neither node is directly connected to a power source. Rather, DRAM depends on capacitance to hold the value, which will fade unless the capacitor is regularly ''refreshed'' (the high side recharged). (This is the "dynamic" part of DRAM.)
Note that this is ''not'' an instance of cross-coupled inverters, as neither node is directly connected to a power source. Rather, DRAM depends on capacitance to hold the value, which will fade unless the capacitor is regularly ''refreshed'' (the high side recharged). This is the "dynamic" part of DRAM.


Below is a picture of the upper edge of the PPU OAM DRAM array:
Below is a picture of the upper edge of the PPU OAM DRAM array:
Line 184: Line 182:
[[File:vis_oam.png|none]]
[[File:vis_oam.png|none]]


(The "column" and "row" labels are conventional memory terminology; they confusingly happen to get the opposite orientation in Visual 2C02. "Row" and "column" below will refer to this terminology.)
The "column" and "row" labels are conventional memory terminology; they confusingly happen to get the opposite orientation in Visual 2C02. "Row" and "column" below will refer to this terminology.


The '''spr_row''x''''' lines (sometimes called ''word lines'') are used to connect a row of memory cells to the horizontal ''bit lines'' (by opening up each cell to a pair of [[#Terms|vias]]); this is called ''opening'' that row. For example, '''spr_row16''' opens the highlighted row, while '''spr_row0''' opens the row on its right side. As can be guessed from the node names, the memory layout is not as straightforward as consecutive memory locations being stored in consecutive rows.
The '''spr_row''x''''' lines (sometimes called ''word lines'') are used to connect a row of memory cells to the horizontal ''bit lines'' (by opening up each cell to a pair of [[#Terms|vias]]); this is called ''opening'' that row. For example, '''spr_row16''' opens the highlighted row, while '''spr_row0''' opens the row on its right side. As can be guessed from the node names, the memory layout is not as straightforward as consecutive memory locations being stored in consecutive rows. (Interestingly, we do get consecutive rows if we reverse the bits in the part of the sprite address that selects the row. It is unknown why the row selection bits were not wired to the DRAM in this "correct" configuration instead.)


On the left side of OAM we see pass transistors on the '''spr_col1''' and '''spr_col3''' lines select the bit lines from the first and second columns of the memory array, respectively (there are other, similar, lines next to them) . Each such '''spr_col''x''''' line is connected to eight different columns (16 bit lines), corresponding to the eight bits of the byte to be read or written (increasing bit positions are not stored in consecutive columns either). One notable exception to this pattern is that two columns only connect to '''five''' sets of bit lines; these columns correspond to the "flags" bytes in OAM, where the middle 3 bits don't actually exist.
On the left side of OAM we see pass transistors on the '''spr_col1''' and '''spr_col3''' lines select the bit lines from the first and second columns of the memory array, respectively (there are other, similar, lines next to them) . Each such '''spr_col''x''''' line is connected to eight different columns (16 bit lines), corresponding to the eight bits of the byte to be read or written (increasing bit positions are not stored in consecutive columns either). One notable exception to this pattern is that two columns only connect to '''five''' sets of bit lines; these columns correspond to the "flags" bytes in OAM, where the middle 3 bits don't actually exist.
Line 194: Line 192:
At the right side in the picture above we see [[#PPU_clock_signals|'''pclk0''']] running down the edge of OAM, connected to [[#Terms|pull-up transistors]] for each bit line. During '''pclk0''', these are used to ''precharge'' the bit lines, after which the pull-up transistors are disabled but the lines remain charged through capacitance. When the selected row is opened after '''pclk0''', it will be exposed to the precharged bit lines, which has the effect of charging up the high side of the cell. On the low side of the cell, the precharge current will simply drain to ground, as the gate on that side will be driven high.
At the right side in the picture above we see [[#PPU_clock_signals|'''pclk0''']] running down the edge of OAM, connected to [[#Terms|pull-up transistors]] for each bit line. During '''pclk0''', these are used to ''precharge'' the bit lines, after which the pull-up transistors are disabled but the lines remain charged through capacitance. When the selected row is opened after '''pclk0''', it will be exposed to the precharged bit lines, which has the effect of charging up the high side of the cell. On the low side of the cell, the precharge current will simply drain to ground, as the gate on that side will be driven high.


In a typical DRAM circuit, the rows are automatically and periodically refreshed to prevent values from fading. In the PPU, no such logic exists, and rows are only refreshed when accessed. The reason the PPU (usually) gets away with this is that [[PPU_sprite_evaluation|sprite evaluation]] will access the entire OAM (provided rendering is enabled), refreshing the rows as a side effect.
In a typical DRAM circuit, the rows are automatically and periodically refreshed to prevent values from fading. In the PPU, no such logic exists, and rows are only refreshed when explicitly accessed. The reason the PPU (usually) gets away with this is that [[PPU_sprite_evaluation|sprite evaluation]] will access the entire OAM (provided rendering is enabled), refreshing the rows as a side effect.
 
In Visual 2C02, the precharge logic has been disconnected (clicking on the gates of the pull-up transistors will show that there are no transistors there, even though the visual display is as-if there would be) as it is not necessary in a purely digital simulator and causes timing glitches.


=== SRAM (Static RAM) ===
=== SRAM (Static RAM) ===
Line 208: Line 204:
== Miscellaneous circuitry ==
== Miscellaneous circuitry ==


=== PLAs (Programmable Logic Arrays) ===
=== Decoders and mask ROMs ===


[[wikipedia:Programmable_logic_array|PLA]]s are [[wikipedia:Combinational_logic|combinational]] circuits implementing Boolean functions, often
*A [[wikipedia:Decoder|decoder]] is a circuit that maps input values to output values. A decoder that maps ''m'' input lines to ''n'' output lines is called an ''m-to-n decoder''.
used for decoding and "lookup table" functionality, constructed with an AND gate plane that feeds into an OR gate plane.
*A [[wikipedia:Mask_ROM|mask ROM]] is a type of read-only memory constructed by masking off parts of a circuit grid.


Pictured below is the PLA that acts as the lookup table for initialization of the [[APU_Length_Counter|length counters]] in the APU:
The two elements are covered together since their implementation turns out to be similar in this case.


[[File:vis_len_pla.png|none]]
Pictured below is the decoder and mask ROM that act as the lookup table for initialization of the [[APU_Length_Counter|length counters]] in the APU:


The length is set by writing bits 7-3 of e.g. '''$4003''' (in the case of the
[[File:vis_len_rom.png|none]]
first pulse channel), so the inputs to the PLA are bits 7-3 of the data bus.
The output is the corresponding length from the lookup table, which is used to
initialize a counter that counts down to zero before silencing the channel.


The picture below shows a zoomed-in view of the lower part of the PLA:
The length is set by writing bits 7-3 of e.g. '''$4003''' (in the case of the first pulse channel), so the inputs to the decoder are bits 7-3 of the data bus. The output from the decoder feeds into the mask ROM, and the output from the mask ROM is the length from the lookup table. The length is used to initialize a counter that counts down to zero before silencing the channel.
 
The picture below shows a zoomed-in view of the lower part of the decoder and mask ROM:


[[File:vis_len_pla_zoom.png]]
[[File:vis_len_pla_zoom.png]]


The spots of yellow diffusion in the AND and OR planes are connections to the
The spots of yellow diffusion in the decoder and mask ROM are connections to the metal wires, which run horizontally in the decoder and vertically in the mask ROM. By setting the gates connected to the diffusion high, the wires can be driven low.
metal wires, which run horizontally in the AND plane and vertically in the OR
plane. By setting the gates connected to the diffusion high, the wires can be driven
low.


In the AND plane (right part) of the PLA the input lines and their inverses run
In the decoder (right part) the input lines and their inverses run vertically ('''/db7''' has been highlighted to show its connection). By looking carefully at the bottom-most horizontal row in the decoder, we see that it is powered on the right side, and that the condition for it to remain high as it passes into the mask ROM is <span style="border:dotted thin gray">'''/db7''' AND '''/db6''' AND '''/db5''' AND '''/db4''' AND '''/db3'''</span>. Another way to put this condition is '''db7'''-'''db3''' = $00.
vertically ('''/db7''' has been highlighted to show its connection). By looking
carefully at the bottom-most horizontal row in the AND plane, we see that it is
powered on the right side, and that the condition for it to remain high as it
passes into the OR plane is <span style="border:dotted thin gray">'''/db7''' AND '''/db6''' AND '''/db5''' AND '''/db4''' AND '''/db3'''</span>.
Another way to put this condition is '''db7'''-'''db3''' = $00.


Similarly, the condition for the second row from the bottom to be high is
Similarly, the condition for the second row from the bottom to be high is <span style="border:dotted thin gray">'''/db7''' AND '''/db6''' AND '''/db5''' AND '''db4''' AND '''/db3'''</span>, which translates to '''db7'''-'''db3''' = $02. The conditions for the third and fourth rows from the bottom are '''db7'''-'''db3''' = $04 and '''db7'''-'''db3''' = $06, respectively.
<span style="border:dotted thin gray">'''/db7''' AND '''/db6''' AND '''/db5''' AND '''db4''' AND '''/db3'''</span>,
which translates to '''db7'''-'''db3''' = $02. The conditions for the third and fourth rows from the bottom are '''db7'''-'''db3''' = $04
and '''db7'''-'''db3''' = $06, respectively.


The AND plane is set up so that '''db''x''''' and '''/db''x''''' will never both drive the same
The decoder is set up so that '''db''x''''' and '''/db''x''''' will never both drive the same horizontal line low (which would make it impossible for that line to ever be high), and in this case each row has a unique bit pattern that activates it. (It would also be possible to insert a "don't care" condition in the decoder by having ''neither'' '''db''x''''' nor '''/db''x''''' drive the line low.)
horizontal line low (which would make it impossible for that line to ever be
high), and in this case each row has a unique bit pattern that activates it. (It would also be
possible to insert a "don't care" condition in the PLA by having ''neither'' '''db''x''''' nor
'''/db''x''''' drive the line low.)


In the OR plane (so called because each vertical line makes a (N)OR with
The decoder here is a 5-to-32 decoder, with 32 rows corresponding to the 32 possible bit patterns made with five bits. This type of decoder is said to ''fully decode'' its inputs, and is an instance of an ''n-to-2<sup>n</sup> decoder''.
horizontal lines from the AND plane outputs), we see that each horizontal line
from the AND plane when high will cause a particular pattern to appear on the
'''len''x''''' outputs. Reading off the bottom row, this pattern is '''len7'''-'''0''' = 00001001b = 9.
Reading off the remaining rows from bottom to top, we get the values
00010011b = 19, 00100111b = 39, and 01001111b = 79.


Putting together the above, we have the following incomplete map from inputs to
In the mask ROM (this one in particular being a NOR ROM), we see that each horizontal line from the decoder when high will cause a particular pattern to appear on the '''len''x''''' outputs. Reading off the bottom row, this pattern is '''len7'''-'''0''' = 00001001b = 9. Reading off the remaining rows from bottom to top, we get the values 00010011b = 19, 00100111b = 39, and 01001111b = 79.
outputs:
 
Putting together the above, we have the following incomplete map from inputs to outputs:


{|
{|
Line 272: Line 248:
|}
|}


By checking against the [[APU_Length_Counter|APU length counter table]], we see
By checking against the [[APU_Length_Counter|APU length counter table]], we see that these indeed are the length values corresponding to those indices (minus one, due to details of how the length counter works).
that these indeed are the length values corresponding to those indices (minus one,
due to details of how the length counter works).


Some PLAs omit the OR gate plane. As an example, the picture below shows the
To give an example of a decoder that does not feed into a mask ROM, the picture below shows the internal 2A03 address decoder for the address range '''$4000'''-'''$4017''', where signals such as '''r4017''' (read 4017) and '''w4004''' (write 4004) are generated.
internal 2A03 address decoder for the address range '''$4000'''-'''$4017''', where signals
such as '''r4017''' (read 4017) and '''w4004''' (write 4004) come directly from the AND
plane outputs.


[[File:vis_addr_pla.png]]
[[File:vis_addr_pla.png]]
The theory behind the decoder and mask ROM seen here is closely related to that of [[wikipedia:Programmable_logic_array|PLAs]] (Programmable Logic Arrays), where we could view the decoder as the AND plane and the mask ROM as the OR plane (both implemented with NOR gates). [http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Comb/pla.html This introduction to PLAs] is helpful.


=== Adders ===
=== Adders ===


Pictured below is part of the [[wikipedia:Adder_(electronics)|adder]] used by
Pictured below is part of the [[wikipedia:Adder_(electronics)|adder]] used by the [[APU_Sweep|sweep units]] in the 2A03 to calculate the target period for sweep period updates to the second square channel (the first square channel is identical except for a small quirk related to subtraction; see below). The pictured part calculates the second bit (bit 1) of the sum, along with the carry for that bit position.
the [[APU_Sweep|sweep units]] in the 2A03 to calculate the target period for sweep
period updates to the second square channel (the first square channel is identical
except for a small quirk related to subtraction; see below). The pictured part
calculates the second bit (bit 1) of the sum, along with the carry for that bit
position.


[[File:vis_adder.png]]
[[File:vis_adder.png]]


The adder is split into two parts. The left-most part (having four columns)
The adder is split into two parts. The left-most part (having four columns) calculates bit 1 of the sum. The right-most part (with three columns) calculates the carry. Both '''/sum1 out''' and '''/carry out''' are powered, and can be forced low by certain combinations of the input signals being high. (For e.g. the left-most column, this combination is <span style="border:dotted thin gray">'''addend1''' AND '''carry in''' AND '''sq1_p1'''</span>). The essential information is captured in the following truth table:
calculates bit 1 of the sum. The right-most part (with three columns)
calculates the carry. Both '''/sum1 out''' and '''/carry out''' are powered,
and can be forced low by certain combinations of the input signals being high
(for e.g. the left-most column, this combination is
<span style="border:dotted thin gray">'''addend1''' AND '''carry in''' AND '''sq1_p1'''</span>).
The essential information is captured in the following truth table:


{| class="wikitable"
{| class="wikitable"
Line 326: Line 288:
|}
|}


As expected, this corresponds to an addition operation (with the sum and carry
As expected, this corresponds to an addition operation (with the sum and carry inverted).
inverted).


The same logic is used to perform subtraction, by inverting each bit of the
The same logic is used to perform subtraction, by inverting each bit of the addend (using separate logic) and setting the carry in for the zeroth bit to 1. This corresponds to the usual invert-bits-and-add-one operation for negating a number in two's complement.
addend (using separate logic) and setting the carry in for the zeroth bit to 1.
This corresponds to the usual invert-bits-and-add-one operation for negating a
number in two's complement.


For unknown reasons, the carry in for the zeroth bit is not connected on the
For unknown reasons, the inverted carry input for the zeroth bit of the first square channel is connected to VCC instead of the inverted sweep direction flag (as it is in the other square channel), making the carry input unconditionally zero. This leads to the value ''minus one'' being subtracted instead on that channel.
first square channel, making it always zero. This leads to the value ''minus one''
being subtracted instead on that channel.


=== Barrel shifters ===
=== Barrel shifters ===


The below circuitry forms part of a [[wikipedia:Barrel_shifter|barrel shifter]], used to shift the inputs to
The below circuitry forms part of a [[wikipedia:Barrel_shifter|barrel shifter]], used to shift the inputs to the adders for sweep unit period updates in this case.
the adders for sweep unit period updates in this case.


[[File:vis_barrel_shifter.png]]
[[File:vis_barrel_shifter.png]]


As a side note, the bit inversion for subtraction by the sweep units happens ''before'' the bits
As a side note, the bit inversion for subtraction by the sweep units happens ''before'' the bits enter the barrel shifter.
enter the barrel shifter.


=== Shift registers ===
=== Shift registers ===
Line 352: Line 306:
''(This section might be considered "advanced" on a first reading. I just wanted an example that made more complex use of clocks.)''
''(This section might be considered "advanced" on a first reading. I just wanted an example that made more complex use of clocks.)''


The picture below shows the 16-bit [[wikipedia:Shift_register|shift register]] that holds the high bits
The picture below shows the 16-bit [[wikipedia:Shift_register|shift register]] that holds the high bits for background tiles (see the [[PPU_rendering|PPU rendering]] page). The upper eight bits can be reloaded from PPU VRAM data bus lines, and the output is taken from the lower eight bits (in this case, the particular bit to use is selected by the '''[[PPU scrolling|fine x scroll]]'''). Bits flow clockwise through the shift register.
for background tiles (see the [[PPU_rendering|PPU rendering]] page). The upper eight bits can be
reloaded from PPU VRAM data bus lines, and the output is taken from the lower
eight bits (in this case, the particular bit to use is selected by '''fine_x''').
Bits flow clockwise through the shift register.


[[File:vis_shift_reg.png|none]]
[[File:vis_shift_reg.png|none]]


Below is a zoomed-in view of three bits ('''tile_h15'''-'''13''') from the upper-left part
Below is a zoomed-in view of three bits ('''tile_h15'''-'''13''') from the upper-left part of the shift register:
of the shift register:


[[File:vis_shift_reg_zoom.png|none]]
[[File:vis_shift_reg_zoom.png|none]]
Line 369: Line 318:
==== Control signals ====
==== Control signals ====


The following signals control the shifting and reloading of the register (the
The following signals control the shifting and reloading of the register (the names used were invented for the article and are not standard terminology):
names used were invented for the article and are not standard terminology):


* The '''Invert''' signal corresponds to '''pclk0''', which is high during the initial half-cycle of a PPU cycle (see the [[#Clocks|Clocks]] section).
* The '''Invert''' signal corresponds to '''pclk0''', which is high during the initial half-cycle of a PPU cycle (see the [[#Clocks|Clocks]] section).
Line 376: Line 324:
* The '''Parallel load''' signal controls pass transistors connected to '''_db0'''-'''7''', used to load the upper eight bits of the shift register.
* The '''Parallel load''' signal controls pass transistors connected to '''_db0'''-'''7''', used to load the upper eight bits of the shift register.


'''Shift''' does not always exactly mirror '''pclk1''', as explained below, which is
'''Shift''' does not always exactly mirror '''pclk1''', as explained below, which is the reason for the '''' notation.
the reason for the ''~'' notation.


==== Shifting ====
==== Shifting ====
Line 383: Line 330:
Shifting the register is a two-step process:
Shifting the register is a two-step process:


# During '''pclk0''', '''Invert''' is driven high, making the value of '''(1)''' flow through the pass transistor into the node in the blue circle, which causes '''(1)''' to invert into '''(2)'''.
# During '''pclk0''', '''Invert''' is driven high, making the value of '''(1)''' flow through the pass transistor into the red-highlighted node, which causes '''(1)''' to invert into '''(2)'''.
# During '''pclk1''', '''Shift''' is driven high, which causes the node marked '''(2)''' to invert into the node marked '''(3)''' (the next bit of the shift register). '''Invert''' is low during this phase, and the value on the node in the blue circle is held via [[#Wire_capacitance_as_storage|wire capacitance]], which makes this a ''[[wikipedia:Dynamic_logic_(digital_electronics)|dynamic]] shift register''.
# During '''pclk1''', '''Shift''' is driven high, which causes the node marked '''(2)''' to invert into the node marked '''(3)''' (the next bit of the shift register). '''Invert''' is low during this phase, and the value on the red-highlighted node is held via [[#Wire_capacitance_as_storage|wire capacitance]], which makes this a ''[[wikipedia:Dynamic_logic_(digital_electronics)|dynamic]] shift register''.


Due to the bit of powered diffusion circled in red, the default value shifted
Due to the bit of powered diffusion circled in red, the default value shifted into '''(1)''' is 1. However, as the value is held on the inverted side '''(2)''', this means that zeroes are being shifted in.
into '''(1)''' is 1. However, as the value is held on the inverted side '''(2)''', this
means that zeroes are being shifted in.


==== Parallel load ====
==== Parallel load ====


To perform a parallel load of the register, step '''(2)''' from above is modified so
To perform a parallel load of the register, step '''(2)''' from above is modified so that '''Shift''' remains low during '''pclk1''' and '''Parallel load''' goes high instead, causing the new value for each cell to come from the data bus lines instead of from the previous cell.
that '''Shift''' remains low during '''pclk1''' and '''Parallel load''' goes high instead,
causing the new value for each cell to come from the data bus lines instead of
from the previous cell.


The diagram below might clarify how the control signals are related. Each row
The diagram below might clarify how the control signals are related. Each row is a PPU half-cycle.
is a PPU half-cycle.


<pre>
{| class="wikitable"
pclk0 Invert Shift Parallel load
! pclk0 !! Invert !! Shift !! Parallel load
1     1       0     0
|-
0     0       1     0
| 1 || 1 || 0 || 0
1     1       0     0
|-
0     0       1     0
| 0 || 0 || 1 || 0
1     1       0     0
|-
0     0       0     1 <-- Reloaded here
| 1 || 1 || 0 || 0
1     1       0     0
|-
0     0       1     0
| 0 || 0 || 1 || 0
1     1       0     0
|-
0     0       1     0
| 1 || 1 || 0 || 0
1     1       0     0
|-
0     0       1     0
| 0 || 0 || 0 || 1 Reloaded here
</pre>
|-
| 1 || 1 || 0 || 0  
|-
| 0 || 0 || 1 || 0
|-
| 1 || 1 || 0 || 0
|-
| 0 || 0 || 1 || 0
|-
| 1 || 1 || 0 || 0
|-
| 0 || 0 || 1 || 0
|}


=== Digital-to-analog conversion (DAC) ===
=== Digital-to-analog conversion (DAC) ===
Line 426: Line 379:
[[File:vis_vid_dac.png|none]]
[[File:vis_vid_dac.png|none]]


The upper-left end is actually connected to VCC, and the lower-right to ground. This is a [[wikipedia:voltage ladder|''voltage ladder'']], and works by tapping the wire at different points along the run to get different voltages. As the simulator is purely digital, this circuit is not directly used in the simulation, and some parts that would otherwise interfere with it have been disconnected.
The upper-left end is actually connected to VCC, and the lower-right to ground. This is a [[wikipedia:voltage ladder|''voltage ladder'']], and works by tapping the wire (which behaves as a resistor) at different points along the run to get different voltages. As the simulator is purely digital, this circuit is not directly used in the simulation, and some parts that would otherwise interfere with it have been disconnected.


=== Output drivers ===
=== Output drivers ===


These are found on pins capable of doing output, which need to be able to source (generate) and sink large currents to drive the line high or low. Large clusters of [[#Terms|pull-up and pull-down]] transistors like these are sometimes called ''superbuffers''. The polysilicon wire that would cause the pin to source current is highlighted below.
These are found on pins capable of doing output, which need to be able to source (generate) and sink large currents to drive the line high or low. The polysilicon wire that would cause the pin to source current is highlighted below.


[[File:vis_output_driver.png|none]]
[[File:vis_output_driver.png|none]]
Large clusters of [[#Terms|pull-up and pull-down]] transistors like these are sometimes called ''superbuffers''. They also appear in some internal circuits that need to source or sink larger currents, e.g. due to having a large [[wikipedia:Fan-out|''fan-out'']] – a large number of connections from the logic gate's output to inputs of other gates.
On lines that are capable of being tri-stated, this is done by activating neither the pull-up nor the pull-down transistors, so that the pin neither sources nor sinks current. This is also done for reads on bidirectional lines, to prevent the output driver from interfering.


=== Cut-off connections ===
=== Cut-off connections ===
Line 449: Line 406:


The way diffusion is powered or grounded is through vias to large areas of metal that are either grounded or powered.
The way diffusion is powered or grounded is through vias to large areas of metal that are either grounded or powered.
== Transistor dimensions ==
Visual6502, Visual2A03, and Visual2C02 are purely digital simulators, so the effects of transistor dimensions don't matter. But you will often notice locations in the simulators where transistors are different shapes.
Here's the inverter from the beginning of this tutorial, now annotated with dimensions:
[[Image:Vis_inverter_dimensions.png]]
Because the layer of substrate is uniform thickness, everything is calculable in terms of [[wikipedia:Sheet resistance|sheet resistance]]. In the above annotated picture, two transistors are shown: one significantly wider than long, and the other the opposite. The aspect ratio (length divided by width) of the depletion mode pull-up transistor on the left is approximately 3.47, while the aspect ratio of the enhancement mode pull-down transistor on the right is 0.23. As a result, the pull-down transistor is approximately 15 times more effective at sinking current (which is good: it has to be able to override the pull-up).
The 2A03 uses these analog effects in its audio path:
[[Image:Vis_2a03_pcmout012.png]]
Shown are the three least significant bits of the 2A03's APU's PCM channel. '''pcmout1''' drives a single transistor (with some resistance R) and '''pcmout2''' drives two (resulting in a resistance R÷2). To give '''pcmout0''' a resistance of 2·R, they either would have had to make the transistor half as wide or twice as long. Halving the width wasn't an option because the diffusion areas are already as narrow as possible using this manufacturing technology. As a result, the gate for the least significant bit is longer.


== Clocks ==
== Clocks ==
Line 477: Line 450:
; apu_/clk2
; apu_/clk2
:  Like '''apu_clk1''', but ticks on the opposite phase, and is also inverted so that it has a 75% duty cycle.
:  Like '''apu_clk1''', but ticks on the opposite phase, and is also inverted so that it has a 75% duty cycle.
; apu_clk2''x'', where ''x'' is ''a'', ''b'', ''c'', etc.
:  Inverses of '''apu_/clk2''', used internally in various components.


This clock arrangement helps to ensure that timed events (various counters being decremented or reloaded) do not conflict with writes from the CPU (which only happen when '''φ2''' is high).
This clock arrangement helps to ensure that timed events (various counters being decremented or reloaded) do not conflict with writes from the CPU (which only happen when '''φ2''' is high).
Line 492: Line 468:
!apu_/clk2
!apu_/clk2
| 1 || 1 || '''0''' || 1 || 1 || 1 || '''0''' || 1
| 1 || 1 || '''0''' || 1 || 1 || 1 || '''0''' || 1
|-
!apu_clk2''x''
| 0 || 0 || '''1''' || 0 || 0 || 0 || '''1''' || 0
|}
|}


Line 497: Line 476:


; clk0
; clk0
:  The input clock, fed from the [[Clock_rate|master clock]]. Used directly in video waveform generation.
:  The input clock, fed from the [[Cycle_reference_chart#Clock_rates|master clock]]. Used directly in video waveform generation.


; _clk0
; _clk0
Line 503: Line 482:


; pclk0
; pclk0
:  The '''p'''ixel clock. Derived from clk0 by dividing by four (NTSC) or five (PAL). One cycle corresponds to a rendered dot, with '''pclk0''' being high during the first phase (half-cycle).
:  The '''p'''ixel clock. Derived from '''clk0''' by dividing by four (NTSC) or five (PAL). One cycle corresponds to a rendered dot, with '''pclk0''' being high during the first phase (half-cycle).


; pclk1
; pclk1
:  The inverse of '''pclk0'''. High during the second phase of a pixel clock.
:  The inverse of '''pclk0'''. High during the second phase of a pixel clock.
=== Master clock and CPU/PPU clock alignment ===
The clock divider in the PPU is clocked on a zero-to-one transition of the master clock, while the clock divider in the CPU is clocked on a one-to-zero transition. Diagrammatically, this might look like below for NTSC (for CPU clock, "!" denotes when the 2A03's M2 line goes high but the 6502's internal clock is still low).
<pre>
Master clock    | 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ...
PPU pixel clock | 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 ...
CPU clock      | 0 0 0 0 0 0 0 0 0 ! ! ! 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 ! ! ! 1 ...
</pre>
== 6502 cycle and phase timing ==
During each active cycle (i.e., while the RDY line of the 6502 has not been pulled low), the CPU either reads from or writes to memory; there are no "idle" cycles w.r.t. the data bus in the 6502. Each such read or write cycle is split up into two equally long phases, called '''φ1''' (phase 1) and '''φ2''' (phase 2), corresponding to the clock signals above. '''φ1''' takes place while the clock input is low, '''φ2''' while it is high.
During each cycle, the [[CPU_pin_out_and_signal_description|R/W signal]] and the address bus lines are updated during '''φ1'''. In the simulators we see them change right away, but in a real 6502 there will be some delay. At the end of '''φ2''', values are read from or written to the data bus lines. The IRQ and NMI interrupt lines appear to be sampled on the falling edge of '''φ2''' (as indicated in [http://forum.6502.org/viewtopic.php?f=1&t=2532 this thread] and also from observed behavior relating to the '''VBlank''' flag).
[http://nesdev.org/6502_cpu.txt This document] lists data and address bus contents during each cycle of an instruction. For more detailed timing information, see [http://users.telenet.be/kim1-6502/6502/hwman.html the MOS hardware manual].


== Terms ==
== Terms ==
Line 513: Line 510:


; [[wikipedia:Wire bonding|Bond wire]]
; [[wikipedia:Wire bonding|Bond wire]]
:  A wire that connects an internal pad to an external pin on the chip package; see e.g. [http://farm4.staticflickr.com/3270/4562350638_7bd4f62a76_z.jpg].
:  A wire that connects an internal pad to an external pin on the chip package; see e.g. [[commons:File:Wirebond-ballbond.jpg|this]]
 
; Buried contact
; Buried contact
:  A connection between diffusion and polysilicon.
:  A connection between diffusion and polysilicon.
 
; [[wikipedia:NMOS logic|NMOS]]
; NMOS
:  The technology used for the transistors in the 2A03 and 2C02. In NMOS, transistors are made by creating regions of n-doped semiconductor that become the source and drain ("n-doped" because this doping increases the mobility of electrons and their '''n'''egative charge). This type of transistor is good at sinking current to ground (this is what causes a 0 bit to usually "win" in [[Bus conflict|bus conflicts]]), and worse at pulling up. The transistors used in NMOS are more precisely called ''n(-type )MOSFETs''. NMOS transistors are "active" when their gate is connected to a ''high'' voltage (i.e. VCC). See [http://www.youtube.com/watch?v=IcrBqCFLHIY this YouTube video] for another high-level overview of how transistors work.
:  The technology used for the transistors in the 2A03 and 2C02. In NMOS, transistors are made by creating regions of n-doped semiconductor that become the source and drain ("n-doped" because this doping increases the mobility of electrons and their '''n'''egative charge). This type of transistor is good at sinking current to ground (this is what causes a 0 bit to usually "win" in [[Bus conflict|bus conflicts]]), and worse at pulling up. PMOS is the opposite. The transistors used in NMOS and PMOS are more precisely called ''n(-type )MOSFETs'' and ''p(-type )MOSFETs'', respectively.  
; [[wikipedia:PMOS logic|PMOS]]
 
:  The counterpart to NMOS, PMOS is an older and slower (but initially easier to manufacture) technology used for making integrated circuits. In PMOS, transistors are made by creating regions of p-doped semiconductor that become the source and drain ("p-doped" because this doping increases the mobility of electron holes and their '''p'''ositive charge). This type of transistor is good at sourcing current from VCC, and worse at pulling down. The transistors used in PMOS are more precisely called ''p(-type )MOSFETs''. PMOS transistors are "active" when their gate is connected to a ''low'' voltage (i.e. GND).
; [[wikipedia:CMOS logic|CMOS]]
:  The ''combination'' of NMOS and PMOS within a single design, CMOS chips make use of both n-type '''and''' p-type MOSFETs in order to form logic gates. While PMOS requires weak pull-down resistors (often permanently enabled transistors connected to GND) and NMOS requires weak pull-up resistors (originally permanently enabled transistors connected to VCC, but later replaced with [[wikipedia:Depletion-load NMOS logic|depletion loads]]), CMOS makes use of p-type MOSFETs as strong pull-ups and n-type MOSFETs as strong pull-downs, arranged in pairs such that exactly one transistor in each pair is active at any given moment, made even simpler by the fact that a single input signal can connect directly to both transistors (when it is high it activates the NMOS pull-down, and when it is low it activates the PMOS pull-up).
; Open drain
; Open drain
:  A type of output that works by sinking current from an external pull-up resistor instead of generating current on its own. An example is the PPU's INT pin. The pull-up resistor is denoted "RM1" in [[media:neswires.jpg|this wiring diagram]].
:  A type of output that works by sinking current from an external pull-up resistor instead of generating current on its own. An example is the PPU's INT pin. The pull-up resistor is denoted "RM1" in [https://www.nesdev.org/Ntd_8bit.jpg this Famicom wiring diagram].
 
; Pull-up resistor
; Pull-up resistor
:  A resistor connected to power. "Pull-up" comes from pulling the wire to a high state.
:  A resistor connected to power. "Pull-up" comes from pulling the wire to a high state.
; Pull-up transistor
; Pull-up transistor
:  A transistor whose gate when high causes current to flow from a power source.
:  A transistor whose gate when high causes current to flow from a power source.
; Pull-down transistor
; Pull-down transistor
:  The analogue of a pull-up transistor for sinking to ground.
:  The analogue of a pull-up transistor for sinking to ground.
; Via
; Via
:  A connection between polysilicon/diffusion and metal.
:  A connection between polysilicon/diffusion and metal.


== Tips for working with the simulators ==
== Tips for working with the simulators ==
=== Node names in Visual 6502 ===
A hash (#) or tilde (~) on a node name signifies active low or negation in Visual 6502. Due to problems passing hashes in URLs, aliases were automatically introduced that use tildes instead (hence the "automatic alias replacing hash with tilde" comments).


=== Clearing highlighting ===
=== Clearing highlighting ===
Line 545: Line 543:


Being able to add node names to '''nodenames.js''' can be very helpful when figuring out a circuit. To do this, a local version of the simulator can be downloaded with e.g. '''$ wget --convert-links''' on a *nix system. Please watch the recursion level and avoid downloading data needlessly, as at least Visual 2C02 and Visual 2A03 are hosted on a limited uplink.
Being able to add node names to '''nodenames.js''' can be very helpful when figuring out a circuit. To do this, a local version of the simulator can be downloaded with e.g. '''$ wget --convert-links''' on a *nix system. Please watch the recursion level and avoid downloading data needlessly, as at least Visual 2C02 and Visual 2A03 are hosted on a limited uplink.
=== Extra node names ===
Many additional node names for Visual 2C02 can be found [https://github.com/ulfalizer/Visual-2C02-nodes in this repository]. The repo is maintained separately since it is updated often and not all nodes have been confirmed.
=== PPU chip layout overview ===
A high-level overview of the layout of the PPU [[media:ppuareas.png|can be found here]]. Another, lower-level analysis [http://breaknes.com/files/PPU/tilemap can be found here].

Latest revision as of 16:03, 6 March 2024

This is a crash course on making sense of the NMOS circuit displays in Visual 6502/2C02/2A03, written for people without much low-level electronics experience (like the primary author). It aims to present the information needed to read the diagrams at a basic level in simple language, omitting details that are less important when starting out.

You might want to read the Visual 6502 user's guide and the Visual 2C02 page first.

What the different colored areas are

Let's start by defining what the different colors mean:

Vis areas.png
  • Green areas are diffusion connected to ground.
  • Red areas are diffusion connected to VCC (power).
  • Yellow areas are diffusion that is neither connected directly to ground nor directly to VCC.
  • Gray areas are metal.
  • Purple areas are polysilicon (often shortened to just "poly").

At the level presented here, diffusion, metal, and polysilicon can be thought of as roughly equivalent when viewed in isolation; they all conduct current. The important difference is in how they interact with each other, which is explained below.

Basic building blocks

Transistors

When a piece of polysilicon is sandwiched between two areas of diffusion, it acts as a gate, only letting current through when the polysilicon is connected to power (or, equivalently, is on, conducting, high, or 1). The diffusion area from which current flows when the gate is high is called the source. The diffusion area into which current flows is called the drain. The gate together with the source and drain is what makes a transistor.

Vis transistor.png

The transistor here is an enhancement-mode transistor. All the "ordinary" selectable (see the nodes section) transistors have this type.

Power sources

Around areas of powered diffusion we often see something like the following (note the distinctive "hook" in the polysilicon):

Vis power.png

Here the polysilicon acts roughly like a resistor (or more specifically a pull-up resistor). This prevents there from ever being a short from VCC to ground (through some path of high gates). In the simulators, this entire configuration is simply modeled as a power source.

The transistor here is a depletion-mode transistor, a different type of transistor compared to above (though it appears the same visually).

An example that brings together transistors and power sources

In the following curcuit, the gate will be high and A powered/pulled to VCC:

                      VCC
                       |
                       |
                       |
[power source]------[gate]
                       |
                       |
                       |
                       A

No current will ever flow from the power source to A (ideally). The voltage on the gate only controls whether there's a conductive path between VCC and A.

Similarly, the gate will be low and A not powered/pulled to VCC In the following circuit:

                      VCC
                       |
                       |
                       |
GND-----------------[gate]
                       |
                       |
                       |
                       A

Using a switch analogy for transistors (which is common in digital electronics), the above circuits can be viewed as follows:

                                           A
                                           |
                                           |
                                           |
                                           \
B----------[remote switch controller]       \   <- controlled switch
                                             \
                                           |
                                           |
                                           |
                                           C

When B is high, the switch is closed, connecting A and C. When B is low, it is open (like in the figure).

Nodes

Electrically common areas are called nodes in Visual 6502/2C02/2A03. Clicking on a node highlights it, making it easier to see how things are connected. (Clicking on powered or grounded diffusion won't work; these only modify properties of other nodes and are not themselves nodes.) When a node is highlighted, a numeric ID unique to the node is displayed in the upper right, along with a name for the node if it has one. Node names are defined in nodenames.js.

Transistors can be selected separately by clicking on the gate (the part of the polysilicon between the diffusion areas). They have names that start with 't', followed by a numeric ID.

The Find: edit field can be used to locate nodes, either by numeric ID or by name. Numeric IDs can also be used to trace the values of nodes without an assigned name.

Basic logic elements

Inverters

An inverter is constructed like in the image below:

Vis inverter.png

Note that there is a hook in the gate to the left, meaning the left part of the circuit is a power source instead of a "normal" transistor.

When the input gate is low, current from the power source flows into the output wire, pulling the voltage high. When the input gate is high, current from the power source instead flows into ground, driving the voltage on the output wire low. The output wire is hence the inverse of the input wire.

When one node is the inverse of another, we will say that it inverts into the other node.

NOR gates

Below is an example of a NOR gate taken from Visual 2A03, related to controlling when the first square channel is silenced:

Vis nor.png

If any of the gates in red circles are high, the voltage of the highlighted node is pulled to ground instead of pulled high (as current will flow from the power source on the left into ground through any high gates). The value that reaches the gate in the blue circle is hence the NOR of the values on the gates in the red circles.

Note that the circles represent the only transistors in this image (except for the depletion-mode transistors on the power sources). There are polysilicon traces passing above (or in reality, below) metal traces in a few spots, but this does not form a transistor. The highlighting (which was activated by clicking on the node) shows how things are connected.

The gate in the blue circle is part of a pass transistor, so called because it passes current between two nodes rather than driving or grounding a node. The gate in this case is apu_clk1, and we say that value is "buffered on apu_clk1".

Storage elements

Wire capacitance as storage

This is the simplest form of storage, and so is covered first.

If a wire is "closed off" so that it is no longer connected to neither power nor ground, it retains its value for a while through capacitance. This is used to store some short-lived data "on the wire". As an example, here's the read buffer for the 2C02's VBlank flag, which lets its value be read even though reading $2002 immediately clears the VBlank flag:

Vis vblbuf.png

The circled gate is controlled by the /read_2002_output_vblank_flag signal (shortened to ov from here on). While ov is high, the value of vbl_flag (or rather /vbl_flag in this case) is connected to the highlighted wire. When ov goes low, the value on the wire is held.

A '/' denotes 'inverse', meaning the signal is the inverse of another signal with the same name but without the '/'. A '/' can also mean 'active low', meaning the signal is considered "active" when low. Visual 6502 has a slightly different convention – see Node names in Visual 6502.

While a node or wire is isolated from both VCC and ground in the above fashion, it is said to be floating. For bus lines, a floating line is said to be tri-stated, as the floating state can be viewed as a third state in addition to 0 and 1. This third state allows other devices to use the bus without interference.

Using capacitance as storage in the above fashion is an instance of dynamic logic, so called since it has time-dependent behavior beyond just the input clock. Chips that make use of dynamic logic techniques tend to have a minimum clock speed at which they function correctly, as values stored via capacitance degrade to zero over time.

Latches (cross-coupled inverters)

Two cross-coupled inverters make a latch – an element that stores a single bit.

Below is the VBlank flag from Visual 2C02. In the left-most picture the vbl_flag node is highlighted, and in the middle picture its inverse (/vbl_flag) is highlighted. As can be seen by the two gates in gray circles, each node inverts into the other, forming two cross-coupled inverters.

Vis crossreg.png

The gates marked set and clear set and clear the latch, respectively. To clear the latch, vbl_flag is driven low. To set the latch, /vbl_flag is driven low.

This circuit is an example of an SR Latch, where S stands for set and R for reset, corresponding to the set and clear gates above. It is more specifically an SR NOR Latch, as it can be viewed as being built of NOR gates. The corresponding schematic using NOR gates is shown in the right-most picture.

Clocked latches

When a latch can be set directly from the value of some line, e.g. a data bus line, an arrangement involving a clock is often used. The motivation is to avoid having to form both data_line and /data_line and route them to the set and clear terminals of the latch, which would use more logic. The clock is already routed all around the chip, so mixing it in usually isn't as much of a problem.

As an example, here's the noi_lfsrmode node (the Loop noise flag from $400E):

Vis clockedreg.png

While apu_clk1 is high, noi_lfsrmode will flow into the floating node (so called because it will float when both apu_clk1 and w400e are low), which then inverts into noi_/lfsrmode, forming a cross-coupled inverter latch. While apu_clk1 is low, the loop will be broken momentarily, and during this phase a new value can be copied into the latch through the gate controlled by the w400e signal (which goes high on writes to $400E). The value let through by the pass transistor is the db7 node, corresponding to the 8th bit of the data bus. (There's a via between the diffusion and the metal db7 line – easier to see if the node is highlighted.) If the loop was not broken during the write operation, the old value in the latch would interfere with setting a new value.

For another, less cluttered view of the same type of circuit, see this image (substitute "apu_clk1" for "/φ₁" and "w400e" for "φ₁").

(The circuitry in the lower-right corner is a multiplexer, which selects between one of two inputs depending on whether noi_lfsrmode or noi_/lfsrmode is high; i.e., depending on whether noi_lfsrmode is 0 or 1. The output of the multiplexer is on the left side.)

DRAM (Dynamic RAM)

Below is an example of a DRAM cell, taken from the internal PPU OAM memory:

Vis dram cell.png

In the left and right pictures the two sides of the cell are highlighted (with a different highlight color on the right due to the node being high). The two nodes are always inverses of each other, with the node highlighted in the left picture corresponding to the value held in the cell (low for 0 and high for 1).

Note that this is not an instance of cross-coupled inverters, as neither node is directly connected to a power source. Rather, DRAM depends on capacitance to hold the value, which will fade unless the capacitor is regularly refreshed (the high side recharged). This is the "dynamic" part of DRAM.

Below is a picture of the upper edge of the PPU OAM DRAM array:

Vis oam.png

The "column" and "row" labels are conventional memory terminology; they confusingly happen to get the opposite orientation in Visual 2C02. "Row" and "column" below will refer to this terminology.

The spr_rowx lines (sometimes called word lines) are used to connect a row of memory cells to the horizontal bit lines (by opening up each cell to a pair of vias); this is called opening that row. For example, spr_row16 opens the highlighted row, while spr_row0 opens the row on its right side. As can be guessed from the node names, the memory layout is not as straightforward as consecutive memory locations being stored in consecutive rows. (Interestingly, we do get consecutive rows if we reverse the bits in the part of the sprite address that selects the row. It is unknown why the row selection bits were not wired to the DRAM in this "correct" configuration instead.)

On the left side of OAM we see pass transistors on the spr_col1 and spr_col3 lines select the bit lines from the first and second columns of the memory array, respectively (there are other, similar, lines next to them) . Each such spr_colx line is connected to eight different columns (16 bit lines), corresponding to the eight bits of the byte to be read or written (increasing bit positions are not stored in consecutive columns either). One notable exception to this pattern is that two columns only connect to five sets of bit lines; these columns correspond to the "flags" bytes in OAM, where the middle 3 bits don't actually exist.

DRAM refresh

At the right side in the picture above we see pclk0 running down the edge of OAM, connected to pull-up transistors for each bit line. During pclk0, these are used to precharge the bit lines, after which the pull-up transistors are disabled but the lines remain charged through capacitance. When the selected row is opened after pclk0, it will be exposed to the precharged bit lines, which has the effect of charging up the high side of the cell. On the low side of the cell, the precharge current will simply drain to ground, as the gate on that side will be driven high.

In a typical DRAM circuit, the rows are automatically and periodically refreshed to prevent values from fading. In the PPU, no such logic exists, and rows are only refreshed when explicitly accessed. The reason the PPU (usually) gets away with this is that sprite evaluation will access the entire OAM (provided rendering is enabled), refreshing the rows as a side effect.

SRAM (Static RAM)

SRAM uses cross-coupled inverters for storage and is accessed using a row/column scheme similar to DRAM. Compared to DRAM, SRAM does not need to be refreshed, tends to be faster, uses more die area per memory cell, and draws more power for the NMOS version.

Below is a picture of SRAM memory cells used to store the PPU's palette (in this case the rows do go horizontally):

Vid sram.png

Miscellaneous circuitry

Decoders and mask ROMs

  • A decoder is a circuit that maps input values to output values. A decoder that maps m input lines to n output lines is called an m-to-n decoder.
  • A mask ROM is a type of read-only memory constructed by masking off parts of a circuit grid.

The two elements are covered together since their implementation turns out to be similar in this case.

Pictured below is the decoder and mask ROM that act as the lookup table for initialization of the length counters in the APU:

Vis len rom.png

The length is set by writing bits 7-3 of e.g. $4003 (in the case of the first pulse channel), so the inputs to the decoder are bits 7-3 of the data bus. The output from the decoder feeds into the mask ROM, and the output from the mask ROM is the length from the lookup table. The length is used to initialize a counter that counts down to zero before silencing the channel.

The picture below shows a zoomed-in view of the lower part of the decoder and mask ROM:

Vis len pla zoom.png

The spots of yellow diffusion in the decoder and mask ROM are connections to the metal wires, which run horizontally in the decoder and vertically in the mask ROM. By setting the gates connected to the diffusion high, the wires can be driven low.

In the decoder (right part) the input lines and their inverses run vertically (/db7 has been highlighted to show its connection). By looking carefully at the bottom-most horizontal row in the decoder, we see that it is powered on the right side, and that the condition for it to remain high as it passes into the mask ROM is /db7 AND /db6 AND /db5 AND /db4 AND /db3. Another way to put this condition is db7-db3 = $00.

Similarly, the condition for the second row from the bottom to be high is /db7 AND /db6 AND /db5 AND db4 AND /db3, which translates to db7-db3 = $02. The conditions for the third and fourth rows from the bottom are db7-db3 = $04 and db7-db3 = $06, respectively.

The decoder is set up so that dbx and /dbx will never both drive the same horizontal line low (which would make it impossible for that line to ever be high), and in this case each row has a unique bit pattern that activates it. (It would also be possible to insert a "don't care" condition in the decoder by having neither dbx nor /dbx drive the line low.)

The decoder here is a 5-to-32 decoder, with 32 rows corresponding to the 32 possible bit patterns made with five bits. This type of decoder is said to fully decode its inputs, and is an instance of an n-to-2n decoder.

In the mask ROM (this one in particular being a NOR ROM), we see that each horizontal line from the decoder when high will cause a particular pattern to appear on the lenx outputs. Reading off the bottom row, this pattern is len7-0 = 00001001b = 9. Reading off the remaining rows from bottom to top, we get the values 00010011b = 19, 00100111b = 39, and 01001111b = 79.

Putting together the above, we have the following incomplete map from inputs to outputs:

Index Length
$00 9
$02 19
$04 39
$06 79

By checking against the APU length counter table, we see that these indeed are the length values corresponding to those indices (minus one, due to details of how the length counter works).

To give an example of a decoder that does not feed into a mask ROM, the picture below shows the internal 2A03 address decoder for the address range $4000-$4017, where signals such as r4017 (read 4017) and w4004 (write 4004) are generated.

Vis addr pla.png

The theory behind the decoder and mask ROM seen here is closely related to that of PLAs (Programmable Logic Arrays), where we could view the decoder as the AND plane and the mask ROM as the OR plane (both implemented with NOR gates). This introduction to PLAs is helpful.

Adders

Pictured below is part of the adder used by the sweep units in the 2A03 to calculate the target period for sweep period updates to the second square channel (the first square channel is identical except for a small quirk related to subtraction; see below). The pictured part calculates the second bit (bit 1) of the sum, along with the carry for that bit position.

Vis adder.png

The adder is split into two parts. The left-most part (having four columns) calculates bit 1 of the sum. The right-most part (with three columns) calculates the carry. Both /sum1 out and /carry out are powered, and can be forced low by certain combinations of the input signals being high. (For e.g. the left-most column, this combination is addend1 AND carry in AND sq1_p1). The essential information is captured in the following truth table:

sq1_p1 addend1 carry in /sum1 out /carry out
0 0 0 1 1
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 0 0

As expected, this corresponds to an addition operation (with the sum and carry inverted).

The same logic is used to perform subtraction, by inverting each bit of the addend (using separate logic) and setting the carry in for the zeroth bit to 1. This corresponds to the usual invert-bits-and-add-one operation for negating a number in two's complement.

For unknown reasons, the inverted carry input for the zeroth bit of the first square channel is connected to VCC instead of the inverted sweep direction flag (as it is in the other square channel), making the carry input unconditionally zero. This leads to the value minus one being subtracted instead on that channel.

Barrel shifters

The below circuitry forms part of a barrel shifter, used to shift the inputs to the adders for sweep unit period updates in this case.

Vis barrel shifter.png

As a side note, the bit inversion for subtraction by the sweep units happens before the bits enter the barrel shifter.

Shift registers

(This section might be considered "advanced" on a first reading. I just wanted an example that made more complex use of clocks.)

The picture below shows the 16-bit shift register that holds the high bits for background tiles (see the PPU rendering page). The upper eight bits can be reloaded from PPU VRAM data bus lines, and the output is taken from the lower eight bits (in this case, the particular bit to use is selected by the fine x scroll). Bits flow clockwise through the shift register.

Vis shift reg.png

Below is a zoomed-in view of three bits (tile_h15-13) from the upper-left part of the shift register:

Vis shift reg zoom.png

The value of each bit corresponds to the value on the (2) side.

Control signals

The following signals control the shifting and reloading of the register (the names used were invented for the article and are not standard terminology):

  • The Invert signal corresponds to pclk0, which is high during the initial half-cycle of a PPU cycle (see the Clocks section).
  • The Shift signal corresponds to pclk1, which is high during the second half-cycle.
  • The Parallel load signal controls pass transistors connected to _db0-7, used to load the upper eight bits of the shift register.

Shift does not always exactly mirror pclk1, as explained below, which is the reason for the notation.

Shifting

Shifting the register is a two-step process:

  1. During pclk0, Invert is driven high, making the value of (1) flow through the pass transistor into the red-highlighted node, which causes (1) to invert into (2).
  2. During pclk1, Shift is driven high, which causes the node marked (2) to invert into the node marked (3) (the next bit of the shift register). Invert is low during this phase, and the value on the red-highlighted node is held via wire capacitance, which makes this a dynamic shift register.

Due to the bit of powered diffusion circled in red, the default value shifted into (1) is 1. However, as the value is held on the inverted side (2), this means that zeroes are being shifted in.

Parallel load

To perform a parallel load of the register, step (2) from above is modified so that Shift remains low during pclk1 and Parallel load goes high instead, causing the new value for each cell to come from the data bus lines instead of from the previous cell.

The diagram below might clarify how the control signals are related. Each row is a PPU half-cycle.

pclk0 Invert Shift Parallel load
1 1 0 0
0 0 1 0
1 1 0 0
0 0 1 0
1 1 0 0
0 0 0 1 ← Reloaded here
1 1 0 0
0 0 1 0
1 1 0 0
0 0 1 0
1 1 0 0
0 0 1 0

Digital-to-analog conversion (DAC)

The below Visual 2A03 circuitry controls the volume on the output pin for the two square channels (the triangle, noise, and DMC channels use a separate pin). Note that each successive bit has twice the weight of the preceding one in terms of the amount of powered diffusion connected to it.

Vis da conversion.png

This is an example of a binary-weighted DAC. A different type of DAC is used for the video output from the PPU (found in the upper-left of Visual 2C02, rotated 90 degrees here):

Vis vid dac.png

The upper-left end is actually connected to VCC, and the lower-right to ground. This is a voltage ladder, and works by tapping the wire (which behaves as a resistor) at different points along the run to get different voltages. As the simulator is purely digital, this circuit is not directly used in the simulation, and some parts that would otherwise interfere with it have been disconnected.

Output drivers

These are found on pins capable of doing output, which need to be able to source (generate) and sink large currents to drive the line high or low. The polysilicon wire that would cause the pin to source current is highlighted below.

Vis output driver.png

Large clusters of pull-up and pull-down transistors like these are sometimes called superbuffers. They also appear in some internal circuits that need to source or sink larger currents, e.g. due to having a large fan-out – a large number of connections from the logic gate's output to inputs of other gates.

On lines that are capable of being tri-stated, this is done by activating neither the pull-up nor the pull-down transistors, so that the pin neither sources nor sinks current. This is also done for reads on bidirectional lines, to prevent the output driver from interfering.

Cut-off connections

Some parts of the chips, especially outside the 6502 core, were designed using a copy-and-paste process called "standard cell", leading to some seemingly nonsensical and cut-off connections. These carry no special significance. The image below contains an example.

The 6502 core inside the 2A03 is a substantially tighter block of NMOS (having been designed by hand), but it still has a few cut-off connections remaining from removal of the original output drivers.

Vis cutoff.png

Layers

(This information is not essential to reading the diagrams.)

The layers that make up the chip are as follows, in order from bottom to top: substrate, diffusion, oxide (with holes for buried contacts and vias), polysilicon, more oxide (with holes for vias), metal, and passivation (or "overglass", containing holes where bond wires connect).

The way diffusion is powered or grounded is through vias to large areas of metal that are either grounded or powered.

Transistor dimensions

Visual6502, Visual2A03, and Visual2C02 are purely digital simulators, so the effects of transistor dimensions don't matter. But you will often notice locations in the simulators where transistors are different shapes.

Here's the inverter from the beginning of this tutorial, now annotated with dimensions:

Vis inverter dimensions.png

Because the layer of substrate is uniform thickness, everything is calculable in terms of sheet resistance. In the above annotated picture, two transistors are shown: one significantly wider than long, and the other the opposite. The aspect ratio (length divided by width) of the depletion mode pull-up transistor on the left is approximately 3.47, while the aspect ratio of the enhancement mode pull-down transistor on the right is 0.23. As a result, the pull-down transistor is approximately 15 times more effective at sinking current (which is good: it has to be able to override the pull-up).

The 2A03 uses these analog effects in its audio path:

Vis 2a03 pcmout012.png

Shown are the three least significant bits of the 2A03's APU's PCM channel. pcmout1 drives a single transistor (with some resistance R) and pcmout2 drives two (resulting in a resistance R÷2). To give pcmout0 a resistance of 2·R, they either would have had to make the transistor half as wide or twice as long. Halving the width wasn't an option because the diffusion areas are already as narrow as possible using this manufacturing technology. As a result, the gate for the least significant bit is longer.

Clocks

This section lists node names for various clocks that sequence operations within the chips. Some of the 6502 pin signals might have gained a "c_" prefix in Visual 2A03 compared to Visual 6502.

6502 core pins

clk0
The φ0 clock input pin. Goes low at the beginning of a CPU cycle.
clk1out, clk2out
The φ1 and φ2 output pins. φ2 is used to form M2 in the 2A03, which has a modified duty cycle.

6502 internal clock signals

cp1
High during the first phase (half-cycle) of a CPU cycle. The inverse of clk0.
cclk
High during the second phase of a CPU cycle. Roughly equivalent to clk0, but modified slightly to never overlap with cp1 (though that won't be visible in the simulators).

APU clock signals

apu_clk1
This clock signal has a 25% duty cycle. It ticks at half the rate of the CPU clock, and is high only when φ2 is low.
apu_/clk2
Like apu_clk1, but ticks on the opposite phase, and is also inverted so that it has a 75% duty cycle.
apu_clk2x, where x is a, b, c, etc.
Inverses of apu_/clk2, used internally in various components.

This clock arrangement helps to ensure that timed events (various counters being decremented or reloaded) do not conflict with writes from the CPU (which only happen when φ2 is high).

φ1 1 0 1 0 1 0 1 0
φ2 0 1 0 1 0 1 0 1
apu_clk1 1 0 0 0 1 0 0 0
apu_/clk2 1 1 0 1 1 1 0 1
apu_clk2x 0 0 1 0 0 0 1 0

PPU clock signals

clk0
The input clock, fed from the master clock. Used directly in video waveform generation.
_clk0
The inverse of clk0.
pclk0
The pixel clock. Derived from clk0 by dividing by four (NTSC) or five (PAL). One cycle corresponds to a rendered dot, with pclk0 being high during the first phase (half-cycle).
pclk1
The inverse of pclk0. High during the second phase of a pixel clock.

Master clock and CPU/PPU clock alignment

The clock divider in the PPU is clocked on a zero-to-one transition of the master clock, while the clock divider in the CPU is clocked on a one-to-zero transition. Diagrammatically, this might look like below for NTSC (for CPU clock, "!" denotes when the 2A03's M2 line goes high but the 6502's internal clock is still low).

Master clock    | 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ...
PPU pixel clock | 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 ...
CPU clock       | 0 0 0 0 0 0 0 0 0 ! ! ! 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 ! ! ! 1 ...

6502 cycle and phase timing

During each active cycle (i.e., while the RDY line of the 6502 has not been pulled low), the CPU either reads from or writes to memory; there are no "idle" cycles w.r.t. the data bus in the 6502. Each such read or write cycle is split up into two equally long phases, called φ1 (phase 1) and φ2 (phase 2), corresponding to the clock signals above. φ1 takes place while the clock input is low, φ2 while it is high.

During each cycle, the R/W signal and the address bus lines are updated during φ1. In the simulators we see them change right away, but in a real 6502 there will be some delay. At the end of φ2, values are read from or written to the data bus lines. The IRQ and NMI interrupt lines appear to be sampled on the falling edge of φ2 (as indicated in this thread and also from observed behavior relating to the VBlank flag).

This document lists data and address bus contents during each cycle of an instruction. For more detailed timing information, see the MOS hardware manual.

Terms

Below are various terms you might run into:

Bond wire
A wire that connects an internal pad to an external pin on the chip package; see e.g. this
Buried contact
A connection between diffusion and polysilicon.
NMOS
The technology used for the transistors in the 2A03 and 2C02. In NMOS, transistors are made by creating regions of n-doped semiconductor that become the source and drain ("n-doped" because this doping increases the mobility of electrons and their negative charge). This type of transistor is good at sinking current to ground (this is what causes a 0 bit to usually "win" in bus conflicts), and worse at pulling up. The transistors used in NMOS are more precisely called n(-type )MOSFETs. NMOS transistors are "active" when their gate is connected to a high voltage (i.e. VCC). See this YouTube video for another high-level overview of how transistors work.
PMOS
The counterpart to NMOS, PMOS is an older and slower (but initially easier to manufacture) technology used for making integrated circuits. In PMOS, transistors are made by creating regions of p-doped semiconductor that become the source and drain ("p-doped" because this doping increases the mobility of electron holes and their positive charge). This type of transistor is good at sourcing current from VCC, and worse at pulling down. The transistors used in PMOS are more precisely called p(-type )MOSFETs. PMOS transistors are "active" when their gate is connected to a low voltage (i.e. GND).
CMOS
The combination of NMOS and PMOS within a single design, CMOS chips make use of both n-type and p-type MOSFETs in order to form logic gates. While PMOS requires weak pull-down resistors (often permanently enabled transistors connected to GND) and NMOS requires weak pull-up resistors (originally permanently enabled transistors connected to VCC, but later replaced with depletion loads), CMOS makes use of p-type MOSFETs as strong pull-ups and n-type MOSFETs as strong pull-downs, arranged in pairs such that exactly one transistor in each pair is active at any given moment, made even simpler by the fact that a single input signal can connect directly to both transistors (when it is high it activates the NMOS pull-down, and when it is low it activates the PMOS pull-up).
Open drain
A type of output that works by sinking current from an external pull-up resistor instead of generating current on its own. An example is the PPU's INT pin. The pull-up resistor is denoted "RM1" in this Famicom wiring diagram.
Pull-up resistor
A resistor connected to power. "Pull-up" comes from pulling the wire to a high state.
Pull-up transistor
A transistor whose gate when high causes current to flow from a power source.
Pull-down transistor
The analogue of a pull-up transistor for sinking to ground.
Via
A connection between polysilicon/diffusion and metal.

Tips for working with the simulators

Node names in Visual 6502

A hash (#) or tilde (~) on a node name signifies active low or negation in Visual 6502. Due to problems passing hashes in URLs, aliases were automatically introduced that use tildes instead (hence the "automatic alias replacing hash with tilde" comments).

Clearing highlighting

When the simulator is loaded and after it has been run with "animate during simulation" enabled, nodes that are high will be highlighted. To get rid of this highlighting, click the "clear highlighting" button.

Local copies of the simulator

Being able to add node names to nodenames.js can be very helpful when figuring out a circuit. To do this, a local version of the simulator can be downloaded with e.g. $ wget --convert-links on a *nix system. Please watch the recursion level and avoid downloading data needlessly, as at least Visual 2C02 and Visual 2A03 are hosted on a limited uplink.

Extra node names

Many additional node names for Visual 2C02 can be found in this repository. The repo is maintained separately since it is updated often and not all nodes have been confirmed.

PPU chip layout overview

A high-level overview of the layout of the PPU can be found here. Another, lower-level analysis can be found here.