Random number generator/Linear feedback shift register (advanced)
Further commentary on the linear feedback shift register example at random number generator.
Basic version
prng: ldx #8 ; iteration count lda seed+0 : asl ; shift the register rol seed+1 bcc :+ eor #$2D ; apply XOR feedback whenever a 1 bit is shifted out : dex bne :-- sta seed+0 cmp #0 ; reload flags rts
Sacrifice entropy for speed
The iteration count stored in X can be reduced to speed up the generator, at the expense of quality of randomness. Each iteration effectively generators one more bit of entropy, so 8 iterations are needed for an 8-bit random number. If you intend to use fewer bits of the result (e.g. use AND to mask the result), or if you are satisfied with less randomness, you can reduce X, or even parameterize it:
; X as a parameter specifies number of random bits to generate (1 to 8) prng: lda seed+0 @bitloop: asl rol seed+1 bcc :+ eor #$2D : dex bne @bitloop sta seed+0 cmp #0 rts
Alternatively this loop could be unrolled with 8 entry points, saving the need to use X or load it as a parameter.
24 and 32 bit LFSR
By adding an extra byte or two to the seed variable, and choosing an appropriate polynomial to XOR with, we can extend the sequence length significantly with only one additonal ROL per byte (+40 cycles).
This 24-bit version has a sequence length of 16777215:
.zeropage seed: .res 3 ; 24-bit .code prng: ldx #8 lda seed+0 : asl rol seed+1 rol seed+2 bcc :+ eor #$1B : dex bne :-- sta seed+0 cmp #0 rts
This 32-bit version has a sequence length of 4294967295:
.zeropage seed: .res 4 ; 32-bit .code prng: ldx #8 lda seed+0 : asl rol seed+1 rol seed+2 rol seed+3 bcc :+ eor #$C5 : dex bne :-- sta seed+0 cmp #0 rts
Even longer sequences are possible, but it's not likely to be practical, as it would already take approximately 7 days for an NTSC NES to complete this 32 bit LFSR cycle.
A note about the chosen polynomials, $2D, and $1B are the lowest 1-byte polynomials that give a maximal sequence for 16 and 24-bit generators. $AF is the lowest for a 32-bit generator, but $C5 was preferred because it has less crowded taps (i.e. "better" behaviour if you're lowering the iteration count). There are other maximal sequence polynomials that can be used for all of these, the choice is somewhat arbitrary.
Additional Resource
http://users.ece.cmu.edu/~koopman/lfsr/index.html - a source for maximal polynomials for LFSRs of many lengths.