8-bit Multiply: Difference between revisions
Rainwarrior (talk | contribs) (→frantik: it seems more responsible just to fix the infinite loop case than add a warning about it) |
Rainwarrior (talk | contribs) m (→frantik: :) |
||
Line 61: | Line 61: | ||
sta temp | sta temp | ||
lda value2 | lda value2 | ||
bne +end | bne +end" | ||
jmp +start: | jmp +start: | ||
Revision as of 10:46, 12 October 2017
Since the 6502 CPU has no multiplication instruction, this feature has to be written in software.
Bregalad
This routine by Bregalad does binary long multiplication (a.k.a. "the Russian Peasant method").
;8-bit multiply ;by Bregalad ;Enter with A,Y, numbers to multiply ;Output with YA = 16-bit result (X is unchanged) Multiply: sty Factor ;Store input factor ldy #$00 sty Res sty Res2 ;Clear result ldy #$08 ;Number of shifts needed - lsr A ;Shift right input number bcc + ;Check if bit is set pha lda Res2 clc adc Factor sta Res2 ;If so add number to result pla + lsr Res2 ;Shift result right ror Res dey bne - lda Res ldy Res2 rts
An optimization for efficiency is made here; binary long multiplication requires adding one multiplicand to the result at various bit-shifts (i.e. multiply by each power of 2). The naive approach might maintain the value to add as a 16-bit value, left shifting it once each iteration to reach the next power of 2. This one, however, takes advantage of the input being only 8-bits wide, and instead pre-multiplies the result by 256 (8 bits), and each iteration instead right-shifts the result. After 8 iterations the pre-multiply is undone, and the advantage gained is that only the shift is 16-bit; adding the multiplicand remains an efficient 8-bit add.
Assuming no page crossings and zero page, this routine takes 184-320 cycles, not counting the JSR to call it. (Each non-zero bit in A adds 17 cycles.)
frantik
This routine by frantik is another binary long multiplication.
; Multiply two bytes in memory using Russian peasant algorithm ; by frantik ; Accepts: value1 and value2, labels for bytes in memory ; value2 should ideally be the lesser of the two input values ; for increased efficiency ; Uses: $00, $01, $02 for temporary variables ; Returns: 16 bit value in $00 and $01 .macro multiply value1, value2 ret = $00 ; return value temp = $02 ; temp storage lda #$00 ; clear temporary variables sta ret sta ret+1 sta temp lda value2 bne +end" jmp +start: -loop: asl value1 ; double first value rol temp ; using 16bit precision lsr value2 ; halve second vale +start: lda value2 ; and #01 ; is new 2nd value an odd number? beq -loop: ; clc ; if so, add new 1st value to running total lda ret ; adc value1 ; sta ret ; lda ret+1 ; adc temp ; sta ret+1 ; lda value2 ; cmp #01 ; is 2nd value 1? if so, we're done bne -loop: ; otherwise, loop +end: .endm
Assuming no page crossings and zero page, this routine takes 17-403 cycles, though it is an in-place macro generating 48 bytes of new code each time it is used.
External Links
- Forum post: Fast multi, ... - Faster multiplication routine using lookup tables, by keldon.