Jump to content



3

6502 Killer hacks


141 replies to this topic

#1 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Sun Jun 12, 2005 2:20 AM

As I was growing up, I kept a notebook full of cool code snippets and ideas. My notebook had been misplaced but I ran across it recently and here is one of the pages which is from a 1987 Dr. Dobbs article by Mark S. Ackerman. "6502 Killer Hacks".

Post your own 6502 Killer Hacks and share them with the rest of us!

I also checked into Mark S. Ackerman with our trusty tool Google and found his 'vita' -

Pretty sure it's the same guy as he worked at GCC from 1982 - 1984 and was the lead on Ms. PacMan, Galaxian and Moon Patrol - time to update AtariAge database as these games are empty when it comes to staff

He has a patent on the Galaxian kernel.

Well here is the killer hack. This one is to scrimp on RAM.

Incrementing only the lower 4 bits of a byte (with wrap)

...
    lda word      ; original byte
    and #$0f      ; retrieve lower nybble
    tay            ; index
    lda word
    clc            ; might not be needed
    adc nextinc,y  ; could be ora or sbc
    sta word
...

nextinc .byte 1,2,3,4,5,6,7,8
         .byte 9,10,11,12,13,14,15,0


Well, funny thing is - maybe I didn't transcribe it properly back in '87 - because it doesn't seem like it would work.

Seems like it needs an AND #$F0 after the second LDA word


So I thought I'd take a shot at a working version...

...
    lda word      ; original byte
    and #$0f      ; retrieve lower nybble
    tay            ; index
    lda word
    clc
    adc nextinc,y
    sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
         .byte 1,1,1,1,1,1,1,-15


who knows if that one works either. :-)

If someone has the original article from Feb 1987 Dr. Dobbs Journal, I'd be curious to see the code.

Also, post your own 6502 Killer Hacks and share them with the rest of us!

- David

Edited by djmips, Sun Jun 12, 2005 2:22 AM.


#2 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Sun Jun 12, 2005 3:26 AM

Hi there!

djmips, on Sun Jun 12, 2005 8:20 AM, said:

...
    lda word      ; original byte
    and #$0f      ; retrieve lower nybble
    tay            ; index
    lda word
    clc
    adc nextinc,y
    sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
         .byte 1,1,1,1,1,1,1,-15

View Post


Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space:

    LAX word
    INX
    AND #$F0
    STA temp
    TXA
    AND #$0F
    ORA temp
    STA word

As many cycles, but 14 bytes saved...
(Also this can count n bits, it's not fixed to 4)

Greetings,
Manuel

#3 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Sun Jun 12, 2005 11:09 AM

Cybergoth, on Sun Jun 12, 2005 2:26 AM, said:

Hi there!

djmips, on Sun Jun 12, 2005 8:20 AM, said:

...
    lda word     ; original byte
    and #$0f     ; retrieve lower nybble
    tay           ; index
    lda word
    clc
    adc nextinc,y
    sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
         .byte 1,1,1,1,1,1,1,-15

View Post


Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space:

    LAX word
    INX
    AND #$F0
    STA temp
    TXA
    AND #$0F
    ORA temp
    STA word

As many cycles, but 14 bytes saved...
(Also this can count n bits, it's not fixed to 4)

Greetings,
Manuel

View Post


I like that version, nice use of LAX, but you gotta be brave and not use the temp ;)

    LAX word
    INX
    AND #$F0
    STA word
    TXA
    AND #$0F
    ORA word
    STA word


#4 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Sun Jun 12, 2005 12:26 PM

Hi there!

djmips, on Sun Jun 12, 2005 5:09 PM, said:

I like that version, nice use of LAX, but you gotta be brave and not use the temp  ;)

View Post


Uihjah... so my kung fu is ok, just needs some work on the finishing move... :lolblue:

Greetings,
Manuel

#5 Alex H OFFLINE  

Alex H

    Chopper Commander

  • 120 posts
  • Mangooooo!
  • Location:UK

Posted Sun Jun 12, 2005 5:08 PM

djmips, on Sun Jun 12, 2005 9:20 AM, said:

Post your own 6502 Killer Hacks and share them with the rest of us!
Here's one I like:

; unsigned divide by 3

    sta    temp
    lsr
    lsr
    clc
    adc    temp
    ror
    lsr
    clc
    adc    temp
    ror
    lsr
    clc
    adc    temp
    ror
    lsr
    clc
    adc    temp
    ror
    lsr
:)

#6 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Sun Jun 12, 2005 5:43 PM

djmips, on Sun Jun 12, 2005 8:20 AM, said:

[code]
    LAX word
    INX
    AND #$F0
    STA word
    TXA
    AND #$0F
    ORA word
    STA word

View Post

I think I can save a byte... I am not totally sure if this will work though, I've been wrong before.

    inc word
    lax word
    and #$0f
    bne no
    txa
    sbx #$10
    stx word
no


#7 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Mon Jun 13, 2005 2:34 AM

Hi there!

The last 2 instructions should rather be SBC and STA I think :)

And, to make it totally failproof, you'd probably need to add a SEC before the subtraction.

Greetings,
Manuel

#8 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Mon Jun 13, 2005 3:09 AM

Cybergoth, on Mon Jun 13, 2005 3:34 AM, said:

Hi there!

The last 2 instructions should rather be SBC and STA I think :)

And, to make it totally failproof, you'd probably need to add a SEC before the subtraction.

Greetings,
Manuel

View Post

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) For this reason I've found it useful to save a byte here and there.

#9 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Mon Jun 13, 2005 3:37 AM

Hi there!

batari, on Mon Jun 13, 2005 9:09 AM, said:

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)

Oh... I thought that opcode was called AXS :?

Anyway, clever usage! :)

Greetings,
Manuel

#10 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Mon Jun 13, 2005 4:37 AM

Cybergoth, on Mon Jun 13, 2005 4:37 AM, said:

Hi there!

batari, on Mon Jun 13, 2005 9:09 AM, said:

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)

Oh... I thought that opcode was called AXS :?

Anyway, clever usage! :)

Greetings,
Manuel

View Post

I've noticed that different documents use different mnemonics for the illegals, so maybe SBX=AXS. Though SBX works in dasm, maybe AXS does too :?

#11 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Mon Jun 13, 2005 6:24 AM

Hi there!

To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well.

It checks wether a point is within a rectangle (software collision detection!):

   LDA rect.right
   SBC point.x
   BMI NoHit
   SBC rect.width
   BPL NoHit
   LDA rect.top
   SBC point.y
   BMI NoHit
   SBC rect.height
   BPL NoHit
   ;BANG!
NoHit

Greetings,
Manuel

#12 vdub_bobby OFFLINE  

vdub_bobby

    Quadrunner

  • 5,831 posts
  • Boom bam.
  • Location:Seattle, WA

Posted Mon Jun 13, 2005 9:57 AM

Cybergoth, on Mon Jun 13, 2005 5:24 AM, said:

Hi there!

To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well.

It checks wether a point is within a rectangle (software collision detection!):

   LDA rect.right
   SBC point.x
   BMI NoHit
   SBC rect.width
   BPL NoHit
   LDA rect.top
   SBC point.y
   BMI NoHit
   SBC rect.height
   BPL NoHit
  ;BANG!
NoHit

Greetings,
Manuel

View Post

Here's the software collision routine I worked up for Go Fish!
A little different - I use it to check if two boxes overlap. Call it once for X values, then call it again with Y values.
CheckBoundaries
        lda rect1.leftortop
        cmp rect2.leftortop
        bmi Check2
        cmp rect2.rightorbottom
        bmi InsideBoundingBox
Check2
        lda rect2.leftortop
        cmp rect1.leftortop
        bmi NotInsideBoundingBox
        cmp rect1.rightorbottom
        bpl NotInsideBoundingBox
InsideBoundingBox
        sec
        rts
NotInsideBoundingBox
        clc
        rts


#13 Cybergoth OFFLINE  

Cybergoth

    Quadrunner

  • 8,207 posts
  • This is Sparta!
  • Location:Bavaria

Posted Mon Jun 13, 2005 10:08 AM

Hi there!

vdub_bobby, on Mon Jun 13, 2005 3:57 PM, said:

Here's the software collision routine I worked up for Go Fish!
A little different - I use it to check if two boxes overlap.  Call it once for X values, then call it again with Y values.

View Post


Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or?

Greetings,
Manuel

#14 vdub_bobby OFFLINE  

vdub_bobby

    Quadrunner

  • 5,831 posts
  • Boom bam.
  • Location:Seattle, WA

Posted Mon Jun 13, 2005 10:58 AM

Cybergoth, on Mon Jun 13, 2005 9:08 AM, said:

Hi there!

vdub_bobby, on Mon Jun 13, 2005 3:57 PM, said:

Here's the software collision routine I worked up for Go Fish!
A little different - I use it to check if two boxes overlap.  Call it once for X values, then call it again with Y values.

View Post


Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or?

Greetings,
Manuel

View Post

Use it for playerLEFT-controlled-fish to playerRIGHT-controlled-fish collisions in two-player game. Flicker. :)

#15 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Mon Jun 13, 2005 4:47 PM

What this does is set a number of random bits in a memory location. X is defined before calling this routine. Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas... Anyone?

EDIT: oops, "bits" should be zero page, not immediate.

makemines
        lda bits
        sta TEMPVAR
loop    JSR randomize; returns random value in accumulator
        AND #7
        TAY
        LDA maskbit,y
        ORA minefield-1,x; x is defined outside this routine
        STA minefield-1,x
        dec TEMPVAR
        BPL loop
        rts

maskbit
        .byte %00000001
        .byte %00000010
        .byte %00000100
        .byte %00001000
        .byte %00010000
        .byte %00100000
        .byte %01000000
        .byte %10000000

Edited by batari, Mon Jun 13, 2005 5:12 PM.


#16 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Tue Jun 14, 2005 1:40 AM

batari, on Mon Jun 13, 2005 3:47 PM, said:

What this does is set a number of random bits in a memory location.  X is defined before calling this routine.  Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas...  Anyone?

EDIT: oops, "bits" should be zero page, not immediate.

makemines
        lda bits
        sta TEMPVAR
loop    JSR randomize; returns random value in accumulator
        AND #7
        TAY
        LDA maskbit,y
        ORA minefield-1,x; x is defined outside this routine
        STA minefield-1,x
        dec TEMPVAR
        BPL loop
        rts

maskbit
        .byte %00000001
        .byte %00000010
        .byte %00000100
        .byte %00001000
        .byte %00010000
        .byte %00100000
        .byte %01000000
        .byte %10000000

View Post


hmmm. So let me see, so bits + 1 is the maximum number of bits you want set. So if bitsis 2 for example, some legitimate output has 1 to 3 bits set because your random routine could return the same result each time for instance. Is that what you really want or does it matter if the routine always returned the number of bits

I thought about something where you generate a bit per loop and the following is an idea for the inner loop.

loop:	jsr random
  cmp threshold
  rol temp

I don't think this approach will result in an improvement over your version but maybe it sparks an idea.

#17 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Wed Jun 15, 2005 5:04 PM

djmips, on Tue Jun 14, 2005 2:40 AM, said:

loop:	jsr random
  cmp threshold
  rol temp

I don't think this approach will result in an improvement over your version but maybe it sparks an idea.

View Post

I though of doing something similar, it would save bytes but add cycles... But it does spark an idea. This routine is only called every 12 frames, so I should use something like the above and spread it out over several frames, then I'll have cycles to spare.

#18 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Wed Jun 15, 2005 5:18 PM

I like little code snippets like the ones posted here... I don't want this thread to die, so I'll post another one of my hacks that saved a few bytes.

In 2600 games, there's often tons of STA WSYNCs, so I wondered if there was a way to get basically the same effect while saving space. So I came up with this, which works in cases where you have some kernel timing to spare, and the stack pointer is constant (let's assume $FF). Basically you replace all STA WSYNCs with BRK, but don't add the extra byte after the BRK, by using this short BRK routine:
brkroutine
 DEC $FE; correct return address to eliminate the byte after the BRK
            ; ONLY works when low byte of return address is not zero!
 STA WSYNC
 RTI
If you replace 6 or more STA WSYNCs, you start saving space...

Edited by batari, Wed Jun 15, 2005 5:20 PM.


#19 Bruce Tomlin OFFLINE  

Bruce Tomlin

    River Patroller

  • 3,531 posts
  • CD C9 01
  • Location:Austin, TX

Posted Wed Jun 15, 2005 5:40 PM

...but you better not be tight for cycles!

That adds 6 cycles after the STA WSYNC, and 10 before it.

In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did. There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro. It would have only been 11 except that I saved 4 cycles from using LAX.

#20 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Wed Jun 15, 2005 5:55 PM

Bruce Tomlin, on Wed Jun 15, 2005 6:40 PM, said:

...but you better not be tight for cycles!

That adds 6 cycles after the STA WSYNC, and 10 before it.

In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did.  There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro.  It would have only been 11 except that I saved 4 cycles from using LAX.

View Post

Yeah, there were lots of places where I couldn't use this trick.

But I think if you're already using this trick, I think you could use it to create an 8 byte VSYNC!

Now, this assumes that BRK/RTI will restore flags on returning. It does, right? If so, then this should work when SP=$FF and you use it right after your INTIM loop, like this:
.1
LDX INTIM
BNE .1

so you are certain X=0 and the Z flag is 1. Anyway:

; 8 byte VSYNC!
 BRK;STA WSYNC, plus restore flags (?)
 TXS;stack pointer = 0, which is VSYNC
 PHP;Z=1, which writes a 1 to bit 1 of VSYNC
 BRK
 BRK
 BRK
 STX VSYNC


#21 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Thu Jun 16, 2005 3:54 PM

Here's a killer hack from the stella archives. It's near and dear to me because it is a key component used in most of the modern moving 48 wide sprite code ( like my Amiga Boing demo 2.0 (derived from R. Kudla/ E. Stolberg) . It is also used in the various Fu Kung demos from A. Davies)

Also, it is very cool. Definately a killer hack.

It was originally posted by the late Jim Nitchals on Mar 18 1998

Quote

Hi,

Here's a way to implement single cycle resolution without the use of the
carry flag (which adds overhead in the setup and at the end):

; A is assumed to hold the delay value plus the offset address of JumpTable.
; Or, you can align JumpTable to a page boundary.

  sta indjmp
  jmp (indjmp) ; point indjmp+1 to JumpTable somewhere in your init code

JumpTable:
  dc.b $C9
  dc.b $C9
; repeat as many $C9's as you need for the maximum number of cycles you
; you need to delay by.
  dc.b $C9     ; opcode: CMP immediate (4 cycles: uses the $C5, executes
               ; the NOP below.)
  dc.b $C5     ; opcode: CMP zero page (3 cycles, uses up the NOP as a
               ; destination address of $EA)
  nop          ; opcode: NOP (2 cycles by itself)

You may find the reduced overhead of this technique useful.


#22 cd-w OFFLINE  

cd-w

    Stargunner

  • 1,195 posts
  • Juno First!
  • Location:Glasgow, UK

Posted Sun Jun 19, 2005 6:25 AM

This is a fairly obvious hack, but here goes ...

If you want to display more than 2 sprites, you can use the missile and ball graphics to construct pseudo-sprites. Obviously you can only display a limited number of shapes, but if you are clever, you can obtain the appearance of extra flicker-free sprites. The following code fragment illustrates how to draw a man sprite using only missile 0:

Kernel
        sta WSYNC
        sta HMOVE               ; [0] + 3
                
        ; Draw Sprite (SwitchDraw Variant)
        cpy PSWITCH             ; [3] + 3 
        bpl PSwitch             ; [6] + 2/3
        lda (PPTR),Y            ; [8] + 5
        sta ENAM0               ; [13] + 3
        sta HMM0                ; [16] + 3
        asl                     ; [19] + 2
        asl                     ; [21] + 2
        sta NUSIZ0              ; [23] + 3
PContinue
        dey
        bpl Kernel

        ; SwitchDraw Routines
PSwitch
        bne PWait               ; [9] + 2/3
        lda PEND                ; [11] + 3      
        sta PSWITCH             ; [14] + 3
        SLEEP 6                 ; [17] + 6
        bcs PContinue           ; [23] + 3
PWait
        sta HMCLR               ; [12] + 3
        SLEEP 8                 ; [15] + 8
        bpl PContinue           ; [23] + 3      

; Player Data
; Bit 7-4 = HMove 
; Bit 3-2 = Missile Width (1, 2, 4, or 8 pixels)
; Bit 1-0 = Missile Enable
Player1
        DC.B    %00000000
        DC.B    %00000110
        DC.B    %00000010               
        DC.B    %00000110
        DC.B    %11111010
        DC.B    %00001010
        DC.B    %00001010
        DC.B    %00001010       
        DC.B    %00001010       
        DC.B    %00010010
        DC.B    %00000110
        DC.B    %00000010
        DC.B    %00000110
        DC.B    %11111010
        DC.B    %00001010               
        DC.B    %00010110             

I have attached the full code to this message which allows you to move the sprite around the screen.

Chris

Attached Files



#23 supercat OFFLINE  

supercat

    Quadrunner

  • 6,367 posts

Posted Mon Jun 20, 2005 5:59 PM

batari, on Mon Jun 13, 2005 4:09 AM, said:

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)  For this reason I've found it useful to save a byte here and there.

View Post


What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses?

#24 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Mon Jun 20, 2005 6:12 PM

supercat, on Mon Jun 20, 2005 6:59 PM, said:

batari, on Mon Jun 13, 2005 4:09 AM, said:

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)  For this reason I've found it useful to save a byte here and there.

View Post


What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses?

View Post

Emulator source code is the most comprehensive resource. Second best is probably this.

#25 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Thu Jul 14, 2005 1:09 AM

Not a killer hack but I'd like to share a link to an interview with William Mensch, 6502 designer.

Real format video of interview




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users