Jump to content
IGNORED

6502 Killer hacks


djmips

Recommended Posts

As I was growing up, I kept a notebook full of cool code snippets and ideas. My notebook had been misplaced but I ran across it recently and here is one of the pages which is from a 1987 Dr. Dobbs article by Mark S. Ackerman. "6502 Killer Hacks".

Post your own 6502 Killer Hacks and share them with the rest of us!

I also checked into Mark S. Ackerman with our trusty tool Google and found his 'vita' -

Pretty sure it's the same guy as he worked at GCC from 1982 - 1984 and was the lead on Ms. PacMan, Galaxian and Moon Patrol - time to update AtariAge database as these games are empty when it comes to staff

He has a patent on the Galaxian kernel.

Well here is the killer hack. This one is to scrimp on RAM.

Incrementing only the lower 4 bits of a byte (with wrap)
 

...
   lda word      ; original byte
   and #$0f      ; retrieve lower nybble
   tay            ; index
   lda word
   clc            ; might not be needed
   adc nextinc,y  ; could be ora or sbc
   sta word
...

nextinc .byte 1,2,3,4,5,6,7,8
        .byte 9,10,11,12,13,14,15,0


Well, funny thing is - maybe I didn't transcribe it properly back in '87 - because it doesn't seem like it would work.

Seems like it needs an AND #$F0 after the second LDA word


So I thought I'd take a shot at a working version...

 

 

 

...
   lda word      ; original byte
   and #$0f      ; retrieve lower nybble
   tay            ; index
   lda word
   clc
   adc nextinc,y
   sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
        .byte 1,1,1,1,1,1,1,-15

 


who knows if that one works either. ?

If someone has the original article from Feb 1987 Dr. Dobbs Journal, I'd be curious to see the code.

Also, post your own 6502 Killer Hacks and share them with the rest of us!

- David

Updated 2017: Just came across the original PDF of the article by Mark S. Ackerman and confirmed that I did transcribe it incorrectly but my fixed version is the same as the published version.

http://archive.6502.org/publications/dr_dobbs_journal_selected_articles/6502_hacks.pdf

 

 

See the following post for a better version of this hack.

 

 

Edited by djmips
improvement.
  • Like 1
Link to comment
Share on other sites

Hi there!

 

...
   lda word      ; original byte
   and #$0f      ; retrieve lower nybble
   tay            ; index
   lda word
   clc
   adc nextinc,y
   sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
        .byte 1,1,1,1,1,1,1,-15

872906[/snapback]

 

Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space:

 

   LAX word
   INX
   AND #$F0
   STA temp
   TXA
   AND #$0F
   ORA temp
   STA word

 

As many cycles, but 14 bytes saved...

(Also this can count n bits, it's not fixed to 4)

 

Greetings,

Manuel

Link to comment
Share on other sites

Hi there!

 

...
   lda word     ; original byte
   and #$0f     ; retrieve lower nybble
   tay           ; index
   lda word
   clc
   adc nextinc,y
   sta word
...

nextinc .byte 1,1,1,1,1,1,1,1
        .byte 1,1,1,1,1,1,1,-15

872906[/snapback]

 

Yours should work a lot better than the other version. So you're going for speed here? A version without table would certainly waste less ROM space:

 

   LAX word
   INX
   AND #$F0
   STA temp
   TXA
   AND #$0F
   ORA temp
   STA word

 

As many cycles, but 14 bytes saved...

(Also this can count n bits, it's not fixed to 4)

 

Greetings,

Manuel

872915[/snapback]

 

I like that version, nice use of LAX, but you gotta be brave and not use the temp ;)

 

   LAX word
   INX
   AND #$F0
   STA word
   TXA
   AND #$0F
   ORA word
   STA word

Link to comment
Share on other sites

Post your own 6502 Killer Hacks and share them with the rest of us!

Here's one I like:

 

; unsigned divide by 3

   sta    temp
   lsr
   lsr
   clc
   adc    temp
   ror
   lsr
   clc
   adc    temp
   ror
   lsr
   clc
   adc    temp
   ror
   lsr
   clc
   adc    temp
   ror
   lsr

:)

Link to comment
Share on other sites

[code]
   LAX word
   INX
   AND #$F0
   STA word
   TXA
   AND #$0F
   ORA word
   STA word

872999[/snapback]

I think I can save a byte... I am not totally sure if this will work though, I've been wrong before.

 

   inc word
   lax word
   and #$0f
   bne no
   txa
   sbx #$10
   stx word
no

Link to comment
Share on other sites

Hi there!

 

The last 2 instructions should rather be SBC and STA I think :)

 

And, to make it totally failproof, you'd probably need to add a SEC before the subtraction.

 

Greetings,

Manuel

873284[/snapback]

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!) For this reason I've found it useful to save a byte here and there.

Link to comment
Share on other sites

Hi there!

 

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)

 

Oh... I thought that opcode was called AXS :?

 

Anyway, clever usage! :)

 

Greetings,

Manuel

873290[/snapback]

I've noticed that different documents use different mnemonics for the illegals, so maybe SBX=AXS. Though SBX works in dasm, maybe AXS does too :?

Link to comment
Share on other sites

Hi there!

 

To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well.

 

It checks wether a point is within a rectangle (software collision detection!):

 

  LDA rect.right
  SBC point.x
  BMI NoHit
  SBC rect.width
  BPL NoHit
  LDA rect.top
  SBC point.y
  BMI NoHit
  SBC rect.height
  BPL NoHit
  ;BANG!
NoHit

 

Greetings,

Manuel

  • Like 1
Link to comment
Share on other sites

Hi there!

 

To contribute something myself, here's an IMO very usefull bit I wrote for Gunfight back then and which I'm using one way or another in Star Fire and Crazy Balloon as well.

 

It checks wether a point is within a rectangle (software collision detection!):

 

  LDA rect.right
  SBC point.x
  BMI NoHit
  SBC rect.width
  BPL NoHit
  LDA rect.top
  SBC point.y
  BMI NoHit
  SBC rect.height
  BPL NoHit
 ;BANG!
NoHit

 

Greetings,

Manuel

873320[/snapback]

Here's the software collision routine I worked up for Go Fish!

A little different - I use it to check if two boxes overlap. Call it once for X values, then call it again with Y values.

CheckBoundaries
       lda rect1.leftortop
       cmp rect2.leftortop
       bmi Check2
       cmp rect2.rightorbottom
       bmi InsideBoundingBox
Check2
       lda rect2.leftortop
       cmp rect1.leftortop
       bmi NotInsideBoundingBox
       cmp rect1.rightorbottom
       bpl NotInsideBoundingBox
InsideBoundingBox
       sec
       rts
NotInsideBoundingBox
       clc
       rts

Link to comment
Share on other sites

Hi there!

 

Here's the software collision routine I worked up for Go Fish!

A little different - I use it to check if two boxes overlap.  Call it once for X values, then call it again with Y values.

873438[/snapback]

 

Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or?

 

Greetings,

Manuel

Link to comment
Share on other sites

Hi there!

 

Here's the software collision routine I worked up for Go Fish!

A little different - I use it to check if two boxes overlap.  Call it once for X values, then call it again with Y values.

873438[/snapback]

 

Do you use it for fish:fish collisions? Probably not, as the hardware detection should be good enough for that, or?

 

Greetings,

Manuel

873446[/snapback]

Use it for playerLEFT-controlled-fish to playerRIGHT-controlled-fish collisions in two-player game. Flicker. :)

Link to comment
Share on other sites

What this does is set a number of random bits in a memory location. X is defined before calling this routine. Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas... Anyone?

 

EDIT: oops, "bits" should be zero page, not immediate.

 

makemines
       lda bits
       sta TEMPVAR
loop    JSR randomize; returns random value in accumulator
       AND #7
       TAY
       LDA maskbit,y
       ORA minefield-1,x; x is defined outside this routine
       STA minefield-1,x
       dec TEMPVAR
       BPL loop
       rts

maskbit
       .byte %00000001
       .byte %00000010
       .byte %00000100
       .byte %00001000
       .byte %00010000
       .byte %00100000
       .byte %01000000
       .byte %10000000

Edited by batari
Link to comment
Share on other sites

What this does is set a number of random bits in a memory location.  X is defined before calling this routine.  Actually, it's not "killer" yet - I think this can be improved for cycles as well as space... but I'm at a loss for ideas...  Anyone?

 

EDIT: oops, "bits" should be zero page, not immediate.

 

makemines
       lda bits
       sta TEMPVAR
loop    JSR randomize; returns random value in accumulator
       AND #7
       TAY
       LDA maskbit,y
       ORA minefield-1,x; x is defined outside this routine
       STA minefield-1,x
       dec TEMPVAR
       BPL loop
       rts

maskbit
       .byte %00000001
       .byte %00000010
       .byte %00000100
       .byte %00001000
       .byte %00010000
       .byte %00100000
       .byte %01000000
       .byte %10000000

873707[/snapback]

 

hmmm. So let me see, so bits + 1 is the maximum number of bits you want set. So if bitsis 2 for example, some legitimate output has 1 to 3 bits set because your random routine could return the same result each time for instance. Is that what you really want or does it matter if the routine always returned the number of bits

 

I thought about something where you generate a bit per loop and the following is an idea for the inner loop.

 

loop:	jsr random
 cmp threshold
 rol temp

 

I don't think this approach will result in an improvement over your version but maybe it sparks an idea.

Link to comment
Share on other sites

loop:	jsr random
 cmp threshold
 rol temp

 

I don't think this approach will result in an improvement over your version but maybe it sparks an idea.

873984[/snapback]

I though of doing something similar, it would save bytes but add cycles... But it does spark an idea. This routine is only called every 12 frames, so I should use something like the above and spread it out over several frames, then I'll have cycles to spare.

Link to comment
Share on other sites

I like little code snippets like the ones posted here... I don't want this thread to die, so I'll post another one of my hacks that saved a few bytes.

 

In 2600 games, there's often tons of STA WSYNCs, so I wondered if there was a way to get basically the same effect while saving space. So I came up with this, which works in cases where you have some kernel timing to spare, and the stack pointer is constant (let's assume $FF). Basically you replace all STA WSYNCs with BRK, but don't add the extra byte after the BRK, by using this short BRK routine:

brkroutine
DEC $FE; correct return address to eliminate the byte after the BRK
           ; ONLY works when low byte of return address is not zero!
STA WSYNC
RTI

If you replace 6 or more STA WSYNCs, you start saving space...

Edited by batari
  • Like 1
Link to comment
Share on other sites

...but you better not be tight for cycles!

 

That adds 6 cycles after the STA WSYNC, and 10 before it.

 

In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did. There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro. It would have only been 11 except that I saved 4 cycles from using LAX.

Link to comment
Share on other sites

...but you better not be tight for cycles!

 

That adds 6 cycles after the STA WSYNC, and 10 before it.

 

In Red Box/Blue Box there were only a few places that I could JSR to a copy of my DoSound macro, and it saved a LOT of bytes when I did.  There were only a few places because I only had 15 cycles to spare on each scan line after the DoSound macro.  It would have only been 11 except that I saved 4 cycles from using LAX.

875071[/snapback]

Yeah, there were lots of places where I couldn't use this trick.

 

But I think if you're already using this trick, I think you could use it to create an 8 byte VSYNC!

 

Now, this assumes that BRK/RTI will restore flags on returning. It does, right? If so, then this should work when SP=$FF and you use it right after your INTIM loop, like this:

.1

LDX INTIM

BNE .1

 

so you are certain X=0 and the Z flag is 1. Anyway:

 

; 8 byte VSYNC!
BRK;STA WSYNC, plus restore flags (?)
TXS;stack pointer = 0, which is VSYNC
PHP;Z=1, which writes a 1 to bit 1 of VSYNC
BRK
BRK
BRK
STX VSYNC

Link to comment
Share on other sites

Here's a killer hack from the stella archives. It's near and dear to me because it is a key component used in most of the modern moving 48 wide sprite code ( like my Amiga Boing demo 2.0 (derived from R. Kudla/ E. Stolberg) . It is also used in the various Fu Kung demos from A. Davies)

 

Also, it is very cool. Definately a killer hack.

 

It was originally posted by the late Jim Nitchals on Mar 18 1998

 

Hi,

 

Here's a way to implement single cycle resolution without the use of the

carry flag (which adds overhead in the setup and at the end):

 

; A is assumed to hold the delay value plus the offset address of JumpTable.
; Or, you can align JumpTable to a page boundary.

 sta indjmp
 jmp (indjmp) ; point indjmp+1 to JumpTable somewhere in your init code

JumpTable:
 dc.b $C9
 dc.b $C9
; repeat as many $C9's as you need for the maximum number of cycles you
; you need to delay by.
 dc.b $C9     ; opcode: CMP immediate (4 cycles: uses the $C5, executes
              ; the NOP below.)
 dc.b $C5     ; opcode: CMP zero page (3 cycles, uses up the NOP as a
              ; destination address of $EA)
 nop          ; opcode: NOP (2 cycles by itself)

 

You may find the reduced overhead of this technique useful.

Link to comment
Share on other sites

This is a fairly obvious hack, but here goes ...

 

If you want to display more than 2 sprites, you can use the missile and ball graphics to construct pseudo-sprites. Obviously you can only display a limited number of shapes, but if you are clever, you can obtain the appearance of extra flicker-free sprites. The following code fragment illustrates how to draw a man sprite using only missile 0:

 

Kernel
       sta WSYNC
       sta HMOVE               ; [0] + 3
               
       ; Draw Sprite (SwitchDraw Variant)
       cpy PSWITCH             ; [3] + 3 
       bpl PSwitch             ; [6] + 2/3
       lda (PPTR),Y            ; [8] + 5
       sta ENAM0               ; [13] + 3
       sta HMM0                ; [16] + 3
       asl                     ; [19] + 2
       asl                     ; [21] + 2
       sta NUSIZ0              ; [23] + 3
PContinue
       dey
       bpl Kernel

       ; SwitchDraw Routines
PSwitch
       bne PWait               ; [9] + 2/3
       lda PEND                ; [11] + 3      
       sta PSWITCH             ; [14] + 3
       SLEEP 6                 ; [17] + 6
       bcs PContinue           ; [23] + 3
PWait
       sta HMCLR               ; [12] + 3
       SLEEP 8                 ; [15] + 8
       bpl PContinue           ; [23] + 3      

; Player Data
; Bit 7-4 = HMove 
; Bit 3-2 = Missile Width (1, 2, 4, or 8 pixels)
; Bit 1-0 = Missile Enable
Player1
       DC.B    %00000000
       DC.B    %00000110
       DC.B    %00000010               
       DC.B    %00000110
       DC.B    %11111010
       DC.B    %00001010
       DC.B    %00001010
       DC.B    %00001010       
       DC.B    %00001010       
       DC.B    %00010010
       DC.B    %00000110
       DC.B    %00000010
       DC.B    %00000110
       DC.B    %11111010
       DC.B    %00001010               
       DC.B    %00010110             

 

I have attached the full code to this message which allows you to move the sprite around the screen.

 

Chris

msprite.zip

  • Like 1
Link to comment
Share on other sites

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)  For this reason I've found it useful to save a byte here and there.

873288[/snapback]

 

What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses?

Link to comment
Share on other sites

Actually, SBX is an illegal opcode that stores to X the result of (A&X)-Immediate (and it ignores the carry!)  For this reason I've found it useful to save a byte here and there.

873288[/snapback]

 

What's the best resource for finding a list of such opcodes, DASM's preferred mnemonics for them, and any side-effects or weirdnesses?

877857[/snapback]

Emulator source code is the most comprehensive resource. Second best is probably this.

Link to comment
Share on other sites

  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...