Jump to content



3

6502 Killer hacks


141 replies to this topic

#126 johncl OFFLINE  

johncl

    Space Invader

  • 11 posts

Posted Mon Dec 29, 2008 7:01 AM

View Postdjmips, on Sun Jun 12, 2005 6:09 PM, said:

I like that version, nice use of LAX, but you gotta be brave and not use the temp ;)

    LAX word
    INX
    AND #$F0
    STA word
    TXA
    AND #$0F
    ORA word
    STA word

A late reply to the starting hack of this thread. If you got a free byte of memory space and you could initialise that before the loop this should be faster:

At an init stage store the top 4 bits in a variable:

lda word
and #$f0
sta hi

And in your loop when you are iterating and need the wraparound increment on lower 4 bits:

ldx word ; 3
inx	  ; 2
txa	  ; 2
and #$0f ; 2
ora hi; 3
sta word ; 3

Uses a total of 15 cycles, 4 less than the one quoted assuming word and hi are in zero page. :)

Well, if you can assume that your "word" variable bit 4 is always zero (so counter starts from one of $00,$20,$40,$60,$80,$a0,$c0,$e0) you could do this also:

ldx word; 3
inx	 ; 2
txa	 ; 2
and #$ef; 2
sta word; 2

Which is only 11 cycles! :P

Edited by johncl, Mon Dec 29, 2008 7:46 AM.


#127 bogax OFFLINE  

bogax

    Star Raider

  • 61 posts

Posted Tue Feb 24, 2009 1:00 AM

View Postdjmips, on Sun Jun 12, 2005 3:20 AM, said:

As I was growing up, I kept a notebook full of cool code snippets and ideas. My notebook had been misplaced but I ran across it recently and here is one of the pages which is from a 1987 Dr. Dobbs article by Mark S. Ackerman. "6502 Killer Hacks".

Post your own 6502 Killer Hacks and share them with the rest of us!
.
.
.
Well here is the killer hack. This one is to scrimp on RAM.

Incrementing only the lower 4 bits of a byte (with wrap)
.
.
.
- David

Just joined these forums so sorry if I'm a little late to this party ;)

Here's a couple of my favorites

First the counter

eor something with its self you get 0
eor something with 0 you get its self

 lda counter
 inc counter
 eor counter
 and #$F0
 eor counter
 sta counter
Of course you can insert bits from one byte into another
byte (not just from a changed version of itself)
Used eg for setting pixels

=========

Parity is just an xoring of bits

A simple sum is just an xoring of bits

0+0=0
0+1=1
1+0=1
1+1=0

Disregarding the carry obviously

Carry is a way of propagating bits across a byte (sort of)

   000a
  +0111
  =a???
We can combine the two to get parity and collect "bits" across a byte

;parity of A

 sta temp
 asl
 eor temp
 and #b10101010
 adc #b01100110
 and #b10001000
 adc #b01111000
;now the parity is in the sign bit

=========

Already posted this to a different thread

Rotate two bits left through the carry

 asl
 adc #$80
 rol
Do it twice to swap nibbles

============

Kernigans method for counting set bits in a byte

This code lifted directly from dclxvi in the 6502.org
programming forum

http://forum.6502.or...highlight=#6993

   TAX 
   BEQ L2 
   LDX #0 
   SEC 
L1 INX 
   STA SCRATCH 
   SBC #1 
   AND SCRATCH 
   BNE L1 
   TXA 
L2 RTS


#128 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Tue Feb 24, 2009 1:54 AM

View Postbogax, on Tue Feb 24, 2009 2:00 AM, said:

Kernigans method for counting set bits in a byte

This code lifted directly from dclxvi in the 6502.org
programming forum

http://forum.6502.or...highlight=#6993

   TAX 
   BEQ L2 
   LDX #0 
   SEC 
L1 INX 
   STA SCRATCH 
   SBC #1 
   AND SCRATCH 
   BNE L1 
   TXA 
L2 RTS
A simple shifting approach is more efficient in terms of size (and in many cases, cycles) than any other routine I saw on that thread.

They all start with the value passed in the accumulator and return the accumulator, so I'll do the same:
  sta temp
  lda #0
loop
  asl temp
  beq done
  adc #0
  bcc loop
done
I'm sure this can be improved somehow.

Edited by batari, Tue Feb 24, 2009 1:56 AM.


#129 Nukey Shay OFFLINE  

Nukey Shay

    Sheik Yerbouti

  • 20,458 posts
  • Location:The land of Gorch

Posted Tue Feb 24, 2009 2:42 AM

How about this?
A bit smaller using the X register to count:

;A=byte value
   ldx #-1
Bump_Count
   inx
Next_Bit
   lsr
   bcs Bump_Count
   bne Next_Bit
;X=number of set bits, A=0


#130 Nukey Shay OFFLINE  

Nukey Shay

    Sheik Yerbouti

  • 20,458 posts
  • Location:The land of Gorch

Posted Tue Feb 24, 2009 2:48 AM

:ponder:

Maybe precede the LSR with CLC, but it's still smaller.

#131 batari OFFLINE  

batari

    )66]U('=I;B$*

  • 6,237 posts
  • begin 644 contest

Posted Tue Feb 24, 2009 3:40 AM

View PostNukey Shay, on Tue Feb 24, 2009 3:48 AM, said:

:ponder:

Maybe precede the LSR with CLC, but it's still smaller.
Why would you need to do that?

But anyway, this further proves my point that the code linked in the thread above isn't ideal in terms of size and (with the probable exception of the 256-byte table) cycles.

#132 bogax OFFLINE  

bogax

    Star Raider

  • 61 posts

Posted Tue Feb 24, 2009 4:52 AM

View Postbatari, on Tue Feb 24, 2009 2:54 AM, said:

A simple shifting approach is more efficient in terms of size (and in many cases, cycles) than any other routine I saw on that thread.
Yes
I just think it's a clever hack (OK, so it's not a killer hack..)
I presume it was originally in C

#133 grafixbmp OFFLINE  

grafixbmp

    Dragonstomper

  • 659 posts
  • Location:South Central US

Posted Mon Mar 2, 2009 4:16 AM

Anyone have a slick hack for taking a byte and separate 5 bits on one side and the other 3 bits as well?

I saw where some were talking about swaping nibbles. This is also usefull for taking 11111111 and producing 00001111 and 11110000 but shifting it down to be 00001111.

The the 5 /3 one would be like taking 11111111 and producing 00011111 and 00000111 out of it with minimal cycles used.

just curious.

#134 Nukey Shay OFFLINE  

Nukey Shay

    Sheik Yerbouti

  • 20,458 posts
  • Location:The land of Gorch

Posted Mon Mar 2, 2009 7:45 AM

5 bits? Is this for audio frequency? If so, are you aware that the upper 3 bits are irrelivant for these registers? Likewise, the upper 4 bits are irrelivant for distortion and volume registers.

So you could use the original merged value to update frequency, then drop the upper 3 bits down (5xLSR) for one of the other registers.

#135 grafixbmp OFFLINE  

grafixbmp

    Dragonstomper

  • 659 posts
  • Location:South Central US

Posted Mon Mar 2, 2009 2:56 PM

View PostNukey Shay, on Mon Mar 2, 2009 7:45 AM, said:

5 bits? Is this for audio frequency? If so, are you aware that the upper 3 bits are irrelivant for these registers? Likewise, the upper 4 bits are irrelivant for distortion and volume registers.

So you could use the original merged value to update frequency, then drop the upper 3 bits down (5xLSR) for one of the other registers.
Yes. but I was more intrested in getting thoes last 3 bits ready ASAP for audio control. The other byte is used for sustain and rest duration. This is how long the volume is held and how long it is off. I was going to organize thoes 3 bits to cover the most usable distortion settings on the audio control register.

Edited by grafixbmp, Mon Mar 2, 2009 2:58 PM.


#136 SpiceWare OFFLINE  

SpiceWare

    Quadrunner

  • 5,993 posts
  • Medieval Mayhem
  • Location:Planet Houston

Posted Mon Mar 2, 2009 2:58 PM

View PostNukey Shay, on Mon Mar 2, 2009 7:45 AM, said:

drop the upper 3 bits down (5xLSR) for one of the other registers.
4 ROLs

#137 Nukey Shay OFFLINE  

Nukey Shay

    Sheik Yerbouti

  • 20,458 posts
  • Location:The land of Gorch

Posted Mon Mar 2, 2009 3:02 PM

List 'em first. %sssssccc. It's only 2 cycles to AND off the upper bits...and by using LAX, you don't need to reload (the original value is still in X).

LAX tablevalue
AND #7
STA AUDCn
TXA
LSR
LSR
LSR

#138 grafixbmp OFFLINE  

grafixbmp

    Dragonstomper

  • 659 posts
  • Location:South Central US

Posted Mon Mar 2, 2009 3:50 PM

View PostNukey Shay, on Mon Mar 2, 2009 3:02 PM, said:

List 'em first. %sssssccc. It's only 2 cycles to AND off the upper bits...and by using LAX, you don't need to reload (the original value is still in X).

LAX tablevalue
AND #7
STA AUDCn
TXA
LSR
LSR
LSR

How quick then would it be to do the others from X? Remove the low 3 bits and shift down. Or somehow keep the carry at 0 while ROR 3 times

#139 fox OFFLINE  

fox

    Chopper Commander

  • 189 posts
  • Location:Poland

Posted Tue Mar 31, 2009 10:27 AM

View Postbatari, on Sun May 21, 2006 9:34 PM, said:

After research, I came up with something really short (17 bytes)
; Binary in A

   sed
   sta temp1
   lda #0
   ldx #8
loop
   asl temp1
   sta temp2
   adc temp2
   dex
   bne loop
   cld

; BCD in A
What's cool about this one is that it actually will do 8-bit binary -> 9-bit BCD, with the 9th bit contained in the carry! Can this be improved any more, though?

Faster, one byte shorter and not using X:
	sec
	rol	@
	sta	bin
	lda	#0
	sed
do_bit
	sta	bcd
	adc	bcd
	asl	bin
	bne	do_bit
	cld


#140 vdub_bobby OFFLINE  

vdub_bobby

    Quadrunner

  • 5,831 posts
  • Boom bam.
  • Location:Seattle, WA

Posted Thu Mar 3, 2011 1:17 PM

Just saw this today:

Quote

Average of Integers
This is actually an extension of the "well known" fact that for binary integer values x and y, (x+y) equals ((x&y)+(x|y)) equals ((x^y)+2*(x&y)).

Given two integer values x and y, the (floor of the) average normally would be computed by (x+y)/2; unfortunately, this can yield incorrect results due to overflow. A very sneaky alternative is to use (x&y)+((x^y)/2). If we are aware of the potential non-portability due to the fact that C does not specify if shifts are signed, this can be simplified to (x&y)+((x^y)>>1). In either case, the benefit is that this code sequence cannot overflow.
http://aggregate.ee....age of Integers

In 6502 assembly:
lda a
and b
sta temp

lda a
eor b
lsr
clc
adc temp
Next question: extend to more than 2 integers, and is it possible to do without temp RAM?

#141 Thomas Jentzsch OFFLINE  

Thomas Jentzsch

    Thrust, Jammed, SWOOPS!

  • 16,745 posts
  • Always left from right here!
  • Location:Düsseldorf, Germany

Posted Thu Mar 3, 2011 3:57 PM

Why not
  clc
  lda a 
  adc b
  ror
?

#142 djmips OFFLINE  

djmips

    Dragonstomper

  • 591 posts
  • scrolling
  • Location:Seattle

Posted Fri Mar 4, 2011 2:53 AM

View Postbogax, on Tue Feb 24, 2009 1:00 AM, said:

View Postdjmips, on Sun Jun 12, 2005 3:20 AM, said:

As I was growing up, I kept a notebook full of cool code snippets and ideas. My notebook had been misplaced but I ran across it recently and here is one of the pages which is from a 1987 Dr. Dobbs article by Mark S. Ackerman. "6502 Killer Hacks".

Post your own 6502 Killer Hacks and share them with the rest of us!
.
.
.
Well here is the killer hack. This one is to scrimp on RAM.

Incrementing only the lower 4 bits of a byte (with wrap)
.
.
.
- David

Just joined these forums so sorry if I'm a little late to this party ;)

Here's a couple of my favorites

First the counter

eor something with its self you get 0
eor something with 0 you get its self

 lda counter
 inc counter
 eor counter
 and #$F0
 eor counter
 sta counter
Of course you can insert bits from one byte into another
byte (not just from a changed version of itself)
Used eg for setting pixels


=========

haven't read this thread for awhile (thanks to vdub to resurrect it so I would actually see some of the cool additions)

This is more likely the original Ackerman 'hack' for incrementing only the low 4 bits of a byte without requiring any additional memory. I think the other 'bad' version must have been my own idle mind playing around with other ideas. Thanks bogax.

Edited by djmips, Fri Mar 4, 2011 2:54 AM.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users