Jump to content



21

Anyone think Ballblazer is possible on the 2600?


772 replies to this topic

#751 GroovyBee OFFLINE  

GroovyBee

    7800 Developer

  • 5,779 posts
  • Busy bee!
  • Location:North, England

Posted Wed Dec 14, 2011 4:21 AM

View Postroland p, on Wed Dec 14, 2011 3:59 AM, said:

The most time consuming part is the calculation of the 2d coordinates. That uses about 400 cycles.
Maybe this is a good candidate for speed optimisation?

#752 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Wed Dec 14, 2011 4:47 AM

View PostGroovyBee, on Wed Dec 14, 2011 4:21 AM, said:

View Postroland p, on Wed Dec 14, 2011 3:59 AM, said:

The most time consuming part is the calculation of the 2d coordinates. That uses about 400 cycles.
Maybe this is a good candidate for speed optimisation?
Maybe, I've to study it a bit more. I'm already glad it works :)

 ;CALC_2d
 ;
 ;input:  OBJECT_X3d OBJECT_Y3d
 ;output: OBJECT_X2d OBJECT_Y2d OBJECT_SIZE
 
 ;First calc Y2d:
 ;Y2d = (128 + <Y3d >> 1) >> >Y3d
 ;
CALC_2d
 LDA #0	  ;2
 STA OBJECT_Y2d	;3
 
 LDX OBJECT_Y3d + 1   ;3
 CPX #11	  ;2
 BCC NOT_TOO_FAR	;2/3
 
 LDA #-5	  ;2 object is too far away.
  STA OBJECT_Y2d + 1   ;3
 LDA #34
 STA OBJECT_SIZE
 RTS	   ;6
NOT_TOO_FAR
 LDA DIV_JUMP_TABLE_HIGH,X ;4
 PHA	   ;3
 LDA DIV_JUMP_TABLE_LOW,X ;4
 PHA	   ;3
 LDY #4	  ;OBJECT_SIZE
 
 LDA OBJECT_Y3d	;3
 LSR	   ;2
 EOR #$FF	 ;2  a = 255 - (<Y3d >> 1)
 
 RTS	   ;6 jump to DIV_...
 
DIV_FAR	   ;$0000...$0080
 SEC
 SBC SUBSTRACTION_TABLE,X
 STA OBJECT_Y2d
 LDA #0
 STA OBJECT_Y2d + 1
 LDA OBJECT_SIZES,X
 STA OBJECT_SIZE
 LDA CORRECTION_TABLE,X
 STA CORRECTION
 
 JMP CALC_X2d
DIV_5	   ;$0080...$0100
 STA OBJECT_Y2d
 LDA #0
 STA OBJECT_Y2d + 1
 
 LDA #5
 STA OBJECT_SIZE
 LDA #14
 STA CORRECTION
 
 JMP CALC_X2d
 
DIV_4
 LSR	  ;2  128...255  >> 7  = $0100...$0200
 ROR OBJECT_Y2d   ;5
DIV_3
 LSR	  ;2  128...255  >> 6  = $0200...$0400
 ROR OBJECT_Y2d   ;5
DIV_2
 LSR	  ;2  128...255  >> 5  = $0400...$0800
 ROR OBJECT_Y2d   ;5
DIV_1
 LSR	  ;2  128...255  >> 4  = $0800...$1000
 ROR OBJECT_Y2d   ;5
DIV_0
 LSR	  ;2  128...255  >> 3  = $1000...$2000
 ROR OBJECT_Y2d   ;5
 LSR	  ;2  128...255  >> 2  = $2000...$4000
 ROR OBJECT_Y2d   ;5
 LSR	  ;2  $8000...$FF00  >> 1 = $4000...$8000
 ROR OBJECT_Y2d   ;5
 STA OBJECT_Y2d + 1  ;3
 
 LDA OBJECT_Y2d
 SEC
 SBC #$80
 LDA OBJECT_Y2d + 1
 SBC #0
 CLC
 ADC #5
 STA OBJECT_SIZE
 
 LDA #14
 STA CORRECTION
 
 ;
 ;X2d = X3d * (Y2d * 3 + correction)
 ;
CALC_X2d
 LDX OBJECT_Y2d + 1  ;3
 LDA OBJECT_Y2d   ;3
 ASL	  ;2
 BCC NO_ADD	;2/3
 INX	  ;2
NO_ADD
 CLC	  ;2
 ADC OBJECT_Y2d   ;3
 STA TEMP	;3 TEMP = <OBJECT_Y2d * 3
 
 STA FAC_LOW_RESULT_LOW_PLUS	 ;set zp adresses
 STA FAC_LOW_RESULT_HIGH_PLUS
 EOR #$ff
 STA FAC_LOW_RESULT_LOW_MINUS
 STA FAC_LOW_RESULT_HIGH_MINUS
 
 TXA	  ;2
 ADC OBJECT_Y2d + 1  ;3
 ADC OBJECT_Y2d + 1  ;3
 ADC CORRECTION   ;3
 STA TEMP + 1   ;3 TEMP + 1 = >OBJECT_Y2d * 3 + CORRECTION
 
 STA FAC_HIGH_RESULT_LOW_PLUS	 ;set zp adresses
 STA FAC_HIGH_RESULT_HIGH_PLUS
 EOR #$ff
 STA FAC_HIGH_RESULT_LOW_MINUS
 STA FAC_HIGH_RESULT_HIGH_MINUS
 
 ;	   AB  (TEMP+1,TEMP)
 ;	   CD * (OBJECT_X3d+1, OBJECT_X3d)
 ;   ------
 ;	   HL    (B*D) (TEMP * OBJECT_X3d)
 ;	  HL	 (A*D) (TEMP + 1 * OBJECT_X3d)
 ;	  HL	 (B*C) (TEMP * OBJECT_X3d + 1)
 ;	 HL	  (A*C) (TEMP + 1 * OBJECT_X3d + 1)
 ;TEMP * OBJECT_X3d   = AAaa
 ;TEMP + 1 * OBJECT_X3d  = BBbb
 ;TEMP * OBJECT_X3d + 1  = CCcc
 ;TEMP + 1 * OBJECT_X3d + 1 = DDdd
 
 ; 
 ;    AAaa
 ;  BBbb
 ;  CCcc
 ;DDdd	    +
 ;   
 ;TEMP * OBJECT_X3d   = AAaa
 LDY OBJECT_X3d
 SEC
; LDA (FAC_LOW_RESULT_LOW_PLUS),y	;Lowest byte of result not needed
; SBC (FAC_LOW_RESULT_LOW_MINUS),y
; STA PRODUCT		
 LDA (FAC_LOW_RESULT_HIGH_PLUS),y
 SBC (FAC_LOW_RESULT_HIGH_MINUS),y
 STA AA
 
 ;TEMP + 1 * OBJECT_X3d  = BBbb
 SEC
 LDA (FAC_HIGH_RESULT_LOW_PLUS),y
 SBC (FAC_HIGH_RESULT_LOW_MINUS),y
 STA bb
 LDA (FAC_HIGH_RESULT_HIGH_PLUS),y
 SBC (FAC_HIGH_RESULT_HIGH_MINUS),y
 STA BB
 
 LDY OBJECT_X3d + 1
 ;TEMP * OBJECT_X3d + 1  = CCcc
 SEC
 LDA (FAC_LOW_RESULT_LOW_PLUS),y
 SBC (FAC_LOW_RESULT_LOW_MINUS),y
 STA cc
 LDA (FAC_LOW_RESULT_HIGH_PLUS),y
 SBC (FAC_LOW_RESULT_HIGH_MINUS),y
 STA CC
 
 ;TEMP + 1 * OBJECT_X3d + 1 = DDdd
 SEC
 LDA (FAC_HIGH_RESULT_LOW_PLUS),y
 SBC (FAC_HIGH_RESULT_LOW_MINUS),y
 STA dd
 LDA (FAC_HIGH_RESULT_HIGH_PLUS),y
 SBC (FAC_HIGH_RESULT_HIGH_MINUS),y
 STA PRODUCT + 3
 
 clc				   
 lda AA
 adc bb
 sta PRODUCT+1
 lda BB
 adc CC
 sta PRODUCT+2							 
 bcc SKIP1
 inc PRODUCT+3						 
 clc								   
SKIP1
 lda cc
 adc PRODUCT+1							 
 sta PRODUCT+1							 
 lda dd
 adc PRODUCT+2							 
 sta PRODUCT+2
 bcc SKIP2
 inc PRODUCT+3						 
SKIP2
 ;Take care of signed OBJECT_X3d
 LDA OBJECT_X3d + 1
 bpl NOT_NEG
 sec
 lda PRODUCT+2
 sbc TEMP+0
 sta PRODUCT+2
 lda PRODUCT+3
 sbc TEMP+1
 sta PRODUCT+3
NOT_NEG
 CLC	   ;MAYBE THIS PART CAN BE REMOVED
 LDA PRODUCT+1
 ADC #$80
 STA OBJECT_X2d
 
 LDA PRODUCT+2
 ADC #0
 STA OBJECT_X2d + 1
 
 LDA PRODUCT+3
 ADC #0
 STA OBJECT_X2d + 2
 
 RTS

I already see I could replace "PRODUCT" (result of multiplication) with OBJECT_X2d.
I remember I was pretty exhausted when I finished this one...

FAC_LOW_RESULT_LOW_PLUS etc. are pointers to tables with squares. I have this from some online C64 magazine.

Edited by roland p, Wed Dec 14, 2011 4:48 AM.


#753 GroovyBee OFFLINE  

GroovyBee

    7800 Developer

  • 5,779 posts
  • Busy bee!
  • Location:North, England

Posted Thu Dec 15, 2011 4:55 AM

Could you provide the theory behind the code?

#754 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Thu Dec 15, 2011 6:54 AM

I'm not that good in explaining things but I'll try.
This routine calculates the screen coordinates (X2d & Y2d) and an OBJECT_SIZE for the 3d coordinates of an object (X3d Y3d). These coordinates are the relative to the player. X3d=0 would indicate the object is exactly in front or behind the player.
This is all very 'pseudo 3d'.

Calculation of y2d coordinate (this is the weirdest):
The y2d calculation only needs the y3d coordinate. The y3d coordinate is a 16-bit value. The y2d value is also 16-bit, The lowest value is used for precision when calculating x3d. The highest value indicates the scanline (0 is horizon, 23 is last line of checkerboard kernel)
It first takes the lowest y3d 8-bit value negates it, divides it by 2 and add $80 to it.
In other words, the 0...$FF range becomes $FF...$80 range.
An object at y3d = $0000 whould be displayed at scanline $FF. This value is too big tp display so I LSR this value always 3 times to make it smaller. so the range becomes $1F...$10 ($1F = 31, checkerboard is 24 scanlines high)
For further objects, it uses the high value of y3d to LSR this value even more. So for farther objects get a lower value.
Some examples of y3d to y2d coordinates:
y3d y2d
0000...00FF $1F00...$1000
0100...01FF $0F00...$0800
0200...02FF $0700...$0400
0300...03FF $0300...$0200
0400...04FF $0100...$0000
IF an object is further, it stops LSR'ing. and it just returns an OBJECT_SIZE value to indicate the object should be drawn smaller.

Calculation of x2d coordinate.
This is done by the formula X2d = X3d * (Y2d * 3 + correction)
It multiplies y2d by 3 because the diagonals of the checkerboard grow 3 pixels each scanline.
The correction is added because otherwise, all values at scanline 0 would result in x2d 0. The correction will be the width of a tile at the horizon. for objects behind the horizon, this value will become smaller.
Thanks for reading :)

#755 GroovyBee OFFLINE  

GroovyBee

    7800 Developer

  • 5,779 posts
  • Busy bee!
  • Location:North, England

Posted Thu Dec 15, 2011 7:00 AM

Looks like you could get some of this into tables if you have the ROM space.

#756 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Thu Dec 15, 2011 7:14 AM

View PostGroovyBee, on Thu Dec 15, 2011 7:00 AM, said:

Looks like you could get some of this into tables if you have the ROM space.
It uses 2kB for the multiplication.
I already reduced the precision of the multiplication so it's already getting faster :)
I hope the lsr/ror'ing can be optimised some more.

#757 GroovyBee OFFLINE  

GroovyBee

    7800 Developer

  • 5,779 posts
  • Busy bee!
  • Location:North, England

Posted Thu Dec 15, 2011 7:46 AM

In the formula :- X2d = X3d * (Y2d * 3 + correction)

Can "Y2d*3 + correction" not be simplified to the addition/subtraction of a 16 bit variable every time you update Y2d? That'd save the multiplication and add.

#758 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Sun Dec 18, 2011 4:20 PM

View PostGroovyBee, on Thu Dec 15, 2011 7:46 AM, said:

In the formula :- X2d = X3d * (Y2d * 3 + correction)

Can "Y2d*3 + correction" not be simplified to the addition/subtraction of a 16 bit variable every time you update Y2d? That'd save the multiplication and add.
But in the routine above, y2d is updated (RORed) 7 times worst case.

I'm thinking of dropping the framerate to 30fps. So with the screen running at 60fps, one frame will be dedicated to game logic, the other frame will be dedicated to on screen calculations.

#759 theloon OFFLINE  

theloon

    Stargunner

  • 1,012 posts

Posted Mon Dec 19, 2011 3:29 PM

Full disclosure: I have no idea what I'm talking about.

That being said, instead of halving the framerate, how about one frame be a "guess" at the next rendering values needed thus saving time for logic? I bet the 2600 would be overburdened just coming up with the predicted values needed for the next frame though.. ugh.

#760 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Tue Dec 20, 2011 2:01 PM

View Posttheloon, on Mon Dec 19, 2011 3:29 PM, said:

Full disclosure: I have no idea what I'm talking about.

That being said, instead of halving the framerate, how about one frame be a "guess" at the next rendering values needed thus saving time for logic? I bet the 2600 would be overburdened just coming up with the predicted values needed for the next frame though.. ugh.

I've now changed it the game-logic into (you probably triggered the idea :) ):

even frames: calculate all game logic, acceleration, friction, collisions, add speed to current position of drone.
odd frames: skip game logic, only add speed to current position of drone. Calculate positions etc. of objects for display.

That way, the drone moves every 1/60s and you still get 60fps movement of the checkerboard. 30fps movement of the other objects.

Edited by roland p, Tue Dec 20, 2011 2:02 PM.


#761 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Sat Dec 24, 2011 2:19 AM

I've now added 2 goalbeams. I optimised the gamelogic a bit (including more border-collision-detection now) so it fits in the overscan area.
The pseudo-3d-calculations are not 'optimised' now, optimisation made it look less smooth so I want to wait with that.
Sprites look a bit screwed up when moving too much to the left/right, that's because it takes a few days to correct it and I'm lazy.

All is still running at 60fps, about 1250 cycles left in screenblanking area.

Attached Thumbnails

  • ballblazer_20111224.bin.png

Attached Files


Edited by roland p, Sat Dec 24, 2011 3:08 AM.


#762 Stephen OFFLINE  

Stephen

    River Patroller

  • 3,234 posts
  • A8 Gear Head
  • Location:Akron, Ohio

Posted Sat Dec 24, 2011 8:38 PM

Still looking great! I hope you can keep the 60Hz screen, but if not, totally understandable. It's hard to believe this is running on a stock 2600!

#763 enthusi OFFLINE  

enthusi

    Space Invader

  • 20 posts

Posted Thu Jan 5, 2012 10:10 AM

Ah, VERY nice and smooth.
However, I would drop 60 FPS anytime if that allows for better progress/gameplay etc ;-)
But it appears to be very far already - splendid!
Best of luck with that gemstone of yours.
Really looking forward to it.
enthusi

#764 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Tue Jan 10, 2012 2:48 PM

View Postenthusi, on Thu Jan 5, 2012 10:10 AM, said:

Ah, VERY nice and smooth.
However, I would drop 60 FPS anytime if that allows for better progress/gameplay etc ;-)
But it appears to be very far already - splendid!
Best of luck with that gemstone of yours.
Really looking forward to it.
enthusi
Thanks!

- I've now dropped the sprites and the game-logic to 30fps. But I now interpolate the playfield so it still runs in that smooth 60Hz!

- I also updated the pseudo-3d routines. The horizontal lines of the checkerboard had a too linear movement. I've corrected this with table that made the movement more parabolic. It has now more the smoothness of the 7800 version.

- Rotofoil in the lower viewport is now displayed correctly.

- More goalbeams!

- I moved some of the sprite-position code to the vblank area. So less spare-cycles there, but more screen real estate.

- masking of rotofoil not 100% finished yet...

Attached Thumbnails

  • rom.bin_9.png

Attached Files



#765 enthusi OFFLINE  

enthusi

    Space Invader

  • 20 posts

Posted Wed Jan 11, 2012 3:07 AM

Weee ;-)
Keep it coming hehe.
Let me/us know when you can need tests or bug-reports.
From own experience they'd sure be annoying currently :)
Cheers,
enthusi

#766 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Wed Jan 11, 2012 1:34 PM

View Postenthusi, on Wed Jan 11, 2012 3:07 AM, said:

Let me/us know when you can need tests or bug-reports.
From own experience they'd sure be annoying currently :)
At this moment, they aren't really needed because I know there are a lot of bugs in it :D Reports/critique about gameplay/speed/etc. is welcome.

I now want to put the 'plasmorb' in it. I considered using the Playfield registers but that's probably way too time consuming. At this moment, I have 30 (possibly more) spare cycles left in the sky-kernel, where the ball comes. So I probably use the ball sprite for the plasmorb. So it will be a bit smaller than the plasmorb in the original ballblazer, unless I use flickering, but it's nicer not to have flicker...

#767 enthusi OFFLINE  

enthusi

    Space Invader

  • 20 posts

Posted Wed Jan 11, 2012 1:41 PM

Though the plasma(!) orb could do with some flicker, I'd guess as well, that a smaller ball would be nice. Atari-gfx is rather clean and well colored, the ball would not easyly be missed during gameplay.
A certain flicker might give a nice effect, however. Maybe some slight horizontal jitter left/right with huge overlap at 60 Hz?

#768 Ed Fries OFFLINE  

Ed Fries

    Star Raider

  • 52 posts
  • Location:Seattle, WA

Posted Wed Jan 11, 2012 5:28 PM

This is looking amazing. Please keep working on it!

#769 wvoutlaw2k OFFLINE  

wvoutlaw2k

    Combat Commando

  • 6 posts

Posted Wed Apr 18, 2012 11:30 PM

Wow!

I just tried this in Stella on my MacBook. Looks great! Plus, it seems like it could be played with the Trak-Ball.

#770 Godzilla OFFLINE  

Godzilla

    Quadrunner

  • 6,394 posts
  • Location:Jacksonville, Fl

Posted Thu Apr 19, 2012 5:54 PM

i can't wait to see where this goes, it really is impressive imho

#771 roland p ONLINE  

roland p

    Stargunner

  • 1,413 posts
  • RLA
  • Location:The Netherlands

Posted Fri Apr 20, 2012 2:29 AM

Thanks for the comments! I'll pickup the project soon again, I took a sort of break.

#772 RevEng OFFLINE  

RevEng

    River Patroller

  • 2,010 posts
  • bit shoveler
  • Location:Canada

Posted Fri Apr 20, 2012 5:43 AM

Take the break you need and then a little more; coding burnout isn't pretty, and we can be patient.

But please do eventually return, because this is a masterpiece and it would be a shame for it to not be finished.

#773 Keatah OFFLINE  

Keatah

    River Patroller

  • 3,448 posts

Posted Fri Apr 20, 2012 5:47 AM

Agreed! take the summer off (or winter), and come back fresh with new optimization ideas.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users