Jump to content
IGNORED

Anyone think Ballblazer is possible on the 2600?


Recommended Posts

The most time consuming part is the calculation of the 2d coordinates. That uses about 400 cycles.

Maybe this is a good candidate for speed optimisation?

Maybe, I've to study it a bit more. I'm already glad it works :)

 

;CALC_2d
;
;input:  OBJECT_X3d OBJECT_Y3d
;output: OBJECT_X2d OBJECT_Y2d OBJECT_SIZE

;First calc Y2d:
;Y2d = (128 + <Y3d >> 1) >> >Y3d
;
CALC_2d
LDA #0	  ;2
STA OBJECT_Y2d	;3

LDX OBJECT_Y3d + 1   ;3
CPX #11	  ;2
BCC NOT_TOO_FAR	;2/3

LDA #-5	  ;2 object is too far away.
 STA OBJECT_Y2d + 1   ;3
LDA #34
STA OBJECT_SIZE
RTS	   ;6
NOT_TOO_FAR
LDA DIV_JUMP_TABLE_HIGH,X ;4
PHA	   ;3
LDA DIV_JUMP_TABLE_LOW,X ;4
PHA	   ;3
LDY #4	  ;OBJECT_SIZE

LDA OBJECT_Y3d	;3
LSR	   ;2
EOR #$FF	 ;2  a = 255 - (<Y3d >> 1)

RTS	   ;6 jump to DIV_...

DIV_FAR	   ;$0000...$0080
SEC
SBC SUBSTRACTION_TABLE,X
STA OBJECT_Y2d
LDA #0
STA OBJECT_Y2d + 1
LDA OBJECT_SIZES,X
STA OBJECT_SIZE
LDA CORRECTION_TABLE,X
STA CORRECTION

JMP CALC_X2d
DIV_5	   ;$0080...$0100
STA OBJECT_Y2d
LDA #0
STA OBJECT_Y2d + 1

LDA #5
STA OBJECT_SIZE
LDA #14
STA CORRECTION

JMP CALC_X2d

DIV_4
LSR	  ;2  128...255  >> 7  = $0100...$0200
ROR OBJECT_Y2d   ;5
DIV_3
LSR	  ;2  128...255  >> 6  = $0200...$0400
ROR OBJECT_Y2d   ;5
DIV_2
LSR	  ;2  128...255  >> 5  = $0400...$0800
ROR OBJECT_Y2d   ;5
DIV_1
LSR	  ;2  128...255  >> 4  = $0800...$1000
ROR OBJECT_Y2d   ;5
DIV_0
LSR	  ;2  128...255  >> 3  = $1000...$2000
ROR OBJECT_Y2d   ;5
LSR	  ;2  128...255  >> 2  = $2000...$4000
ROR OBJECT_Y2d   ;5
LSR	  ;2  $8000...$FF00  >> 1 = $4000...$8000
ROR OBJECT_Y2d   ;5
STA OBJECT_Y2d + 1  ;3

LDA OBJECT_Y2d
SEC
SBC #$80
LDA OBJECT_Y2d + 1
SBC #0
CLC
ADC #5
STA OBJECT_SIZE

LDA #14
STA CORRECTION

;
;X2d = X3d * (Y2d * 3 + correction)
;
CALC_X2d
LDX OBJECT_Y2d + 1  ;3
LDA OBJECT_Y2d   ;3
ASL	  ;2
BCC NO_ADD	;2/3
INX	  ;2
NO_ADD
CLC	  ;2
ADC OBJECT_Y2d   ;3
STA TEMP	;3 TEMP = <OBJECT_Y2d * 3

STA FAC_LOW_RESULT_LOW_PLUS	 ;set zp adresses
STA FAC_LOW_RESULT_HIGH_PLUS
EOR #$ff
STA FAC_LOW_RESULT_LOW_MINUS
STA FAC_LOW_RESULT_HIGH_MINUS

TXA	  ;2
ADC OBJECT_Y2d + 1  ;3
ADC OBJECT_Y2d + 1  ;3
ADC CORRECTION   ;3
STA TEMP + 1   ;3 TEMP + 1 = >OBJECT_Y2d * 3 + CORRECTION

STA FAC_HIGH_RESULT_LOW_PLUS	 ;set zp adresses
STA FAC_HIGH_RESULT_HIGH_PLUS
EOR #$ff
STA FAC_HIGH_RESULT_LOW_MINUS
STA FAC_HIGH_RESULT_HIGH_MINUS

;	   AB  (TEMP+1,TEMP)
;	   CD * (OBJECT_X3d+1, OBJECT_X3d)
;   ------
;	   HL    (B*D) (TEMP * OBJECT_X3d)
;	  HL	 (A*D) (TEMP + 1 * OBJECT_X3d)
;	  HL	 (B*C) (TEMP * OBJECT_X3d + 1)
;	 HL	  (A*C) (TEMP + 1 * OBJECT_X3d + 1)
;TEMP * OBJECT_X3d   = AAaa
;TEMP + 1 * OBJECT_X3d  = BBbb
;TEMP * OBJECT_X3d + 1  = CCcc
;TEMP + 1 * OBJECT_X3d + 1 = DDdd

; 
;    AAaa
;  BBbb
;  CCcc
;DDdd	    +
;   
;TEMP * OBJECT_X3d   = AAaa
LDY OBJECT_X3d
SEC
; LDA (FAC_LOW_RESULT_LOW_PLUS),y	;Lowest byte of result not needed
; SBC (FAC_LOW_RESULT_LOW_MINUS),y
; STA PRODUCT		
LDA (FAC_LOW_RESULT_HIGH_PLUS),y
SBC (FAC_LOW_RESULT_HIGH_MINUS),y
STA AA

;TEMP + 1 * OBJECT_X3d  = BBbb
SEC
LDA (FAC_HIGH_RESULT_LOW_PLUS),y
SBC (FAC_HIGH_RESULT_LOW_MINUS),y
STA bb
LDA (FAC_HIGH_RESULT_HIGH_PLUS),y
SBC (FAC_HIGH_RESULT_HIGH_MINUS),y
STA BB

LDY OBJECT_X3d + 1
;TEMP * OBJECT_X3d + 1  = CCcc
SEC
LDA (FAC_LOW_RESULT_LOW_PLUS),y
SBC (FAC_LOW_RESULT_LOW_MINUS),y
STA cc
LDA (FAC_LOW_RESULT_HIGH_PLUS),y
SBC (FAC_LOW_RESULT_HIGH_MINUS),y
STA CC

;TEMP + 1 * OBJECT_X3d + 1 = DDdd
SEC
LDA (FAC_HIGH_RESULT_LOW_PLUS),y
SBC (FAC_HIGH_RESULT_LOW_MINUS),y
STA dd
LDA (FAC_HIGH_RESULT_HIGH_PLUS),y
SBC (FAC_HIGH_RESULT_HIGH_MINUS),y
STA PRODUCT + 3

clc				   
lda AA
adc bb
sta PRODUCT+1
lda BB
adc CC
sta PRODUCT+2							 
bcc SKIP1
inc PRODUCT+3						 
clc								   
SKIP1
lda cc
adc PRODUCT+1							 
sta PRODUCT+1							 
lda dd
adc PRODUCT+2							 
sta PRODUCT+2
bcc SKIP2
inc PRODUCT+3						 
SKIP2
;Take care of signed OBJECT_X3d
LDA OBJECT_X3d + 1
bpl NOT_NEG
sec
lda PRODUCT+2
sbc TEMP+0
sta PRODUCT+2
lda PRODUCT+3
sbc TEMP+1
sta PRODUCT+3
NOT_NEG
CLC	   ;MAYBE THIS PART CAN BE REMOVED
LDA PRODUCT+1
ADC #$80
STA OBJECT_X2d

LDA PRODUCT+2
ADC #0
STA OBJECT_X2d + 1

LDA PRODUCT+3
ADC #0
STA OBJECT_X2d + 2

RTS

 

I already see I could replace "PRODUCT" (result of multiplication) with OBJECT_X2d.

I remember I was pretty exhausted when I finished this one...

 

FAC_LOW_RESULT_LOW_PLUS etc. are pointers to tables with squares. I have this from some online C64 magazine.

Edited by roland p
Link to comment
Share on other sites

I'm not that good in explaining things but I'll try.

This routine calculates the screen coordinates (X2d & Y2d) and an OBJECT_SIZE for the 3d coordinates of an object (X3d Y3d). These coordinates are the relative to the player. X3d=0 would indicate the object is exactly in front or behind the player.

This is all very 'pseudo 3d'.

 

Calculation of y2d coordinate (this is the weirdest):

The y2d calculation only needs the y3d coordinate. The y3d coordinate is a 16-bit value. The y2d value is also 16-bit, The lowest value is used for precision when calculating x3d. The highest value indicates the scanline (0 is horizon, 23 is last line of checkerboard kernel)

It first takes the lowest y3d 8-bit value negates it, divides it by 2 and add $80 to it.

In other words, the 0...$FF range becomes $FF...$80 range.

An object at y3d = $0000 whould be displayed at scanline $FF. This value is too big tp display so I LSR this value always 3 times to make it smaller. so the range becomes $1F...$10 ($1F = 31, checkerboard is 24 scanlines high)

For further objects, it uses the high value of y3d to LSR this value even more. So for farther objects get a lower value.

Some examples of y3d to y2d coordinates:

y3d y2d

0000...00FF $1F00...$1000

0100...01FF $0F00...$0800

0200...02FF $0700...$0400

0300...03FF $0300...$0200

0400...04FF $0100...$0000

IF an object is further, it stops LSR'ing. and it just returns an OBJECT_SIZE value to indicate the object should be drawn smaller.

 

Calculation of x2d coordinate.

This is done by the formula X2d = X3d * (Y2d * 3 + correction)

It multiplies y2d by 3 because the diagonals of the checkerboard grow 3 pixels each scanline.

The correction is added because otherwise, all values at scanline 0 would result in x2d 0. The correction will be the width of a tile at the horizon. for objects behind the horizon, this value will become smaller.

Thanks for reading :)

Link to comment
Share on other sites

In the formula :- X2d = X3d * (Y2d * 3 + correction)

 

Can "Y2d*3 + correction" not be simplified to the addition/subtraction of a 16 bit variable every time you update Y2d? That'd save the multiplication and add.

But in the routine above, y2d is updated (RORed) 7 times worst case.

 

I'm thinking of dropping the framerate to 30fps. So with the screen running at 60fps, one frame will be dedicated to game logic, the other frame will be dedicated to on screen calculations.

Link to comment
Share on other sites

Full disclosure: I have no idea what I'm talking about.

 

That being said, instead of halving the framerate, how about one frame be a "guess" at the next rendering values needed thus saving time for logic? I bet the 2600 would be overburdened just coming up with the predicted values needed for the next frame though.. ugh.

  • Like 1
Link to comment
Share on other sites

Full disclosure: I have no idea what I'm talking about.

 

That being said, instead of halving the framerate, how about one frame be a "guess" at the next rendering values needed thus saving time for logic? I bet the 2600 would be overburdened just coming up with the predicted values needed for the next frame though.. ugh.

 

I've now changed it the game-logic into (you probably triggered the idea :) ):

 

even frames: calculate all game logic, acceleration, friction, collisions, add speed to current position of drone.

odd frames: skip game logic, only add speed to current position of drone. Calculate positions etc. of objects for display.

 

That way, the drone moves every 1/60s and you still get 60fps movement of the checkerboard. 30fps movement of the other objects.

Edited by roland p
Link to comment
Share on other sites

I've now added 2 goalbeams. I optimised the gamelogic a bit (including more border-collision-detection now) so it fits in the overscan area.

The pseudo-3d-calculations are not 'optimised' now, optimisation made it look less smooth so I want to wait with that.

Sprites look a bit screwed up when moving too much to the left/right, that's because it takes a few days to correct it and I'm lazy.

 

All is still running at 60fps, about 1250 cycles left in screenblanking area.

ballblazer_20111224.bin

post-15728-0-69319600-1324717703_thumb.png

Edited by roland p
  • Like 7
Link to comment
Share on other sites

  • 2 weeks later...

Ah, VERY nice and smooth.

However, I would drop 60 FPS anytime if that allows for better progress/gameplay etc ;-)

But it appears to be very far already - splendid!

Best of luck with that gemstone of yours.

Really looking forward to it.

enthusi

Thanks!

 

- I've now dropped the sprites and the game-logic to 30fps. But I now interpolate the playfield so it still runs in that smooth 60Hz!

 

- I also updated the pseudo-3d routines. The horizontal lines of the checkerboard had a too linear movement. I've corrected this with table that made the movement more parabolic. It has now more the smoothness of the 7800 version.

 

- Rotofoil in the lower viewport is now displayed correctly.

 

- More goalbeams!

 

- I moved some of the sprite-position code to the vblank area. So less spare-cycles there, but more screen real estate.

 

- masking of rotofoil not 100% finished yet...

post-15728-0-12329100-1326228355_thumb.png

ballblazer_20120110.bin

  • Like 9
Link to comment
Share on other sites

Let me/us know when you can need tests or bug-reports.

From own experience they'd sure be annoying currently :)

At this moment, they aren't really needed because I know there are a lot of bugs in it :D Reports/critique about gameplay/speed/etc. is welcome.

 

I now want to put the 'plasmorb' in it. I considered using the Playfield registers but that's probably way too time consuming. At this moment, I have 30 (possibly more) spare cycles left in the sky-kernel, where the ball comes. So I probably use the ball sprite for the plasmorb. So it will be a bit smaller than the plasmorb in the original ballblazer, unless I use flickering, but it's nicer not to have flicker...

Link to comment
Share on other sites

Though the plasma(!) orb could do with some flicker, I'd guess as well, that a smaller ball would be nice. Atari-gfx is rather clean and well colored, the ball would not easyly be missed during gameplay.

A certain flicker might give a nice effect, however. Maybe some slight horizontal jitter left/right with huge overlap at 60 Hz?

Link to comment
Share on other sites

  • 3 months later...
  • 7 months later...
  • 1 month later...

Sorry, not much of an update at the moment. I'm trying to pick up the project again and wrap my head around it.

 

Last thing I was working on is a sort of time management system. Sounds fancy, but it works more or less like this:

 

Frame 1: Calculate game logic (process speed/friction/collisions etc.)

Frame 2: Calculate Screen positions of rotoroil 1/2

Frame 3: Calculate Screen positions of goalbeams

Frame 4: Calculate Screen positions of plasmorb

 

So everything is updated every 4 frames. Which will be choppy, so I'm now experimenting with delta values. When a rotofoil is at position 0 (wich is calculated at frame 2), and the next time (at frame 6) it is at position 40, I calculate a delta value (pos2 - pos1)/4 = (40-0)/4 = 10. So at every frame I can interpolate the onscreen position by adding 10 every time. Ofcourse, this consumes a lot of memory since I need to have precise values (at least 2 bytes for every 2 rotofoils, 4 goalbeams, 2 plasmorbs, 2 playfields x 2 axis = 48 bytes)...

 

Also the gamelogic has to fit in one frame. The gamelogic will mostly consist of: collision detection of rotofoils vs wall, and rotofoils vs each other. And it is possible that, in one frame, rotofoil a collides with rotofoil b, rotofoil b hits a wall, bounces back and hits rotofoil a again. This has to be checked for two axis'.

 

 

Ofcourse, this is all a topdown approach, which isn't exactly considered best practise. In real-life (when creating web applications, which I do for a living) I would create the logic together with a simplistic view (gui) and make sure the application does what it needs to do, and afterwards make it pretty.

 

 

So if I would make ballblazer this way, I would create a simple 2d playfield by just using the playfield pixels, and use square dots for rotofoils. And afterwards, make it pretty and try to do it in 3d...

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...