Jump to content



0

One last question for a while - Timing


9 replies to this topic

#1 jbs30000 OFFLINE  

jbs30000

    Moonsweeper

  • 459 posts

Posted Sun Dec 4, 2011 9:10 PM

Sorry to start three threads in such a short time, but once I get help figuring this out, I'll be good for while :).

OK, for each scanline there are 228 TIA clock cycles or 76 CPU cycles.

According to this Website: lda zero page,x is four clock cycles and sta zero page is three clock cycles, so a pair is seven clock cycles.

For the basic routine to display a 40x192 image I have 6 pairs (three read/write for the left side of the playfield, and three read/write for the right side of the playfield. That makes 7 * 6 = 42 cycles.
I also have two SLEEP 4 statements so that's an extra 8 cycles so 50 cycles altogether.

The last part of the code is
sta WSYNC
inx
cpx #192
bne VisibleScreen

Now it seems to me that the since the last three opcodes are after WSYNC then they're inbetween scanlines and shouldn't count towards the 76 cycles, right?
If so then the sta WSYNC should be 3 cycles, so that's a total of 53 cycles and I should have 76 - 53 = 23 cycles left.
If those opcodes do count, then that's an extra 6 or 7 cycles, so at most 53 + 7 = 60 and 76 - 60 = 16 cycles left.

Yet, inbetween the last sta PF2 and sta WSYNC if use anything more than 11 cycles the screen goes wonky. So I am I incorrect about the timing, doing bad math, or...?

Thank you.

#2 Rybags ONLINE  

Rybags

    Quadrunner

  • 10,314 posts
  • Location:Australia

Posted Sun Dec 4, 2011 9:23 PM

You don't have "in between scanlines" in the sense of what scanline the CPU is executing on - the cycles per scanline on a given system also counts stuff like HBlank and Sync time.
So, any given piece of your kernal is being executed on a given scanline, or spanning two.

WSync halts the CPU until a given cycle - with that bit of code you have you might be able to speed it up by putting either / both the INX / CPX #192 before the STA WSYNC.
That is of course assuming the instructions won't overrun to the point where the WSync occurs for the next scanline.

#3 jbs30000 OFFLINE  

jbs30000

    Moonsweeper

  • 459 posts

Posted Sun Dec 4, 2011 10:53 PM

OK, that makes sense, but I'm still coming up short. For example, counting the code that comes after the sta WSYNC I should be able to put in code that takes up to 16 cycles, yet I can only go up to 11 cycles (I tested this using SLEEP).

Edited by jbs30000, Sun Dec 4, 2011 10:58 PM.


#4 Andrew Davie OFFLINE  

Andrew Davie

    Stargunner

  • 1,314 posts
  • Location:Tasmania

Posted Sun Dec 4, 2011 11:13 PM

You have 228 colour clocks for the entire scanline. 160 of these are actually displaying pixels. So there are 68 (228-160) 'colour clocks' of horizontal blank time -- time on a scanline when pixels aren't being drawn.
Divide by 228 by 3 (colour clocks per cycle) and you get that magical 76 cycles per scanline. Note that you don't have to have a WSYNC write; the TIA proceeds anyway. So that WSYNC effectively costs you 3 cycles out of those 76. But it also means you don't have to spend lots of fiddly code getting your timing exactly right. When you're finished your stuff on the line, you can sta WSYNC and you know next thing you're at the start of the next line. On the other hand, if you don't want to do the store to WSYNC and you want to time out everything, you can use all 76 cycles on each line.
Now remember, after you do sta WSYNC, the 6510 *halts* until the start of the next scanline. So, your 'inx' is actually taking 2 cycles out of the next line. With the cpx and the branch, you've taken 7 cycles out of your 76 cycles right there. And the WSYNC as I said takes another 3. So you really only have 66 cycles to play with, the way this code is structured. Now, if your branch is over a page boundary, that takes another cycle to do. But, basically, 6 * lda/sta + a couple of sleeps, plus your looping and incrementing... should fit. Post your code so we can diagnose :)


6 playfield writes @ 7 cycles each = 42 (unless your tables cross page boundaries, in which case +1 for each time)
+ 2 sleeps @ 4 cycles each = 8
+ wsync = 3
+ inx/cpx/bne = 7 (maybe 8)
+ your test buffer space = 11

= 72

So you have ~4 cycles to play with. I suspect you have some cross-page branching or indexing!


Cheers
A

#5 jbs30000 OFFLINE  

jbs30000

    Moonsweeper

  • 459 posts

Posted Mon Dec 5, 2011 12:19 AM

:dunce: It turned out to be 100% cross boundaries. I throw in a few ALIGN 256 statements and low and behold I have 16 cycles left instead of 11.

Well, thanks for your help everybody. I'm pretty sure I've got it figured out now. Well, displaying playfields anyway :).

#6 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Mon Dec 5, 2011 5:48 PM

Although you've got your problem worked out, I thought I'd add a couple more comments about timing.

WSYNC is needed primarily to keep things starting at the same place on each line when there are bits of code that might take up different numbers of cycles-- such as if there's any comparisons and branching-- so the instructions will line up as expected on each line. Otherwise you might end up with a loop where the instructions slip back a cycle or more each time through the loop, or that creep forward a cycle or more each time through the loop. If you need to find a few more cycles, one method is to see if you can eliminate the WSYNC by carefully constructing your loop so the instructions always land on the same cycles each time through the loop, even if there are places where you use comparisons and branches.

Another method to gain more cycles for your code is to unroll the loop, meaning you get rid of the loop entirely and duplicate the code as many times as needed to draw the desired number of lines. This increases the ROM needed to draw the lines, so it can often work better for small sections of lines-- like maybe a score bar, status bar, or if your game lends itself to dividing the screen into horizontal bands. For example, you might have a loop that draws 10 horizontal bands, with each band containing 16 scan lines. You still need a loop to draw the 10 bands, but the code to draw one band of 16 lines could be unrolled. That way you can reclaim the cycles that would otherwise be needed for the comparison and branch instructions on a single line.

#7 Ben_Larson OFFLINE  

Ben_Larson

    Moonsweeper

  • 336 posts
  • Location:Columbus, OH, USA

Posted Mon Dec 5, 2011 6:38 PM

View PostSeaGtGruff, on Mon Dec 5, 2011 5:48 PM, said:

Another method to gain more cycles for your code is to unroll the loop, meaning you get rid of the loop entirely and duplicate the code as many times as needed to draw the desired number of lines.
I did this in 'Incoming' for the main kernel loop actually - in order to get enough cycles for the asymmetrical PF, 2 player objects, missile, and PF/background color changes every 8 lines. It's unrolled into 8-line segments.

#8 jbs30000 OFFLINE  

jbs30000

    Moonsweeper

  • 459 posts

Posted Mon Dec 5, 2011 9:26 PM

SeaGTGruff thank you for your advice. Yeah, I did figure out that sometimes WSYNC is needed. And thank you for advice about unrolling loops. It's so simple, yet I didn't think of it. Sometimes it's the little things that slip my notice.

#9 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Tue Dec 6, 2011 10:02 PM

View Postjbs30000, on Mon Dec 5, 2011 9:26 PM, said:

SeaGTGruff thank you for your advice. Yeah, I did figure out that sometimes WSYNC is needed. And thank you for advice about unrolling loops. It's so simple, yet I didn't think of it. Sometimes it's the little things that slip my notice.
They weren't my ideas-- several 2600 programmers have used them. I just wanted to make sure you knew about them. :)

#10 jbs30000 OFFLINE  

jbs30000

    Moonsweeper

  • 459 posts

Posted Tue Dec 6, 2011 10:07 PM

Thank you.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users