Jump to content

Luma Enhancement Module Development


96 replies to this topic

#26  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Mon Oct 26, 2009 3:59 PM

It occurred to me that we don't need an 8x clock, just 4x. Here's why: Our highest resolution mode sends out 8 pixels per Phi2 cycle. Instead of clocking them out of a shift register 8 times per cycle, we can multiplex them out with 8 states per cycle. We can get 8 states by decoding 3 bits - the 1x, 2x, and 4x clocks. So we just need two clock doublers.

This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.

#27  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 7:41 AM

View PostClausB, on Mon Oct 26, 2009 3:59 PM, said:

So we just need two clock doublers.
A very simple clock doubler is merely a delay line and an XOR gate. You delay the input clock by 1/4 cycle and XOR both signals to get twice the frequency and 50% duty. Put two of those in series and you also get the 4x clock we need. So the first doubler needs 140 ns delay and the second needs 70 ns. I have found a triple 70 ns delay line for $9 from DDD. A bit pricey but we likely won't be mass producing!

#28 ONLINE  

    Stargunner

  • 1,377 posts
  • Joined: 28-July 07

Posted Tue Oct 27, 2009 9:22 AM

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.


Bob




View PostClausB, on Tue Oct 27, 2009 7:41 AM, said:

View PostClausB, on Mon Oct 26, 2009 3:59 PM, said:

So we just need two clock doublers.
A very simple clock doubler is merely a delay line and an XOR gate. You delay the input clock by 1/4 cycle and XOR both signals to get twice the frequency and 50% duty. Put two of those in series and you also get the 4x clock we need. So the first doubler needs 140 ns delay and the second needs 70 ns. I have found a triple 70 ns delay line for $9 from DDD. A bit pricey but we likely won't be mass producing!


#29  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 3:40 PM

View Postbob1200xl, on Tue Oct 27, 2009 9:22 AM, said:

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.
This is the email quote I got yesterday:

MOQ is 10 pieces
3D7323Z-70 $8.68 each 1 week to ship
MDU3C-70 $11.55 each 4-6 weeks

This is the part:
http://www.datadelay...eets/3d7323.pdf

The delay tolerance is 2%. The ideal delays are 69.8 ns for NTSC and 70.5 ns for PAL. They differ by less than the tolerance.

#30 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 3:45 PM

can't you do this in cpld?

#31  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 3:59 PM

View Postcandle, on Tue Oct 27, 2009 3:45 PM, said:

can't you do this in cpld?
Certainly the XORs will be in the CPLD. What about the delays?

I researched a bit on the Web and saw some things about the Xilinx Digital Clock Manager core and about Digital Locked Loops, but I could not find enough details to see if such a thing would fit into our smallish CPLD. Do you have details to share?

#32 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 4:04 PM

how about wait 70ns statement in VHDL code?
no need for domain synchronisers and PLL inside a cpld, just the routing inside that matters - besides, this would only be avaivable inside fpga, not cpld chip

#33  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 4:23 PM

View Postcandle, on Tue Oct 27, 2009 4:04 PM, said:

how about wait 70ns statement in VHDL code?
Does that assume there is some high frequency clock in the FPGA to count out 70 ns? Nothing like that here. We have 1.8 MHz available and we're trying to make 7.2 MHz synchronized.

#34 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 4:26 PM

no it doesn't
it bases on timing equations and propagation delays inside cells of cpld

#35  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 4:34 PM

I doubt there are enough spare gates in our CPLD to chain up to 140 ns and 70 ns delays. Maybe we should use the larger chip you recommended and devote 90% of it to the clock and 10% to the LEM.

#36 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 4:47 PM

i still think that fpga chip is the way to go with this
costs are the same if you consider small fpga and large cpld

#37  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 4:56 PM

You might be right. I'm old-school so I'm trying to design hardware, not software-on-a-chip. That's fine for large, complex designs like VBXE, but I don't think LEM needs it. I'll try to keep an open mind, though.

"I can change, if I have to." - Red Green

#38 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 4:59 PM

do you know most of cpld chips are in fact small fpga chips with bootloaders?

#39 ONLINE  

    Stargunner

  • 1,377 posts
  • Joined: 28-July 07

Posted Tue Oct 27, 2009 7:52 PM

So, they make these custom at 70ns? wow....

Instead of spending $100 on delay lines, how about I just tweak a clock into the circuit and simulate 70ns? I would hate to want a 35ns delay down the road.

We're going to do at least three iterations of the boards, I expect. Maybe more.

Bob



View PostClausB, on Tue Oct 27, 2009 3:40 PM, said:

View Postbob1200xl, on Tue Oct 27, 2009 9:22 AM, said:

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.
This is the email quote I got yesterday:

MOQ is 10 pieces
3D7323Z-70 $8.68 each 1 week to ship
MDU3C-70 $11.55 each 4-6 weeks

This is the part:
http://www.datadelay...eets/3d7323.pdf

The delay tolerance is 2%. The ideal delays are 69.8 ns for NTSC and 70.5 ns for PAL. They differ by less than the tolerance.


#40  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Tue Oct 27, 2009 9:03 PM

View Postbob1200xl, on Tue Oct 27, 2009 7:52 PM, said:

how about I just tweak a clock into the circuit and simulate 70ns?
Not sure what you mean. How will you sync it to Phi2?

#41 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Tue Oct 27, 2009 9:12 PM

you could use 74ls14 chip and r/c circuit to delay the signal
not verry controlled maybe, but still better than spending 100$ for delay lines

#42 ONLINE  

    Stargunner

  • 1,377 posts
  • Joined: 28-July 07

Posted Wed Oct 28, 2009 4:59 PM

I was thinking of a gated oscillator setup, actually. 02 would gate a series of clock pulses that would load registers from SRAM, or whatever. It would have to be manually adjusted on the prototypes, while the finished boards could use delay lines that wouldn't need adjusting ('tweaking').

Bob



View PostClausB, on Tue Oct 27, 2009 9:03 PM, said:

View Postbob1200xl, on Tue Oct 27, 2009 7:52 PM, said:

how about I just tweak a clock into the circuit and simulate 70ns?
Not sure what you mean. How will you sync it to Phi2?


#43 ONLINE  

    Stargunner

  • 1,757 posts
  • Joined: 17-April 05
  • Location:Lublin, Poland

Posted Wed Oct 28, 2009 8:55 PM

so every 4 pixels would be a bit disorted, but within controlable range
may be a good idea to use higher frequency than nessesary, and then scalling it down by the clock divider
it might reduce pixel skew in those 4 pixel chunks if the falling edge of phi2 would activate the clocking circuit it would be in-phase with phi2 all the times - even if not - higher frequency to start with would give smaller skew rate

#44 ONLINE  

    Stargunner

  • 1,377 posts
  • Joined: 28-July 07

Posted Wed Oct 28, 2009 10:07 PM

If I make the initial delay and the data-to-data delay variable, I can adjust the pixels for best fit, can't I?

I'm not sure... it still isn't entirely clear what the sequence is for the process.

*02 clock falls, indicating the start of a new cycle.
*S4 falls, indicating $8000-$9FFF data access. (it had better be ANTIC because that's our only clue)
*After an adjustable delay, (perhaps 0) SRAM is accessed for the first data byte/bits.
*SRAM data is latched into the CPLD data reg.
*Data is clocked out of the register at an adjustable clock rate. (after an adjustable delay?) **when does this happen? do we need two sets of data regs?**

Is that about right?

Would it be worthwhile to have a line counter and start/stop without requiring DLIs? We have the vertical and horizontal sync pulses in the LUMA input. Maybe implement two-line modes?

Bob



View Postcandle, on Wed Oct 28, 2009 8:55 PM, said:

so every 4 pixels would be a bit disorted, but within controlable range
may be a good idea to use higher frequency than nessesary, and then scalling it down by the clock divider
it might reduce pixel skew in those 4 pixel chunks if the falling edge of phi2 would activate the clocking circuit it would be in-phase with phi2 all the times - even if not - higher frequency to start with would give smaller skew rate


#45  

    Dragonstomper

  • 548 posts
  • Joined: 15-April 03
  • Location:California

Posted Thu Oct 29, 2009 12:08 AM

Thank you for keeping this alive, Claus. My family is more financially stable, now, so am eager and willing to put my money where my mouth is and support. :D

#46  

    Quadrunner

  • 9,751 posts
  • Joined: 29-September 05
  • Location:Australia

Posted Thu Oct 29, 2009 12:49 AM

Couldn't we somehow automate the enable/disable process?

Just reserve an address which, if accessed, will enable the LEM, another for disable.

Since we're talking custom Display Lists anyway, we could have something like a dummy graphics line before the real display.

e.g.
2 x 8 Blank
1 x 7 Blank
LMS $BE00 Mode D - tell the LEM to enable itself. (read to page $BE00 will return zeros, any access to $BE00 enables LEM mode)
LMS $9C40 Mode 2
23 x Mode 2
LMS $BE80 Mode D - tell the LEM to disable itself. (any access to $BE80 shuts off LEM mode)

#47  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Thu Oct 29, 2009 4:06 PM

View Postbob1200xl, on Wed Oct 28, 2009 10:07 PM, said:

I'm not sure... it still isn't entirely clear what the sequence is for the process.
It's been bouncing around in my head for a year, so it's pretty clear to me:

As far as the SRAM goes, the sequence is laid out in the timing diagrams I posted at the top of this thread. At the rising edge of Phi2, 8 bits of SRAM data get clocked into the first data register. 140 ns later, 8 bits from another bank go into the second register. (That's one reason why a 140 ns delay line on Phi2 would be ideal.)

As for the luma output, we must divide each 560 ns bus cycle into 8, 4, or 2 equal parts and select 1, 2, or 4 bits at a time per pixel using a variable width, variable period multiplexer. (A 70 ns delay helps generate the counter to address the mux).

#48  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Thu Oct 29, 2009 4:12 PM

View PostRybags, on Thu Oct 29, 2009 12:49 AM, said:

Couldn't we somehow automate the enable/disable process?

Just reserve an address which, if accessed, will enable the LEM, another for disable.

Since we're talking custom Display Lists anyway, we could have something like a dummy graphics line before the real display.

e.g.
2 x 8 Blank
1 x 7 Blank
LMS $BE00 Mode D - tell the LEM to enable itself. (read to page $BE00 will return zeros, any access to $BE00 enables LEM mode)
LMS $9C40 Mode 2
23 x Mode 2
LMS $BE80 Mode D - tell the LEM to disable itself. (any access to $BE80 shuts off LEM mode)
Very interesting idea! A few details in your example need correcting:

Page $BE is outside the range we've selected, but $9E would work.

ANTIC mode 2 would not be useful. The design only works with single-line modes which use DMA on every line.

But the clever idea of using ANTIC to enable and disable the luma is worth considering.

#49  

    Dragonstomper

  • 628 posts
  • Joined: 07-March 08
  • Location:Michigan

Posted Thu Oct 29, 2009 4:33 PM

View PostClausB, on Thu Oct 29, 2009 4:12 PM, said:

ANTIC mode 2 would not be useful. The design only works with single-line modes which use DMA on every line.
Wait a minute! Why shouldn't single-line character modes work? As long as the character set data are stored in the SRAM and the character codes are stored elsewhere, it should work. You would still get 40 or 20 characters across the screen but each character would be 16 bits wide and have the same hi-res luma options as the graphics modes. One complication however is the 8 color-clock delay between luma video and GTIA video. In graphics modes that is easily corrected by offsetting the plotting locations, but in character modes it gets more restrictive.

#50 ONLINE  

    Stargunner

  • 1,377 posts
  • Joined: 28-July 07

Posted Thu Oct 29, 2009 9:25 PM

The timing diagram shows the SRAM but not the LUMA timing. (does it?) While we are loading one register, are we reading LUMA from the other? In the next/same cycle?

Bob




View PostClausB, on Thu Oct 29, 2009 4:06 PM, said:

View Postbob1200xl, on Wed Oct 28, 2009 10:07 PM, said:

I'm not sure... it still isn't entirely clear what the sequence is for the process.
It's been bouncing around in my head for a year, so it's pretty clear to me:

As far as the SRAM goes, the sequence is laid out in the timing diagrams I posted at the top of this thread. At the rising edge of Phi2, 8 bits of SRAM data get clocked into the first data register. 140 ns later, 8 bits from another bank go into the second register. (That's one reason why a 140 ns delay line on Phi2 would be ideal.)

As for the luma output, we must divide each 560 ns bus cycle into 8, 4, or 2 equal parts and select 1, 2, or 4 bits at a time per pixel using a variable width, variable period multiplexer. (A 70 ns delay helps generate the counter to address the mux).






1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users