Luma Enhancement Module Development
Started by ClausB, Oct 23 2009 11:21 AM
96 replies to this topic
#76
Posted Thu Dec 17, 2009 6:30 AM
How is the Lemmy coming on Claus
#78
Posted Tue Jan 5, 2010 11:44 PM
ClausB, on Tue Oct 27, 2009 3:59 PM, said:
I researched a bit on the Web and saw some things about the Xilinx Digital Clock Manager core and about Digital Locked Loops, but I could not find enough details to see if such a thing would fit into our smallish CPLD. Do you have details to share?
Thought about this today doing while doing some work with a dev board I have with an older Spartan 3 (XC3S1000, 1.8V core, 3.3V tolerant) that has four internal DCMs. I wired up a 3.3V buffer (74LVC244) and fed PHI2 into it and configured one of the DCMs to do CLKx16 (there's a minimum output frequency). From there I made some internal /4 and /8 counters to get the 7.16MHz and 3.58 MHz clocks. It wouldn't be any problem generating 14.318 if necessary either. The acquisition board in my HP logic analyzer only has a resolution of 10ns but it looks like the delay between the generated clocks and PHI2 is somewhere between 20ns-30ns. The buffer adds about 8-10ns, and I don't have PHI2 going into a dedicated clock port on the FPGA which would also factor in. This particular FPGA is a 4ns part. The smallest newer generation device that is flash is a XC3S50AN and is under $10. Those come with two DCMs inside.
It's probably not much help if you're planning on going forward with the Atmel part, but I'd just thought I'd show you the result in a Xilinx part using the DCM for one of purposes they were designed for.
#79
Posted Wed Jan 6, 2010 11:27 AM
Very nice... These things (DCMs) are part of the hardware or you have an image that you load?
The bulk of what we (I) do is built into GALs and CPLDs - re-programming the Atari PAL and such. Small projects that work well under CUPL/ATMEL hardware. When something larger comes along that needs to scale up, it is more efficient to carry along the same tools/protocol as used in normal projects, rather than move into another line. If we did a lot of large projects or exclusively large projects, it would change our selection criteria. For now, we need small GAL/PAL support.
Bob
Thought about this today doing while doing some work with a dev board I have with an older Spartan 3 (XC3S1000, 1.8V core, 3.3V tolerant) that has four internal DCMs. I wired up a 3.3V buffer (74LVC244) and fed PHI2 into it and configured one of the DCMs to do CLKx16 (there's a minimum output frequency). From there I made some internal /4 and /8 counters to get the 7.16MHz and 3.58 MHz clocks. It wouldn't be any problem generating 14.318 if necessary either. The acquisition board in my HP logic analyzer only has a resolution of 10ns but it looks like the delay between the generated clocks and PHI2 is somewhere between 20ns-30ns. The buffer adds about 8-10ns, and I don't have PHI2 going into a dedicated clock port on the FPGA which would also factor in. This particular FPGA is a 4ns part. The smallest newer generation device that is flash is a XC3S50AN and is under $10. Those come with two DCMs inside.
It's probably not much help if you're planning on going forward with the Atmel part, but I'd just thought I'd show you the result in a Xilinx part using the DCM for one of purposes they were designed for.
The bulk of what we (I) do is built into GALs and CPLDs - re-programming the Atari PAL and such. Small projects that work well under CUPL/ATMEL hardware. When something larger comes along that needs to scale up, it is more efficient to carry along the same tools/protocol as used in normal projects, rather than move into another line. If we did a lot of large projects or exclusively large projects, it would change our selection criteria. For now, we need small GAL/PAL support.
Bob
warerat, on Tue Jan 5, 2010 11:44 PM, said:
ClausB, on Tue Oct 27, 2009 3:59 PM, said:
I researched a bit on the Web and saw some things about the Xilinx Digital Clock Manager core and about Digital Locked Loops, but I could not find enough details to see if such a thing would fit into our smallish CPLD. Do you have details to share?
Thought about this today doing while doing some work with a dev board I have with an older Spartan 3 (XC3S1000, 1.8V core, 3.3V tolerant) that has four internal DCMs. I wired up a 3.3V buffer (74LVC244) and fed PHI2 into it and configured one of the DCMs to do CLKx16 (there's a minimum output frequency). From there I made some internal /4 and /8 counters to get the 7.16MHz and 3.58 MHz clocks. It wouldn't be any problem generating 14.318 if necessary either. The acquisition board in my HP logic analyzer only has a resolution of 10ns but it looks like the delay between the generated clocks and PHI2 is somewhere between 20ns-30ns. The buffer adds about 8-10ns, and I don't have PHI2 going into a dedicated clock port on the FPGA which would also factor in. This particular FPGA is a 4ns part. The smallest newer generation device that is flash is a XC3S50AN and is under $10. Those come with two DCMs inside.
It's probably not much help if you're planning on going forward with the Atmel part, but I'd just thought I'd show you the result in a Xilinx part using the DCM for one of purposes they were designed for.
#80
Posted Wed Jan 6, 2010 12:45 PM
The DCM is physical hardware that is part of the die. Picture a square-- the perimeter has the interconnect matrix to connect the I/O pins to the logic blocks and LUTs on the interior and the DCMs would be on the corners. To use one you have to instantiate one in a way that so when your definition is "compiled" the synthesizer knows you're referring to a primitive that is native to the device. This is done either with a VHDL/Verilog template or by generating one using the IP core tools that provide GUIs about which features and clock dividers/multipliers you want.
#81
Posted Sat Jan 9, 2010 6:50 AM
I don't know much about interfacing 3.3V logic with 5V systems. Are buffers necessary?
#82
Posted Sun Jan 10, 2010 2:00 PM
Good question... not sure. I think you can drive a 5v IC with a 3.3v chip, but not the other way around. TTL may not exceed the 3.3v input limit, but CMOS will.
On the large ATMEL CPLDs, you can set the outputs to 3.3v or 5v, so they can be run in a 3.3v system.
Bob
On the large ATMEL CPLDs, you can set the outputs to 3.3v or 5v, so they can be run in a 3.3v system.
Bob
ClausB, on Sat Jan 9, 2010 6:50 AM, said:
I don't know much about interfacing 3.3V logic with 5V systems. Are buffers necessary?
#83
Posted Sun Jan 10, 2010 4:32 PM
that board looks a bit big for a standard size cart
#84
Posted Sun Jan 10, 2010 9:41 PM
bob1200xl, on Sun Jan 10, 2010 2:00 PM, said:
Good question... not sure. I think you can drive a 5v IC with a 3.3v chip, but not the other way around. TTL may not exceed the 3.3v input limit, but CMOS will.
On the large ATMEL CPLDs, you can set the outputs to 3.3v or 5v, so they can be run in a 3.3v system.
Bob
On the large ATMEL CPLDs, you can set the outputs to 3.3v or 5v, so they can be run in a 3.3v system.
Bob
ClausB, on Sat Jan 9, 2010 6:50 AM, said:
I don't know much about interfacing 3.3V logic with 5V systems. Are buffers necessary?
For this particular part, yes, I have to drive at 3.3V as the FPGA is not 5V tolerant. If I interfaced it directly, I can kiss that $50 FPGA IC goodbye. You can drive 5V TTL logic with 3.3V LVTTL, but not always the other way around. The CPLD family I use for largish designs is XC9500XL-- they run at 3.3V core but are 5V tolerant. Unfortunately as time goes on there seem to be less parts (excluding legacy SPLD/CPLD) that are 5V tolerant.
#86
Posted Mon Jan 11, 2010 5:00 PM
would adding a dual antic (your previous upgrade) on the upgrade improve things further (more colours/colors and more PMs/sprites etc)
Something like that would be well cool and a decent competitor to the vbxe thing, whilst keeping faith with atari tech...
Something like that would be well cool and a decent competitor to the vbxe thing, whilst keeping faith with atari tech...
#87
Posted Thu Jan 14, 2010 10:16 AM
Well...
You could output two (or four) 'frames' that would simulate a dual (or quad) ANTIC, but only by adding a lot of complexity. LEM is a minimal, but elegant, hack.
Bob
You could output two (or four) 'frames' that would simulate a dual (or quad) ANTIC, but only by adding a lot of complexity. LEM is a minimal, but elegant, hack.
Bob
carmel_andrews, on Mon Jan 11, 2010 5:00 PM, said:
would adding a dual antic (your previous upgrade) on the upgrade improve things further (more colours/colors and more PMs/sprites etc)
Something like that would be well cool and a decent competitor to the vbxe thing, whilst keeping faith with atari tech...
Something like that would be well cool and a decent competitor to the vbxe thing, whilst keeping faith with atari tech...
#88
Posted Thu Jan 14, 2010 10:26 PM
Can you post the .PLD file? I'll add the pinnodes and define some triggers and latches.
Make some video!
Bob
OK. So, combining my pseudocode with your syntax gives us:
Note that we don't need S5 and RD5 because the banked SRAM will occupy pages $80 - $9F. We also don't need RD4 because the SRAM will always be active in that region of RAM, so RD4 should be tied high on the PCB. For the same reason, we don't need to generate MCS because the SRAM CS can be tied to S4 on the PCB.
Make some video!
Bob
ClausB, on Fri Dec 11, 2009 2:17 PM, said:
bob1200xl, on Sun Dec 6, 2009 9:01 PM, said:
I have filled in the I/O pins that we know about. The pin assignment also takes care of the active level for the signal - a ! denotes NOT, so !S4 means S4 is active at a zero level. In the equations and such, all you will see is S4 that denotes S4 Active. (even thogh it's a zero level in the hardware) !S4 in the equations section means S4 is Not Active - OK?
MAxx are the addresses going to the memory chip from the CPLD. Axx would be an address from the cart socket. MCS is MemoryChipSelect and MWE is MemoryWriteEnable. I tried to keep all the Atari signal names the same as we see them in the Atari documentation.
The field to the right of each line is a comment field. Good thing to have if you're going to make something complex.
MAxx are the addresses going to the memory chip from the CPLD. Axx would be an address from the cart socket. MCS is MemoryChipSelect and MWE is MemoryWriteEnable. I tried to keep all the Atari signal names the same as we see them in the Atari documentation.
The field to the right of each line is a comment field. Good thing to have if you're going to make something complex.
OK. So, combining my pseudocode with your syntax gives us:
Name LEM ; PartNo 00 ; Date 12/11/2009 ; Revision 01 ; Designer Claus B. & Bob W.; Company Atari Rulz!! ; Assembly None ; Location ; Device f1504plcc44 ; /* *************** INPUT PINS *********************/ PIN 06 = !S4 ; /* Pages 80-9F select, active low (cart pin 1 and SRAM enable) */ PIN 11 = D4 ; /* Data bus */ PIN 12 = D5 ; /* */ PIN 13 = D2 ; /* */ PIN 14 = D1 ; /* */ PIN 16 = D3 ; /* */ PIN 17 = D7 ; /* */ PIN 18 = D0 ; /* */ PIN 21 = D6 ; /* */ PIN 25 = R/W ; /* Read not Write (cart pin R) */ PIN 26 = !CCTL ; /* Page D5 select, active low (cart pin 15) */ PIN 27 = B02 ; /* Bus clock (cart pin S) */ PIN ?? = B02D ; /* Bus clock thru 140 ns delay */ PIN ?? = C02D ; /* Doubled clock thru 70 ns delay */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ /* *************** OUTPUT PINS *********************/ PIN ?? = MA13 ; /* SRAM upper address */ PIN ?? = MA14 ; /* */ PIN ?? = MA15 ; /* */ PIN ?? = MA16 ; /* */ PIN 09 = !MWE ; /* SRAM Write Enable, active low */ PIN ?? = C02 ; /* Doubled clock to 70 ns delay */ PIN ?? = LUM0 ; /* Luma outputs */ PIN ?? = LUM1 ; /* */ PIN ?? = LUM2 ; /* */ PIN ?? = LUM3 ; /* */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ PIN = ; /* */ /* *************** PINNODES *********************/ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */ PINNODE = ; /* */
Note that we don't need S5 and RD5 because the banked SRAM will occupy pages $80 - $9F. We also don't need RD4 because the SRAM will always be active in that region of RAM, so RD4 should be tied high on the PCB. For the same reason, we don't need to generate MCS because the SRAM CS can be tied to S4 on the PCB.
#89
Posted Thu Feb 4, 2010 9:07 AM
Bob/Claus B...When you finish the mod/upgrade, any chance you could work with the maintainers of the various atari emulators (i.e, m.e.s.s, atari ++ or altirra ) and get your mod/upgrade emulated, you might get more people wanting the upgrade that way, also you might also get more games or software that uses your upgrade's enhanced atari features
just a thought, thats all
just a thought, thats all
#90
Posted Sun Feb 7, 2010 3:07 PM
Sorry, Bob,
I forgot to get back to you. That is the .PLD file pasted into the codebox. Just copy it from the post (or from a reply page) and paste it into a text file.
I have not had time yet to study up on CUPL. Your help is very welcome!
Claus
I forgot to get back to you. That is the .PLD file pasted into the codebox. Just copy it from the post (or from a reply page) and paste it into a text file.
I have not had time yet to study up on CUPL. Your help is very welcome!
Claus
bob1200xl, on Thu Jan 14, 2010 10:26 PM, said:
Can you post the .PLD file? I'll add the pinnodes and define some triggers and latches.
Make some video!
Make some video!
#91
Posted Sun Feb 7, 2010 3:10 PM
carmel_andrews, on Thu Feb 4, 2010 9:07 AM, said:
Bob/Claus B...When you finish the mod/upgrade, any chance you could work with the maintainers of the various atari emulators (i.e, m.e.s.s, atari ++ or altirra ) and get your mod/upgrade emulated, you might get more people wanting the upgrade that way, also you might also get more games or software that uses your upgrade's enhanced atari features.
#92
Posted Sun Feb 7, 2010 3:49 PM
welcome back claus b...was it nice in that 'better place', or were you snowed under with real life (tm&c)
#93
Posted Sat Feb 20, 2010 8:08 AM
carmel_andrews, on Sun Feb 7, 2010 3:49 PM, said:
welcome back claus b...was it nice in that 'better place', or were you snowed under with real life (tm&c)
#94
Posted Sun Feb 21, 2010 2:53 PM
Like the TI ones i saw back in the day.....I do remember the HP oblong/rectangular one's, thought they were nifty
#95
Posted Mon Feb 22, 2010 6:41 PM
Hi folks
I'm also working conceptually right now on a design for something like this.
I was checking some calc'd numbers here, and I'm getting different numbers.
It was mentioned it takes 8x PHI2 accuracy to match current antic resolution.
But I've come up with half that. I'm not trying to have anyone look silly
(and hopefully it won't be me
hehe) Here is what I found in my searching.
According to Mapping Atari, Appendix 3.
1 machine cycle 0.558 uS and is also one PHI2 cycle
color clocks 228 / scanline
machine cycles 114 / scanline
According to Mapping and also Antic mag v3n10 (PBI article) the PHI2 clock
cycle is one machine cycle, and one asm instruction takes two PHI2 cycles,
or more.
Mapping, pg 160, "The shortest 6502 instruction requires two [PHI2] cycles; during
that time the electron beam moves four color clocks."
According to this, the GR.8 scanline will move 8 pixels during the time
for a short 6502 instr.(ie; CLC, etc) to complete. But there are Two PHI2
clocks for that timeframe. So for each PHI2 cycle, there is four colorclocks
moved on-screen. Since 320x192 hi-res uses 0.5 colorclock/pixel, one colorclock
is equivalent to one pixel on the 160px res. screen.
Ok so when the 6502 completes one asm instr., like CLC (CLear Carry), two PHI2's
have happened, and the electron beam or display has moved 8 GR.8 pixels. Or
one text GR.0 character width. My point is there are *TWO* PHI2 signals during
this one character width.
Aha, if you run at PHI2 cycles, you get GTIA capable resolution. And if you
were to turn PHI2 into a double-data-rate/DDR clock system (trigger on rise,
and fall) you might get 160 pixel/scanline res., and this is without clock
multiplier parts, except for altering how PHI2 is used. A DDR-style PHI2
would have about 228 nS/cycle, and of course you would need to get your video
register index finished as early as possible there.
The LEM project is very fascinating! It's very similar to a trick I am thinking
about as well. I sure hope you folks won't mind if I make something with this
technique too?? I'm trying to work-out/design a coprocessor that can use this
video concept. Will I be stepping on any toes if I also complete my project??
sincerely
_falcon_
I'm also working conceptually right now on a design for something like this.
I was checking some calc'd numbers here, and I'm getting different numbers.
It was mentioned it takes 8x PHI2 accuracy to match current antic resolution.
But I've come up with half that. I'm not trying to have anyone look silly
(and hopefully it won't be me
According to Mapping Atari, Appendix 3.
1 machine cycle 0.558 uS and is also one PHI2 cycle
color clocks 228 / scanline
machine cycles 114 / scanline
According to Mapping and also Antic mag v3n10 (PBI article) the PHI2 clock
cycle is one machine cycle, and one asm instruction takes two PHI2 cycles,
or more.
Mapping, pg 160, "The shortest 6502 instruction requires two [PHI2] cycles; during
that time the electron beam moves four color clocks."
According to this, the GR.8 scanline will move 8 pixels during the time
for a short 6502 instr.(ie; CLC, etc) to complete. But there are Two PHI2
clocks for that timeframe. So for each PHI2 cycle, there is four colorclocks
moved on-screen. Since 320x192 hi-res uses 0.5 colorclock/pixel, one colorclock
is equivalent to one pixel on the 160px res. screen.
Ok so when the 6502 completes one asm instr., like CLC (CLear Carry), two PHI2's
have happened, and the electron beam or display has moved 8 GR.8 pixels. Or
one text GR.0 character width. My point is there are *TWO* PHI2 signals during
this one character width.
Aha, if you run at PHI2 cycles, you get GTIA capable resolution. And if you
were to turn PHI2 into a double-data-rate/DDR clock system (trigger on rise,
and fall) you might get 160 pixel/scanline res., and this is without clock
multiplier parts, except for altering how PHI2 is used. A DDR-style PHI2
would have about 228 nS/cycle, and of course you would need to get your video
register index finished as early as possible there.
The LEM project is very fascinating! It's very similar to a trick I am thinking
about as well. I sure hope you folks won't mind if I make something with this
technique too?? I'm trying to work-out/design a coprocessor that can use this
video concept. Will I be stepping on any toes if I also complete my project??
sincerely
_falcon_
#96
Posted Mon Feb 22, 2010 7:46 PM
falcon_, on Mon Feb 22, 2010 6:41 PM, said:
It was mentioned it takes 8x PHI2 accuracy to match current antic resolution.
But I've come up with half that.
But I've come up with half that.
ClausB, on Mon Oct 26, 2009 3:59 PM, said:
It occurred to me that we don't need an 8x clock, just 4x. Here's why: Our highest resolution mode sends out 8 pixels per Phi2 cycle. Instead of clocking them out of a shift register 8 times per cycle, we can multiplex them out with 8 states per cycle. We can get 8 states by decoding 3 bits - the 1x, 2x, and 4x clocks. So we just need two clock doublers.
This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.
This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.
falcon_, on Mon Feb 22, 2010 6:41 PM, said:
The LEM project is very fascinating! It's very similar to a trick I am thinking
about as well. I sure hope you folks won't mind if I make something with this
technique too??
about as well. I sure hope you folks won't mind if I make something with this
technique too??
#97
Posted Tue Feb 23, 2010 9:07 PM
ClausB, on Mon Feb 22, 2010 7:46 PM, said:
falcon_, on Mon Feb 22, 2010 6:41 PM, said:
It was mentioned it takes 8x PHI2 accuracy to match current antic resolution.
But I've come up with half that.
But I've come up with half that.
ClausB, on Mon Oct 26, 2009 3:59 PM, said:
It occurred to me that we don't need an 8x clock, just 4x. Here's why: Our highest resolution mode sends out 8 pixels per Phi2 cycle. Instead of clocking them out of a shift register 8 times per cycle, we can multiplex them out with 8 states per cycle. We can get 8 states by decoding 3 bits - the 1x, 2x, and 4x clocks. So we just need two clock doublers.
This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.
This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.
Oops, missed that I guess.
ClausB, on Mon Feb 22, 2010 7:46 PM, said:
falcon_, on Mon Feb 22, 2010 6:41 PM, said:
The LEM project is very fascinating! It's very similar to a trick I am thinking
about as well. I sure hope you folks won't mind if I make something with this
technique too??
about as well. I sure hope you folks won't mind if I make something with this
technique too??
Thanks Claus. I will be posting all about it once I have something in-hand.
I don't want to be a boaster-poster.
and want solder to flow, soonest!
_falcon_
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users














