Here's my second revision at board layout. Metalguy66 pointed out that I'd be bumping the keyboard on my 600XL since that front row is so close. So I assembled some spare perfboard and header pins and a socketed 6502C and did some testing of how it would all fit, and this is what I came up with:
Also, I have what I believe are correct GAL equations. Though I wouldn't mind if someone wanted to sanity-check them: http://www.flurg.com...d rev 2 GAL.txt
For memory <-> I/O DMA, the controller device needs only to do two things: (1) generate the RAM address to read/write from, and (2) select the I/O device to write/read from (with the appropriate device register address selected).
There are some devices which "know" about DMA already for which you don't need (2), but I think for the most part it's not avoidable. So, what this means is, in order to do DMA to an external I/O device, you need to be able to generate two addresses, one for the system RAM and one for the external device. The system RAM address has to be a counter clocked on phi2's falling edge in order to do full-speed DMA.
To do DMA to an internal I/O device, you have to keep the device address on the bus for each byte transferred (in many cases you'd just do one at a time), but this means the data has to be transferred to external memory which doesn't use the system address bus for the transfer. One application I've thought of is using FIFO RAM, clocking it at some fixed rate (22KHz or 15KHz whatever), and every clock doing 4 (or transfers to load bytes from the FIFO to each POKEY volume register. The CPU just loads up the FIFO every vblank period or so. In this way you have basically a hardware mod player. Likewise you could even have simple hardware waveform generators to do this instead of pulling the data from RAM, and thus be able to generate nice instrument sounds. Since it's /HALT driven, if you sync to the hsync rate it might skew a few clocks at first but eventually it would "fit in" to the spare non-ANTIC cycles on any mode line. You could have a POKEY tone generator which also allows concurrent usage of all POKEY timers... even SIO, probably.
The application I'm really hoping to utilize this with though is networking i.e. high-speed downloads via a DMA-enabled PBI ethernet device. Anyway it seems like there are lots of interesting things you could do with this type of circuit.