4A50 bank-switching spec (updated 9-17)

supercat · September 13, 2005

I've just drawn up a prelimiary specifiication for a new bank-switching method I call $4A50. With luck, carts which implement this specification should be manufacturable at reasonable cost. The design as quoted here should fit within a Xilinx 95C36 chip, though it will be tight. I welcome any comments or suggestions, though I don't know if I'll be able to fit in much beyond what I'm planning. To my knowledge, if this works, it will be the first cart ever to allow the use of all instructions and addressing modes to read and write RAM, even including the read-modify-write instructions.

Old version: CARTFMT.HTM

Somewhat newer version: cartfmt.htm

Latest version: cartfmt.htm

Edited September 20, 2005 by supercat

potatohead · September 13, 2005

I've just drawn up a prelimiary specifiication for a new bank-switching method I call $4A50. With luck, carts which implement this specification should be manufacturable at reasonable cost. The design as quoted here should fit within a Xilinx 95C36 chip, though it will be tight. I welcome any comments or suggestions, though I don't know if I'll be able to fit in much beyond what I'm planning. To my knowledge, if this works, it will be the first cart ever to allow the use of all instructions and addressing modes to read and write RAM, even including the read-modify-write instructions.

930295[/snapback]

Very interesting. I love the hardware aspect of the 2600.

Any chance at allowing program execution from RAM (read / write)? I'm thinking of self-modifying code. I know it's supposed to be a no-no, but it had it's merits on the 8bitters. With all the room this scheme allows, self-modifying code would not be important for game logic stuff, but may well permit exotic kernels.

supercat · September 13, 2005

Any chance at allowing program execution from RAM (read / write)? I'm thinking of self-modifying code. I know it's supposed to be a no-no, but it had it's merits on the 8bitters. With all the room this scheme allows, self-modifying code would not be important for game logic stuff, but may well permit exotic kernels.

930312[/snapback]

Dynamically-generated code may be used if the bank that holds the executing code is write-protected while the code is actually being executed from it. Note that it's possible to have the $1000-$10FF bank point to the same physical spot in RAM as part of the $1100-$17FF bank and have the latter write-protected while the former is not. Code execution within writable address space is not possible, however, because there is no direct way to know whether a particular memory operation is a read or a write.

Actually, there would be a way to make things really nice and easy for the programmer, but I don't know if it could be made robust: add some capacitance to the data bus and regard the first half of every RAM access cycle as a read and the second half as a write. If the 6507 attempts to perform a write, it would overdrive the capacitance and change the value. If it attempts a read, the capacitance would hold the value on the bus so it would get written back to RAM unmodified. Would be beautiful if it would actually work, but it might be flaky on some machines.

BTW, one thing I'd like emulators to trap on if they support 4A50 bank switching would be a 6507 read when a write is expected, which is followed by anything other than a write to the same address. Such an action would probably have no ill effect on a real machine, "most of the time", but code which does such things could end up working on some 2600's and failing on others. Trapping such code in an emulator would help avoid difficult debugging sessions.

cd-w · September 13, 2005

I've just drawn up a prelimiary specifiication for a new bank-switching method I call $4A50. With luck, carts which implement this specification should be manufacturable at reasonable cost. The design as quoted here should fit within a Xilinx 95C36 chip, though it will be tight. I welcome any comments or suggestions, though I don't know if I'll be able to fit in much beyond what I'm planning. To my knowledge, if this works, it will be the first cart ever to allow the use of all instructions and addressing modes to read and write RAM, even including the read-modify-write instructions.

930295[/snapback]

This looks very interesting. I think I understand most of it, but would it be possible to add some code fragements to illustrate exactly how bank switching and mode selection are performed? I am not sure what the addresses in $0030->$003F are for, or if they are required? Would it be possible not to use the addresses and allow the BIT/skip to work properly for these addresses? I can wait to see this in cart form as it is going to be a lot easier to program writes than on the Supercharger, and the extra RAM should allow for some impressive games to be realised.

Chris

Edited September 13, 2005 by cd-w

supercat · September 14, 2005

This looks very interesting. I think I understand most of it, but would it be possible to add some code fragements to illustrate exactly how bank switching and mode selection are performed? I am not sure what the addresses in $0030->$003F are for, or if they are required? Would it be possible not to use the addresses and allow the BIT/skip to work properly for these addresses? I can wait to see this in cart form as it is going to be a lot easier to program writes than on the Supercharger, and the extra RAM should allow for some impressive games to be realised.

930369[/snapback]

The plan for $0030-$003F is to allow stores to them to set the banking mode (as a cycle-saving alternative to using e.g. "CMP $1400,x". Not quite sure what all I'll be able to fit in the logic, though.

SeaGtGruff · September 14, 2005

I've just drawn up a prelimiary specifiication for a new bank-switching method I call $4A50. With luck, carts which implement this specification should be manufacturable at reasonable cost. The design as quoted here should fit within a Xilinx 95C36 chip, though it will be tight. I welcome any comments or suggestions, though I don't know if I'll be able to fit in much beyond what I'm planning. To my knowledge, if this works, it will be the first cart ever to allow the use of all instructions and addressing modes to read and write RAM, even including the read-modify-write instructions.

930295[/snapback]

I just downloaded the specs, and will comment when I've read and absorbed them. One thing, though-- are you writing any programs using this bankswitching method, and if so, how are you writing them? I'm not really a C programmer, and the only time I've tried to read the code for an emulator was when I was trying to figure out why z26 wasn't automatically detecting that my QFTLPOA WIP uses the M-Network bankswitching method. But how hard would it be to modify z26 or Stella so they'll emulate your bankswitching method? After all, if you can't design a game and get it working first, then you won't have anything to put on a chip. So how would I go about writing and testing a game that uses your bankswitching method? Because frankly, if it will do all that you've mentioned in other posts, then I should be designing QFTLPOA to use it.

Michael Rideout

supercat · September 15, 2005

I just downloaded the specs, and will comment when I've read and absorbed them. One thing, though-- are you writing any programs using this bankswitching method, and if so, how are you writing them? I'm not really a C programmer, and the only time I've tried to read the code for an emulator was when I was trying to figure out why z26 wasn't automatically detecting that my QFTLPOA WIP uses the M-Network bankswitching method. But how hard would it be to modify z26 or Stella so they'll emulate your bankswitching method? After all, if you can't design a game and get it working first, then you won't have anything to put on a chip. So how would I go about writing and testing a game that uses your bankswitching method? Because frankly, if it will do all that you've mentioned in other posts, then I should be designing QFTLPOA to use it.

931425[/snapback]

Quest For The Last Pint Of Amstel?

I don't have any emulator support for this thing yet. I figured that before I bother emulator authors with it I should try to get the specs somewhat solid and then confirm that I'll actually be able to make the hardware work as described.

BTW, Z26 uses a few heuristics to detect the most common types of bank switching (if a file is 8448 bytes, it's probably Supercharger; if the first 256 bytes are FF it's probably safe to assume it's Superchip (if it isn't, such treatment is probably harmless). Beyond that, it has a table of hash values of common games and the bank-switch methods they require. If you wish to use something other than the common bank switching methods with your own game (or even with a hack of a pre-existing game) you must specify the bankswitch type on the command line.

supercat · September 15, 2005

BTW, I had an idea for an even nicer method of handling RAM writes. It would require adding a 74HCT373 and eight resistors. Were it not for trademark issues, I'd call it the "Nike" method. Amazingly simple, which makes me wonder why nobody's done it (unless it wouldn't work, in which case that'd be a pretty good reason).

supercat · September 15, 2005

What's the best way to track down every-so-slightly flaky memory access? I have a test program that writes about 32,000 of random data to RAM in random order, then reads it all back, and repeats. It seems to have about one memory failure per megabyte read/written from RAM.

supercat · September 17, 2005

In attempting to actually implement this thing, I've found some bad things and some good things:

BAD: I miscounted my available registers, and also ran out of product terms, so I had to cut back a little on some of the functionality.

GOOD: I think I'll be able to make RAM reads and writes work as they "naturally" should, which is to say that if you read an address, you get a read and if you write that address, you get a write [without needing to double up addressing, play goofy cycle-watching games, etc.]

QUESTIONS: Does this look like a nice bunch of functionality to have in a bank-switched RAM-plus cart, or is there anything people would like that isn't there?

The read/write access to 3F was actually something of a nuisance to get to fit on the chip, and I'm not sure I'll be able to keep it. It seems like it should be nice, though.

If I switch from the XC9536XL to the XC9572XL, I could add some more cool stuff, but the parts price would increase by about $1, which would translate into an increase of about $2 or so retail. I don't really want to have two versions of this thing, and I don't want to lose sight of my original goal (make the cheapest possible RAM-plus cart, and include as many goodies as possible without increasing the cost). If I wanted to add another $5 or so to this thing I could really make it 'sing', but that would really start to lose sight of the whole purpose of the project.

BTW, at present I have one latch and one pin left but not a while lot of routing. Would anyone like a software-controlled blinkenlight?

potatohead · September 17, 2005

BLinkenlight? Yes. Somebody will use the darn light for something, just as they do everything else.

Cost -vs- functionality?

How much would a finished cart be?

Having read / writes work as they should is a big deal, if that is traded for cost, it's a mistake IMHO.

supercat · September 18, 2005

Having read / writes work as they should is a big deal, if that is traded for cost, it's a mistake IMHO.

933459[/snapback]

I think I can get everything in the spec for a component cost, not including PCB, of under $10. The read/write magic should be a freebie, and that one is a big deal because it doubles the amount of simutaneously-available RAM.

Right now the only way to access the top 32K of flash is via the small page. The chip is too cluttered to accommodate access via other means, but being able to select all of the flash pages in any bank might be nice.

Some more interesting features:

-1- Opcode-watcher so that the "BIT abs" instruction could never hit a hitspot, even accidentally.

-2- Two 256-byte freely-locatable pages instead of one.

-3- Ability to configure a page for auto-zap mode: a read from any byte on the page would store a zero there (and read a zero). Useful for clearing out display buffers and such.

-4- Automatic read/modify/write operations for pixel plotting. Plotting a pixel at X,Y would become unbelievably fast and easy:

 cmp $7F00,x
 cmp $7E00,y

Note that that pixel-plotting routine wouldn't even touch the accumulator! Things like line drawing could thus happen much faster than one would normally think possible.

All those ideas seem pretty neat, but they start to get a bit beyond what the 2600 is about, IMHO, which is trying to make due with dirt-cheap hardware. Designing cool hardware to boost the 2600 is an interesting concept, but can easily get out of hand.

SeaGtGruff · September 18, 2005

I think I can get everything in the spec for a component cost, not including PCB, of under $10. The read/write magic should be a freebie, and that one is a big deal because it doubles the amount of simutaneously-available RAM.

933504[/snapback]

That would be way cool!

All those ideas seem pretty neat, but they start to get a bit beyond what the 2600 is about, IMHO, which is trying to make due with dirt-cheap hardware. Designing cool hardware to boost the 2600 is an interesting concept, but can easily get out of hand.

933504[/snapback]

I have to disagree with you. Keeping costs down was certainly a big factor in the original design of the 2600, although the same amount of money today could buy more advanced (faster/fancier/etc.) components, so the 2600's limitations are as much influenced by what sorts of components were *available* at the time as by what those things *cost* back then. However, once a computer or game console is designed and produced, such that its capabilities are more or less set in stone, the general practice in the industry seems to be to find every way possible to boost those capabilities through software tricks or hardware add-ons. Bankswitching is itself a method of boosting the 2600's builtin capabilities, by making more ROM (and/or RAM) available without having to reinvent the 2600. I think the only way you could truly get out of hand in your attempts to boost the 2600 (or other game console or computer) would be if the hardware or peripherals needed to boost the 2600 ended up costing so much that it would be cheaper to just buy a new console that had better capabilities from the get-go. In the industry, when that threshold is reached, the companies usually just come out with a new design anyway, and try to get the public to toss out their older units and buy the newer, spiffier units.

So please, feel free to add whatever cool and fancy boosts you can come up with, as long as the finished product is still affordable enough that homebrewers won't be disinclined to purchase it and use it for developing their games!

Michael Rideout

supercat · September 18, 2005

I have to disagree with you. Keeping costs down was certainly a big factor in the original design of the 2600, although the same amount of money today could buy more advanced (faster/fancier/etc.) components, so the 2600's limitations are as much influenced by what sorts of components were *available* at the time as by what those things *cost* back then.

Oh, better stuff was definitely available, for a price. Including 2K or even 8K (or even 32K for that matter) of RAM would not have been technologically difficult. Nor would it have been hard to use a 6502 and bring all the address and control wires out to the cartridge port. BTW, was the 6507 used in anything other than the 2600? Having the /RDY line but no /IRQ nor /NMI seems like an odd combination. And even retrospectively I'm not totally sure I understand it. Since the TIA generates the 6502's clock signals, couldn't it have simply stalled the clock on a STA WSYNC, thus freeing up the package pin used by /READY for another purpose (perhaps another address pin or /IRQ--RIOT interrupts wouldn't be very useful in-kernel, but they would allow games like Chess or Maze Craze to maintain a small 'Working...' display without too much CPU overhead [set interrupt to hit just before VSync should happen, then do some computing. When interrupt hits, draw the VSync, set interrupt for mid-screen, and resume computing. When that interrupt hits, draw "WORKING...", set interrupt for end-of-frame, and repeat.

Though I guess, thinking about it, that might not quite have worked unless the RIOT got its own non-stalled clock (which it could).

So please, feel free to add whatever cool and fancy boosts you can come up with, as long as the finished product is still affordable enough that homebrewers won't be disinclined to purchase it and use it for developing their games!

Even with simple bank-switch boards, homebrew games have a hard time in the marketplace (I'm sure a lot more would sell at $10 than $20, but nobody'd make any money on them). Would anyone buy an Atari homebrew that cost $50, even if it was really good? I know Thrus+ is $35, and I bought one, but that game is just incredibly awesome. My perception, though, is that cost is very much an object on homebrew carts, and an extra $5 could make a big difference as to how well a cart sells.

Or, to look at it another way, if you as a programmer could use a bit tighter code and get your game to run on a cheaper cart, would you like to have another $5 in your pocket for each cart that sold?

SeaGtGruff · September 19, 2005

Even with simple bank-switch boards, homebrew games have a hard time in the marketplace (I'm sure a lot more would sell at $10 than $20, but nobody'd make any money on them). Would anyone buy an Atari homebrew that cost $50, even if it was really good? I know Thrus+ is $35, and I bought one, but that game is just incredibly awesome. My perception, though, is that cost is very much an object on homebrew carts, and an extra $5 could make a big difference as to how well a cart sells.

Or, to look at it another way, if you as a programmer could use a bit tighter code and get your game to run on a cheaper cart, would you like to have another $5 in your pocket for each cart that sold?

933937[/snapback]

The problem with homebrew games in the marketplace may be that there really isn't a very big marketplace for them. I mean, where are homebrew games sold, anyway? From web sites like this one? At expos? Unless there's a way to get the homebrew games into the "real" marketplace-- i.e., stores-- they will always be tough to sell. And yes, even if Atari suddenly started selling "Flashbacks" that had a real cartridge slot, the number of people that would actually buy one, and then buy homebrew games to play on it, might be very small.

The thing is, yes, a homebrew would absolutely cost a little more if the cartridge had more ROM and RAM, but the extra ROM-- and especially the extra RAM-- will make it possible to create more complex games. My "Quest for the Lost Pyramids of Atlantis" game trilogy (if I ever get back to working on the first game) uses the extra 4K RAM from the M-Network method to map the playfield and sprites to their own memory areas, similar to how the Atari 8-bit works, and that alone makes it easier for me to do certain things with the playfield and sprites. Yes, I might be able to create the same screen displays "the old way," but just being able to set up the screen memory and then simply load the data as needed for each new scan line is a godsend. And I only just found out about the Krokodile Cart and its 32K RAM bankswitching method, so I'll probably end up using that instead, since 4K RAM isn't really enough for what I want to do.

If you can get your bankswitching cart up and running, I'd definitely be interested in it. And once a few games get created for it, and people see what kinds of games are possible with it (such as large-scale RPGs), I think they would buy the games (at least, I hope they would!), and at least some of the homebrewers would start designing games on it. For example, it would be really cool if someone could write a good RPG-construction-kit, or some other kind of construction kit, which should be possible with 32K or more RAM.

I also love the possibility of having RAM that can be executed, for self-modifying code. One of the biggest drawbacks on the 2600 is the clock speed, and not having enough time to change things on a scan line. With decent amounts of RAM that can actually hold executable code, we could create kernels that changed things like colors or other graphics settings using immediate-mode commands, rather than zero-page commands or indexed-mode commands, and that would free up some machine cycles so that more could be done in the same amount of time.

Michael Rideout

Bruce Tomlin · September 19, 2005

BTW, was the 6507 used in anything other than the 2600?

I know that it was used in the 810 floppy drive.

+batari · September 19, 2005

I don't have any emulator support for this thing yet. I figured that before I bother emulator authors with it I should try to get the specs somewhat solid and then confirm that I'll actually be able to make the hardware work as described.

931502[/snapback]

You might have to add the support to the emus yourself if you want it done in a timely manner. I don't know about z26, but I think a new scheme could be added to Stella without too much trouble if you know C++ pretty well.

supercat · September 19, 2005

My "Quest for the Lost Pyramids of Atlantis" game trilogy (if I ever get back to working on the first game) uses the extra 4K RAM from the M-Network method to map the playfield and sprites to their own memory areas, similar to how the Atari 8-bit works, and that alone makes it easier for me to do certain things with the playfield and sprites.... And I only just found out about the Krokodile Cart and its 32K RAM bankswitching method, so I'll probably end up using that instead, since 4K RAM isn't really enough for what I want to do.

934104[/snapback]

I thought M-Network was 2K. Is the sizes.txt document on MiniDig incorrect? How does one access all the memory on the cart?

As for 3E bankswitching, IMHO it bites. Sure it gives you access to lots of ROM and RAM, but the top 2K of address space is an unmovable 2K chunk of ROM and the bottom 2K of address space is either a 2K chunk of ROM or 1K of RAM. If there were some way to switch banks in the top portion of the address space, this might be a decent scheme but without that ability it is severly limited, IMHO, because each RAM bank can only be accessed from code running in either itself, zero page, or the 2K fixed block. Even though one could produce a cartridge with 64K of code, that 2K fixed block would severely bottleneck things.

Basically, my scheme is designed to be an enhanced M-Network with the following improvements:

RAM size is increased to 32K and ROM to 64K (if I could save money by using smaller RAM or flash chips, I would, down to probably 8K RAM and 32K flash).
The fixed bank shrinks to 256 bytes.
The lower 2K memory block can access any 2K of RAM, or any 2K from the first 32K of flash.
A 1.5K upper memory block has the same abilities (except that it only accesses the first 1.5K of a block).
The 256-byte memory page can access any page out of RAM or flash (including the other 32K)
RAM partitions are readable and writable through their whole range, without need for dedicated read/write addresses.
Some special features are included to allow rapid pixel plotting in a Stella-Sketch-style kernel.

BTW, why is it that it seems bank-switching techniques are never designed by programmers? 3E could have been made a lot nicer with some slight changes but it's too late for that now. The segmentation scheme on Intel's 8x86 may be the subject of many jokes, but it's actually a lot better than many new designs that try to increase the addressing range of 8-16 bit micros.

vdub_bobby · September 19, 2005

I'm following this discussion with some interest, but it is somewhat over my head - I'll second the call for some sample code.

SeaGtGruff · September 20, 2005

My "Quest for the Lost Pyramids of Atlantis" game trilogy (if I ever get back to working on the first game) uses the extra 4K RAM from the M-Network method to map the playfield and sprites to their own memory areas, similar to how the Atari 8-bit works, and that alone makes it easier for me to do certain things with the playfield and sprites.... And I only just found out about the Krokodile Cart and its 32K RAM bankswitching method, so I'll probably end up using that instead, since 4K RAM isn't really enough for what I want to do.

934104[/snapback]

I thought M-Network was 2K. Is the sizes.txt document on MiniDig incorrect? How does one access all the memory on the cart?

934215[/snapback]

You're right, it is 2K RAM-- 1K that can be selected as the lower 2K of ROM space, and four 256-byte chunks that can be selected as the lower 512 bytes of the upper 2K ROM area. It's been a while since I've touched anything related to my QFTLPOA game, so I guess I forgot how small the M-Network scheme really is.

As for 3E bankswitching, IMHO it bites. Sure it gives you access to lots of ROM and RAM, but the top 2K of address space is an unmovable 2K chunk of ROM and the bottom 2K of address space is either a 2K chunk of ROM or 1K of RAM. If there were some way to switch banks in the top portion of the address space, this might be a decent scheme but without that ability it is severly limited, IMHO, because each RAM bank can only be accessed from code running in either itself, zero page, or the 2K fixed block. Even though one could produce a cartridge with 64K of code, that 2K fixed block would severely bottleneck things.

934215[/snapback]

The limitation you described isn't too bad, if you design the game for it. Obviously, any routine that needs to access RAM will need to be in the fixed upper 2K ROM. It sounds a lot like the M-Network scheme, except M-Network also lets you access one of four 256-byte RAM blocks at the first 512 bytes of the upper 2K ROM area. I came up with an easy way for one bank to call a subroutine in another bank, but it's still necesary to keep anything that accesses two banks at the same time in the upper 2K area. I'm putting as much as I can in the lower 2K banks, then jumping to the upper 2K for routines that need to swap banks around. For example, to do the title screens (which I posted screenshots of a few weeks ago), I read the data from a 2K ROM bank and decompress it into the 1K RAM bank, then read the RAM in the kernel for the title. It sounds like a roundabout way to do it, and it would be nice if there was enough ROM so I could just read the ROM data without having to decompress it, so that's where having lots of extra ROM and RAM comes in handy, by letting us do things the "nice and easy" way instead of having to find all sorts of tricks and such to squeeze more out of less.

Basically, my scheme is designed to be an enhanced M-Network with the following improvements:

RAM size is increased to 32K and ROM to 64K (if I could save money by using smaller RAM or flash chips, I would, down to probably 8K RAM and 32K flash).

The fixed bank shrinks to 256 bytes.

The lower 2K memory block can access any 2K of RAM, or any 2K from the first 32K of flash.

A 1.5K upper memory block has the same abilities (except that it only accesses the first 1.5K of a block).

The 256-byte memory page can access any page out of RAM or flash (including the other 32K)

RAM partitions are readable and writable through their whole range, without need for dedicated read/write addresses.

Some special features are included to allow rapid pixel plotting in a Stella-Sketch-style kernel.

BTW, why is it that it seems bank-switching techniques are never designed by programmers? 3E could have been made a lot nicer with some slight changes but it's too late for that now. The segmentation scheme on Intel's 8x86 may be the subject of many jokes, but it's actually a lot better than many new designs that try to increase the addressing range of 8-16 bit micros.

934215[/snapback]

That all sounds great, except I'm not so sure about the fixed area being only 256 bytes. That will probably be okay, though-- it just sounds small!

Michael Rideout

supercat · September 20, 2005

The limitation you described isn't too bad, if you design the game for it. Obviously, any routine that needs to access RAM will need to be in the fixed upper 2K ROM. It sounds a lot like the M-Network scheme, except M-Network also lets you access one of four 256-byte RAM blocks at the first 512 bytes of the upper 2K ROM area.

It's possible to design some games so they'll work well within the 3E limitations, but it often enforces a rigid segmentation upon things that's really icky.

For example, suppose you want to produce a game with a single-line-resolution (assymetric) 32-pixel playfield and two 1LK colored players, plus two missiles and the Ball. Given enough readily-accessible RAM, such a kernel would actually be pretty easy:

 ldy #num_scan_lines-1
lp: ; Assume we're 31 cycles before the end of the scan line
 lda playshp0,y ; T-27
 sta GRP0     ; Assume vertical delay; T-24
 ldx playshp1,y; T-20
 lda pfcol2,y ; T-16
 sta PF2       ; T-13
 lda pfcol0,y ; T-9
 sta PF1       ; T-6
 lda plcol0,y ; T-2
 stx GRP0    ; T+1
 sta COLUP0 ; T+4
 asl              ; T+6
 sta ENAM0   ; T+9
 lda plcol1,y ; T+13
 sta COLUP1 ; T+16
 asl           ; T+18
 sta ENAM1 ; T+21
 lda pfcolors,y ; T+25
 sta COLUPF ; T+28
 asl          ; T+30
 sta ENABL ; T+33
; Now waste 6 cycles (end on T+39)
 dey     ; T+42
 bpl lp  ; T+45

Much more stuff than could normally be gotten into an 1LK, and with six cycles to spare even (if I counted right). But there are nine bytes worth of data that need to be fetched for each of those scan lines--where is all that data going to come from?

From RAM, of course, but how is one going to access it? Fortunately, 4A50 bank switching provides a way. Just set the lower memory block to point to a 2K section of memory and the upper block to point to another 256 bytes, and that will provide nine pages of RAM readily available for easy and immediate access. If one wanted to go beyond that, one could do so by using the Stella helper bank addresses. Just replace one of the stores to a stella register with a store to some magic number plus the Stella register, and in exchange for one extra cycle (using absolute instead of zero-page addressing mode) you get an almost-free bank switch.

It would be possible to do something like this in a 3E kernel, except that since each RAM page is only 1024 bytes there's no way to hold an entire screen. One would thus have to subdivide the screen into horizontal stripes. This works fine for some games like Boulderdash, but would be a real pain in others.

Further, how do you put the data into the screen? Under 4A50 bankswitching, you can easily have a bank of RAM for the screen data, a page of ROM (or RAM) for the character shape, and a bank of ROM (or RAM) for the executing code, all accessible at once. If the code is thrown into RAM, using 2:1 loop unrolling would yield:

 ldx #size/2 -1
lp:
 lda src,x
 sta dest,x
 lda src+size/2,x
 sta dest+size/2,x
 dex
 bpl lp

Twenty-three cycles for every two bytes processed, or 11.5 cycles/byte. Assuming a routine in RAM, the best one could do under 3E bank switching (if the shapes were not in the same bank as the screen buffer) would be:

 ldx #size/2 -1
lp:
 ldy #src_bank
 sta $3E ; or $3F
 lda src,x
 ldy #dest_bank
 sta $3E
 sta dest,x
 ldy #src_bank
 sta $3E ; or $3F
 lda src+size/2,x
 ldy #dest_bank
 sta $3E
 sta dest+size/2,x
 dex
 bpl lp

An extra twenty cycles per loop (ten per byte)--an almost 100% increase. Plus, this code has to use up space in either ZPram or else in the RAM page with the display buffers. I should mention, btw, that M-Network bankswitching has an edge here:

 ldx #size/2 -1
lp:
 bit SRC_BANK_HOTSPOT
 lda src,x
 ldy src+size/2,x
 bit DEST_BANK_HOTSPOT
 sta dest,x
 sty src+size/2,x
 dex
 bpl lp

Here the penalty is only eight cycles per loop, or 4 per byte--less than half the penalty imposed by 3E.

That all sounds great, except I'm not so sure about the fixed area being only 256 bytes. That will probably be okay, though-- it just sounds small!

934697[/snapback]

Given a choice between having 2K fixed and 2K bankable, versus having half as many 4K banks, the former might be better. But IMHO a section which is freely bankable independent of other sections is nicer than a fixed section almost any day of the week. The only objection to such a section would be the possibility of an unknown state on startup. Although one could achieve the same effect as a fixed bank by simply copying the same data into every bank, the fact that all the ROM can be accessed via the $1E00 page means that it could be useful to have contiguous areas of ROM many Kbytes long. Having to stick in a reset vector every 2K would seem rather annoying. Putting in the fixed bank for the last 256 bytes avoids this issue.

supercat · September 20, 2005

Added a few little code samples and changed a few hotspots.

cd-w · September 21, 2005

Added a few little code samples and changed a few hotspots.

934800[/snapback]

Thanks - this is a great help to the hardware-clueless like myself. I'm really looking forward to coding for this.

Chris

Paul Slocum · September 21, 2005

Personally I prefer reduced cost to extra features. It already does a lot of awesome stuff.

Would it be possible to include the ability to use some sort of EEPROM or flash memory for the ROM, and to program it using a serial cable that hooks up to 3 pins on the board? This would probably be worth it even if it adds $1-$2 to the parts cost because it is time consuming to program and solder on each EPROM, so it saves money in the long run.

Using this method, all the parts could be wave soldered at once, and it could be quickly programmed later with a special serial cable and software.

It also makes it useful as a dev tool, since you just have to build a special serial cable to program it, which is much cheaper than an EPROM burner and eraser.

-paul

+batari · September 21, 2005

You might have to add the support to the emus yourself if you want it done in a timely manner. I don't know about z26, but I think a new scheme could be added to Stella without too much trouble if you know C++ pretty well.

934186[/snapback]

Based on recent posts on [stella], it looks like I may need to eat my words...The emu authors are very willing to add this scheme to emus, which is very good news.

Edited September 21, 2005 by batari

4A50 bank-switching spec (updated 9-17)

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members