karri, on Wed Sep 23, 2009 11:25 PM, said:
The Lynx cart is really a streaming device. There is no random access available. So I decided to allow just one file open ata time. Opening a new file automatically closes the old one.
Comments?
--
Karri
So, since the Lynx cart is a streaming device, why use the traditional "random" access fopen/fseek/fread/fwrite/fclose type of interface? I realize that it probably makes it easier to keep a paradigm programmer's are used to, but it seems to me like trying to juggle having only one file open would get cumbersome.
Let me add some more thoughts to this discussion. In my day job, I used to program/maintain the low level I/O code for a video game engine used in published games for the PC, PS2, PS3, XBox, and XBox 360. The engine was used in "big world" type games where only a small subset of the level data was in RAM at any given time and data was streamed in off of the DVD as needed. (The other design philosophy used by some games it to force all data for a level to fit into RAM and do a single load per level.)
It really helps to think of the data on the Lynx cart as a database. Sure there is an encrypted loader and an executable down at the beginning of the cart memory but the rest of it is a database of game data. Now that we're thinking of the cart as a big "database", all game "data units" (e.g. sprites, bitmaps, audio, scripts, etc) in this database can be fully described by a memory location (it's location on the cart) and a length (number of bytes).
Typically in a streaming game engine, each data unit has a unique ID assigned to it so that the ID can be used to refer to the asset. For instance if you have a text rendering engine that uses a bitmap of characters, the text engine would typically take an asset ID that refers to the character bitmap. This allows the same text rendering engine to be "data driven" in that it can be configured to use different character bitmaps. The point I'm trying to make is that having a unique ID per data unit is usually much easier to work with than to try to think of things in terms of files that you have to fopen, fseek, fread, fclose.
Once you commit to using unique ID's per data unit, you no longer need files. All you need is an index table that translates a data unit ID into the starting address and length data for the data unit. Typically, you would have a load queue that is full of load requests. Each load request consists of a data unit ID and a destination memory location and possibly a callback function pointer to call when the data is loaded. Then in the game run loop, a small portion of each frame is spent loading data. Since we know how many cycles it takes to do loads and our cart load state is preserved between across frames, we can easily limit the amount of time the loader takes each frame.
This design creates an asynchronous loading system. This is a good thing but can make your coding a little more tricky. For instance, let's say your game starts off by scrolling some text up the screen to tell a short back story. Your intro logic initializes the text rendering engine with the data unit ID of the character bitmap you want to use and when you call into the text rendering engine's render code it has to be able to "dereference" the data unit ID into a pointer to the character bitmap so that it can calculate the memory location of each character it is blit'ing to the screen. To dereference a data unit ID into a pointer, you pass the data unit ID to the loader. The loader checks to see if the data unit is already loaded, if it is, it immediately returns the pointer. If the data unit isn't loaded, it creates a load request, puts it in the load queue, and returns a value that mean "NOT YET" to the text rendering engine. The text rendering engine then needs to see that it doesn't have it's required data so it returns back out the run loop until next frame.
Now the run loop will call the loader which will take the load request, set up the upper address bit hardware and then spins doing dummy writes to the cart I/O register to get the bottom bits counter to the correct value before reading the data in and writing it to a free RAM location. Now the trick here is to figure out what the cycle budget is for the loader is so that you can "pause" a load and return to the run loop when the loader has used up its time slice. It's a primitive "cooperative multitasking" method. The next time through the run loop will call back into the loader which will continue it's load. This time sliced loading continues until the data unit is fully loaded. After that, the next call to the loader to dereference the data unit ID will return the address of the data unit in RAM. The text rendering engine then can proceed to blit the charaters to the screen buffer.
There are several cool tricks you can do with a streaming loader system like this:
1. You can partition the RAM into buffers for different types of data units. For instance, you can have a buffer for sounds, a buffer for bitmaps, a buffer for scripts, etc. Then if you store a monotonically incrementing number (think of it as a time stamp) associated with each data unit in a buffer, you can do least recently used data unit flushing when you need to free up memory in the buffer. For instance, let's say the text rendering engine was trying to load a new character bitmap. The loader sees that the bitmap buffer is full of loaded data units so it goes through the list of data unit timestamps and "flushes" the oldest one. It repeats the flushes until it has enough room for the new data unit to be loaded and then loads it into the buffer. Now this can be tricky because you can get into "thrashing" situations if you're not careful. Let's say your bitmap buffer only has room for two character bitmaps but you're level file requires text from three different character bitmaps. What will happen is that every time through the run loop, the character rendering engine will try to load three data units causing one of them to be flushed each frame. This thrashing will probably mean that one of the three character bitmaps will not be available every frame. It is likely to cause blinking text.
2. The buffers for data units can be partitioned by type or by region or however you think it is appropriate. The buffer sizes themselves can also be data driven. For instance, if you know that you need lots of sound effects in memory but fewer sprites, say during a boss battle, you could have your level definition file set the sound buffer bigger and the sprite buffer smaller.
3. The data unit ID system lets you organize data units so that they take fewer cycles to load. You could even put data units together that are logically associated. For instance, we almost alway need to load a palette with a sprite bitmap. Packing a sprite bitmap with its palettes probably makes sense so that you can load all of them at once.
4. The loader could be given some simple heuristics to scan its load queue and group together data units that are in the same 2048 byte segment on the cart. That way, it won't have to set the upper address bits and go into the offset loop to get the bottom 11-bit counter to the right offset. Instead, it just reads the first data unit, spins to skip the junk in between and then reads in the second data unit.
There are some details that I should lay out:
1. You will need an index data structure that maps data unit ID's to the cart memory locations and lengths. Let's assume you're using one of Karri's prototype carts that uses an additional 8-bit latch for address bits 19-27 giving you a 128 MB (65536 * 2048 Bytes) cart. A straightforward approach to indexing would mean that each index entry would have 3 bytes for the cart address and probably 2 bytes for the length. You could fit about 400 index records in a single 2048 byte segment. Now if your data units average 512 bytes in size, you could fit 262144 data units on a 128 MB cart. You would need 1.3 MB just to store the entire index. So clearly, some kind of hierarchical indexing scheme needs to be cooked up. That I leave up to you, but let me give you a hint: don't be afraid to have multiple copies of data units on the cart. You're going to have 128 MB to play with, having half a dozen copies of some common data units won't hurt anything. Also, think of how you could make assumptions about address bits. You could have a zero page byte that stores the upper 8 bits of cart addresses. Then you only need two bytes for address data per data unit. Each level would then store all of it's data units in a 512 KB block on the cart. When you make a level transition, you load initialize that variable with the 8-bits for the 512KB block for that level's data. Then you would load in the index from that 512 KB block and go with your level. All data unit ID translations then use that level's index and translate to addresses into that 512 KB block.
2. There will be times when your game might cause a brief period (a couple of frames) of thrashing that is harmless to the game play but might result in flushing data units that you absolutely don't want flushed (e.g. the boss a.i. script). You will want to have a locking mechanism on data units that overrides the LRU flush heuristic. You will want to use it sparingly, but I guarantee you will use it.
3. Instead of trying to make data driven A.I. or level logic where your game interprets some script, why not just code it up in C/asm, compile it, and then store it as a data unit that can be loaded when needed. The thing you have to worry about are hard coded branch destinations in the loadable executable data. The reason is that the chunk of executable could be loaded anywhere in memory and if a branch destination is not relative, then it likely branch of into nowhere and crash. You need to write "relocatable" code where all branches are relative to the program counter or are calls into permanent, non-moving engine code.
So Karri, writing a normal file interface may be good for casual programmers but if we want to really take full advantage of the new large carts, I think we need a generic way to package up assets as data units that can be streamed in as needed and an example streaming loader that people can use in their games. It would achieve the goal of abstracting the cart away from the programmer and be a better fit to the streaming cart interface.
--Wookie
Edited by Wookie, Thu Nov 12, 2009 5:54 PM.