thorfdbg Posted January 16, 2013 Share Posted January 16, 2013 Something isn't still quite right here - there are two questions I currently cannot answer, or for which I only have a guess. First, if ANTIC asserts NMI, when does it drop it? My current answer is: At the beginning of the next line, but that doesn't seem to be quite right. It probably doesn't matter anyhow as the CPU reacts on the flank only in first place, though NMI must be cleared early enough to allow Antic to generate a second flank. Second, the acid test currently states that the Antic instruction word remains in the chip and continues to generate the display and the DLI, if bit 7 is set, even around a VBI - of course the critical case is if display list DLI is off. However, I do have one game here where this logic breaks something serious. The game disables the display list DMA, but right at a time where Antic was fetching a blank line + DLI instruction, and thus keeps generating DLIs all over. This, yet again, causes many DLIs to pile up and to finally overflow the stack. However, what this program also does is to write into the Antic display list pointers, even though display list DMA stays off. Could it be that this clears the instruction register? Quote Link to comment Share on other sites More sharing options...
phaeron Posted January 17, 2013 Author Share Posted January 17, 2013 First, if ANTIC asserts NMI, when does it drop it? My current answer is: At the beginning of the next line, but that doesn't seem to be quite right. It probably doesn't matter anyhow as the CPU reacts on the flank only in first place, though NMI must be cleared early enough to allow Antic to generate a second flank. Some guys put a scope on the NMI line a while back and determined that ANTIC asserts it for two machine cycles. This, combined with a bug in the 6502, is what causes the ignored NMI problem. Two cycles is the minimum requirement in the MOS spec, which Atari made the mistake of believing. However, since the NMI is edge-triggered and ANTIC always asserts it at the same place in the scanline, where it is deasserted is not very important. Note that this is not the same as when the DLI and VBI bits change in the NMIST register. DLI and VBI are set around the time that the NMI is asserted, but they don't get reset similarly. DLI only resets when VBI goes active or vice versa, or both get reset when NMIRES is strobed. If only one of them activates then that interrupt status bit never gets automatically reset. Second, the acid test currently states that the Antic instruction word remains in the chip and continues to generate the display and the DLI, if bit 7 is set, even around a VBI - of course the critical case is if display list DLI is off. No, DLIs never happen in vertical blank -- at scanline 248, the DLIs will stop and the VBI will fire, and at scanline 8 the display list processing will assume. With only a couple of exceptions, stopping display list DMA is the same as repeating the last instruction over and over in a display list with display list DMA enabled. You can't get DLIs in vertical blank with a display list even if your display list is longer than can fit in scanlines 8-247. This behavior is also why a $C1 opcode fires DLIs repeatedly: a JVB opcode is essentially a repeating jump opcode that disables display list DMA until vertical blank. Since it's only one scanline tall, it fires a DLI every scanline. Race in Space depends on this in its title screen. However, I do have one game here where this logic breaks something serious. The game disables the display list DMA, but right at a time where Antic was fetching a blank line + DLI instruction, and thus keeps generating DLIs all over. This, yet again, causes many DLIs to pile up and to finally overflow the stack. This is correct, and in fact was a bug we hit with His Dark Majesty 1.0: when you hit a button to end the opening animation, it stopped display list DMA without turning of DLIs, and sometimes this caused an NMI recursion lockup as you describe. We hit this first in emulation and confirmed that it happened on real hardware later. However, what this program also does is to write into the Antic display list pointers, even though display list DMA stays off. Could it be that this clears the instruction register? No. The DLI bit of the instruction register is never cleared any anything other than a DMA load. You can confirm this in the Ir7 output of the Instruction Register section of ijor's ANTIC re-schematic. Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted January 17, 2013 Share Posted January 17, 2013 This is correct, and in fact was a bug we hit with His Dark Majesty 1.0: when you hit a button to end the opening animation, it stopped display list DMA without turning of DLIs, and sometimes this caused an NMI recursion lockup as you describe. We hit this first in emulation and confirmed that it happened on real hardware later. No. The DLI bit of the instruction register is never cleared any anything other than a DMA load. You can confirm this in the Ir7 output of the Instruction Register section of ijor's ANTIC re-schematic. I checked here - the trouble is that currently the console keys are only sampled - in the emulation - in the vertical blank. Unfortunately, this creates a timing situation where the display list gets interrupted exactly at the wrong place to break the program. )-: It's a matter of bad luck. Quote Link to comment Share on other sites More sharing options...
fox Posted February 20, 2013 Share Posted February 20, 2013 (edited) I do not know whether this works for missiles - I haven't tried. But it would be quite a bit more useful indeed (especially as missiles mix with Gr.9 and Gr.11 as player 4, regular players don't.). Here is the code - should hopefully work, I recontructed from memory (no disk drive for the Atari right now): 10 .OPT OBJ 20 *= $6000 30 LDA #$7C 40 STA $D000 50 LDA #$FF 60 STA $D00D 70 LDA #$34 80 STA $02C0 90 LDA #$00 100 STA $022F 110 LDY #$02 120 LOOP: LDA #$03 140 STA $D40A 150 STA $D008 160 LDX #$0B 170 WT: 180 DEX 190 BNE WT 200 STY $D008 210 JMP LOOP As said, it is pretty much obvious switching from 4x to 1x, but 1x is expressed with the value 2. Thank you! On my 65 XE this is off one cycle. It works if I for example replace $7C with $7A. Can someone verify that, please? It also works for missiles. The effect doesn't work in latest Altirra nor Atari++. Any chance emulating it? Edited February 20, 2013 by fox Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted February 21, 2013 Share Posted February 21, 2013 Thank you! On my 65 XE this is off one cycle. It works if I for example replace $7C with $7A. Can someone verify that, please? My 8-bit is currently not in reach, sorry. Will be again in three weeks or so. The effect doesn't work in latest Altirra nor Atari++. Any chance emulating it? Chances are better than even, but I'm currently busy working on atx format support. Which does only work half the way in my current beta. Quote Link to comment Share on other sites More sharing options...
drac030 Posted February 21, 2013 Share Posted February 21, 2013 Speaking of Acid800, I have an observation to share: my 130XE, after few hours of continuous running, stops passing the "CPU: Illegal instructions" test in Acid800. The behaviour is so that the test runs for about 2 seconds, then Acid800 crashes. As far as I can tell, everything else (SDX + utilities, BASIC, TBXL, MAE, LW, Bomb Jack, Numen etc. etc.) on this computer works wth usual stability and reliability. When the computer is cold, the test passes without any problems. Quote Link to comment Share on other sites More sharing options...
phaeron Posted February 22, 2013 Author Share Posted February 22, 2013 Uuu, that's not good. Perhaps one of the insns is acting differently after warmup and is corrupting memory. Could you try the attached version, which will scribble the current opcode at the top-left corner of the screen? If it's consistently a specific opcode that's causing the blowup then we should be able to identify it and then blacklist it. acid800-illopc.zip Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted February 22, 2013 Share Posted February 22, 2013 (edited) Speaking of Acid800, I have an observation to share: my 130XE, after few hours of continuous running, stops passing the "CPU: Illegal instructions" test in Acid800. The behaviour is so that the test runs for about 2 seconds, then Acid800 crashes. As far as I can tell, everything else (SDX + utilities, BASIC, TBXL, MAE, LW, Bomb Jack, Numen etc. etc.) on this computer works wth usual stability and reliability. Please see above for a couple of extra instructions my 8-bit doesn't perform "as it is believed". There are quite a couple of extra instructions that are instable and you probably shouldn't depend upon. Maybe you find similar problems on your system. Unfortunately, I've currently no access to my 8-bit, neither do I have a SIO2PC adapter for it. Note that sane programs (such as Basic, TBXL) do not use such instructions, i.e. that these programs work as advertized is just saying that your system works fine. Edited February 22, 2013 by thorfdbg Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted February 24, 2013 Share Posted February 24, 2013 Thank you! On my 65 XE this is off one cycle. It works if I for example replace $7C with $7A. Can someone verify that, please? It also works for missiles. The effect doesn't work in latest Altirra nor Atari++. Any chance emulating it? Added emulation. It's pretty experimental at this time, though. Requires more experiments on real hardware to check the precise conditions. Quote Link to comment Share on other sites More sharing options...
phaeron Posted February 25, 2013 Author Share Posted February 25, 2013 Bleh... okay, here's what I've been able to dig up: First, the specific cause. The player/missile width is controlled by a two-bit counter that causes a shift every cycle that it is in the 00 state. Bits 0 and 1 of SIZEPx or SIZEM enable bits 0 or 1 of this counter, thus giving rise to the following patterns: SIZE=00 (1x width): xx -> 00 SIZE=01 (2x width): x0 -> 01, x1 -> 00 SIZE=11 (4x width): 00 -> 01, 01 -> 10, 10 -> 11, 11 -> 00 So far, so good. However: SIZE=10 (broken 1x width): 00/11 -> 00, 01/10 -> 10 This mode causes the state machine to eventually lock up in the 10 state. This means that 50% of the time if you switch a sprite from size 3 to size 2 -- the 01 and 10 states -- it will jam on the last output bit value and stop shifting. Normally, this doesn't cause a problem because the shift register is already empty, but if you do it in the middle of the sprite you can get a '1' bit. What this means is that if you are just doing normal sprite programming, you should always prefer the 00 code for normal width instead of 10. As for the specific timing, I've been able to tighten up the repro case to a single machine clock (two color clocks): mva #$80 grafp0 mva #223 hposp0 ;223 or 224 works here mva #0 sizep0 lda #$03 sta wsync sta wsync sta sizep0 ;*, 105, 106, 107 sty sizep0 ;108, 109, 110, 111 Now for the bad news. It turns out that while GTIA only records collisions during non-blank regions, it runs the sprite comparison and shift logic all the time, including horizontal and vertical blank. Again, this typically isn't an issue because for off-screen sprites the shifters empty out before you can see anything. However, this is pertinent when this bug is triggered and the horizontal position is then set >=228. Since the horizontal counter only counts 0-227, it will never match a sprite whose HPOS is set >=228, and if this bug has been triggered it means that the sprite will be continually stuck outputting a '1' bit without the need to touch it ever again. I'm not sure if there's a use for a sprite that covers the entire screen, but I suppose someone could think of some way to abuse it. This means that in order to emulate this behavior properly an emulator has to run the sprite shift logic all the time and not just in the display regions. 2 Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted February 25, 2013 Share Posted February 25, 2013 Now for the bad news. It turns out that while GTIA only records collisions during non-blank regions, it runs the sprite comparison and shift logic all the time, including horizontal and vertical blank. Again, this typically isn't an issue because for off-screen sprites the shifters empty out before you can see anything. However, this is pertinent when this bug is triggered and the horizontal position is then set >=228. Since the horizontal counter only counts 0-227, it will never match a sprite whose HPOS is set >=228, and if this bug has been triggered it means that the sprite will be continually stuck outputting a '1' bit without the need to touch it ever again. I'm not sure if there's a use for a sprite that covers the entire screen, but I suppose someone could think of some way to abuse it. This means that in order to emulate this behavior properly an emulator has to run the sprite shift logic all the time and not just in the display regions. That is pretty much what I wondered about, namely when (or if) GTIA releases the "stuck" counter. Actually, while the player or missile remains to be stuck, you can reset it manually by writing into the size register again so the counter continues to count.Actually, it is of some limited use. Remember that the GTIA priority adjustment for GTIA modes 40 and C0 only work for players, but not for the fifth player with the missiles all combined. A horizontal kernel would allow to split the color of the screen horizontally, up to a single color clock resolution. That's not so easy by just using the CPU alone, by wasting only a single missile for it.Anyhow, it's a pretty obscure feature we have here, indeed. 2 Quote Link to comment Share on other sites More sharing options...
+JAC! Posted February 25, 2013 Share Posted February 25, 2013 >but I suppose someone could think of some way to abuse it Looking at Fox' post you can be pretty sure about it, and thinking about it, I.... :-) 2 Quote Link to comment Share on other sites More sharing options...
phaeron Posted February 26, 2013 Author Share Posted February 26, 2013 Alright, I beefed up the player/missile resizing test in Acid800 to test for the lockup condition. I discovered in the process of doing this that the test didn't actually test the 1x-to-2x and 1x-to-4x cases, which I've added in addition to the new 2x-to-1xalt and 4x-to-1xalt cases. I'd appreciate it if people with XEs tried this version to double-check that this isn't machine-sensitive; I would check it myself except that right now it is too cold in the house for my XE to show alternate timing (no, I am not joking). Tomorrow I might try shoving my XE under a sofa cushion. Other changes in this version are that I've reorganized the memory layout to raise the MEMLO requirement for the standalone tests from $1000 to $1A00 and sped up the annoying ANTIC: DMA pattern test so it doesn't blank out the screen for so long. Acid800-1.1beta2.7z.zip Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted March 1, 2013 Share Posted March 1, 2013 Thanks, tested and found my "p/m shift register stuck" emulation working. Just one thing for the acid test: You currently depend in the test that the Os reacts on an NMI whose origin it cannot detect by triggering a VBI. I would believe this is rather a side effect how the Os was implemented, and it is not necessarily the best implementation choice. Os++ reacts differently - if the origin of the NMI is unclear, the safest thing to do is to ignore it and not to trigger *something* from which you do not know what it is. If you want to test for "undiscovered" NMIs, probably the best option is to disable the Os ROM, define the NMI vector yourself, and there check what the hardware flags are. Quote Link to comment Share on other sites More sharing options...
phaeron Posted March 2, 2013 Author Share Posted March 2, 2013 Acid800 runs on 400/800s as well as XL/XE machines, so most tests avoid using the IRQ and NMI vectors directly. I disagree that ignoring the NMI is the best decision. NMIs can only come from three sources, DLIs, VBIs, and System Reset. The System Reset NMI is only on the 400/800 and the OS normally only runs with VBIs enabled. On an XL/XE, the normal situation is that an untagged NMI can always be assumed to come from the VBI. Failing to do this means that a program that did an unsynchronized STA NMIRES in mainline code could cause a VBI to be lost. Dispatching NMIST=$1F as a VBI avoids this situation, and I am unaware of any case in which this has caused problems. Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted March 2, 2013 Share Posted March 2, 2013 On an XL/XE, the normal situation is that an untagged NMI can always be assumed to come from the VBI. And that's exactly not correct. The trouble is that by this a simple DLI can accidentially generate VBI execution in the Os, at which point the Os reloads a lot of hardware registers and increments clocks and timers, thus the full state machine gets out of sync,. Plus you're wasting a lot of cycles for an operation you do not know where it came from, which can be quite desasterous. Actually, the very same handling is done in the IRQ handler: If the source of the IRQ is not known, it's ignored. Many Os'es have to keep care of such "bogus interrupts" (wasn't any different on the Amiga, for example), and the best soluton is that if you don't know where it came from, you cannot handle it. Even if I would try to be smart, let's say I woudl be using Antic VPOS to "guess" whether an unknown interrupt was a DLI or a VBI, the test wouldn't work correctly because it requires that a DLI is responded as a VBI. That's simply not a good assumption because you're depending on an undocumented implementation choice. Quote Link to comment Share on other sites More sharing options...
phaeron Posted March 2, 2013 Author Share Posted March 2, 2013 The OS does not use DLIs unless the seldom-used fine scrolling feature is enabled. As such, the most common OS display environment by far is VBI enabled only and there is no question as to what triggered the NMI. It is much less likely that badly timed writes to NMIRES would be a problem in a program that uses DLIs as that will screw up the DLI pattern and that is commonly fatal in any program that has closely spaced DLIs. IRQs are a different situation for two reasons. First, there are 12 sources of IRQs vs. the 2/3 sources of NMIs with no one IRQ source being predominant. Second, unlike ANTIC, POKEY does issue spurious interrupts on its own that it spontaneously deasserts, specifically the serial output complete IRQ. The SEROC IRQ will deassert several dozen cycles after a write to SEROUT and this can cause the CPU to dispatch an IRQ that is not reflected in IRQST by the time the IRQ routine checks it. Koronis Rift actually depends on this behavior. On the other hand, so far you have not named any program that benefits from the alternate NMI handling behavior you advocate, and without that you're merely deviating from the behavior of Atari's OS and reducing the compatibility of Os++ for no good reason. Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted March 2, 2013 Share Posted March 2, 2013 The OS does not use DLIs unless the seldom-used fine scrolling feature is enabled. This is not quite the right argument. The Os NMI handler is not only used by the Os. It is used by *any* game on the table, probably with the minor exception of those that disable the Os ROM on the XLs and later series. You usually cannot circumvent that your DLI goes through the Os NMI handler. Despite, Os++ has a couple of other uses for DLIs as well. (Graphic modes 9 to 11 with text window). As such, the most common OS display environment by far is VBI enabled only and there is no question as to what triggered the NMI. It is not a matter of "likeliness". It is a matter of "fatality" if things go wrong. If I trigger a VBI instead of a DLI, things can go terribly wrong because the machine is tied up quite a bit. Games make heavy usage of DLIs, and the VBIs they install are typically much longer than the harmless Os VBI handler. It usually doesn't happen very often - if at all, because IRQ usage in games is so low - but if it does happen (something Atari was likely not even aware about), it will create quite a mess. It is much less likely that badly timed writes to NMIRES would be a problem in a program that uses DLIs as that will screw up the DLI pattern and that is commonly fatal in any program that has closely spaced DLIs. What is more fatal? Using up the tight stack space of the poor 6502 by triggering a long interrupt where no such interrupt should happen, or dropping a shorter interrupt that usually doesn't drive the game mechanics, but only the display process? On the other hand, so far you have not named any program that benefits from the alternate NMI handling behavior you advocate, and without that you're merely deviating from the behavior of Atari's OS and reducing the compatibility of Os++ for no good reason. I gave good reasons. The situations in which it makes a difference are so incredibly small that it would not matter - actually, I don't have a game here where it does - however, it is the matter of a sane Os design to do the right thing in such a situation. Triggering an interrupt on a "good guess" basis is not a sane design. Atari simply didn't know, of course. If no DLI, then VBI or RESET. That the answer could be "none of the above" was simply not considered. And in reality, it is pretty hard to trigger this condition. Quote Link to comment Share on other sites More sharing options...
phaeron Posted March 2, 2013 Author Share Posted March 2, 2013 There are lots of decisions that Atari made in their OS design that I would not characterize as sane. However, of the other alternative OSes that I have been able to find, all of which have altered the NMI dispatch path, none of them have made the decision to drop NMIs as you have. Not Qmeg, not 816-OS, not Ultimon XE, not MyBIOS. Furthermore, games that use DLIs are perfectly fine with this because they are written for Atari's OS. There's simply no reason to do this other than for OS design purity reasons. As I've already said, Acid800 cannot take over the NMI vector on 400/800 machines so it has to rely on this quirk of the OS to test the NMIRES/NMIST behavior. If you want to keep Os++ as-is and reduce its compatibility, you're free to do so. Acid800 is a hardware test and when it comes down to being able to test more hardware behavior or working around the quirks of one alternative OS I will have to choose the former. Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted March 3, 2013 Share Posted March 3, 2013 There are lots of decisions that Atari made in their OS design that I would not characterize as sane. However, of the other alternative OSes that I have been able to find, all of which have altered the NMI dispatch path, none of them have made the decision to drop NMIs as you have. Not Qmeg, not 816-OS, not Ultimon XE, not MyBIOS. Furthermore, games that use DLIs are perfectly fine with this because they are written for Atari's OS. There's simply no reason to do this other than for OS design purity reasons. Simply because the quirk wasn't known, and simply because these Os's aren't re-designs, but rather overwrote parts of the binary dump of the Atari Os with their own routines (most of them, at least). That's an option I do not and cannot pick. As I've already said, Acid800 cannot take over the NMI vector on 400/800 machines so it has to rely on this quirk of the OS to test the NMIRES/NMIST behavior. If you want to keep Os++ as-is and reduce its compatibility, you're free to do so. Acid800 is a hardware test and when it comes down to being able to test more hardware behavior or working around the quirks of one alternative OS I will have to choose the former. And that's simply not true, I provided an option how you can deal with the situation of testing the hardware instead of the Os, at least on the newer machines. It is easy to check whether the memory mapping via PIA port B works. It is also easy to override the NMI vector so you can see how the hardware (and not the Os) reacts. Quote Link to comment Share on other sites More sharing options...
St(r)yker Posted May 31, 2013 Share Posted May 31, 2013 Hi! Acid800 Do not work for Atari 400 16KB ? Quote Link to comment Share on other sites More sharing options...
phaeron Posted May 31, 2013 Author Share Posted May 31, 2013 Should work fine, as long as you have at least Acid800 v1.0 (you can get it from http://virtualdub.org/altirra.html). I reorg'd the memory map a while ago so the tests would fit in 16K of RAM. Quote Link to comment Share on other sites More sharing options...
St(r)yker Posted May 31, 2013 Share Posted May 31, 2013 Atari 600Xl 16K work but It does not work in A400: (.. I tested on real hardware and emulator .... Quote Link to comment Share on other sites More sharing options...
phaeron Posted May 31, 2013 Author Share Posted May 31, 2013 Argh... it's a bug in OS-B. The screen editor clears up to one page above RAMTOP on a screen open. I'll have to put in a workaround for this. Quote Link to comment Share on other sites More sharing options...
thorfdbg Posted May 31, 2013 Share Posted May 31, 2013 Argh... it's a bug in OS-B. The screen editor clears up to one page above RAMTOP on a screen open. I'll have to put in a workaround for this. Just as a side-information: Similar bugs exist even in the XL-Os. Screen-clear should be ok, but IIRC insert-lines or delete-lines were affected and moved more than required, including ROM space. Fixed in Os++. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.