Jump to content



0

TIA Sound


8 replies to this topic

#1 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Thu Jul 7, 2011 1:29 AM

As I've mentioned in other threads, lately I've been preoccupied with TIA sound. I want to understand it from the ground up. So in addition to reading through all of the old [stella] mailing list posts about TIA sound, I've also been trying to learn whatever I can about sound and music in general-- sound waveforms, frequencies, tuning systems, synthesis, MIDI, etc. I've even gone so far as to start collecting books about computer sound, electronic music, and "soft synths." This has been going on to various extents for a while now-- months, if not years-- but just recently I've really been focusing on it more and more, almost obsessively (it's my newest "shiny object").

Last week I wrote a little program that runs through every combination of AUDC0 and AUDF0. It plays 7 seconds of AUDC0=0, AUDF0=0; then 7 seconds of AUDC0=1, AUDF0=0; then 7 seconds of AUDC0=2, AUDF0=0; etc. After it goes up through AUDC0=15, it starts over again with AUDC=0, but with AUDF0=1; then with AUDF0=2; etc. I chose 7 seconds because I wanted to make sure each sound frequency had a chance to repeat a few times, even with the lower frequencies. It just so happened that it takes about 1 hour to cycle through all of the AUDC0/AUDF0 combinations. AUDV0 is set to a fixed value of 15 the whole time. I know it's pointless to play AUDC0=0 and AUDC0=11 this way, and that some AUDC0 settings yield the same results as each other (e.g., AUDC0=4 and AUDC0=5), but I really wanted to go through all the possible combinations, and I figured the occasional periods of silence would be useful when I recorded the output and analyzed it.

So the other night I put the program on my Krokodile Cartridge, hooked up my Atari 2600 to my VCR, then to my DVD recorder, and recorded the program's output on a DVD at the highest quality (1 hour). Then I used the Any Audio Converter program to convert the DVD's soundtrack to WAV files, and started using the WavePad Sound Editor program to examine the WAV files. Unfortunately, the WAV files came out to be sampled at 48000 Hz, which doesn't jive very well with the TIA's base audio frequency of about 31400 Hz, so that makes it tricky to examine the higher frequencies. Still, I've been able to use the lower frequencies to check the higher frequencies. For example, AUDC0=1 and AUDF0=0 is tough to analyze by itself, but I can look at AUDC0=1 and AUDF0=1, then AUDF0=2, then AUDF0=3, etc., to see how the waveform gets longer, and then the bit pattern of the waveform becomes easier to see.

I haven't finished my analysis, but right away I started freaking out about what I'm seeing. For example, where I expected to see pulse waveforms, instead I'm seeing something that's more like a cross between a sawtooth waveform and a pulse waveform. In other words, it jumps up to a peak, but then starts decreasing like a sawtooth wave, then jumps down to a valley, then starts climbing back up like an inverse sawtooth, then jumps back up again. I started to worry that there might be something seriously amiss with the way I'd gone about recording, converting, and analyzing the sounds, but then I realized it makes perfect sense. After all, a steady stream of 1s is silent, since there's no oscillation going on. So if you're alternating between 1s and 0s, but each value lasts for more than one occurrence (e.g., 1111000011110000 etc.), the change from 1 to 0 or 0 to 1 results in a peak (1) or valley (0), but then as the same value continues, the amplitude starts to move toward the center, where it would flatline if it continued long enough. So 10101010 jumps up and down as expected, but 1111000011110000 jumps up, then starts to flatline, then jumps down, then starts to flatline, etc., resulting in a waveform that looks like the following:

AUDC0=6, AUDF0=5.png
AUDC0=6, AUDF0=5

And the longer the value stays the same, the more the waveform will flatline between each peak and valley:

AUDC0=6, AUDF0=12.png
AUDC0=6, AUDF0=12

AUDC0=6, AUDF0=19.png
AUDC0=6, AUDF0=19

Another thing I'm noticing is that the waveforms don't always follow the bit pattern I was expecting. AUDC0=1 (the 4-bit LFSR) is the best example. Apparently the TIASOUND.C code said that the pattern is 111100010011010. But then Adam Wozniak said that it's actually 000011101100101 (i.e., that the 1s and 0s should be reversed). I'd been relying on Adam's old [stella] posts, because he took actual samples and eventually figured out the logic for the more obscure waveforms. But what I'm seeing is that TIASOUND.C was actually correct-- it's 111100010011010.

I'm also seeing that AUDC0=6 and AUDC0=10 are sometimes inverted-- these are the 31-bit waveforms that are supposed to have 13 highs followed by 18 lows. Sometimes it's actually 18 highs and 13 lows. I suppose the sounds sound the same either way, but I still think it's interesting that the "duty cycle" isn't consistent.

I've started working on a document that summarizes my understanding of TIA sound, and I'll post it when I'm done. I've been making some assumptions, and I don't know whether they're correct or not, but it seems like they should be. For example, my attempts to decipher the TIA schematics showed that the TIA generates an asymmetrical pattern of A-phi-1 and A-phi-2 pulses for each scan line. When I posted about that in [stella] a year or two ago, Eric Ball mentioned that he'd also noticed this, and had written about it in his AtariAge blog. At first I was sort of obsessed with how that might sound, but I've since concluded (or assumed) that the A-phi-2 clocks are the important ones-- or rather, the transitions from A-phi-1 to A-phi-2-- which yields a 28-29 waveform that divides the scan line as nearly in half as is possible using 57 counts per scan line. So I'm asuming that when you have a waveform bit pattern like 111100010011010 (the 4-bit LFSR with AUDF0 set to 0), each bit lasts for either 28 or 29 counts, rather than lasting for the exact same duration.

Anyway, I'm finding all of this to be very interesting (otherwise my current obsession wouldn't have lasted for very long before the next "shiny object" captured my attention!). One thing I'm getting out of this is that when David Crane mentioned using the "triangle" or "sawtooth" wave when he was programming Pitfall (I forget what he actually said), he might not have been speaking in error, or confusing Pitfall with Pitfall II as I'd originally assumed. :)

Michael




#2 Rybags OFFLINE  

Rybags

    Quadrunner

  • 10,313 posts
  • Location:Australia

Posted Thu Jul 7, 2011 2:05 AM

Recording the audio at the highest sampling rate possible is a good idea. A modern PC will often have inbuilt sound that can attain 192K samples/sec, so that's worth considering.

With the close-ups of the waveforms, the unexpected decay you see is a result of the external circuitry, not necessarily TIA.
We had similar discussion about this with Pokey (which essentially uses the same generation techniques as TIA).
Instead of a nice square wave, we get a steep ramp followed by gradual decay.

I've measured the Pokey's sound pin with oscilloscope - the output there is a proper square wave.

To further understand TIA sound, probably a good idea to look into LFSRs. Knowing how they work can help understand how the various noise types occur, as well as understanding how different frequencies of the same noise type can give entirely different sounds due to the way the samples are plucked when AUDF counts down.

Then you've got more advanced stuff like the interaction of HF waveforms but I don't know that it applies so much with TIA as it does with Pokey.

Edited by Rybags, Thu Jul 7, 2011 2:05 AM.


#3 Random Terrain ONLINE  

Random Terrain

    Visual batari Basic User

  • 20,923 posts
  • Controlled Randomness
    Replay Value
    Nonlinear
  • Location:North Carolina (USA)

Posted Thu Jul 7, 2011 2:10 AM

If you have it under the microscope long enough, maybe you'll figure out how to produce phonemes by only fiddling with AUDVx, AUDCx, and AUDFx since some tone frequencies almost sound like a human voice already.

I've been hoping that instead of using digitized speech, we could track down a handful of phonemes that already exist inside of the Atari 2600. We could have a collection of phonemes that programmers could play in a sequence to make words. Programmers would no longer have to waste space on digitized speech. They'd only need to use a relatively small bit of phoneme data to make words, so they'd have more space to use for their games.

I've tried to find phonemes by randomly trying stuff, but with no real understanding of how sound works on the Atari 2600, it could take a few thousand years. Maybe your growing knowledge can help.

#4 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Thu Jul 7, 2011 3:11 AM

View PostRybags, on Thu Jul 7, 2011 2:05 AM, said:

Recording the audio at the highest sampling rate possible is a good idea. A modern PC will often have inbuilt sound that can attain 192K samples/sec, so that's worth considering.
Yeah, my DVD recorder's manual says it can record up to 192K, and I set it to record as high as possible, but I can't plug my Atari 2600 directly into the DVD recorder-- I have to go through the VCR coaxial, then VCR to DVD using standard A/V connections, so I don't know how that affects it. I also set up the Any Audio Converter program to use the highest WAV sampling rate possible-- but that was when I first installed it, and now I can't find where to set it (maybe it's only available during initial setup of the program???).

View PostRybags, on Thu Jul 7, 2011 2:05 AM, said:

With the close-ups of the waveforms, the unexpected decay you see is a result of the external circuitry, not necessarily TIA.
We had similar discussion about this with Pokey (which essentially uses the same generation techniques as TIA).
Instead of a nice square wave, we get a steep ramp followed by gradual decay.

I've measured the Pokey's sound pin with oscilloscope - the output there is a proper square wave.
Yeah, I figure the output from the pin would show a proper square or pulse wave, but it does make sense to me that the actual result when it goes to a speaker, and then to an ear, would have the decay, since a steady output of 1 would be "heard" as silence rather than a prolonged loud sound, seeing as how it's the vibrations of the speaker or eardrum that create the sound, and a steady output of the same value doesn't create any vibration or oscillation. The decay actually has a trigonometric-looking shape, because once it hits a peak or valley (as I'm calling it) it starts to decay following a curve, then when it jumps to the next peak or valley the wave curves again, as follows:

waveform with decay.png

Sorry for the sloppy drawing, it was the best I could do freehand.

View PostRybags, on Thu Jul 7, 2011 2:05 AM, said:

To further understand TIA sound, probably a good idea to look into LFSRs. Knowing how they work can help understand how the various noise types occur, as well as understanding how different frequencies of the same noise type can give entirely different sounds due to the way the samples are plucked when AUDF counts down.

Then you've got more advanced stuff like the interaction of HF waveforms but I don't know that it applies so much with TIA as it does with Pokey.
Yeah, I'm familiar with how the LFSRs work-- at least, the horizontal sync counter. The 5-bit and 4-bit LFSR circuitry for the audio tone generators (or "noise" generators) look a bit different on the schematics-- especially the 4-bit LFSR-- but I know how to determine their output based on the bits that get compared. I think the 4-bit audio LFSR must work differently than the 6-bit HSYNC counter LFSR, which may be why the bits seem to be inverted from what we would expect. Or maybe it's like the AUDC0=6 situation (where the "duty cycle" of the pulse wave is inverted sometimes), so the 4-bit LFSR waveform might be inverted at times depending on some factor I haven't determined yet, like maybe the last bit value from the previous AUDC0 setting?

Anyway, I think the decay may have been introduced by the Any Audio Converter program, because I used the default choices when I converted the DVD recording, which included a setting that said something like "normalize the output" or some other phrase with "normalize" in it. I need to read up on that program to see what the settings do, because I just downloaded it and this was the first time I used it, asise from converting ELP's "Hoedown" to a WAV file to see if I could play it on the 2600 (which was a disaster).

Michael

#5 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Thu Jul 7, 2011 3:21 AM

View PostRandom Terrain, on Thu Jul 7, 2011 2:10 AM, said:

If you have it under the microscope long enough, maybe you'll figure out how to produce phonemes by only fiddling with AUDVx, AUDCx, and AUDFx since some tone frequencies almost sound like a human voice already.

I've been hoping that instead of using digitized speech, we could track down a handful of phonemes that already exist inside of the Atari 2600. We could have a collection of phonemes that programmers could play in a sequence to make words. Programmers would no longer have to waste space on digitized speech. They'd only need to use a relatively small bit of phoneme data to make words, so they'd have more space to use for their games.

I've tried to find phonemes by randomly trying stuff, but with no real understanding of how sound works on the Atari 2600, it could take a few thousand years. Maybe your growing knowledge can help.
I think for speech, the AtariVox or whatever it's called (I don't have it handy right now) may be the best solution. Digitized sound takes up too much memory without compression-- I could get only a second or less of digitized sound for my ELP's "Hoedown" experiment, and that was with 4K, so even with 32K it would be only a few seconds of music at best, unless it were heavily compressed somehow. Of course, I was using a sample rate of 31400 Hz, and with speech it could be much lower than that. :)

But I hear you as far as trying to use the TIA's native abilities to simulate complex sounds. The more I learn about TIA sound-- and it's coming in slow spurts-- the more amazed I am at just how versatile it is, considering how very primitive it is compared to more modern computer sound generation models.

Michael

#6 xxl OFFLINE  

xxl

    Moonsweeper

  • 465 posts
  • Location:KRAKOW/Poland

Posted Thu Jul 7, 2011 3:46 AM

PWM

http://www.atariage....for-atari-2600/

#7 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Fri Jul 8, 2011 12:58 AM

View Postxxl, on Thu Jul 7, 2011 3:46 AM, said:

PWM
Thank you, that was very helpful! :) I'm obviously still learning this stuff-- slowly-- and every puzzle piece helps me fill in the picture. :D

Michael

#8 Tjoppen OFFLINE  

Tjoppen

    Chopper Commander

  • 129 posts

Posted Fri Jul 8, 2011 1:59 AM

View PostSeaGtGruff, on Thu Jul 7, 2011 1:29 AM, said:

I haven't finished my analysis, but right away I started freaking out about what I'm seeing. For example, where I expected to see pulse waveforms, instead I'm seeing something that's more like a cross between a sawtooth waveform and a pulse waveform. In other words, it jumps up to a peak, but then starts decreasing like a sawtooth wave, then jumps down to a valley, then starts climbing back up like an inverse sawtooth, then jumps back up again.
Just to chime in briefly on this: This happens to all square waves that go through DC unbiasing/decoupling, which for the VCS probably happens somewhere in the RF modulation circuitry or just before. For instance, you see this on the NES as well.

#9 SeaGtGruff OFFLINE  

SeaGtGruff

    River Patroller

  • 4,545 posts
  • Location:Georgia, USA

Posted Thu Jul 14, 2011 2:55 AM

Just a brief update-- I decided to go back to the TIA schematics to see if I could verify some of what I'm seeing in my samples, and I think I've finally deciphered just about all of the audio stuff-- in pieces, anyway (I haven't tried to "combine" the pieces together per se yet, just worked out the poly-5 logic for all possible AUDC0 settings, and the poly-4 clock for all possible AUDC0 settings, and the poly-4 logic for all possible AUDC0 settings).

The only thing I haven't actually tried to tackle yet is the "frequency divider" logic, and I expect I'll be re-reading Eric Ball's blog entries carefully when I do, since he's already worked that part out.

The 4-bit LFSR was actually the worst part for me to understand, because I'm no engineer, and those flip-flop circuits were driving me bananas. In particular, since each flip-flop has two output lines, there's the question of which line is the "bit" for the 4-bit polynomial. Some of the nodes are labeled Q6, Q7, Q8, and Q9, which would lead you to believe that those nodes are the bits-- but then it's actually the "not Q9" line that goes to the volume circuits! So I decided to go with the "not node" lines for the bits (not Q6, not Q7, not Q8, and not Q9), since I thought the idea of LFSRs is that the last bit is the output bit, not the *negation* of the last bit-- but hey, why not, what do I know? ;)

I was also having trouble reconciling the logic for the 0 or 1 states until I looked up at the volume circuits. Duh! I kept thinking, "If AUDC0 = 0, then the output must always be 1," but the logic just didn't make sense-- until I realized that the output is always *0*, not 1, so the "nor" gates of the volume circuits can output 1s when they nor the "not D" lines with the 4-bit LFSR output.

There was also the problem of deciphering how those pull-downs determine what bit 0 will be (the value that gets shifted into bit 1), but I finally saw the light! :) So simple, yet so confusing until you finally "get it"-- and so much harder to get when you don't know squat about schematics and electronics to begin with. ;)

And there's also the issue of two different *kinds* of LFSRs-- those that use an XOR comparison, and those that use an XNOR comparison. Depending on the comparison, the illegal state will be either all 0s (XOR) or all 1s (XNOR). The 6-bit horizontal sync counter uses XNOR, so 111111 is the illegal value. The 5-bit LFSR uses XOR, so 00000 is the illegal value. I believe the 4-bit LFSR uses XNOR, so 1111 is the illegal value-- which means that for the 9-bit LFSR, the illegal value would be 000001111 (I think-- I haven't actually tried to work through all the poly-9 states to follow the sequence). But maybe it would be different if you take the labeled nodes as the bits, and say that the output of poly 4 is the negation of the last bit? I'm going to be working through everything again one more time to verify I didn't screw something up somewhere.

Anyway, at least I figured out how the "divide by 3" works for AUDC0 = %11xx. I guess maybe Adam Wozniak had figured it out years ago in [stella], but Ron Fries was totally off. It's kind of neat, really. :) Also the "divide by 2" pure tones-- very simple, and so obvious in the schematics.

The 4-bit LFSR also has the additional complexity (in comparison to the 5-bit LFSR) that its clock changes depending on AUDC0 (or AUDC1), not to mention depending on poly-5. I worked out the poly-4 clock before I even considered trying to tackle the 4-bit LFSR itself, because those flip-flops had me so totally freaked out about trying to follow the logic. It's neat the way the "divide by 15" (as the TIA hardware manual calls it) works-- actually the "13 high/18 low" or "18 high/13 low" pattern that I think Ron Fries calls "divide by 31," although he does clarify that it follows a 13/18 pattern (which averages out to division by 15.5). And the AUDC0 = 3 logic that had stumped everyone until Adam Wozniak finally figured it out is also very plain to see in the schematics.

It's just such a frigging pain in the ass to work through all those logic gates for the first time, but it turns out that it really isn't all that bad once you realize how the different AUDC0 settings affect the logic gates, since some of the stuff that looks horrendously complex actually gets "cancelled out" in certain situations (i.e., sometimes it makes a difference , sometimes it doesn't).

There are still things about the schematics that totally mystify me. I'm sure they're basic engineering things, but so far I haven't found any books or web pages that show anything similar to help me figure out what they're doing, although it's still sort of easy to figure out the overall results if I don't get too hung up on the details. For example, where a line (usually a clock line) looks like it's crossing another line, or maybe just "butting" up against it, and it looks kind of like a bridge or something, like the lines don't actually connect. If I just ignore the clock line and pretend the other line is just a normal line, I can work out the logic. I just wish I understood what the clock line is supposed to be doing to it. And I know some things are undoubtedly a matter of timing, or how a signal flows along a line over time rather than being transmitted "instantaneously"-- like all those places throughout the schematics where a line is fed into an inverter, but just before that it branches off and feeds into another inverter that then feeds back into the first one. Huh? I guess the output from the "final" inverter must switch quickly from low to high, or high to low, resulting in a pulse rather than a steady signal? But I really have no freaking clue.

Anyway, I actually feel pretty good about how much I've been able to figure out, considering how clueless I actually am when it comes to schematics! ;)

Michael







0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users