Jump to content



0

Another lossy audio compression experimentation


5 replies to this topic

#1 newcoleco ONLINE  

newcoleco

    Stargunner

  • 1,053 posts

Posted Mon Apr 19, 2010 1:11 PM

I did a little experimentation, nothing concret with a real project but clearly plausible whatever the vintage system. The idea is to use the most significative parts of a wavelet transform which can be applied on digitalized sounds and video.

My main objective was to try an extreme case of compression and deal with old processors like Zilog Z80 (no hardware multiplication and floating point values). I did write a paper ( available as a PDF file ) that explains a little bit what is a 1-dimension Haar wavelet transform and why I think it can be used for lossy compression adapted for vintage systems. Maybe someone will find it interresting enough to investigate the idea and push it into cool projects to use extreme lossy compression. With this perspective, I've decided to publish my paper.

The result I've published with it is an audio test file to let you hear the effect of discarding less significant wavelets from the transformation (usually white noise and low volume frequencies). The compression ratio can goes beyond LPC (Linear Predictive Coding) but the quality is way worst because it's not based on speech compression principles.

http://newcoleco.dev...processors.html

#2 chjmartin2 OFFLINE  

chjmartin2

    Moonsweeper

  • 259 posts
  • Location:Massachusetts

Posted Wed May 4, 2011 8:36 PM

View Postnewcoleco, on Mon Apr 19, 2010 1:11 PM, said:

I did a little experimentation, nothing concret with a real project but clearly plausible whatever the vintage system. The idea is to use the most significative parts of a wavelet transform which can be applied on digitalized sounds and video.

My main objective was to try an extreme case of compression and deal with old processors like Zilog Z80 (no hardware multiplication and floating point values). I did write a paper ( available as a PDF file ) that explains a little bit what is a 1-dimension Haar wavelet transform and why I think it can be used for lossy compression adapted for vintage systems. Maybe someone will find it interresting enough to investigate the idea and push it into cool projects to use extreme lossy compression. With this perspective, I've decided to publish my paper.

The result I've published with it is an audio test file to let you hear the effect of discarding less significant wavelets from the transformation (usually white noise and low volume frequencies). The compression ratio can goes beyond LPC (Linear Predictive Coding) but the quality is way worst because it's not based on speech compression principles.

http://newcoleco.dev...processors.html

How would you go about playing back the sample? As an example, the Mattel Aquarius only has a speaker that can be on or off, so, we have to use PWM, or sample 1 bit and play back. How would you go about implementing your compression?

#3 newcoleco ONLINE  

newcoleco

    Stargunner

  • 1,053 posts

Posted Thu May 5, 2011 8:19 AM

View Postchjmartin2, on Wed May 4, 2011 8:36 PM, said:

How would you go about playing back the sample? As an example, the Mattel Aquarius only has a speaker that can be on or off, so, we have to use PWM, or sample 1 bit and play back. How would you go about implementing your compression?
Quick answer : Encoding a complex digital sound and reducing it to a 1-bit signal is already a lossy compression, you don't need Haar wavelets except if you want to torture yourself trying to make the result making sense or really get compression after all.


I'm glad that at least someone here commented my message about the possible application of wavelet compressions with 8-bit systems.

Wavelet transformations are used these days as alternatives for lossy data compression. The most common usage is for pictures like the JPEG2000 format. The idea is to apply mathematics on your data in order to transform them into numbers showing what is obviously important to encode and what can be ignored depending on the level of details you want to keep for the final result.

If the original data is like noise, then a lossy compression will be either not efficient or makes unwanted visible or audible artefacts. But in a nice picture (jpeg) or music (mp3), the data sequence is mostly smooth with variations that follow a certain harmony having parts of pretty much the same colors or tones that makes the data compressible with wavelets.

In my paper (pdf file) I'm talking about the Haar wavelet because of its squared shape that makes it the best candidate for possible applications with 8-bit systems including digital sounds compression. However, it's not a no brainer solution, it may works for you or not depending on the possibilities of the system and how you deal with it.

In your case, if the speaker can be only muted or not (1 bit : 0 or 1), and so there is no volume variation possible to simulated a "smooth" wave, then a wavelet method of data compression will not work for you because even the Haar wavelet implies that there is at least 3 possible states (-1,0,1). And if you try to use wavelets to compress multiple 1 bit data as 8 bit data you'll get a result that will not fit want you expected. And considering that the data is already encoded as 1-bit only values, you'll either not know how to encode the transformed data in order to save space (compression) or getting a result that will be even less interesting to use (too much lossy compression makes no sense).

#4 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,095 posts
  • Location:Germany

Posted Thu May 5, 2011 11:36 AM

So from a pratical side, how would this work out on the colecovision? Can you have a rough estimate how much ROM and RAM the player would require for playing a sample ?
Considering the speech example with 75% compression. How much ROM space would the sample itself take ? Are we talking about a few kilobytes or would it be a lot more ?

Also do you think if the player would require all available Z80 CPU power or would it leave enough room for other tasks ?

#5 chjmartin2 OFFLINE  

chjmartin2

    Moonsweeper

  • 259 posts
  • Location:Massachusetts

Posted Thu May 5, 2011 7:34 PM

Quote

In your case, if the speaker can be only muted or not (1 bit : 0 or 1), and so there is no volume variation possible to simulated a "smooth" wave, then a wavelet method of data compression will not work for you because even the Haar wavelet implies that there is at least 3 possible states (-1,0,1). And if you try to use wavelets to compress multiple 1 bit data as 8 bit data you'll get a result that will not fit want you expected. And considering that the data is already encoded as 1-bit only values, you'll either not know how to encode the transformed data in order to save space (compression) or getting a result that will be even less interesting to use (too much lossy compression makes no sense).

Ok, hear me out. Right now, I can create audio using Pulse Width Modulation to 4 bits. (http://www.atariage....nd-on-aquarius/) Because I had to do shifts in order to lower the bit depth I think I used up too many cycles downsampling the 8 bit sample I had stored. I couldn't find a program to reduce to lower than 8 bit depth... Anyway, I am off topic. My thought is this, I think I have on the Aquarius (Z80 ~ 4 MHz) enough cycles to do PWM at 6 bits (or at least 5 bits) but I still can't store any real amount of audio. (http://en.wikipedia....wiki/PC_speaker)

What I am thinking is that if I took a 6 bit audio file, used your compression technique, maybe I'd have enough cycles to decode the wavelet, encode it as a PWM 1 bit stream and play it back. I had thought about trying to implement GSM or something like that, but that looked too processor intensive.

#6 newcoleco ONLINE  

newcoleco

    Stargunner

  • 1,053 posts

Posted Thu May 5, 2011 9:31 PM

View Postretroclouds, on Thu May 5, 2011 11:36 AM, said:

So from a pratical side, how would this work out on the colecovision? Can you have a rough estimate how much ROM and RAM the player would require for playing a sample ?
Considering the speech example with 75% compression. How much ROM space would the sample itself take ? Are we talking about a few kilobytes or would it be a lot more ?

Also do you think if the player would require all available Z80 CPU power or would it leave enough room for other tasks ?
I can't remember exactly what I had in mind back then.

If the decompression routine is well optimized, my guess was that you can keep the computations +1 and -1 very simple to decode the compressed data almost like a stream and so avoid the need for extra RAM usage to decompress the data and then play the result. So the ROM and RAM space depends really on how you want to implement what I've proposed in the paper.

I've never done the coding to test my idea on the ColecoVision, I've made my audio examples by compressing and decompressing audio samples based on what it may sounds like in an ideal world.




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users