Noise shaped ADPCM encoding 
Sunday, October 22, 2006, 11:39 AM - Audio coding
ADPCM is a very basic audio coding scheme, and is quite fast to encode and decode. The drawback is that compression is quite limited (fixed to 4 bits per sample), and so is the quality.

As it's based on encoding a difference from the predicted sample, and that prediction is based on previous samples, I already thought about a modified ADPCM encoder that would work on a window of a few samples, in order to select the optimal encoding for this whole window, instead of only considering the current sample.

I've since found a blog post by M.Neidermayer about this, and his explanations are way better than mine.

Now, let's take this further: what about a noise shaped ADPCM encoding? Instead of a simple SNR optimisation we could shape the ADPCM noise in the frequency domain. We could base it on SMR (signal to mask ratio) values, but even just using the ATH (absolute threshold of hearing) would probably be a big improvement.
To do this, we could consider a bigger window of samples. 128 samples would be comfortable regarding frequency resolution, but down to 32 samples would probably be useable. Over this window, we would do a time to frequency transform, and do an election of the best ADPCM encoded vector based on the distance from the ideal frequency based distortion.

The problem is now the computationnal cost of this. Using a coarse 32 samples window, we would need to compute 4^32 quantizations and transforms, which would be prohibitive. (please note how I managed to change ADPCM from a very fast coding scheme to something that would take years to compute)

The open question is now: how to reduce the complexity of this?
  |  0 trackbacks   |  permalink   |  related link