Author Topic: On the fly data compression with 32 bit uC  (Read 7898 times)

0 Members and 1 Guest are viewing this topic.

Offline sgpeeTopic starter

  • Contributor
  • Posts: 24
On the fly data compression with 32 bit uC
« on: November 23, 2010, 04:47:04 pm »
Hi,

Do you guys have any experience with on the fly data compression? I am not talking about complex huffman encoding, something simpler but still gives me 20-30% savings and it can operate on fixed boundaries? (i.e. 256 bytes in x bytes out, but always 256 in)

FYI. My data is fairly linear, so the 256 byte will have a nice uniform pattern, should be pretty easy to compress with the right algo.




Thx,
sgpee
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #1 on: November 23, 2010, 05:00:40 pm »
use fpga.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline Zad

  • Super Contributor
  • ***
  • Posts: 1013
  • Country: gb
    • Digital Wizardry, Analogue Alchemy, Software Sorcery
Re: On the fly data compression with 32 bit uC
« Reply #2 on: November 23, 2010, 06:02:00 pm »
The simplest would be run-length encoding (if your data is suitable) http://en.wikipedia.org/wiki/Run_length_encoding

Lempel Ziv Welch (LZW) is pretty efficient in terms of CPU time too http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch

Offline oPossum

  • Super Contributor
  • ***
  • Posts: 1451
  • Country: us
  • Very dangerous - may attack at any time
Re: On the fly data compression with 32 bit uC
« Reply #3 on: November 23, 2010, 06:24:36 pm »
You may be able to encode the change in data rather than the absolute value.

I did this with a data logger that read a 12 bit ADC and stored the samples as 8 bit deltas (-127 to +127). Each block of 1024 bytes had the absolute value of the first 12 bit sample, 999 8 bit deltas, and a timestamp.

Code: [Select]
delta = adc - prev_adc;
if(delta < -127) {
delta = -127;
} else if(delta > 127) {
delta = 127;
}
prev_adc += delta;
*buffer_ptr++ = (uint8_t) delta;

complete source code
 

Offline allanw

  • Frequent Contributor
  • **
  • Posts: 343
    • Electronoblog
Re: On the fly data compression with 32 bit uC
« Reply #4 on: November 23, 2010, 06:38:40 pm »
Have you thought about using SPI to write to an SD card? Then you can have virtually unlimited storage.
 

Offline sgpeeTopic starter

  • Contributor
  • Posts: 24
Re: On the fly data compression with 32 bit uC
« Reply #5 on: November 24, 2010, 01:23:29 am »
You rock.. this is a brilliant idea.. This works because my data is linear, no extreme changes.. I can actually even reduce this to 6 or 4 bits.. Thanks man, this is really really cool..


You may be able to encode the change in data rather than the absolute value.

I did this with a data logger that read a 12 bit ADC and stored the samples as 8 bit deltas (-127 to +127). Each block of 1024 bytes had the absolute value of the first 12 bit sample, 999 8 bit deltas, and a timestamp.

Code: [Select]
delta = adc - prev_adc;
if(delta < -127) {
delta = -127;
} else if(delta > 127) {
delta = 127;
}
prev_adc += delta;
*buffer_ptr++ = (uint8_t) delta;

complete source code

 

Offline sgpeeTopic starter

  • Contributor
  • Posts: 24
Re: On the fly data compression with 32 bit uC
« Reply #6 on: November 24, 2010, 01:25:13 am »
I have.. The size is an issue for me.. I am trying to squeeze everything i got to a 20x40 board.. Even uSD will screw the size..



Have you thought about using SPI to write to an SD card? Then you can have virtually unlimited storage.
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #7 on: November 24, 2010, 05:15:57 am »
You may be able to encode the change in data rather than the absolute value.

I did this with a data logger that read a 12 bit ADC and stored the samples as 8 bit deltas (-127 to +127). Each block of 1024 bytes had the absolute value of the first 12 bit sample, 999 8 bit deltas, and a timestamp.

Code: [Select]
delta = adc - prev_adc;
if(delta < -127) {
delta = -127;
} else if(delta > 127) {
delta = 127;
}
prev_adc += delta;
*buffer_ptr++ = (uint8_t) delta;
any adc value spikes will ruin the rest of data. nice algol though with compromise, can be even further compressed to 7-4bit per adc. carefull field study is needed for such a simple algol.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13998
  • Country: gb
    • Mike's Electric Stuff
Re: On the fly data compression with 32 bit uC
« Reply #8 on: November 24, 2010, 10:03:41 am »
For  data which is amenable to difference coding but may contain occasional spikes, variable-length coding can work well, e.g. reserve some code values or bits to indicate special-cases of absolute values or higher deltas.
If using an ARM, the barrel-shifter can make variable-length coding very efficient.
 
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline oPossum

  • Super Contributor
  • ***
  • Posts: 1451
  • Country: us
  • Very dangerous - may attack at any time
Re: On the fly data compression with 32 bit uC
« Reply #9 on: November 24, 2010, 02:32:21 pm »
Storing log(delta) can also provide greater dynamic range if necessary. Use a LUT for log/antilog. The curve doesn't have to be perfectly log, it could be linear for small to medium values and log for larger.
 

Offline oPossum

  • Super Contributor
  • ***
  • Posts: 1451
  • Country: us
  • Very dangerous - may attack at any time
Re: On the fly data compression with 32 bit uC
« Reply #10 on: November 24, 2010, 02:35:32 pm »

any adc value spikes will ruin the rest of data.

No! There will be one or more wrong values, but the error is held over so it will correct when the data settles. It is essentially a low pass filter.

Note this line in the code:

prev_adc += delta;

That is very important.
Doing it this way...

prev_adc = adc;

would cause all readings following an overflow to be incorrect.
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #11 on: November 24, 2010, 03:56:42 pm »
No! There will be one or more wrong values, but the error is held over so it will correct when the data settles. It is essentially a low pass filter.
prev_adc += delta;
correct! i just re-created it, a self-correcting algol, compromise only in the vicinity of spikes. not a 100% data match though, but brilliant!
ps: i abandoned it long time ago due to its lossy nature :P
« Last Edit: November 24, 2010, 04:24:04 pm by shafri »
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline TheDirty

  • Frequent Contributor
  • **
  • Posts: 440
  • Country: ca
Re: On the fly data compression with 32 bit uC
« Reply #12 on: November 24, 2010, 04:50:20 pm »
It would not be hard to make it lossless with an RLE type scheme, if you really needed it.

packet length | start value | delta | delta | delta...
Mark Higgins
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #13 on: November 24, 2010, 05:21:39 pm »
yes but more codes is needed to do the decision. if there are still more room for the mcu, it can as well do variable length coding if more compression is needed, where it try to do the smallest bit length first, and increase the bitlength as necessary based on behaviour of data coming. zad's link will be a good read.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline oPossum

  • Super Contributor
  • ***
  • Posts: 1451
  • Country: us
  • Very dangerous - may attack at any time
Re: On the fly data compression with 32 bit uC
« Reply #14 on: November 24, 2010, 07:18:59 pm »
It would not be hard to make it lossless with an RLE type scheme, if you really needed it.

packet length | start value | delta | delta | delta...

Hmm. Inspiration for this concept...

marker | absolute | delta | delta | delta...

The marker would be a special value outside the delta range. For example a signed 8 bit value can be -128 to 127. So use -127 to 127 for delta, and -128 for marker.

No limit on run length with this method. And no streaming problems (no need to know packet length at head of data block).
 

Offline sgpeeTopic starter

  • Contributor
  • Posts: 24
Re: On the fly data compression with 32 bit uC
« Reply #15 on: November 25, 2010, 09:17:16 am »
It is easy to make this algo quasi lossless. (one value will be gone)

My data is 16 bit and if I am willing to let go of one value say 0xff than that would act as a marker. In this case, if you are outside the boundary you insert marker and reset the start value to the extreme and go from there. I need to check the values to understand how well this would work but my hunch tells me, this would easily save 20-30%.


sgpee

It would not be hard to make it lossless with an RLE type scheme, if you really needed it.

packet length | start value | delta | delta | delta...

Hmm. Inspiration for this concept...

marker | absolute | delta | delta | delta...

The marker would be a special value outside the delta range. For example a signed 8 bit value can be -128 to 127. So use -127 to 127 for delta, and -128 for marker.

No limit on run length with this method. And no streaming problems (no need to know packet length at head of data block).

 

Offline sgpeeTopic starter

  • Contributor
  • Posts: 24
Re: On the fly data compression with 32 bit uC
« Reply #16 on: November 25, 2010, 09:18:30 am »
Mike,

I am not familiar with barrel-shifter, would you care to elaborate?

Thx,
sgpee


For  data which is amenable to difference coding but may contain occasional spikes, variable-length coding can work well, e.g. reserve some code values or bits to indicate special-cases of absolute values or higher deltas.
If using an ARM, the barrel-shifter can make variable-length coding very efficient.
 
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #17 on: November 25, 2010, 11:59:51 am »
It is easy to make this algo quasi lossless. (one value will be gone)
it is easy as well to do it purely lossless as previously discussed. just simple test of if greater than 127 or less than -127 will throw out -128 (marker) and absolute adc value. the problem will be just it will expanded instead of compressed for highly oscillating adc.

another way is using table based compression. i have here a fin table prediction compression method for pc file, not mcu unfortunately (not sure where i got the "fin" name, its been years ago). it is alot simpler compared to lzw. but it will more codes to do the setup and housekeeping and only suitable for highly stable and repeatative data. it is possible to compress up to 1bit per data for a highly predictable data.
« Last Edit: November 25, 2010, 12:05:58 pm by shafri »
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13998
  • Country: gb
    • Mike's Electric Stuff
Re: On the fly data compression with 32 bit uC
« Reply #18 on: November 25, 2010, 10:51:03 pm »
Mike,

I am not familiar with barrel-shifter, would you care to elaborate?

Thx,
sgpee


For  data which is amenable to difference coding but may contain occasional spikes, variable-length coding can work well, e.g. reserve some code values or bits to indicate special-cases of absolute values or higher deltas.
If using an ARM, the barrel-shifter can make variable-length coding very efficient.
 

Most ARM instructions can shift or rotate one of the operands by any number of bits with no extra execution time, so doing things like packing variable-length bitfields into 32-bit words can be done very effciently. It can also shift data by a number of bits specified in a register.
e.g. in C, the following lines can each be done as a single ARM instruction:
a=b<<5|c;
b<<=c;

Assuming of course the compiler is suitably aware of the archtecture.
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11713
  • Country: my
  • reassessing directives...
Re: On the fly data compression with 32 bit uC
« Reply #19 on: November 26, 2010, 02:31:15 am »
Most ARM instructions can shift or rotate one of the operands by any number of bits with no extra execution time, so doing things like packing variable-length bitfields into 32-bit words can be
pic and avr also have this bit shift feature iirc, only a bit at a time/op... asm'wise.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf