Author Topic: Signal filtering - Could this work ? (Read 4529 times)

matb · « **on:** April 19, 2024, 12:27:51 pm »

Hi,
I am creating a product which will be a weighing loadcell for a client.

The client wants a moving average of the digital signal of around 30 seconds. But the twist is that they also want the ADC frequency to be 400Hz.

A stored value beeing 4 bytes that make for a very big buffer !
4 bytes x 400 = 1600 bytes for 1 second. x30 is 48k bytes !
Maybe I could make this work for 1 signal but I have 3/4 for the prototype software !

So the buffer size won't be doable...

So I was thinking of only adding an average of for example 500ms.
That would be 2 times less values in the moving average buffer !

But is that ok from a signal filtering point of view ? Excel datas from the sensor shows it working but is my idea a red flag.

I am wondering if the ADC beeing 100Hz or 400Hz or 25Hz is going to change anything if we are filtering at such high average ??
And just for the record marketing already that "of course it's possible we'll make it work"

eutectique · « **Reply #1 on:** April 19, 2024, 01:55:30 pm »

Did the client mention "oversampling"?

Say, with 64x oversampling, you can keep the wanted 400 Hz sampling rate, while having 64 times less samples per time unit. And you can sell it to the client as "3 additional bits of resolution".

Here is one of the papers: https://www.ti.com/lit/an/sprad55/sprad55.pdf

eutectique · « **Reply #2 on:** April 19, 2024, 02:27:45 pm »

Also ...

Quote from: matb on April 19, 2024, 12:27:51 pm

A stored value beeing 4 bytes that make for a very big buffer !

Why not 2 bytes?

nctnico · « **Reply #3 on:** April 19, 2024, 03:18:17 pm »

Moving average is a low-pass filter. The easiest way is to start with the value of the first sample and calculate the filter as new = (old * (filterfactor -1) + sample) / filterfactor. Filterfactor determines how much each new sample adds to the filter output. Note that due to the division, you'll need to shift the values to the left by the number of bits used for filterfactor. Otherwise you'll get large rounding errors. If filterfactor is a power of two, then the division is a simple right shift operation which is quick to perform.

xvr · « **Reply #4 on:** April 19, 2024, 03:42:28 pm »

Do client required 400Hz update rate? If not, you can make sliding window not by sampling rate, but update rate. Just accumulate SamplingRate/UpdateRate samples in one before putting it into buffer.

NorthGuy · « **Reply #5 on:** April 19, 2024, 06:50:50 pm »

What does he want to do with these numbers? There's no point of just storing the numbers in memory, they must be consumed somehow. How?

MarkT · « **Reply #6 on:** April 19, 2024, 07:16:45 pm »

How much RAM is available? If there's enough then use it, and show them how bad it behaves (abrupt change will take 30 seconds to register fully).

I think what is more useful is adaptive filtering, where after an abrupt change a simple low-pass filter is used, but once the signal is stable a wide moving average can then take over to reduce noise. Something like that.

Perhaps its best to figure out what the requirements actually are for performance.

radiolistener · « **Reply #7 on:** April 19, 2024, 07:34:56 pm »

Quote from: matb on April 19, 2024, 12:27:51 pm

But is that ok from a signal filtering point of view ? Excel datas from the sensor shows it working but is my idea a red flag.

No, that is not ok. Less sample buffer leads to less processing gain and as result - worse dynamic range and higher noise.

By the way, moving average is simple, but is not a good low pass filter, there is a sense to use something better depends on your needs.
For example, FIR filter allows you to get much-much better results and with linear phase.

MarkT · « **Reply #8 on:** April 19, 2024, 07:41:07 pm »

Moving average filter is an example of a FIR filter and is linear phase. But much much less resource hungry!

Best to find the real requirements, I suspect adaptive filtering of some form is what is really wanted to get best of both worlds (low noise, rapid response to change).

RandallMcRee · « **Reply #9 on:** April 19, 2024, 07:46:32 pm »

What you want (most likely) is a streaming average. Easy to do in a single small buffer.

Here is a presentation of the concept: https://nestedsoftware.com/2018/03/20/calculating-a-moving-average-on-streaming-data-5a7k.22879.html

The exponential moving average is an approximation as someone mentioned, is very efficient, but again approximate.

Here is the java code for streaming alg (uses only 5 elements), it is easily adapted to a microcontroller:

Code: [Select]

package arduino;
/*
* Copyright (c) 2003, the JUNG Project and the Regents of the University
* of California
* All rights reserved.
*
* This software is open-source under the BSD license.
* See http://jung.sourceforge.net/license.txt for a description.
*/

public class DataMoments {
    /**
     * A data structure representing the central moments of a distribution including: <ul>
     * <li> the mean </li>
     * <li> the variance </li>
     * <li> the skewness</li>
     * <li> the kurtosis </li></ul> <br>
     * Data values are each passed into this data structure via the accumulate(...) method
     * and the corresponding central moments are updated on each call
     *
     * @author Didier H. Besset (modified by Scott White and, subsequently, by Leland Wilkinson)
     */
    private double[] moments;

    public DataMoments() {
        moments = new double[5];
    }

    public void accumulate(double x) {
        if (Double.isNaN(x) || Double.isInfinite(x))
            return;
        double n = moments[0];
        double n1 = n + 1;
        double n2 = n * n;
        double delta = (moments[1] - x) / n1;
        double d2 = delta * delta;
        double d3 = delta * d2;
        double r1 = n / n1;
        moments[4] += 4 * delta * moments[3] + 6 * d2 * moments[2] + (1 + n * n2) * d2 * d2;
        moments[4] *= r1;
        moments[3] += 3 * delta * moments[2] + (1 - n2) * d3;
        moments[3] *= r1;
        moments[2] += (1 + n) * d2;
        moments[2] *= r1;
        moments[1] -= delta;
        moments[0] = n1;
    }

    public double mean() {
        return moments[1];
    }

    public double count() {
        return moments[0];
    }

    public double kurtosis() {
        if (moments[0] < 4)
            return Double.NaN;
        double kFact = (moments[0] - 2) * (moments[0] - 3);
        double n1 = moments[0] - 1;
        double v = variance();
        return (moments[4] * moments[0] * moments[0] * (moments[0] + 1) / (v * v * n1) - n1 * n1 * 3) / kFact;
    }

    public double skewness() {
        if (moments[0] < 3)
            return Double.NaN;
        double v = variance();
        return moments[3] * moments[0] * moments[0] / (Math.sqrt(v) * v * (moments[0] - 1) * (moments[0] - 2));
    }

    public double standardDeviation() {
        return Math.sqrt(variance());
    }

    public double variance() {
        if (moments[0] < 2)
            return Double.NaN;
        return moments[2] * moments[0] / (moments[0] - 1);
    }
}

radiolistener · « **Reply #10 on:** April 19, 2024, 08:04:22 pm »

Quote from: MarkT on April 19, 2024, 07:41:07 pm

But much much less resource hungry!

But it has smooth cut-off slope on frequency response and pretty mediocre rejection ratio at stop bandand, which is not suitable for many DSP applications.

Nominal Animal · « **Reply #11 on:** April 19, 2024, 08:08:16 pm »

If your MCU supports SPI, you could always use e.g. Microchip 23LC512 (524288 bits = 65536 bytes).

For the moving average, you only read one word and write one word per ADC sample, i.e. 400 reads (56 SPI clock cycles each) and 400 writes (56 SPI clock cycles each) per second: no need for DMA or hardware external memory support at all. Essentially, you'd use the SRAM as a cyclic buffer of 16384 words. By controlling how many words are between the write/head and the read/tail addresses, you can choose any window size between 2 and 16384 (16383).

You do need a 32+14 = 46-bit accumulator, too, with each new ADC sample added to it (and each oldest sample subtracted from it when the desired window length has been reached). The running average is then the 46-bit accumulator sum divided by the number of samples summed. At 400 samples/second, the maximum window duration would be 40.96 seconds. Note that you do need a division operation, 46-bit unsigned integer divided by a 16-bit unsigned integer, yielding a 32-bit unsigned integer (with overflow impossible), but this should not be a problem in practice, as most compilers provide 64-bit support (uint64_t) which works perfectly well for this.

gf · « **Reply #12 on:** April 19, 2024, 10:16:54 pm »

Quote from: radiolistener on April 19, 2024, 07:34:56 pm

But it has smooth cut-off slope on frequency response and pretty mediocre rejection ratio at stop bandand, which is not suitable for many DSP applications.

Yes, the selectivity and stopband attenuation are not good.
However, you won't find a FIR filter of the same length that has a better (lower) equivalent noise bandwidth (ENBW).
If noise reduction is the only goal (and selectivity is not important), then boxcar averaging over the entire desired settling time is still the best choice.

Another advantage of a FIR filter is a bounded 0-100% step response settling time, while the settling of an IIR filter is asymptotic. Example: An exponential moving average filter which settles to (say) 99.9% within 30 seconds has an even worse selectivity and stop band attenuation than boxcar averaging over 30 seconds (whose step response settles to 100% in 30 seconds), and the ENBW of this exponential moving average filter is ~3.45 times higher, too. Frequency response is attached for comparison.

nctnico · « **Reply #13 on:** April 20, 2024, 02:06:58 pm »

Quote from: gf on April 19, 2024, 10:16:54 pm

Quote from: radiolistener on April 19, 2024, 07:34:56 pm
But it has smooth cut-off slope on frequency response and pretty mediocre rejection ratio at stop bandand, which is not suitable for many DSP applications.

Yes, the selectivity and stopband attenuation are not good.
However, you won't find a FIR filter of the same length that has a better (lower) equivalent noise bandwidth (ENBW).

True. Also keep in mind that you can have filtering in the analog domain which then adds to the boxcar filter response. With the boxcar filter having a steep roll-off, the real world results are typically pretty good.

matb · « **Reply #14 on:** April 20, 2024, 04:41:45 pm »

Thank all of you for the feedback, I wasn't expecting such a popularity regarding signal filtering. I am no filtering pros just getting out of college creating my first product and hoped to get "senior" feedback on how to handle this request.

I will try to answer your questions.

Quote from: xvr on April 19, 2024, 03:42:28 pm

Do client required 400Hz update rate? If not, you can make sliding window not by sampling rate, but update rate. Just accumulate SamplingRate/UpdateRate samples in one before putting it into buffer.

Actually no the update rate for the communication would be ~200ms. What you're describing seems to fit what I originally planed on doing. What would be the downside of this ? If I understand correctly you would average all sample between 2 communication send ?

Quote from: NorthGuy on April 19, 2024, 06:50:50 pm

What does he want to do with these numbers? There's no point of just storing the numbers in memory, they must be consumed somehow. How?

The moving average filter requires that all values are stored ? Am I wrong about this ?

Quote from: MarkT on April 19, 2024, 07:16:45 pm

How much RAM is available?
If there's enough then use it, and show them how bad it behaves (abrupt change will take 30 seconds to register fully).

I think what is more useful is adaptive filtering, where after an abrupt change a simple low-pass filter is used, but once the signal is stable a wide moving average can then take over to reduce noise. Something like that.

Perhaps its best to figure out what the requirements actually are for performance.

There is not enough memory for all the signals they wish to filter, won't go into details but the moving average of 30s is a requirement for at least 4 other sensors...
I will look at the adaptive filter in depth thanks for the tips.

Quote from: Nominal Animal on April 19, 2024, 08:08:16 pm

If your MCU supports SPI, you could always use e.g. Microchip 23LC512 (524288 bits = 65536 bytes).

For the moving average, you only read one word and write one word per ADC sample, i.e. 400 reads (56 SPI clock cycles each) and 400 writes (56 SPI clock cycles each) per second: no need for DMA or hardware external memory support at all. Essentially, you'd use the SRAM as a cyclic buffer of 16384 words. By controlling how many words are between the write/head and the read/tail addresses, you can choose any window size between 2 and 16384 (16383).

You do need a 32+14 = 46-bit accumulator, too, with each new ADC sample added to it (and each oldest sample subtracted from it when the desired window length has been reached). The running average is then the 46-bit accumulator sum divided by the number of samples summed. At 400 samples/second, the maximum window duration would be 40.96 seconds. Note that you do need a division operation, 46-bit unsigned integer divided by a 16-bit unsigned integer, yielding a 32-bit unsigned integer (with overflow impossible), but this should not be a problem in practice, as most compilers provide 64-bit support (uint64_t) which works perfectly well for this.

That looks very interesting but the hardware design is already locked.
Could I use the EEPROM of the µC for that ? Not sure about speeds ...

xvr · « **Reply #15 on:** April 20, 2024, 04:56:26 pm »

> Actually no the update rate for the communication would be ~200ms.

So, 30 sec * 5 data-pack/sec = 150 records. Size of sliding window buffer should be 150 records.

> If I understand correctly you would average all sample between 2 communication send

Yes.

> What would be the downside of this ?

No downside. Result will be absolutely the same, as with buffer for all samples (30*400 = 12000)

You required to send only 1/80 of all sampled points, so there is no reason to store all 80 samples - you never send out 79 of them. Sum them right at sampling time.

gf · « **Reply #16 on:** April 20, 2024, 06:21:42 pm »

Quote from: xvr on April 20, 2024, 04:56:26 pm

> Actually no the update rate for the communication would be ~200ms.

So, 30 sec * 5 data-pack/sec = 150 records. Size of sliding window buffer should be 150 records.

> If I understand correctly you would average all sample between 2 communication send

Yes.

> What would be the downside of this ?

No downside. Result will be absolutely the same, as with buffer for all samples (30*400 = 12000)

You required to send only 1/80 of all sampled points, so there is no reason to store all 80 samples - you never send out 79 of them. Sum them right at sampling time.

It is basically a 1st order CIC decimator as depicted in Figure 10 (a) in this article, with D=12000, R=80 and N=150.

Nominal Animal · « **Reply #17 on:** April 20, 2024, 07:41:39 pm »

Since each ADC sample is 32-bit, let's assume you use 48 bits (6 bytes) for each sample set, 8 bits for the number of 32-bit samples in the set, and 32+8=40 bits for the sample sum. One possibility in C is to use

Code: [Select]

#define  SETS  150

uint64_t  sum_total;
uint32_t  sum_count;
uint32_t  average;

volatile uint32_t  set_sum[SETS];
volatile uint8_t   set_overflow[SETS];
volatile uint8_t   set_count[SETS];
uint8_t  set_index;
uint8_t  set_size;

This takes 18+6×SETS = 918 bytes of RAM. Initially, all these are cleared to all zeroes, except set_size to SETS; it sets the window duration in number of updates (0.2 second units) between 2 and SETS, inclusive. It is okay to initialize everything again to all zeros (except set_size to the new window duration), so that previous measurements are completely ignored and a new averaging window is started from scratch.

Whenever a 32-bit ADC sample is acquired, you do

Code: [Select]

    if (__builtin_add_overflow(set_sum[set_index], SAMPLE, &(set_sum[set_index]))
        set_overflow[set_index]++;
    set_count[set_index]++;

or equivalent, i.e. add the sample value to the current sum, and increment overflow if the 32-bit value overflowed; and finally increment the current count. Normally, each set_count[] value will be between 0 and 80 (=400/5).)

To update the average, you do something like

Code: [Select]

    uint32_t  state = begin_atomic();

    uint8_t  new_count = set_count[set_index];
    uint32_t  new_sum = set_sum[set_index];
    uint8_t  new_overflow = set_overflow[set_index];

    if (++set_index >= set_size)
        set_index = 0;

    uint8_t  old_count = set_count[set_index];
    uint32_t  old_sum = set_sum[set_index];
    uint8_t old_overflow = set_overflow[set_index];

    set_count[set_index] = 0;
    set_sum[set_index] = 0;
    set_overflow[set_index] = 0;

    end_atomic(state);

    sum_count += new_count;
    sum_count -= old_count;

    sum_total += new_sum + (uint64_t)(new_overflow) << 32;
    sum_total -= old_sum + (uint64_t)(old_overflow) << 32;

    if (sum_count > 0)
        average = (sum_total + sum_count/2) / sum_count;
    else
        average = 0;  // Do not report an average

where the +sum_count/2 adds rounding halfway upwards, instead of truncation towards zero. It is somewhat important that this occurs at regular 0.2 second intervals.

Notice that the atomic (uninterruptible) part just updates the variables (to temporary variables), so it should not last for more than maybe two dozen clock cycles. All the math is done outside the critical/atomic section, and since it is only done about 5 times a second, won't need much resources even if you are using an 8-bit microcontroller.

Of course, there are many other approaches one could use, the above one is not the only possibility!

The key here is that for each sample set you maintain both the sum of samples, and the number of samples in the sum, separately. You need 30s / 0.2s = 150 such sets. That does not need to be a constant, either, as long as it is between 2 and the number of elements allocated to the arrays as above. (Changing it should always clear everything to zeroes, though; otherwise you'd need to move data around in the arrays to ensure the correct sets are used.)

If one has enough RAM (a few kilobytes) and a 32-bit MCU, and the output report format is not fixed yet, I would consider keeping both sum of samples and sum of squared samples in each set, so that in addition to the box-car average (windowed sample mean), you could also report the variance within the window. The minimum variance depends on the amount of noise in the ADC process, but problems possibly affecting the average like vibrations et cetera will increase the variance, and might be useful. The variance would be windowed exactly the same way as the data is.

You see, if say sum_total is the sum of sum_count samples, and sum_squared is the sum of those samples squared (each sample squared, then summed), then the variance of the samples is (sum_squared-sum_total*sum_total)/sum_count or equivalently sum_squared/sum_count - sum_total*sum_total/sum_count. (Statistically, the unbiased estimate is (sum_squared-sum_total*sum_total)/(sum_count - 1).)

NorthGuy · « **Reply #18 on:** April 20, 2024, 08:57:10 pm »

Quote from: matb on April 20, 2024, 04:41:45 pm

Quote from: NorthGuy on April 19, 2024, 06:50:50 pm
What does he want to do with these numbers? There's no point of just storing the numbers in memory, they must be consumed somehow. How?
The moving average filter requires that all values are stored ? Am I wrong about this ?

Only if you are going to use every datapoint, for example if you're streaming them all out with 20 MHz SPI.

Nominal Animal · « **Reply #19 on:** April 20, 2024, 11:17:09 pm »

Quote from: matb on April 20, 2024, 04:41:45 pm

The moving average filter requires that all values are stored ?

No. The moving average or box-car filter requires that the values are stored in sets of the minimum window movement.

For example, if you know that the window will always move in multiples of 80 samples –– that would be 400 samples/second at 0.2 second intervals, 400×0.2=80 –– then it suffices to store the values in groups of 80 samples. (In my example C code above, I called such groups sets.)

I pointed out above that for each group, you should store the sum of the samples, plus the number of samples summed into the group.
(If you also store the sum of squared samples, you can report the variance of the samples within the window, too.)

The reason for this is that within the first 30 seconds from initialization, you do not have a full set of samples; but recording also the number of samples allows you to report the current filtered value (or mean) as the window extends. That is, the exact window size is limited by your buffer size, but can be varied at run time.

SiliconWizard · « **Reply #20 on:** April 20, 2024, 11:17:43 pm »

As it's not quite unlikely to be a bit of an X-Y problem, maybe the OP can elaborate a bit on what the customer really wants to achieve - and if the OP doesn't fully know, I think they should ask.

From just the requirements - 400 Hz sampling, and a "long-term" moving average - my guess: it may be part of a machine that will measure low-frequency vibrations/oscillations around a baseline (baseline which would be determined by the moving average). Which is something that could possibly be tackled with a slightly different approach in terms of DSP.

Either that, or are they just willing to use a highish sample rate as oversampling, and a relatively large averaging window, in order to just increase the SNR?

matb · « **Reply #21 on:** April 22, 2024, 06:21:49 am »

Quote from: SiliconWizard on April 20, 2024, 11:17:43 pm

As it's not quite unlikely to be a bit of an X-Y problem, maybe the OP can elaborate a bit on what the customer really wants to achieve - and if the OP doesn't fully know, I think they should ask.

From just the requirements - 400 Hz sampling, and a "long-term" moving average - my guess: it may be part of a machine that will measure low-frequency vibrations/oscillations around a baseline (baseline which would be determined by the moving average). Which is something that could possibly be tackled with a slightly different approach in terms of DSP.

Either that, or are they just willing to use a highish sample rate as oversampling, and a relatively large averaging window, in order to just increase the SNR?

The goal of the product is for the loadcell to report how much weight was lost is the field (imagine a fertilizer spreader) to know how much product was spread each seconds.
We want to know the "real" weight loss and not the motor, road etc noises.

From my understanding the 400Hz is due to the fear of missing "road" induced variations. But then we need a big filter to clean those.

SiliconWizard · « **Reply #22 on:** April 22, 2024, 06:49:54 am »

Alright, that makes the problem clear. I'll give it a thought.

Tation · « **Reply #23 on:** April 22, 2024, 03:49:04 pm »

As I understand it, this is a low pass filter. Its pass-band limit should be above the maximum speed of «desirable/interesting» variations in load, its stop-band limit below the lowest frequency of expected disturbances. Then, implementing it with a moving average, or FIR, or IIR or whatever, may be open to discussion, but not knowing such frequency limits will not allow one to properly design the filter. Even with the moving average, how long, what sampling freq???

A question: are low frequency disturbances really avoided? Think of the variation in load due to the vehicle (is it a movable device?) being slightly tilted by road camber or slope.

In quite different applications, but still requiring separation of low freqs from high ones, with a clear and wide separation between them, I have used simple 1st order IIR filters (really simple indeed) with good results.

Nominal Animal · « **Reply #24 on:** April 22, 2024, 08:49:01 pm »

Quote from: matb on April 22, 2024, 06:21:49 am

The goal of the product is for the loadcell to report how much weight was lost is the field (imagine a fertilizer spreader) to know how much product was spread each seconds.
We want to know the "real" weight loss and not the motor, road etc noises.

Do you have inertial sensors at the load cells so you could do 'sensor fusion' with the load cells?

Assuming you have an inertial sensor at each load cell, the direction parallel to the load cell direction describes the effective acceleration observed by the sensor; let's call this a. When there is no movement at all, this should be very close to g = 9.8 m/s² (one standard gravity), varying a bit depending on the latitude (due to Earth's rotation).

The load cell measures the force F the load applies to the load cell.

Because F = m a (Newtonian mechanics; suffices excellently for this), you could then obtain the instantaneous estimate of the mass at any point using m = F/a.

If you have a three-axis inertial sensor that reports a_x, a_y, a_z, and in that coordinate system the unit direction towards center of Earth (calibratable at run time, whenever the vehicle/device is not moving!) is n_x, n_y, n_z (and unit meaning it has length 1, i.e. n_x²+n_y²+n_z²=1), then
m = F / (a_x×n_x + a_y×n_y + a_z×n_z)

(If the load is not a single solid mass, then the relationship isn't that simple; one should apply a filter –– depending on the type of mass, be it small granules, large granules, single solid mass, nonviscous liquid, viscous liquid, etc. –– to adjust a to correspond to the acceleration of the load at the load cell. Consider sand: when you bump it up, it does not come back down as a solid mass, but somewhat dispersed in time. It is some kind of decay filter, I guess.)

By fitting a line m(t) = m₀ + m_Δ t to a set of estimates m(t), you can estimate the rate at which mass is lost (or acquired).


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Signal filtering - Could this work ? (Read 4529 times)

Share me