Author Topic: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware  (Read 5839 times)

0 Members and 3 Guests are viewing this topic.

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #25 on: June 18, 2024, 07:51:50 pm »
By using more Taylor members you can decrease the error to any arbitrary value.
No, you cannot.  That only works if you have sufficient precision in your representation of individual terms, compared to that arbitrary error limit.

Essentially, each term needs to have less error (compared to the exact value) than the arbitrary limit; typically less than half the arbitrary limit value.
With floating point types, this means that you generally cannot get less than ±1 ULP of error, and to get within the standard 0.5 ULP of error, you need to use higher precision for the terms.  If you have N terms similar in magnitude, you can expect ±N of error.

Simply put, the problem is that with floats, 10000000 + 0.5 == 10000000.


 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3237
  • Country: ca
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #26 on: June 18, 2024, 09:00:08 pm »
By using more Taylor members you can decrease the error to any arbitrary value.
No, you cannot.  That only works if you have sufficient precision in your representation of individual terms, compared to that arbitrary error limit.

Essentially, each term needs to have less error (compared to the exact value) than the arbitrary limit; typically less than half the arbitrary limit value.
With floating point types, this means that you generally cannot get less than ±1 ULP of error, and to get within the standard 0.5 ULP of error, you need to use higher precision for the terms.  If you have N terms similar in magnitude, you can expect ±N of error.

It's a matter of technique. You start from the back (from last members) where floating point numbers have excess precision because the values are much smaller than the final result and then you move towards bigger members. This way the rounding error does not accumulate.
 
The following users thanked this post: SiliconWizard

Offline Postal2

  • Frequent Contributor
  • **
  • Posts: 389
  • Country: ru
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #28 on: June 19, 2024, 06:53:36 am »
It's a matter of technique. You start from the back (from last members) where floating point numbers have excess precision because the values are much smaller than the final result and then you move towards bigger members. This way the rounding error does not accumulate.
Again, no.  That, and similar techniques like Kahan summation, will help you get down to ±1 ULP or so, but no more.

This is because the terms largest in magnitude cannot have too much error for the sum to be within the acceptable error bounds.  Lesser terms cannot compensate for loss of information from the greater terms.

In mathematical terms, we are computing polynomials of form
$$S_n = \sum_{k=0}^n s_k x^k$$
but with floating point types, we actually calculate
$$S_n = \sum_{k=0}^n \left( s_k + \epsilon_{0,k} \right) \left( x^k + \epsilon_{1,k} \right)$$
where \$\epsilon_{0,k}\$ is the error in the coefficient \$s_k\$, and \$\epsilon_{1,k}\$ is error in \$x^k\$.  Expanding the sum, we can separate out the error terms:
$$\begin{aligned}
S_n &= \sum_{k=0}^n s_k x^k \\
~ &+ \sum_{k=0}^n \epsilon_{0,k} x^k \\
~ &+ \sum_{k=0}^n \epsilon_{1,k} s_k \\
~ &+ \sum_{k=0}^n \epsilon_{0,k} \epsilon_{1,k} \\
\end{aligned}$$
The product of the errors \$\epsilon_{0,k} \epsilon_{1,k}\$ is obviously very small, and is generally not a problem at all. 

Because \$s_k\$ are the constant coefficients, \$\epsilon_{0,k}\$ are constant as well; but since \$x^k\$ depends on \$x\$, so does \$\epsilon_{1,k}\$.

That means the error sum \$\sum_{k=0}^n \epsilon_{0,k} x^k\$ depends on \$x\$, as does the error sum \$\sum_{k=0}^n \epsilon_{1,k} s_k\$, but their dependence on \$x\$ is so different, you cannot make them cancel out.

Increasing \$n\$ will not cause the already accrued error to go away.

With fixed point arithmetic, summing \$n\$ terms of \$B\$ bits each will yield a result with \$B + \lceil \log_2 n \rceil\$ bits.  Only one additional bit is needed for proper half-ULP rounding, so generally, you can get the absolute error below any arbitrary limit by using just a few additional bits compared to the absolute error limit.
« Last Edit: June 19, 2024, 08:48:49 am by Nominal Animal »
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #29 on: June 19, 2024, 07:18:08 am »
NorthGuy: To clarify, the techniques you mentioned are excellent, and I too do recommend their use.  Thank you for pointing them out!

I am only arguing that "to an arbitrary limit with simply adding more terms" is not correct, when using limited precision representations like floating-point and fixed-point formats.  Because of the limited precision in each term, there is a limit to how many terms can affect the sum; and no term can "repair" the error in other terms.
 

Offline ali_asadzadeh

  • Super Contributor
  • ***
  • Posts: 1929
  • Country: ca
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #30 on: June 19, 2024, 08:43:17 am »
Nominal Animal, thanks for sharing, would you explain what ±1 ULP means?
ASiDesigner, Stands for Application specific intelligent devices
I'm a Digital Expert from 8-bits to 64-bits
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1027
  • Country: es
    • Picuino web
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #31 on: June 19, 2024, 08:59:43 am »
...
over all \$1,056,964,609\$ normal (non-subnormal) IEEE 754 single-precision (float aka Binary32) values between 0 and 1.
How do you manage to test all floating values between 0 and 1? (If you post the test program, the rest of us can test our routines with the same pattern).
 

Offline gf

  • Super Contributor
  • ***
  • Posts: 1308
  • Country: de
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #32 on: June 19, 2024, 09:05:19 am »
How do you manage to test all floating values between 0 and 1?

In C/C++, you can use nextafter() or nextafterf() to iterate over representable floating point numbers.
https://en.cppreference.com/w/c/numeric/math/nextafter
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #33 on: June 19, 2024, 09:43:06 am »
Nominal Animal, thanks for sharing, would you explain what ±1 ULP means?
±1 ULP = ±1 Unit in the Least Significant Place.  It is the smallest possible representable change in a floating-point value.

In C, nextafterf(a, HUGE_VALF]) increments a by 1 ULP, yielding the closest value greater than a.
nextafterf(a, -HUGE_VALF]) decrements a by 1 ULP, yielding the closest value greater than a.

The exact size of that change depends on the magnitude of a.

To find out the error in float value compared to the expected nonzero value expect, somewhat similar to value-expect, in units of expected value ULPs, you can use
Code: [Select]
int  float_ulps(const float value, float expect, const int limit) {
    if (expect > value) {
        int  n = 0;
        do {
            n++;
            expect = nextafterf(expect, -HUGE_VALF);
        } while (n < limit && expect > value);
        return -n; // value was below expected, so error is negative

    } else
    if (expect < value) {
        int  n = 0;
        do {
            n++;
            expect = nextafterf(expect, +HUGE_VALF);
        } while (n < limit && expect < value);
        return n; // value was above expected, so error is positive

    } else {
        return 0;
    }
}
This is the most precise way of expressing the error in a value compared to what its type could support.

While one can choose which one to use as the yardstick, I believe the correct or best-known-value is the right yardstick to use.  This means that the ULPs counts I show are with respect to the correct value, and not the calculated value.
Since one is usually only interested in relatively small error ranges, say -15..+15 ULPs, the limit parameter keeps the number of iterations down.

When the expected value is zero, there is no ULP range, and the above will not return anything sensible.  I report such cases separately.



Background:

IEEE 754 Binary32 (float) is a number of format
$$v = (-1)^s \cdot m \cdot 2^{p - 127}$$
where \$v\$ is the logical numerical value, \$s\$ is the sign bit (0 or 1), \$m\$ is the 24-bit mantissa, and \$p\$ is the 8-bit exponent (between 0 and 255).

When \$p = 255\$, \$v\$ is either infinity (\$m = 0\$) or a Not-a-Number (\$m \ne 0\$).  When \$p = 0\$ and \$m = 0\$, \$v\$ is zero; this is why float can represent both +0.0 and -0.0, although the two have equal value.  When \$p = 0\$ but \$m \ne 0\$, \$v\$ is a subnormal value.  For \$1 \le p \le 254\$, \$v\$ is a normal real number.

The most significant bit of the mantissa is 1 if \$p \ge 1\$, and 0 if \$p = 0\$, so it is implicit and not stored at all.  The decimal point is logically between this implicit bit and the 23 bits of the mantissa.

The Unit in Least Significant Place refers to the magnitude of the least significant bit of the mantissa.  Its sign is used to indicate the direction of the error (positive if overshoot, negative for undershoot).  For a float, the magnitude is
$$1\text{ ULP} = 2^{-23} \cdot 2^{p - 127} = 2^{p - 140}$$
where \$p\$ depends on the value –– always the expected value, not the computed/approximate value in my opinion! –– at hand.

If we have IEEE 754 Binary32 floats –– and Intel/AMD x86 and x86-64, ARM Cortex-M4F, Cortex M7F, and RISC-V with hardware floating-point support all do –– and the byte order is the same for float as it is to uint32_t (also, the mentioned architectures do), we can calculate the actual ULPs using the binary representation of the float value:
Code: [Select]
typedef union {
    float   f;
    uint32_t  u;
    int32_t   i;
} word32;

// Distance from target to value, in units of target ULP.
int  float_ulps(const float value, const float expect, int limit) {
    if (correct == value)
        return 0;
    else
    if (correct == 0.0f)
        return (value < 0.0f) ? -16777216 : +16777216;

    const word32  v = { .f = value  };
    const word32  c = { .f = correct };

    // Different signs?
    if ((v.u ^ c.u) & 0x80000000)
        return (v.u & 0x80000000) ? -16777216 : +16777216;

    // Grab the exponent bits.
    uint32_t  ve = v.u & 0x7F800000;
    uint32_t  ce = c.u & 0x7F800000;

    // OR all exponent bits into bit position 24,
    uint32_t  vm = (ve | (ve >> 4)) & 0x07800000;
              vm = (vm | (vm >> 2)) & 0x01800000;
              vm = (vm | (vm >> 1)) & 0x00800000;
    uint32_t  cm = (ce | (ce >> 4)) & 0x07F00000;
              cm = (cm | (cm >> 2)) & 0x01800000;
              cm = (cm | (cm >> 1)) & 0x00800000;

    // Difference in exponents.
    int32_t  shift = (int32_t)(ve >> 23) - (int32_t)(ce >> 23);

    if (shift > 0) {
        if (vm) {
            vm = (vm << shift) | (v.u & 0x007FFFFF);
        } else {
            vm = (v.u & 0x007FFFFF) << shift;
        }
        cm |= (c.u & 0x007FFFFF);
    } else
    if (shift < 0) {
        if (cm) {
            cm = (cm << (-shift)) | (c.u & 0x007FFFFF);
        } else {
            cm = (c.u & 0x007FFFFF) << (-shift);
        }
        vm |= (v.u & 0x007FFFFF);
    } else {
        vm |= (v.u & 0x007FFFFF);
        cm |= (c.u & 0x007FFFFF);
    }

    return (c.u & 0x80000000) ? (int)((int32_t)cm - (int32_t)vm) : (int)((int32_t)vm - (int32_t)cm);
}
This works by constructing the exact bit representation of the two values with the exponent of the expected value, and returning their difference.

The limit is a convenience for histogramming errors, into an array with 2*limit+1 entries, with zero ULPs entry at index limit.  Then, entry at index limit-limit=0 collects all negative errors with larger magnitudes, and entry at index limit+limit=2*limit collects all positive errors with larger magnitudes.

The two float_ulps() implementations yield the exact same answers, when value and expect have the same sign, and expect is nonzero.  Edit: Added the handling for negative value and expect, and fixed the case where 2 ULP difference was occasionally reported as 1 ULP difference due to initial ULP scale being larger.
« Last Edit: June 19, 2024, 06:02:42 pm by Nominal Animal »
 
The following users thanked this post: ali_asadzadeh

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #34 on: June 19, 2024, 01:26:21 pm »
How do you manage to test all floating values between 0 and 1? (If you post the test program, the rest of us can test our routines with the same pattern).
I only hesitate because it is horribly ugly throwaway code I do not want to be judged by.  :-[

I'll clean it up a bit first, okay?  ;)
 

Offline ali_asadzadeh

  • Super Contributor
  • ***
  • Posts: 1929
  • Country: ca
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #35 on: June 19, 2024, 03:18:53 pm »
Quote
±1 ULP = ±1 Unit in the Least Significant Place.  It is the smallest possible representable change in a floating-point value.
Thanks, I learned something new today. >:D
ASiDesigner, Stands for Application specific intelligent devices
I'm a Digital Expert from 8-bits to 64-bits
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3237
  • Country: ca
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #36 on: June 19, 2024, 03:20:26 pm »
NorthGuy: To clarify, the techniques you mentioned are excellent, and I too do recommend their use.  Thank you for pointing them out!

I am only arguing that "to an arbitrary limit with simply adding more terms" is not correct, when using limited precision representations like floating-point and fixed-point formats.  Because of the limited precision in each term, there is a limit to how many terms can affect the sum; and no term can "repair" the error in other terms.

Of course, if you screwed up calculation of any one term, adding more terms will not fix that no matter how accurate they are. This is self-evident.

In real world tasks, there's a certain precision that you must attain with your calculation, so you do whatever it takes to ensure you maintain the desired precision throughout your calculation. For example, not long time ago I had to calculate sin()/cos() with high precision for coordinate transform using fractionals. To do so, I had to retain extra bits in the calculation and then chop them off at the end.

How did I know how many bits to retain? Purely empirical. I just had tested all the possible numbers and made sure the result is correct. This way I had determined how many bits to retain and how many terms to use as well.

Using more bits and using more terms are different things and cannot compensate for each other. No matter how many bits you retain, if you don't use enough terms, your calculation will lack precision. Similarly, if you don't use enough bits, your calculation will not be precise no matter how many terms you use.

 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #37 on: June 19, 2024, 04:14:31 pm »
In the mean time, here is a command-line interface for examining values and storage formats of floats.  It only cursorily checks that the implementation uses IEEE 754 Binary32 float: it does not detect e.g. GCC -fno-signed-zeros and such.
Code: [Select]
// SPDX-License-Identifier: CC0-1.0
// Nominal Animal, 2024
//
// Compile using for example
//    gcc -Wall -O2 -std=c11  this.c  -lm -o  this.exe

#define  _ISOC99_SOURCE
#include <stdlib.h>
#include <inttypes.h>
#include <string.h>
#include <stdio.h>
#include <math.h>
#include <errno.h>

typedef union {
    uint32_t  u;
    int32_t   i;
    float     f;
} word32;

// Returns NULL float uses the expected storage format, error text otherwise.
static const char *verify_float_format(void) {

    // Size must match uint32_t.
    if (sizeof (float) != sizeof (uint32_t))
        return "Invalid float size";

    // Encoding.
    if (((word32){ .u = 0x3f5d7b99 }).f !=  0.8651672f ||
        ((word32){ .u = 0xc0490fdb }).f != -3.1415927f)
        return "Invalid storage format";

    // Subnormal reachable from zero.
    if (nextafterf(0.0f, +1.0f) != ((word32){ .u = 0x00000001 }).f ||
        nextafterf(0.0f, -1.0f) != ((word32){ .u = 0x80000001 }).f)
        return "nextafterf() does not support subnormals";

    // All tests passed.
    return NULL;
}

static const struct {
    const char      *name;
    const char      *desc;
    const word32     value;
} constant[] = {
    { .name = "nan",    .value = { .u = 0x7F800001 }, .desc = "   Not a Number" },
    { .name = "inf",    .value = { .u = 0x7F800000 }, .desc = "   Infinity" },
    { .name = "max",    .value = { .u = 0x7F7FFFFF }, .desc = " = 3.4028235e+38, largest finite float" },
    { .name = "twopi",  .value = { .u = 0x40C90FDB }, .desc = " = 6.2831855" },
    { .name = "pi",     .value = { .u = 0x40490FDB }, .desc = " = 3.1415927" },
    { .name = "e",      .value = { .u = 0x402DF854 }, .desc = " = 2.7182817" },
    { .name = "halfpi", .value = { .u = 0x3FC90FDB }, .desc = " = 1.5707964 = pi / 2" },
    { .name = "one",    .value = { .u = 0x3F800000 }, .desc = " = 1" },
    { .name = "inve",   .value = { .u = 0x3EBC5AB2 }, .desc = " = 0.36787945 = 1/e" },
    { .name = "min",    .value = { .u = 0x00800000 }, .desc = " = 1.1754944e-38, smallest normal float" },
    { .name = "submax", .value = { .u = 0x007FFFFF }, .desc = " = 1.1754942e-38, largest subnormal float" },
    { .name = "submin", .value = { .u = 0x00000001 }, .desc = " = 1e-45, smallest subnormal float" },
    { .name = "zero",   .value = { .u = 0x80000000 }, .desc = " = 0" },
};
#define  CONSTANTS  (sizeof constant / sizeof constant[0])

#ifndef  WHITESPACE
#define  WHITESPACE  "\t\n\v\f\r "
#endif

static inline int separator_at(const char *s) {
    if (!s)
        return 0;
    return (*s == '\0' || *s == '\t' || *s == '\n' ||
            *s == '\v' || *s == '\f' || *s == '\r' ||
            *s == ' '  ||
            *s == '+'  || *s == '-' ||
            *s == '*'  || *s == '/' ||
            *s == '^'  ||
            *s == '~'  || *s == '_');
}

static const char *matches(const char *from, const char *name) {
    if (!from || !*from || !name || !*name)
        return NULL;

    while (1) {
        const int  f = (*from >= 'A' && *from <= 'Z') ? (*from - 'A' + 'a') : *from;
        if (f == '\0')
            return (separator_at(name)) ? from : NULL;
        if (*name == '\0')
            return (separator_at(from)) ? from : NULL;
        if (f != *name)
            return NULL;
        from++;
        name++;
    }
}

static const char *parse_suffixes(const char *from, word32 *to) {
    if (!from)
        return NULL;

    while (1)
        switch (*from) {
        case '~': // Post-increment
            if (to)
                to->f = nexttowardf(to->f, HUGE_VAL);
            from++;
            break;

        case '_': // Post-decrement
            if (to)
                to->f = nexttowardf(to->f, -HUGE_VAL);
            from++;
            break;

        default:
            return from;
        }
}

const char *parse_int(const char *from, int *to) {
    const char *ends = NULL;
    long        val;

    if (!from)
        return NULL;

    errno = 0;
    val = strtol(from, (char **)&ends, 0);
    if (errno)
        return NULL;
    if (!ends || ends == from || !separator_at(ends))
        return NULL;

    if ((long)((int)val) != val)
        return NULL;

    if (to)
        *to = (int)val;
    return ends;
}

const char *parse_part32(const char *from, word32 *to) {
    const char *ends = NULL;
    word32      tmp;

    // Fail if nothing to parse.
    if (!from)
        return NULL;

    from += strspn(from, WHITESPACE);
    if (!*from)
        return NULL;

    // Hexadecimal storage representation?
    if (from[0] == '0' && (from[1] == 'x' || from[1] == 'X')) {
        do {
            errno = 0;
            unsigned long  val = strtoul(from + 2, (char **)&ends, 16);
            if (errno || !ends || ends == from + 2)
                break;
            if ((unsigned long)((uint32_t)val) != val)
                break;

            tmp.u = val;
            ends = parse_suffixes(ends, &tmp);
            if (!separator_at(ends))
                break;

            if (to)
                *to = tmp;
            return ends;

        } while (0);
    }

    // A known constant?
    for (size_t i = 0; i < CONSTANTS; i++) {
        if (!constant[i].name)
            continue;

        uint32_t  negative = 0x00000000;
        ends = from;
        while (*ends == '+' || *ends == '-')
            if (*(ends++) == '-')
                negative ^= 0x80000000;

        ends = matches(from, constant[i].name);
        if (!ends)
            continue;

        tmp.u = negative ^ constant[i].value.u;

        ends = parse_suffixes(ends, &tmp);
        if (!separator_at(ends)) {
            ends = NULL;
            continue;
        }

        if (to)
            *to = tmp;
        return ends;
    }
    ends = NULL;

    // A floating-point numeric value?
    errno = 0;
    double  val = strtod(from, (char **)&ends);
    if (errno)
        return NULL;
    if (!ends || ends == from)
        return NULL;

    // Value is nonzero, but rounds to zero?
    if (val != 0.0 && (float)val == 0.0f)
        return NULL;

    tmp.f = val;
    ends = parse_suffixes(ends, &tmp);
    if (!separator_at(ends))
        return NULL;

    if (to)
        *to = tmp;
    return ends;
}

int parse_word32(const char *from, word32 *const to, const char **errp) {
    const char  *dummy;
    word32       have, temp;
    int          p;

    // Clear error pointer
    if (errp)
        *errp = NULL;
    else
        errp = &dummy;

    if (!from || !*from)
        return -1; // Nothing to parse.

    *errp = from; from = parse_part32(from, &have);
    if (!from)
        return -1; // Failed to parse.

    while (1) {

        // Skip whitespace.
        from += strspn(from, WHITESPACE);
        *errp = from;

        switch (*from) {

        case '\0': // Completed.
            *errp = NULL;
            if (to)
                *to = have;
            return 0;

        case '+': // Addition
            from = parse_part32(from + 1, &temp);
            if (!from)
                return -1;

            have.f += temp.f;
            break;

        case '-': // Substraction
            from = parse_part32(from + 1, &temp);
            if (!from)
                return -1;

            have.f -= temp.f;
            break;

        case '*': // Multiplication
            from = parse_part32(from + 1, &temp);
            if (!from)
                return -1;

            have.f *= temp.f;
            break;

        case '/': // Division
            from = parse_part32(from + 1, &temp);
            if (!from)
                return -1;

            have.f /= temp.f;
            break;

        case '^': // Multiply by a power of two
            from = parse_int(from + 1, &p);
            if (!from)
                return -1;

            have.f = ldexpf(have.f, p);

            *errp = from;
            from = parse_suffixes(from, &have);
            if (!from)
                return -1;

            break;

        default:
            return -1;
        }
    }
}

typedef  struct {
    char buf[256];
} buffer;

const char *word32_exact_float(buffer *const where, const word32 value) {
    int len = snprintf(where->buf, sizeof where->buf, "%.128f", value.f);
    if (len < 1 || (size_t)len >= sizeof where->buf)
        return "(ERROR)";

    while (len > 1 && where->buf[len-1] == '0' && where->buf[len-2] >= '0' && where->buf[len - 2] <= '9')
        where->buf[--len] = '\0';

    return (const char *)(where->buf);
}

const char *word32_pattern(buffer *const where, const word32 value) {
    char     *p = where->buf + sizeof where->buf;
    uint32_t  u = value.u;

    *(--p) = '\0';

    // 23 mantissa bits
    for (int i = 0; i < 23; i++) {
        *(--p) = '0' + (u & 1);
        u >>= 1;
    }

    // Implicit bit.
    if ((value.u & 0x7F800000) == 0x00000000)
        *(--p) = '0';
    else
    if ((value.u & 0x7F800000) < 0x7F800000)
        *(--p) = '1';
    else
        *(--p) = ' ';  // Not shown for NaNs and Infs

    *(--p) = ' ';

    // 8 exponent bits
    for (int i = 0; i < 8; i++) {
        *(--p) = '0' + (u & 1);
        u >>= 1;
    }

    *(--p) = ' ';

    // Sign bit.
    *(--p) = '0' + (u & 1);
    return (const char *)p;
}

const char *word32_float(buffer *const where, const word32 value) {
    // Use scientific notation for very large and very small values.
    const char  *format = (value.f < -100000000.0f || value.f > +100000000.0f ||
                          (value.f >= -0.000000001f && value.f <= 0.000000001f) ) ? "%.*g" : "%.*f";
    int          decimals = 1;

    while (1) {

        int  len = snprintf(where->buf, sizeof where->buf, format, decimals, value.f);
        if (len < 1 || (size_t)len >= sizeof where->buf)
            return "(ERROR)";

        errno = 0;
        char *end = where->buf;
        float val = strtod(end, &end);
        if (errno || end == where->buf || *end)
            return "(ERROR)";

        if (val == value.f)
            return (const char *)(where->buf);

        decimals++;
        if (decimals >= (int)sizeof where->buf)
            return "(ERROR)";
    }
}

int usage(const char *arg0, int status) {
    fprintf(stderr, "\n");
    fprintf(stderr, "Usage: %s [ -h | --help ]\n", arg0);
    fprintf(stderr, "       %s FLOAT [ FLOAT ... ]\n", arg0);
    fprintf(stderr, "Where each FLOAT can be:\n");
    fprintf(stderr, "       VALUE [ OP VALUE ]\n");
    fprintf(stderr, "where OP is\n");
    fprintf(stderr, "       + for addition,\n");
    fprintf(stderr, "       - for substraction,\n");
    fprintf(stderr, "       * for multiplication,\n");
    fprintf(stderr, "       / for division, and\n");
    fprintf(stderr, "       ^ for multiplying with a power of two,\n");
    fprintf(stderr, "and each VALUE may be suffixed with zero or more of\n");
    fprintf(stderr, "       _ (underscore) to decrement by 1 ULP\n");
    fprintf(stderr, "       ~ (tilde) to increment by 1 ULP\n");
    fprintf(stderr, "possibly repeated. Order of operations is strictly left to right.\n");
    fprintf(stderr, "\n");
    fprintf(stderr, "Known constants:\n");
    for (size_t i = 0; i < CONSTANTS; i++)
        fprintf(stderr, "%15s %s\n", constant[i].name, constant[i].desc);
    fprintf(stderr, "\n");
    exit(status);
}

int main(int argc, char *argv[]) {
    const char *arg0 = (argc > 0 && argv && argv[0] && argv[0][0]) ? argv[0] : "(this)";

    const char *bad_format = verify_float_format();
    if (bad_format) {
        fprintf(stderr, "Detected non-IEEE 754 Binary32 'float' format: %s.\n", bad_format);
        exit(EXIT_FAILURE);
    }

    // If run without arguments, display usage.
    if (argc < 2)
        usage(arg0, EXIT_SUCCESS);

    // If any of the arguments asks for help, display usage.
    for (int arg = 1; arg < argc; arg++)
        if (!strcmp(argv[arg], "-h") || !strcmp(argv[arg], "--help"))
            usage(arg0, EXIT_SUCCESS);

    // Describe input float values.
    for (int arg = 1; arg < argc; arg++) {
        const char *err = NULL;
        buffer      buf;
        word32      val;

        if (parse_word32(argv[arg], &val, &err)) {
            fprintf(stderr, "%s: Invalid expression.\n", (err) ? err : argv[arg]);
            exit(EXIT_FAILURE);
        }
        const int   fpc = fpclassify(val.f);

        printf("Input: %s\n", argv[arg]);
        printf("Value: %s\n", word32_float(&buf, val));
        printf("Exact: %s\n", word32_exact_float(&buf, val));
        printf("Class:");
        if (signbit(val.f))
            printf(" negative");
        if (isfinite(val.f))
            printf(" finite");
        if (isnormal(val.f))
            printf(" normal");
        if (fpc == FP_ZERO)
            printf(" zero");
        if (fpc == FP_NAN)
            printf(" nan");
        if (fpc == FP_INFINITE)
            printf(" inf");
        if (fpc == FP_SUBNORMAL)
            printf(" subnormal");
        printf("\n");
        printf("Bits:  %s\n", word32_pattern(&buf, val));
        printf("Hex:   0x%08x\n\n", (unsigned int)(val.u));
        fflush(stdout);
    }

    return EXIT_SUCCESS;
}

For example, to find out what is the smallest float greater than zero, run it with "1~" as an argument, and it should print
    Input: 1~
    Value: 1.0000001
    Exact: 1.00000011920928955078125
    Class: finite normal
    Bits:  0 01111111 100000000000000000000001
    Hex:   0x3f800001

Note that the Bits: row includes the implicit mantissa bit for finite (normal and subnormal and zero) values; it is omitted for infinities and NaNs.

Run without parameters to see usage information.

I use this to examine the storage representation of specific float values (Bits: and Hex: lines), to see the minimum decimal representation matching them (Value: line), and examining the results of simple arithmetic operations.

The reason this is a command-line tool and not say a HTML+CSS tool one can use in a browser, is that browsers don't have IEEE-754 Binary32 environment (we'd have to synthesize one, yuk), but most desktops and servers and SBCs do.
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #38 on: June 19, 2024, 06:36:20 pm »
This float function/approximation accuracy checking code is ugly, but it will do the job.  Apologies for your eyes.

We put each tested function and known correct function in a separate file, say funcs.c:
Code: [Select]
// SPDX-License-identifier: CC0-1.0
// Nominal Animal, 2024
//
// sin(2*Pi*x) -- unit-period sin function
//

#include <math.h>

float expected_value(float x) {
    x -= (int)x;

    if (x == -1.0f || x == -0.5f || x == 0.0f || x == 0.5f || x == 1.0f)
        return 0.0f;
    else
    if (x == -0.75f || x == 0.25f)
        return 1.0f;
    else
    if (x == -0.25f || x == 0.75f)
        return -1.0f;
    else {
        // Identical results with (float)sin((long double)x * 6.283185307179586476925286766559005768394338798750211641949889185L);
        return (float)sin((double)x * 6.283185307179586);
    }
}

static inline float usinf_internal(float x) {
    float  xx = x * x;
    return x * (6.2831855f - xx * (41.3417f - xx * (81.60525f - xx * (76.70586f - xx * (42.058693f - xx * 15.094643f)))));
}

float computed_value(float x) {
    x -= (int)x;
    if (x < 0.0f)
        x += 1.0f;

    if (x >= 0.75f)
        return  usinf_internal(x - 1.0f);
    else
    if (x >= 0.25f)
        return -usinf_internal(x - 0.5f);
    else
        return  usinf_internal(x);
}
Function
    float  expected_value(float x);
must return the float we compare to, and function
    float  computed_value(float x);
is your own implementation of it.

The verification code is in a separate file, say bench.c:
Code: [Select]
// SPDX-License-Identifier: CC0-1.0
// Nominal Animal, 2024
//
// Compile using for example
//     gcc -Wall -O2 -c  bench.c
//     gcc -Wall -O2 -c  funcs.c
//     gcc -Wall -O2     bench.o funcs.o   -lm -o   bench.exe
//
// To verify the ULPs calculation, use
//     gcc -DCHECK_ULPS -Wall -O2 -c  this.c

#define _ISOC99_SOURCE
#include <stdlib.h>
#include <inttypes.h>
#include <stdio.h>
#include <math.h>

// Functions in the separate source file
extern float  expected_value(float x);
extern float  computed_value(float x);

// Range of ULPs that matters; determines how slow delta_ulps_slow() can be
#ifndef  MAX_ULPS
#define  MAX_ULPS  16777216
#endif

// Number of bins per direction in the histogram
#ifndef  ULP_RANGE
#define  ULP_RANGE  512
#endif

static uint32_t  histogram_bin_sum = 0;
static uint32_t  histogram_bin[2*ULP_RANGE+1] = { 0 };

// Add to histogram
static inline int  histogram_add(int ulp) {
    if (ulp < -ULP_RANGE) ulp = -ULP_RANGE;
    if (ulp > +ULP_RANGE) ulp = +ULP_RANGE;
    histogram_bin_sum++;
    histogram_bin[ulp + ULP_RANGE]++;
    return ulp;
}

// Number of hits in a specific bin in the histogram
static inline unsigned int  histogram_count(const int ulp) {
    if (ulp >= -ULP_RANGE && ulp <= +ULP_RANGE)
        return histogram_bin[ulp + ULP_RANGE];
    else
        return 0;
}

// Fraction of hits in a specific bin in the histogram
static inline double  histogram_fraction(const int ulp) {
    if (!histogram_bin_sum)
        return 0.0;
    else
    if (ulp >= -ULP_RANGE && ulp <= +ULP_RANGE)
        return (double)histogram_bin[ulp + ULP_RANGE] / (double)histogram_bin_sum;
    else
        return 0.0;
}

// Difference in units of ULP in correct
static inline int delta_ulps_slow(const float value, float correct) {
    if (correct == 0.0f) {
        if (value > 0.0f)
            return  MAX_ULPS;
        else
        if (value < 0.0f)
            return -MAX_ULPS;
        else
            return 0;
    } else
    if (correct > value) {
        int  n = 0;
        do {
            n++;
            correct = nextafterf(correct, -HUGE_VALF);
        } while (correct > value && n < MAX_ULPS);
        return -n;

    } else
    if (correct < value) {
        int  n = 0;
        do {
            n++;
            correct = nextafterf(correct, +HUGE_VALF);
        } while (correct < value && n < MAX_ULPS);
        return n;

    } else
        return 0;
}

typedef union {
    float       f;
    uint32_t    u;
    int32_t     i;
} word32;

static inline int32_t  delta_ulps_fast(const float value, const float correct) {
    if (correct == value)
        return 0;
    else
    if (correct == 0.0f)
        return (value < 0.0f) ? -MAX_ULPS : +MAX_ULPS;

    const word32  v = { .f = value  };
    const word32  c = { .f = correct };

    // Different signs?
    if ((v.u ^ c.u) & 0x80000000)
        return (v.u & 0x80000000) ? -MAX_ULPS : +MAX_ULPS;

    // Grab the exponent bits.
    uint32_t  ve = v.u & 0x7F800000;
    uint32_t  ce = c.u & 0x7F800000;

    // OR all exponent bits into bit position 24,
    uint32_t  vm = (ve | (ve >> 4)) & 0x07800000;
              vm = (vm | (vm >> 2)) & 0x01800000;
              vm = (vm | (vm >> 1)) & 0x00800000;
    uint32_t  cm = (ce | (ce >> 4)) & 0x07F00000;
              cm = (cm | (cm >> 2)) & 0x01800000;
              cm = (cm | (cm >> 1)) & 0x00800000;

    // Difference in exponents.
    int32_t  shift = (int32_t)(ve >> 23) - (int32_t)(ce >> 23);

    if (shift > 0) {
        if (vm) {
            vm = (vm << shift) | (v.u & 0x007FFFFF);
        } else {
            vm = (v.u & 0x007FFFFF) << shift;
        }
        cm |= (c.u & 0x007FFFFF);
    } else
    if (shift < 0) {
        if (cm) {
            cm = (cm << (-shift)) | (c.u & 0x007FFFFF);
        } else {
            cm = (c.u & 0x007FFFFF) << (-shift);
        }
        vm |= (v.u & 0x007FFFFF);
    } else {
        vm |= (v.u & 0x007FFFFF);
        cm |= (c.u & 0x007FFFFF);
    }

    return (c.u & 0x80000000) ? (int)((int32_t)cm - (int32_t)vm) : (int)((int32_t)vm - (int32_t)cm);
}

// Check argument x
static void check(const float x) {
    const float  ev = expected_value(x);
    const float   v = computed_value(x);

    const int     u = delta_ulps_fast(v, ev);

#ifdef CHECK_ULPS
    const int    fu = delta_ulps_slow(v, ev);
    if (u != fu)
        fprintf(stderr, "\r%d != %d ULPs, %.12g != %.12g\n", fu, u, v, ev);
#endif

    histogram_add(u);
}

int main(void) {

    if (sizeof (float) != 4 ||
        ((word32){ .u = 0x3f5d7b99 }).f !=  0.8651672f ||
        ((word32){ .u = 0xc0490fdb }).f != -3.1415927f ||
        nextafterf(0.0f, +1.0f) != ((word32){ .u = 0x00000001 }).f ||
        nextafterf(0.0f, -1.0f) != ((word32){ .u = 0x80000001 }).f) {
        fprintf(stderr, "Your 'float' type is not IEEE 754 Binary32 with native byte order.\n");
        exit(EXIT_FAILURE);
    }



    // n_per_cent = 100.0 / estimated_number_of_cases
    const double  n_per_cent = 100.0 / 1056964609.0;
    // n_step = estimated_number_of_cases / 100
    const unsigned int  n_delta = 10569646;
    unsigned int        n_next  = n_delta;
    unsigned int        n       = 0;

    fprintf(stderr, "Checking .. ");
    fflush(stderr);

    // Check zero separately.
    check(0.0f); n++;
    for (float x = 1.0f; isnormal(x); n++, x = nextafterf(x, -1.0f)) {
        check(x);

        // Progress update
        if (n >= n_next) {
            n_next += n_delta;
            fprintf(stderr, "\r%.1f%% checked .. ", n * n_per_cent);
            fflush(stderr);
        }
    }
    fprintf(stderr, "\rChecking complete.  \n");
    fflush(stderr);

    // Drop zero tails from the histogram
    int  ulp_min = -ULP_RANGE;
    while (ulp_min < 0 && !histogram_count(ulp_min)) ulp_min++;

    int  ulp_max = ULP_RANGE;
    while (ulp_max > 0 && !histogram_count(ulp_max)) ulp_max--;

    printf("Report on differences between calculated and expected results,\n");
    printf("among all normal floats between 0.0 and 1.0, inclusive.\n");
    printf("Subnormal floats (2**-149 to 2**-126) were not tested.\n");
    printf("\n");

    printf("ULPs       Count  Fraction\n");
    for (int ulp = ulp_max; ulp >= ulp_min; ulp--) {
        const char *more = (ulp == -ULP_RANGE) ? "+" :
                           (ulp == +ULP_RANGE) ? "+" : " ";

        printf("%+3d%-2s %10u  %5.1f%%\n", ulp, more,
               histogram_count(ulp), 100.0 * histogram_fraction(ulp));
    }

    return EXIT_SUCCESS;
}
This way, you can use the same bench.c for many different tests, by compiling all at once, then linking each test case into a separate program, to ensure fair testing code.  For testing positive and negative values, you can just add check(-0.0f); and check(-x); next to the corresponding unsigned ones.  For other ranges, take a look at the float tool in my previous post.

Compile both files, and link against the math library.  I use GCC, and used
   gcc -Wall -O2 -std=c11 -pedantic -DCHECK_ULPS -c bench.c
   gcc -Wall -O2 -std=c11 -pedantic -c funcs.c
   gcc -Wall -O2 -std=c11 -pedantic bench.o funcs.o -lm -o bench.exe
You can omit the -DCHECK_ULPS, as it just verifies the binary ULPS calculation matches the slow loop version.

For the implemented 6-term unit period sine function included in funcs.c, the expected output is
    Report on differences between calculated and expected results,
    among all normal floats between 0.0 and 1.0, inclusive.
    Subnormal floats (2**-149 to 2**-126) were not tested.
   
    ULPs       Count  Fraction
     +3         1995    0.0%
     +2       682383    0.1%
     +1    353023465   33.4%
     +0    698222628   66.1%
     -1      4785888    0.5%
     -2       247531    0.0%
     -3          720    0.0%

To keep the program as portable as possible, it is single-threaded, and contains no GNU/Linux-isms.  It does report progress to standard output, using carriage return to overwrite the line instead of printing consecutive lines.  On a single thread on Intel Core i5 7200U laptop, it takes about a minute to run.
 

Offline Tation

  • Regular Contributor
  • *
  • Posts: 78
  • Country: pt
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #39 on: June 19, 2024, 07:52:59 pm »
To find out the error in float value compared to the expected nonzero value expect, somewhat similar to value-expect, in units of expected value ULPs, you can use
Code: [Select]
int  float_ulps(const float value, float expect, const int limit) {
    if (expect > value) {
        int  n = 0;
        do {
            n++;
            expect = nextafterf(expect, -HUGE_VALF);
        } while (n < limit && expect > value);
        return -n; // value was below expected, so error is negative

    } else
    if (expect < value) {
        int  n = 0;
        do {
            n++;
            expect = nextafterf(expect, +HUGE_VALF);
        } while (n < limit && expect < value);
        return n; // value was above expected, so error is positive

    } else {
        return 0;
    }
}
This is the most precise way of expressing the error in a value compared to what its type could support.

The difference in ULPs between two floats can also be computed (instead of "walked") using the fact that the IEEE754 format is simply a count (in sign and magnitude) of ULPs, thus:
Code: [Select]
typedef union {
    float       f;
    uint32_t    u;
    int32_t     i;
} word32;

// Difference in units of ULP in correct
static inline int delta_ulps_slow(const float value, float correct) {
    if (correct == 0.0f) {
        if (value > 0.0f)
            return  MAX_ULPS;
        else
        if (value < 0.0f)
            return -MAX_ULPS;
        else
            return 0;
    } else {
      word32  v = {.f = value};
      word32  c = {.f = correct};
      // from sign + magnitude to two's complement (32 bit)
      int32_t iv = (v.u & 0x80000000U) ? 0x80000000U - v.u : v.u;
      int32_t ic = (c.u & 0x80000000U) ? 0x80000000U - c.u : c.u;
      return iv - ic;
    }
}

On my machine it gets the very same ULP histogram.

The same trick can be used to compute nextafterf(), simply increment (or decrement) the float bits as if they were an integer.
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1027
  • Country: es
    • Picuino web
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #40 on: June 19, 2024, 08:01:04 pm »
Functions with 4 multiplications:
Code: [Select]
#include <math.h>

float expected_value(float x) {
    x -= (int)x;

    if (x == -1.0f || x == -0.5f || x == 0.0f || x == 0.5f || x == 1.0f)
        return 0.0f;
    else
    if (x == -0.75f || x == 0.25f)
        return 1.0f;
    else
    if (x == -0.25f || x == 0.75f)
        return -1.0f;
    else {
        return (float)sin((double)x * 6.283185307179586);
    }
}


float cos_pol_r(float x) {
   float p0 = -8.32795518e+01;
   float p2 = +6.49167315e+01;
   float p4 = -1.97391497e+01;
   float p6 = +1.00000000e+00;
   float xx = x * x;
   return ((p0 * xx + p2) * xx + p4) * xx + p6;
}


float sin_pol_r(float x) {
   float p1 = +8.01233966e+01;
   float p3 = -4.13334793e+01;
   float p5 = +6.28317388e+00;
   float xx = x * x;
   return ((p1 * xx + p3) * xx + p5) * x;
}


float sinf(float x) {
    if (x > 0.10546875) {
        return cos_pol_r(0.25 - x);
    }
    if (x < -0.10546875) {
        return -cos_pol_r(0.25 + x);
    }
    return sin_pol_r(x);
}


float computed_value(float x) {
    x -= (int)x;
    if (x < 0.0f)
        x += 1.0f;

    if (x >= 0.75f)
        return  sinf(x - 1.0f);
    else {
        if (x >= 0.25f)
            return -sinf(x - 0.5f);
        else
            return  sinf(x);
    }
}


Errors:
Code: [Select]
ULPs       Count  Fraction
+31          178    0.0%
+30         4082    0.0%
+29        10772    0.0%
+28        18004    0.0%
+27        24576    0.0%
+26        24842    0.0%
+25        24612    0.0%
+24        24110    0.0%
+23        27634    0.0%
+22        62334    0.0%
+21        82810    0.0%
+20        71706    0.0%
+19        64390    0.0%
+18        60182    0.0%
+17        56298    0.0%
+16        54276    0.0%
+15        42968    0.0%
+14        15250    0.0%
+13          270    0.0%
+12        54024    0.0%
+11       233896    0.0%
+10        98582    0.0%
 +9        70436    0.0%
 +8        59416    0.0%
 +7       237589    0.0%
 +6      1807455    0.2%
 +5      1860700    0.2%
 +4      1466285    0.1%
 +3      2443116    0.2%
 +2      3438085    0.3%
 +1      5877224    0.6%
 +0      8416375    0.8%
 -1      5555953    0.5%
 -2      3939831    0.4%
 -3      1173737    0.1%
 -4       365389    0.0%
 -5       649810    0.1%
 -6       733024    0.1%
 -7       510590    0.0%
 -8       533905    0.1%
 -9       712209    0.1%
-10      1541060    0.1%
-11      3971406    0.4%
-12       922841    0.1%
-13         9051    0.0%
-14       545256    0.1%
-15     39775223    3.8%
-16     85320785    8.1%
-17     85611062    8.1%
-18     86179561    8.2%
-19     85710897    8.1%
-20     85607309    8.1%
-21     85997833    8.1%
-22     84990193    8.0%
-23     81698824    7.7%
-24     50861056    4.8%
-25     42608121    4.0%
-26     42566959    4.0%
-27     42791517    4.0%
-28     42557141    4.0%
-29     42043804    4.0%
-30     24464613    2.3%
-31       283173    0.0%
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1027
  • Country: es
    • Picuino web
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #41 on: June 19, 2024, 08:03:36 pm »
Functions with 5 multiplications and one coefficient more:
Code: [Select]
#include <math.h>

float expected_value(float x) {
    x -= (int)x;

    if (x == -1.0f || x == -0.5f || x == 0.0f || x == 0.5f || x == 1.0f)
        return 0.0f;
    else
    if (x == -0.75f || x == 0.25f)
        return 1.0f;
    else
    if (x == -0.25f || x == 0.75f)
        return -1.0f;
    else {
        return (float)sin((double)x * 6.283185307179586);
    }
}


float cos_pol_r(float x) {
   float p0 = +5.90141572e+01;
   float p2 = -8.54375962e+01;
   float p4 = +6.49392826e+01;
   float p6 = -1.97392086e+01;
   float p8 = +1.00000000e+00;
   float xx = x * x;
   return (((p0 * xx + p2) * xx + p4) * xx + p6) * xx + p8;
}


float sin_pol_r(float x) {
   float p1 = -7.56594865e+01;
   float p3 = +8.15965361e+01;
   float p5 = -4.13416753e+01;
   float p7 = +6.28318528e+00;
   float xx = x * x;
   return (((p1 * xx + p3) * xx + p5) * xx + p7) * x;
}


float sinf(float x) {
    if (x > 0.10546875) {
        return cos_pol_r(0.25 - x);
    }
    if (x < -0.10546875) {
        return -cos_pol_r(0.25 + x);
    }
    return sin_pol_r(x);
}


float computed_value(float x) {
    x -= (int)x;
    if (x < 0.0f)
        x += 1.0f;

    if (x >= 0.75f)
        return  sinf(x - 1.0f);
    else {
        if (x >= 0.25f)
            return -sinf(x - 0.5f);
        else
            return  sinf(x);
    }
}


Errors:
Code: [Select]
ULPs       Count  Fraction
 +2        78193    0.0%
 +1    348612641   33.0%
 +0    705099980   66.7%
 -1      3165782    0.3%
 -2         8014    0.0%
 
The following users thanked this post: Nominal Animal

Offline Tation

  • Regular Contributor
  • *
  • Posts: 78
  • Country: pt
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #42 on: June 19, 2024, 11:25:57 pm »
Functions with 5 multiplications and one coefficient more:

[...]

Errors:
Code: [Select]
ULPs       Count  Fraction
 +2        78193    0.0%
 +1    348612641   33.0%
 +0    705099980   66.7%
 -1      3165782    0.3%
 -2         8014    0.0%

Nice! The best I was able to achieve was ±3 ULP with 5 multiplications to directly compute sine (not sine via cosine, as you did).

My approach was using the Remez algorithm to fit a 9th order odd polinomial to sin(2·π·x) in (0: 0.25) using doubles on this calculation. The weighting function was also sin(2·π·x), so the error was «equiripple in ULPs». Then truncated the double coefficients to floats and then trimmed each coefficient manually ± some ULP in a kind of «manual simulated-annealing». This narrowed the error distribution, but was unable to enhance the initial ±3 ULP error of the approximation.

How did you arrive to such coeffs?
« Last Edit: June 19, 2024, 11:33:48 pm by Tation »
 

Online nimish

  • Regular Contributor
  • *
  • Posts: 179
  • Country: us
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #43 on: June 20, 2024, 03:34:10 am »
With Taylor you get to approximate a function in such a way that the approximation is more accurate the closer it is to x=0.
If what you want is the error to be constant and minimum in all the approximated range, it is better to use interpolating polynomials choosing the zeros of the polynomial with the chebyshev coefficients.

There are two considerations for numbers close to 0.

Typically, in embedded application you will be concerned with absolute (rather than relative) errors. The fact that the presentation resolution increases as the number gets smaller is an idiosyncrasy of floating point numbers. If you use integers instead, the problem disappears.

Second, as you get closer to zero, sine function gets closer and closer to sin(x) = x which you can use for any number lower than certain threshold.

You should be using remez exchange anyway since the domain is small and fixed.

Chebyshev nodes are optimal or close to it in this case anyway.

And no you do not need to exhaustively check all floats even if there are only 4 billion. You can bound the error via interval estimates.

How does the new interpolation compare to the optimal one from github?
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #44 on: June 20, 2024, 05:54:03 am »
Functions with 5 multiplications and one coefficient more:

Errors:
Code: [Select]
ULPs       Count  Fraction
 +2        78193    0.0%
 +1    348612641   33.0%
 +0    705099980   66.7%
 -1      3165782    0.3%
 -2         8014    0.0%
Confirmed: ±2 ULPs of error.  Excellent work, Picuino! :-+

Edited to add: The associated cosine variant,
Code: [Select]
float computed_value(float x) {
    if (x < 0.0f)
        x = -x;
    x -= (int)x;

    if (x >= 0.5f)
        return -internal_sinf(0.75f - x);
    else
        return  internal_sinf(0.25f - x);
}
performs even slightly better, the error distribution for it (vs. cos(2*Pi*x)) being
Code: [Select]
ULPs       Count  Fraction
 +2        15916    0.0%
 +1      6722522    0.6%
 +0   1045503626   98.9%
 -1      4715505    0.4%
 -2         7041    0.0%

The two internal approximation functions can also be written as
Code: [Select]
static inline float cos_pol_r(float x) {
    const float xx = x * x;
    return 1.0f - xx * (19.739208f - xx * (64.939285f - xx * (85.4376f - xx * 59.014156f)));
}

static inline float sin_pol_r(float x) {
    const float xx = x * x;
    return x * (6.2831855f - xx * (41.341675f - xx * (81.596535f - xx * 75.659485f)));
}
without affecting their results.  I do believe this form may be more familiar to others.  As expected, the coefficients are almost, but not exactly those I listed in reply #6, the series definition coefficients for sin(2*Pi*x) and cos(2*Pi*x).

I, too, would be interested in the procedure you used to obtain these, Picuino!

Also, care to add an explicit // SPDX-License-Identifier: identifier to your code, too, just to be absolutely clear?  The CC0-1.0 I use is basically a copyright waiver, dedicating the code to the public domain.  You don't have to pick that if you don't want to, but I think for this kind of stuff, we should.
« Last Edit: June 20, 2024, 07:25:58 am by Nominal Animal »
 

Offline Nominal AnimalTopic starter

  • Super Contributor
  • ***
  • Posts: 6742
  • Country: fi
    • My home page and email address
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #45 on: June 20, 2024, 07:51:43 am »
On my machine it gets the very same ULP histogram.
For small differences, yes.  The problem is when the exponents differ (by more than ±1).  The one used in the bench.c above should be at least as precise as walking, and when defining CHECK_ULPS (via e.g. -DCHECK_ULPS option for GCC or Clang), it uses both and reports if any counts differ.

The same trick can be used to compute nextafterf(), simply increment (or decrement) the float bits as if they were an integer.
See the float tool in reply #37.  I've also used this for radix-sorting floating point numbers (both Binary32 and Binary64). To convert all finite Binary32/Binary64 representations to unsigned integer order, one must flip the sign bit if clear, all bits (~) if sign bit set.

For vectorized checking, using a V8SI (8×int32 on AVX) with the integer representations shown by the above float tool, conversion to V8SF (8×float) is a no-op, and one can speed up the computation to a ridiculous degree.  When the exponents match and 23 mantissa bits are between L and 2²³-L-1, inclusive, direct subtraction will yield the difference in ULPs, up to ±L ulps.  For V8SI/V8SF, L=8 makes a lot of sense.  The arguments can also be treated as V8SI for all but one out of every 2²³/8 tests (as only one in 2²³ will require a change in exponent).

As each test is also utterly separate, distributing/parallelizing this is trivial –– unless the evaluated function has branches, in which case parallelizing is very difficult.

All interesting testing ranges can be described as one or more contiguous ranges in the storage representation, so a proper bench would support useful ranges as keywords (-1..+1, 0..+1, -Pi/2..+Pi/2, 0..+Pi/2, -Pi..+Pi, 0..+Pi, with and without subnormals), be able to handle arbitrary ranges (via storage representation), and use a runtime-specified number of threads.
« Last Edit: June 20, 2024, 08:04:12 am by Nominal Animal »
 

Offline Tation

  • Regular Contributor
  • *
  • Posts: 78
  • Country: pt
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #46 on: June 20, 2024, 09:23:23 am »
For small differences, yes.  The problem is when the exponents differ (by more than ±1).  The one used in the bench.c above should be at least as precise as walking, and when defining CHECK_ULPS (via e.g. -DCHECK_ULPS option for GCC or Clang), it uses both and reports if any counts differ.

I've checked (not exhaustively, though) both approaches (walking vs. sign+magnitude substract) with exponent differences up to 31 (so that the int counting walk steps doesn't overflow) and always get the same values.

Even with larger exponent differences (with the integer counter overflowing) both approaches get the same (incorrect) value for ULP difference (expected in a machine working in two's complement).

Why do you state that with exponents greater than ±1 the other approach is wrong? As stated IEEE754 is a sign+magnitude count of ULPs, to get the difference in ULP between 2 numbers, "count" (walk) from one to the other or, well, substract. Basically you are counting the number of times you call nextafterf() to reach the other number, but internally nextafterf() is basically integer ++/--.
« Last Edit: June 20, 2024, 03:03:32 pm by Tation »
 
The following users thanked this post: Nominal Animal

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1027
  • Country: es
    • Picuino web
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #47 on: June 20, 2024, 09:28:22 am »
I, too, would be interested in the procedure you used to obtain these, Picuino!

Also, care to add an explicit // SPDX-License-Identifier: identifier to your code, too, just to be absolutely clear?  The CC0-1.0 I use is basically a copyright waiver, dedicating the code to the public domain.  You don't have to pick that if you don't want to, but I think for this kind of stuff, we should.
I believe that such good results are also partly a matter of luck. What I like most about the method I used is that you can choose the precision you want to achieve with the desired number of multiplications (more or less depending on the desired computational load).

I have used a simple polynomial interpolation of the sine function and the cosine function around the zero point. The points chosen to perform the interpolation are the Chebyshev coefficients. These are obtained by dividing a semicircle into equispaced parts and then calculating the cosine of these angles.

I have implemented it with a Python program that I posted in reply 17.
https://www.eevblog.com/forum/microcontrollers/lightweight-sin(2x)-cos(2x)-for-mcus-with-float-hardware/msg5546027/#msg5546027
I divided the two series (sine and cosine) adjusting them by hand until I found that at the limit x < 27/256 = 0.10546875 the two series had similar errors.

For the license, I don't like CC0 because it doesn't force derivative works to be freely licensed as well. I prefer a license that is free and that forces derivative works to be free as well. For example, the MIT license is simple, not very restrictive and maintains that requirement, besides being more appropriate for programs.

Code: [Select]

"""
   Copyright 2024 Picuino

   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or
   sell copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
"""

import numpy as np
import matplotlib.pyplot as plt


def main():
    nodes = 9
    xmax = 27/256
    calc_sin(nodes, xmax)
    calc_cos(nodes, 0.25-xmax)
   

def calc_cos(nodes, xmax):
    x_nodes = cheby_nodes(-xmax, +xmax, nodes)
    y_nodes = cos(x_nodes)

    pol = np.polyfit(x_nodes, y_nodes, nodes-1)
    for i in range(1, nodes, 2):
        pol[i] = 0
    print()
    for i in range(0, nodes, 2):
        print(f"   float p{i} = {pol[i]:+11.8e};")

    xx = np.linspace(-xmax, xmax, 500)
    yy = np.polyval(pol, xx) - cos(xx)
   
    plt.plot(xx, yy)
    plt.xlabel('cos(x)')
    plt.show()


def calc_sin(nodes, xmax):
    x_nodes = cheby_nodes(-xmax, +xmax, nodes)
    y_nodes = sin(x_nodes)

    pol = np.polyfit(x_nodes, y_nodes, nodes-1)
    for i in range(0, nodes, 2):
        pol[i] = 0
    print()
    for i in range(1, nodes, 2):
        print(f"   float p{i} = {pol[i]:+11.8e};")

    xx = np.linspace(-xmax, xmax, 500)
    yy = np.polyval(pol, xx) - sin(xx)
   
    plt.plot(xx, yy)
    plt.xlabel('sin(x)')
    plt.show()


def cos(x):
    return np.cos(x * 2 * np.pi)

def sin(x):
    return np.sin(x * 2 * np.pi)


def cheby_nodes(a, b, nodes):
    i = np.array(range(nodes))
    x = np.cos((2 * i + 1) * np.pi / (2 * nodes))
    return 0.5 * (b - a) * x + 0.5 * (b + a)


main()


Code: [Select]
/*
   Copyright 2024 Picuino

   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or
   sell copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
*/

#include <math.h>

float expected_value(float x) {
    x -= (int)x;

    if (x == -1.0f || x == -0.5f || x == 0.0f || x == 0.5f || x == 1.0f)
        return 0.0f;
    else
    if (x == -0.75f || x == 0.25f)
        return 1.0f;
    else
    if (x == -0.25f || x == 0.75f)
        return -1.0f;
    else {
        return (float)sin((double)x * 6.283185307179586);
    }
}


float cos_pol_r(float x) {
   float p0 = +5.90141572e+01;
   float p2 = -8.54375962e+01;
   float p4 = +6.49392826e+01;
   float p6 = -1.97392086e+01;
   float p8 = +1.00000000e+00;
   float xx = x * x;
   return (((p0 * xx + p2) * xx + p4) * xx + p6) * xx + p8;
}


float sin_pol_r(float x) {
   float p1 = -7.56594865e+01;
   float p3 = +8.15965361e+01;
   float p5 = -4.13416753e+01;
   float p7 = +6.28318528e+00;
   float xx = x * x;
   return (((p1 * xx + p3) * xx + p5) * xx + p7) * x;
}


float sinf(float x) {
    if (x > 0.10546875) {
        return cos_pol_r(0.25 - x);
    }
    if (x < -0.10546875) {
        return -cos_pol_r(0.25 + x);
    }
    return sin_pol_r(x);
}


float computed_value(float x) {
    x -= (int)x;
    if (x < 0.0f)
        x += 1.0f;

    if (x >= 0.75f)
        return  sinf(x - 1.0f);
    else {
        if (x >= 0.25f)
            return -sinf(x - 0.5f);
        else
            return  sinf(x);
    }
}
« Last Edit: June 20, 2024, 09:37:21 am by Picuino »
 

Offline voltsandjolts

  • Supporter
  • ****
  • Posts: 2377
  • Country: gb
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #48 on: June 20, 2024, 09:55:45 am »
I, too, would be interested in the procedure you used to obtain these, Picuino!

Also, care to add an explicit // SPDX-License-Identifier: identifier to your code, too, just to be absolutely clear?  The CC0-1.0 I use is basically a copyright waiver, dedicating the code to the public domain.  You don't have to pick that if you don't want to, but I think for this kind of stuff, we should.
I believe that such good results are also partly a matter of luck. What I like most about the method I used is that you can choose the precision you want to achieve with the desired number of multiplications (more or less depending on the desired computational load).

I have used a simple polynomial interpolation of the sine function and the cosine function around the zero point. The points chosen to perform the interpolation are the Chebyshev coefficients. These are obtained by dividing a semicircle into equispaced parts and then calculating the cosine of these angles.

I have implemented it with a Python program that I posted in reply 17.
https://www.eevblog.com/forum/microcontrollers/lightweight-sin(2x)-cos(2x)-for-mcus-with-float-hardware/msg5546027/#msg5546027
I divided the two series (sine and cosine) adjusting them by hand until I found that at the limit x < 27/256 = 0.10546875 the two series had similar errors.

For the license, I don't like CC0 because it doesn't force derivative works to be freely licensed as well. I prefer a license that is free and that forces derivative works to be free as well. For example, the MIT license is simple, not very restrictive and maintains that requirement, besides being more appropriate for programs.

Code: [Select]

"""
   Copyright 2024 Picuino

   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or
   sell copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
"""

import numpy as np
import matplotlib.pyplot as plt


def main():
    nodes = 9
    xmax = 27/256
    calc_sin(nodes, xmax)
    calc_cos(nodes, 0.25-xmax)
   

def calc_cos(nodes, xmax):
    x_nodes = cheby_nodes(-xmax, +xmax, nodes)
    y_nodes = cos(x_nodes)

    pol = np.polyfit(x_nodes, y_nodes, nodes-1)
    for i in range(1, nodes, 2):
        pol[i] = 0
    print()
    for i in range(0, nodes, 2):
        print(f"   float p{i} = {pol[i]:+11.8e};")

    xx = np.linspace(-xmax, xmax, 500)
    yy = np.polyval(pol, xx) - cos(xx)
   
    plt.plot(xx, yy)
    plt.xlabel('cos(x)')
    plt.show()


def calc_sin(nodes, xmax):
    x_nodes = cheby_nodes(-xmax, +xmax, nodes)
    y_nodes = sin(x_nodes)

    pol = np.polyfit(x_nodes, y_nodes, nodes-1)
    for i in range(0, nodes, 2):
        pol[i] = 0
    print()
    for i in range(1, nodes, 2):
        print(f"   float p{i} = {pol[i]:+11.8e};")

    xx = np.linspace(-xmax, xmax, 500)
    yy = np.polyval(pol, xx) - sin(xx)
   
    plt.plot(xx, yy)
    plt.xlabel('sin(x)')
    plt.show()


def cos(x):
    return np.cos(x * 2 * np.pi)

def sin(x):
    return np.sin(x * 2 * np.pi)


def cheby_nodes(a, b, nodes):
    i = np.array(range(nodes))
    x = np.cos((2 * i + 1) * np.pi / (2 * nodes))
    return 0.5 * (b - a) * x + 0.5 * (b + a)


main()


Code: [Select]
/*
   Copyright 2024 Picuino

   Permission is hereby granted, free of charge, to any person
   obtaining a copy of this software and associated documentation
   files (the "Software"), to deal in the Software without
   restriction, including without limitation the rights to use,
   copy, modify, merge, publish, distribute, sublicense, and/or
   sell copies of the Software, and to permit persons to whom the
   Software is furnished to do so, subject to the following
   conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
   OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
   HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
   WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
   OTHER DEALINGS IN THE SOFTWARE.
*/

#include <math.h>

float expected_value(float x) {
    x -= (int)x;

    if (x == -1.0f || x == -0.5f || x == 0.0f || x == 0.5f || x == 1.0f)
        return 0.0f;
    else
    if (x == -0.75f || x == 0.25f)
        return 1.0f;
    else
    if (x == -0.25f || x == 0.75f)
        return -1.0f;
    else {
        return (float)sin((double)x * 6.283185307179586);
    }
}


float cos_pol_r(float x) {
   float p0 = +5.90141572e+01;
   float p2 = -8.54375962e+01;
   float p4 = +6.49392826e+01;
   float p6 = -1.97392086e+01;
   float p8 = +1.00000000e+00;
   float xx = x * x;
   return (((p0 * xx + p2) * xx + p4) * xx + p6) * xx + p8;
}


float sin_pol_r(float x) {
   float p1 = -7.56594865e+01;
   float p3 = +8.15965361e+01;
   float p5 = -4.13416753e+01;
   float p7 = +6.28318528e+00;
   float xx = x * x;
   return (((p1 * xx + p3) * xx + p5) * xx + p7) * x;
}


float sinf(float x) {
    if (x > 0.10546875) {
        return cos_pol_r(0.25 - x);
    }
    if (x < -0.10546875) {
        return -cos_pol_r(0.25 + x);
    }
    return sin_pol_r(x);
}


float computed_value(float x) {
    x -= (int)x;
    if (x < 0.0f)
        x += 1.0f;

    if (x >= 0.75f)
        return  sinf(x - 1.0f);
    else {
        if (x >= 0.25f)
            return -sinf(x - 0.5f);
        else
            return  sinf(x);
    }
}

"For the license, I don't like CC0 because it doesn't force derivative works to be freely licensed as well. I prefer a license that is free and that forces derivative works to be free as well."

No. MIT doesn't force downstream to publish source code.
https://pitt.libguides.com/openlicensing/MIT
 
The following users thanked this post: newbrain

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1027
  • Country: es
    • Picuino web
Re: Lightweight sin(2·π·x), cos(2·π·x) for MCUs with float hardware
« Reply #49 on: June 20, 2024, 09:56:32 am »
Yes, I was wrong. Actually the MIT license is not very restrictive and allows the code to be used in proprietary programs. The GPL license would be more appropriate to ensure that it is free, but I think it is too much for a simple script. It is a discussion that would merit a thread of its own to discuss in the forum what would be most appropriate.
« Last Edit: June 20, 2024, 09:59:11 am by Picuino »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf