Author Topic: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI  (Read 9418 times)

0 Members and 1 Guest are viewing this topic.

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #75 on: December 26, 2021, 03:54:24 am »
Yeah, note that lg2(x) can be reduced to either counting maximum bit position (which is simply the integer logarithm; some ISAs even have an instruction for this), then computing on the range [1.0, 2.0).  Also, presumably this is fixed point, since floating already is the first part, and libraries cover the rest?

Or also, if you only have basic arithmetic handy, just use Feynman's algorithm or the like; it's computed directly, and in not much more time than a few longhand multiplications or divisions.  Polynomials are only attractive when fast hardware multiplication is available, with bonus points if division is also available.

It's interesting that polynomial approximations are notoriously poor (the (natural) series expansion goes as x^n/n); a rational series is a better fit (i.e. something like 1 - 1/x - 1/x^2 - 1/x^3 + ..., or some combination of p(x)/q(x) as a polynomial ratio, or whatever form you wish to represent it as).  Err, what is such a series even called?  I'm sure I've seen it before but I can't recall...  Not quite Pade approximants, though that generates similar expressions.

Also, offhand, a hybrid approach seems better..?
\[ 1 - \frac{1}{x} + \frac{(x-1)^2}{3} - \frac{(x-1)^3}{7.5} \]
does surprisingly well.  log has an infinite repeated pole at 0 (in effect; I think? But the answer from complex analysis is of course more interesting than a repeated pole, i.e., branch cut stuff) but if you aren't modeling arbitrarily close to 0, that really doesn't mean much to you, and any polynomial will do.

I can probably do a lot more with this.  The above was just poking some numbers into WolframAlpha; perhaps for every positive power added, a complementary negative power gets added, too?  Hmm...

And as usual -- if division is impractical on your platform, disregard rational approaches; a higher-order Taylor series (or better-optimized alternative) will simply perform better.  And for that, we can use whatever transforms and tweaks as above.

Tim
« Last Edit: December 26, 2021, 03:59:25 am by T3sl4co1l »
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline emece67

  • Frequent Contributor
  • **
  • !
  • Posts: 614
  • Country: 00
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #76 on: December 26, 2021, 10:35:10 am »
.
« Last Edit: August 19, 2022, 04:55:42 pm by emece67 »
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1033
  • Country: es
    • Picuino web
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #77 on: December 26, 2021, 11:38:40 am »
This is a real log(x) implementation in c:

Code: [Select]
/*
   log returns the natural logarithm of its floating
   point argument.
   The coefficients are #2705 from Hart & Cheney. (19.38D)
   It calls frexp.
*/
#include <errno.h>
#include "cmath.h"

static const double p0 = -0.240139179559210510e2;
static const double p1 =  0.309572928215376501e2;
static const double p2 = -0.963769093368686593e1;
static const double p3 =  0.421087371217979714e0;
static const double q0 = -0.120069589779605255e2;
static const double q1 =  0.194809660700889731e2;
static const double q2 = -0.891110902798312337e1;

double log(double arg)
{
   double x,z, zz, temp;
   int exp;

   if(arg <= 0.0)
   {
      errno = EDOM;
      return(-HUGE);
   }

   x = frexp(arg, &exp);
 
   if(x < INV_SQRT2)
   {
      x *= 2;
      exp--;
   }

   z = (x-1)/(x+1);
   zz = z*z;
   temp = ((p3*zz + p2)*zz + p1)*zz + p0;
   temp = temp/(((1.0*zz + q2)*zz + q1)*zz + q0);
   temp = temp*z + exp*LN2;
   return(temp);
}

double log10(double arg)
{
   return(log(arg)*INV_LN10);
}

It makes the transformation    z = (x-1)/(x+1); before compute the polynomial.

The interval of approximation is [0.7071067, 1.4142135]


Edit:
Book Hart & Cheney https://books.google.es/books/about/Computer_Approximations.html?id=jPVQAAAAMAAJ&redir_esc=y&hl=es
« Last Edit: December 26, 2021, 11:55:33 am by Picuino »
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1033
  • Country: es
    • Picuino web
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #78 on: December 26, 2021, 12:38:57 pm »
Approximation of log(x) between [sqrt (2) / 2, sqrt (2)]

Code: [Select]
def log(x):
    a_1 = 2.0000004743492
    a_3 = 0.6664575555535
    a_5 = 0.41516419048219
    z = (x-1)/(x+1)
    zz = z * z
    return z * (a_1 + zz * (a_3 + zz * a_5))

Absolute error < 4e-8


Edit:
Python program to do the calculations:

Code: (Python) [Select]
import matplotlib.pyplot as plt
import numpy as np

def transf(x):
    return (x-1)/(x+1)

num_nodes = 6
interval = [transf(np.sqrt(2)/2), transf(np.sqrt(2))]

def f(x):
    return np.log(-(x+1)/(x-1))

def chebyshev_nodes(n, interval, closed_interval=False):
    """Return n chebyshev nodes over interval [a, b]"""
    i = np.arange(n)
    if closed_interval:
       x = np.cos((2*i)*np.pi/(2*(n-1))) # nodes over closed interval [-1, 1]
    else:
       x = np.cos((2*i+1)*np.pi/(2*n)) # nodes over open interval (-1, 1)
    x = np.flip(x)
    a, b = interval
    return 0.5*(b - a)*x + 0.5*(b + a) # nodes over interval [a, b]

def print_coefs(coefs):
    print("\nCoefficients:")
    for i in range(len(coefs)):
        print(f"    a_{i} = {coefs[i]:.14g}")

def tics(interval, numtics):
    a, b = interval
    return np.linspace(a, b, numtics)

def plot_func(x, y, err_abs, err_rel):
    fig, ax = plt.subplots(3)
    fig.subplots_adjust(left=0.1, bottom=0.05, right=0.95, top=0.95, wspace=None, hspace=0.35)

    ax[0].plot(x, y, linewidth=1)
    ax[0].set_title('Function')
    #ax[0].spines['left'].set_position('zero')
    ax[0].spines['right'].set_color('none')
    ax[0].spines['bottom'].set_position('zero')
    ax[0].spines['top'].set_color('none') 

    ax[1].plot(x, err_abs, linewidth=1)
    ax[1].set_title('Polynomial absolute error')
    #ax[1].spines['left'].set_position('zero')
    ax[1].spines['right'].set_color('none')
    ax[1].spines['bottom'].set_position('zero')
    ax[1].spines['top'].set_color('none')   

    ax[2].plot(x, err_rel, linewidth=1)
    ax[2].set_title('Polynomial relative error')
    #ax[2].spines['left'].set_position('zero')
    ax[2].spines['right'].set_color('none')
    ax[2].spines['bottom'].set_position('zero')
    ax[2].spines['top'].set_color('none')

    plt.show()

def test_errors(interval, poly_coefs, num_dots=1000):
    x_dots = np.linspace(interval[0], interval[1], num_dots)
    y_dots = f(x_dots)
    y_poly_dots = np.polyval(poly_coefs, x_dots)

    # Compute errors
    err_abs = y_dots - y_poly_dots
    err_rel = np.arange(num_dots).astype(float)
    for i in range(len(x_dots)):
        if y_dots[i] == 0:
            if y_poly_dots[i] == 0:
                err_rel[i] = 0.0
            else:
                err_rel[i] = np.inf
        else:
            err_rel[i] = err_abs[i] / y_dots[i]

    # Show errors
    print(f"\nMax absolute error = {max(err_abs):.14g}")
    print(f"Min absolute error = {min(err_abs):.14g}")
    print(f"Max relative error = {max(err_rel):.14g}")
    print(f"Min relative error = {min(err_rel):.14g}")
    print(f"Max polynomial value = {max(y_poly_dots):.14g}")
    print(f"Min polynomial value = {min(y_poly_dots):.14g}")
    plot_func(x_dots, y_dots, err_abs, err_rel)

def main():
    x_dots = chebyshev_nodes(num_nodes, interval, closed_interval=True)
    y_dots = f(x_dots)
    print(f"x nodes = {x_dots}")
    print(f"y values = {y_dots}")
    degrees = np.arange(1, num_nodes, 2)
    poly_coefs = np.polynomial.polynomial.polyfit(x_dots, y_dots, degrees)
    print_coefs(poly_coefs)
    test_errors([interval[0], interval[1]], np.flip(poly_coefs))

main()
« Last Edit: December 26, 2021, 12:47:43 pm by Picuino »
 
The following users thanked this post: emece67

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15413
  • Country: fr
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #79 on: December 26, 2021, 06:27:44 pm »
There is more than one way of doing it of course, and interestingly, the almost only way exposed here was some kind of polynomial approximation, which doesn't do all that well for log(), unless you use a lot of coefficients.

T3sl4co1l's approach is not polynomial and requires a reciprocal (which may or may not be practical depending on your requirements), but anyway, I tried it and could not make it evaluate to anything close to a log? So could you elaborate?

One typical way of computing log2() is to first extract the integer part of the log2, then work on the fractional part. The integer part can be found using various approaches. But if you're dealing with IEEE 754 FP numbers, and can assume those are normalized, it's actually just a matter of extracting the exponent, which is trivial, and then set the exponent to 0 for working on the fractional part. Oh yes, manipulating IEEE-754 is baddd, but it's actually what most libraries do under the hood.

Then, getting the fractional part can be done using a binary log approach, which allows you to control how many bits of precision you want. The more bits, the longer it takes of course.
Here is a fully functional approach which works on IEEE 754 double precision numbers, you can of course adapt it to single precision (just adjust the part dealing with the exponent), or use it as a starting point if you're going to use any other number format (I wrote a variant of this for fixed-point numbers, for instance.) So depending on the number of bits of precision, it's not ultra-fast, but it can be made as accurate as possible and only requires FP multiplication, compare and addition. And it will work for any x. Again you can translate this for other formats too.

Code: [Select]
double MyLog2(double x)
{
if (x <= 0.0)
return DBL_MIN;

// Assume x is normalized.
double dX = x;
uint64_t *px = (uint64_t *) &dX; // Oh yeah, that's not pretty.

// Integer part.
int16_t nExp = (int16_t)((*px >> 52) & 0x7FF) - 1023; // IEEE 754 exponent
double dY = (double) nExp, dB = 0.5;

// Fractional part.
*px |= 0x3FF0000000000000; // Set IEEE 754 exponent to 0 (1023)
*px &= 0xBFFFFFFFFFFFFFFF;

const unsigned int nFracBits = 32;

for (unsigned int i = 0; i < nFracBits; i++, dB *= 0.5)
{
dX = dX * dX;
if (dX >= 2.0)
{
dY += dB;
dX *= 0.5;
}
}

return dY;
}
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1033
  • Country: es
    • Picuino web
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #80 on: December 26, 2021, 07:39:41 pm »
There is more than one way of doing it of course, and interestingly, the almost only way exposed here was some kind of polynomial approximation, which doesn't do all that well for log(), unless you use a lot of coefficients.
It doesn't take many coefficients to get a good approximation to log (x)

With only 3 coefficients you can obtain an aproximation with max relative error of 1.2e-7

With 6 coefficients you can obtain an aproximation with max relative error of 3e-14:

Code: [Select]
#include <errno.h>
#include <math.h>

#define INV_SQRT2  (0.70710678118654752440084436210)      /* sqrt(1/2) */
#define LN2        (0.69314718055994530941723212146)      /* log(2)    */

static const double a_1 = 1.999999999999945;
static const double a_3 = 0.6666666667970299;
static const double a_5 = 0.3999999486999626;
static const double a_7 = 0.2857216790491805;
static const double a_9 = 0.2217418680272728;
static const double a_11 = 0.1960893014786098;

double logpol(double arg) {
   double x, z, zz, temp;
   int exp;

   if(arg <= 0.0) {
      errno = EDOM;
      return(-HUGE);
   }

   x = frexp(arg, &exp);
 
   if(x < INV_SQRT2) {
      x *= 2;
      exp--;
   }

   z = (x-1)/(x+1);
   zz = z*z;
   temp = ((((a_11*zz + a_9)*zz + a_7)*zz + a_5)*zz + a_3)*zz + a_1;
   temp = temp*z + exp*LN2;
   return(temp);
}
« Last Edit: December 26, 2021, 07:52:19 pm by Picuino »
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #81 on: December 26, 2021, 07:49:16 pm »
T3sl4co1l's approach is not polynomial and requires a reciprocal (which may or may not be practical depending on your requirements), but anyway, I tried it and could not make it evaluate to anything close to a log? So could you elaborate?

It wasn't supposed to; well, the display-math one was (fitted to ln(x) from 1 to 2, which... wait, that's not log2 anyway, that should be 1 to e -- well, it was a _very_ basic attempt, you see--).  The other suggested series were merely forms to try and fit coefficients to (emphasis on "something like"!).

I'm not very familiar with the (x+1)/(x-1) substitution, but it sounds like it's doing a much better job, so I'm not too bothered about [not] expanding on my ideas.  It's probably a better form of ultimately the same sort of thing. :-+

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15413
  • Country: fr
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #82 on: December 26, 2021, 08:07:16 pm »
OK no problem!

And yes, what Picuino showed is not polynomial either (in x!)
emece67 tried to use polynomial approx. and could see for themselves it was not the right way of tackling it.

I haven't checked the results myself for Picuino's code - but I do think it's a popular way of doing it (seems to be used in some C lib implementations), so I guess it does the job. But it requires a division, which may cost quite a bit depending on the architecture.

The one I suggested has the benefit of not requiring any division, and being as accurate as you want it to be. It's not as fast obviously. For the 32-bit fractional version, it's on average 8 times slower per log2() than the C lib one (GCC on x86_64). Of course most of the time is spent in the loop. There may be ways to optimize it.

Oh,  and you may find the pointer aliasing for extracting the exponent ugly, but if you're using frexp(), as in Picuino's code, then you're going to need to use the math library. And then you may as well use the log2() function directly. :)
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4533
  • Country: nz
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #83 on: December 26, 2021, 10:24:29 pm »
Oh,  and you may find the pointer aliasing for extracting the exponent ugly, but if you're using frexp(), as in Picuino's code, then you're going to need to use the math library. And then you may as well use the log2() function directly. :)

frexp() is very easy to do yourself, without the math library.

Ignoring zero, infinities, and NaNs it's just:

Code: [Select]
double my_frexp(double d, int *exp) {
  uint64_t d_bits = *(uint64_t*)&d;
  uint64_t exp_mask = 2047L << 52;
  *exp = ((d_bits & exp_mask) >> 52) - 1022;
  d_bits = (d_bits & ~exp_mask) | (1022L << 52);
  return *(double*)&d_bits;
}

On my Mac that's just...

Code: [Select]
0000000100003e94 _my_frexp:
100003e94: 08 00 66 9e                  fmov    x8, d0
100003e98: 09 f9 74 d3                  ubfx    x9, x8, #52, #11
100003e9c: 29 f9 0f 51                  sub     w9, w9, #1022
100003ea0: 09 00 00 b9                  str     w9, [x0]
100003ea4: 08 d1 41 92                  and     x8, x8, #0x800fffffffffffff
100003ea8: 08 21 4b b2                  orr     x8, x8, #0x3fe0000000000000
100003eac: 00 01 67 9e                  fmov    d0, x8
100003eb0: c0 03 5f d6                  ret
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15413
  • Country: fr
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #84 on: December 27, 2021, 12:32:25 am »
Oh,  and you may find the pointer aliasing for extracting the exponent ugly, but if you're using frexp(), as in Picuino's code, then you're going to need to use the math library. And then you may as well use the log2() function directly. :)

frexp() is very easy to do yourself, without the math library.

Yes, have you seen my code? :)
Apparently you did not quite get what I meant above.
 

Offline CatalinaWOW

  • Super Contributor
  • ***
  • Posts: 5454
  • Country: us
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #85 on: December 27, 2021, 01:47:37 am »
Excellent thread with many good suggestions.  For those looking for starting points on these types of problems it is hard to do better than "Handbook of Mathematical Functions" by Abramowitz and Stegun.  Has tables of values forall functions normally encountered in mathematics and suggested rational approximations for varying levels of accuracy.   There is a Dover reprint of the original US Government Printing Office version.
 

Offline emece67

  • Frequent Contributor
  • **
  • !
  • Posts: 614
  • Country: 00
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #86 on: December 27, 2021, 10:03:15 am »
.
« Last Edit: August 19, 2022, 05:58:00 pm by emece67 »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6950
  • Country: fi
    • My home page and email address
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #87 on: December 27, 2021, 11:13:58 am »
Which, FWIW, are equivalent to fractions of:
3253427/432964489
2779529/16777216
8386063/8388608
Actually, the smallest term is 8068423/1073741824.
Put another way, they are 8068423*2**-30, 2779529*2**-24, and 8386063*2**-23.

If one uses truncating (and not rounding) math, I fully believe that the exact best solution is very near the Remez coefficients, but just some ULPs away, due to the quantization to single-precision floating-point and rounding differences.



If anyone is wondering:

When approximating logarithms, we are relying on the key observation: logb(a b) = logb(a) + logb(b).
Although the basis b=2 makes most sense when dealing with binary-radix floating point numbers, it does expand to any b via loga(x) = logb(x)/logb(a).  And since 1/log2(e) ≃ 1.4426950408889634073599246810 and 1/log2(10) ≃ 3.32192809488736234787031942948939, using any other basis is just one multiplication by a constant away, so we don't even need to worry about natural or base-10 logarithms; we get them for basically free if we handle base-2.

Because log2(2) = 1 and log2(1/2) = -1, we can multiply the argument x by any integer power of two 2y, and use log2(x) = log2(x y) + y.  This is why we only really need a to handle a small range of arguments.

The reason 0.7071 ≃ sqrt(1/2) ≤ x ≤ sqrt(2) ≃ 1.4142 is chosen as the range, is that 2*sqrt(1/2) = sqrt(2) (and 1/sqrt(2) = sqrt(1/2), and 1/sqrt(1/2) = sqrt(2)).  That is, if you have a positive real number r, you can always multiply it with an integer power of two, 2y where y is an integer, to get sqrt(1/2) ≤ r*2y ≤ sqrt(2).  Of course you can pick an even larger range to approximate –– just like you don't need to restrict to |x|≤PI/2 for approximating sine –– but the approximation will be more complex than it has to be.

What I am wondering myself, is whether absolute or relative error is the "better" error metric here. log2(sqrt(1/2)) = -1/2, and log2(sqrt(2)) = +1/2, with log2(1) = 0.  Also, log2(x) = -log2(1/x), so we definitely have symmetry here.
 

Offline emece67

  • Frequent Contributor
  • **
  • !
  • Posts: 614
  • Country: 00
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #88 on: December 27, 2021, 01:54:09 pm »
.
« Last Edit: August 19, 2022, 04:56:03 pm by emece67 »
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1033
  • Country: es
    • Picuino web
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #89 on: December 27, 2021, 02:04:43 pm »
Here some interesting links about CORDIC method:
Elementary functions for embedded systems
https://www.hpl.hp.com/hpjournal/72jun/jun72a2.pdf
https://www.quinapalus.com/efunc.html

Somebody knows a good CORDIC algoritm for simple precision log(x) function?
 

Offline Picuino

  • Super Contributor
  • ***
  • Posts: 1033
  • Country: es
    • Picuino web
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #90 on: December 27, 2021, 02:11:43 pm »
Program to generate CORDIC table for sine and cosine functions for several words precission lenght:
http://www.dcs.gla.ac.uk/~jhw/cordic/
 

Offline emece67

  • Frequent Contributor
  • **
  • !
  • Posts: 614
  • Country: 00
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #91 on: December 27, 2021, 02:45:07 pm »
.
« Last Edit: August 19, 2022, 04:56:12 pm by emece67 »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6950
  • Country: fi
    • My home page and email address
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #92 on: December 27, 2021, 03:13:02 pm »
Another "CORDIC"-like approach to logarithm calculation is Feynman's algorithm, which uses the fact that every real \$x\$, \$1 \lt x \lt 2\$, can be approximated with
$$x \approx \prod_{k=1}^{N} \left(1 + 2^{-k}\right) B_k , \quad B_k \in 0, 1$$
where \$B_k\$ is the \$k\$'th fractional bit in \$x\$, and therefore
$$\log(x) = \sum_{k=1}^{N} B_k \log\left(1 + 2^{-k}\right)$$
Since single-precision floating point (IEEE 754 Binary32) has 24 significant bits in its mantissa, \$N = 24\$ should suffice.
That is, we'll only need 24 single-precision floating point numbers, pair-wise.

This is particularly well suited for IEEE 754 binary floating point types.
« Last Edit: December 27, 2021, 03:28:42 pm by Nominal Animal »
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15413
  • Country: fr
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #93 on: December 28, 2021, 12:22:51 am »
What I am wondering myself, is whether absolute or relative error is the "better" error metric here.

Yeah. I think that would really depend on what you're using the function for.
With that said, as always, of course, before using any kind of "expensive" function or operation, make sure you really need them. Sometimes changing the computation approach just allows to get rid of them. Probably obvious for most people here, but I've see too many using log() or sqrt() when you could definitely do without them.
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #94 on: December 28, 2021, 03:06:05 am »
Or the really low-hanging fruits, like integer-argument pow()... :D

Speaking of, anyone know how often GCC or other compilers are able to simplify those?  I don't know offhand whether it's smart enough to spot an integer (or say within 1 ULP of an integer cast to float) literal and produce alternative code, or if it really matters much/at all (because the argument has to be cast to float and no assumptions can be made), or if it can even end-run around the semantics and detect that an integer literal has been used?  I would assume the latter, not so much, but all are reasonable to varying degrees...

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15413
  • Country: fr
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #95 on: December 28, 2021, 03:25:07 am »
As far as I know, if you pass an integer to a function which has a FP argument, the integer will get promoted to a FP, and then the function will get called. Due to the promotion rule in C, I don't think a compiler would be allowed to optimize this out - at least by using an integer version of the function instead, for instance.

Now you're talking about literals - meaning you have in mind calling such functions with literals (constants). So, in general, would the value of the literal matter? Could the compiler even compute the result at compile-time?

Now feel free to experiment. :)

A small test showed that common math functions seem to be evaluated at compile time if they are called with a literal argument. For instance, the simple:
Code: [Select]
#include <math.h>

double Foo(double x)
{
        return sqrt(2)*x;
}

Yields:
Code: [Select]
Foo:
        .seh_endprologue
        mulsd   .LC0(%rip), %xmm0
        ret
        .seh_endproc
        .section .rdata,"dr"
        .align 8
.LC0:
        .long   1719614413
        .long   1073127582

Whether that works will all standard math functions, I have no clue!
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4533
  • Country: nz
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #96 on: December 28, 2021, 03:38:05 am »
Certainly easy enough to experiment...

https://godbolt.org/z/xcaPnq5YW

I get the same results on this independent of ISA or gcc/clang.
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #97 on: December 28, 2021, 04:58:27 am »
Certainly easy enough to experiment...

https://godbolt.org/z/xcaPnq5YW

I get the same results on this independent of ISA or gcc/clang.

Ahh, interesting that it simplifies 2, but not 3...

Ah, that makes sense, -funsafe-math-optimizations will generate the [naively] expected code (but it might not distribute errors exactly the way you like, hence the name of the option).  I suppose squaring being a binary operation (x * x), order doesn't matter, but for other powers it does.  Also works for negative powers (emitting a division op, and whatever muls to get it there).

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6950
  • Country: fi
    • My home page and email address
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #98 on: December 28, 2021, 05:19:25 am »
Since binary logarithm requires a large-ish number of squarings, I suppose the alternative is a piecewise cubic approximation:
Code: [Select]
#include <math.h>
#define   LOG2_PARTS  256
#define   LOG2_ERROR  -HUGE_VALF

float log2_poly[LOG2_PARTS][4];

float log2_approx(float x)
{
    if (x <= .0f)
        return LOG2_ERROR;

    int  p, i;

    x = 2 * LOG2_PARTS * (frexpf(x, &p) - 0.5f);
    i = (int)x;
    x -= (float)i;

    float  result = x * log2_poly[i][3];
    result += log2_poly[i][2];
    result *= x;
    result += log2_poly[i][1];
    result *= x;
    result += log2_poly[i][0];

    return result + p;
}
If we use Cn as shorthand for log2_poly[i][n], then i = 0..N-1 is the piece 0.5+i/(2N) ≤ x < 0.5+(i+1)/(2N), C0 is the base-2 logarithm at the lower boundary, and C0+C1+C2+C3 is the base-2 logarithm at the upper boundary, approximately.  The individual polynomial coefficients can be calculated using e.g. the Remez algorithm.  N = LOG2_PARTS is the number of polynomials in the entire interval 0.5 ≤ x < 1, and is one of the parameters we can control to tune the fit.  For obvious reasons, powers of two N are preferable if we need to extract the mantissa bits from the IEEE 754 Binary32 storage representation.

Of course, each polynomial takes 4×4 bytes = 16 bytes of ROM/Flash, so this is just another method of trading ROM/Flash for speed/accuracy, up to a limit.
« Last Edit: December 28, 2021, 11:35:05 am by Nominal Animal »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Compact sin(x) which gets to within 0.1%, x = 0 to 2xPI
« Reply #99 on: December 28, 2021, 11:07:32 am »
Late to the party, but I knocked up a CORDIC implementation for you. It generates the tables needed, which in an actual implementation can be initialized with constants, rather than calling build_tables(), and avoid all floating point operations.

mysin() takes a 32-bit integer representing the phase angle (0 = 0 degrees, 0x40000000 = 90 degrees and so on) and returns the sine at that point. Adjust the value of "start" to scale the output range - the "CORDIC gain" is about 1.647, so the current value of 19813 gives a full scale of +/-32627

Here's the output:

Code: [Select]
$ ./cordic
table[0] is 536870912
table[1] is 316933405
table[2] is 167458907
table[3] is 85004756
table[4] is 42667331
table[5] is 21354465
table[6] is 10679838
table[7] is 5340245
table[8] is 2670163
table[9] is 1335086
table[10] is 667544
table[11] is 333772
table[12] is 166886
table[13] is  83443
table[14] is  41721
table[15] is  20860
table[16] is  10430
table[17] is   5215
table[18] is   2607
table[19] is   1303
table[20] is    651
table[21] is    325
table[22] is    162
table[23] is     81

Actual  Calced  Error
     0      0    0
  6365   6365    0
 12485  12486    1
 18126  18126    0
 23070  23070    0
 27128  27128    0
 30143  30143    0
 32000  32000    0
 32627  32627    0
 32000  32000    0
 30143  30143    0
 27128  27128    0
 23070  23071    1
 18126  18126    0
 12485  12486    1
  6365   6365    0
     0      0    0
 -6365  -6365    0
-12485 -12486   -1
-18126 -18126    0
-23070 -23070    0
-27128 -27128    0
-30143 -30143    0
-32000 -32000    0
-32627 -32627    0
-32000 -32000    0
-30143 -30143    0
-27128 -27128    0
-23070 -23070    0
-18126 -18126    0
-12485 -12486   -1
 -6365  -6365    0

Here's the code:

Code: [Select]
#include <stdio.h>
#include <stdint.h>
#include <math.h>


static uint32_t table[24] = {0};

#define TABLE_SIZE (sizeof(table)/sizeof(table[1]))

int build_table(void) {
   double total = 0;
   for(int i = 0; i < TABLE_SIZE; i++) {
      double angle = pow(2.0,-i);
      double v = atan(angle);
      v /= 2*M_PI;
      v *= (1<<30);
      v *= 4;
      table[i] = (int)v;
      printf("table[%d] is %6d\n", i, table[i]);
      total += v;
   }
   return 0;
}

int32_t mysin(int32_t in) {
   int32_t start = 19813*16;
   int32_t out_s = 0;
   int32_t out_c = 0;
   // Get 'in' to within +/- PI/2
   if(in > 0x40000000) {
     out_c = -start;
     in -= 0x80000000;
   } else if(in <= -0x40000000) {
     out_c = -start;
     in += 0x80000000;
   } else {
     out_c = start;
   }
   for(int i = 0; i < TABLE_SIZE; i++) {
      int32_t next_c, next_s;
      if(in > 0) {
        in -= table[i];
        next_c = out_c - (out_s>>i);
        next_s = out_s + (out_c>>i);
      } else {
        in += table[i];
        next_c = out_c + (out_s>>i);
        next_s = out_s - (out_c>>i);
      }
      out_s = next_s;
      out_c = next_c;
   }
   return out_s/16;
}

int main(int argc, char *argv[]) {
   build_table();
   int test_points = 32;
   printf("\nActual  Calced  Error\n");
   for(int32_t i = 0; i < test_points; i++) {
      int actual =  (int)(sin(i*M_PI*2/test_points)*32627);
      int calc   =  mysin(0x10000000/test_points*16*i);
      printf("%6d %6d %4d\n", actual, calc, calc-actual);
   }
}
« Last Edit: December 28, 2021, 11:10:38 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: peter-h


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf