Author Topic: Mind over bugs: C pointer edition (Read 12038 times)

Nominal Animal · « **on:** June 29, 2021, 01:40:56 pm »

C pointers are confusing and bug-prone even for experienced programmers. Because they are also very powerful, we don't want to get rid of them in C. (We do switch to other languages to get rid of pointers and their associated baggage, though. I'm only saying that one cannot get rid of them in C.)

Compilers are not much help in detecting C pointer bugs, either. The kinds of bugs that many C programmers see at a fraction of a second glance, seem and are perfectly acceptable code to C compilers.

So, what is one to do to avoid proverbially impaling oneself on C pointers?

We change how we think, how we use pointers, so that we develop within ourselves that which the compilers cannot do for us.
Because this is squishy human stuff, what I will suggest below, will not work for everyone, and will be different to what others suggest; but I sincerely hope others will tell what they do, if they do avoid pointer bugs in their C work product, because that is just more tools in the box to choose from.
But do be critical, and make sure you don't listen to those who feel that it is perfectly okay for C code to crash if you call it with different parameters than they do, if the clock is past five in the evening, or because Sagittarius is in Aquarius or some other mystical reason.

This would work so much better as a Dave-style EEVBlog video. (I'm particularly reminded of the capacitance multiplier one, #1116.) Unfortunately, I have the on-screen personality of a slightly annoyed gerbil dealing with a frozen potato –– so possibly entertaining in the ooh, you stupid little critter sense, but that's it ––, so I cannot do it myself. The best I can do, is note that since these bugs keep cropping up endlessly in this forum, and I may have a tool both learners and advanced C mongers can use to avoid creating those bugs, in the hopes of it being useful and spurring other ideas for others to post in this thread, is to write it here.

Jan Audio · « **Reply #1 on:** June 29, 2021, 01:50:22 pm »

Use macros, that you have tested, and dont forget to use it.
*even now i still have a crash, i am personally thinking it is 50/50 chance the compilers fault.
You can not trust the best compiler with big linecount in combination with pointers.
Once i get past, lets say 10.000 lines, bugs appear automaticly that dont exist.
I think thats why they all make .dll files for windows.

Jan Audio · « **Reply #2 on:** June 29, 2021, 03:50:56 pm »

THING_COUNT - 1

*sorry, coulndt resist.

gf · « **Reply #3 on:** June 29, 2021, 04:04:40 pm »

THING_COUNT is correct

Not THING_COUNT - 1
[ It is supposed to emulate the past-the-end iterator of C++ containers, which points behind the last element. ]

It may be confufing, though, for C guys which don't program in C++ as well.

Jan Audio · « **Reply #4 on:** June 29, 2021, 04:07:05 pm »

Not always maybe, i have no clue about bank-switching, just to be safe you better :

Something * const p_end = &things[THING_COUNT - 1];

void algorithm( Something * const p_begin, Something * const p_end ) {
for( Something * p_it = p_begin; p_it <= p_end ; p_it++ ) {

DiTBho · « **Reply #5 on:** June 29, 2021, 04:19:41 pm »

Quote from: Nominal Animal on June 29, 2021, 01:40:56 pm

So, what is one to do to avoid proverbially impaling oneself on C pointers?

Personally I try to avoid every "pointer-arithmetic" because (for me) it's too prone to fail

gf · « **Reply #6 on:** June 29, 2021, 04:34:49 pm »

Quote from: Jan Audio on June 29, 2021, 04:07:05 pm

Not always maybe, i have no clue about bank-switching, just to be safe you better :

C++ containers and iterators follow a different philosophy. The end() iterator of a container is by definition always an invalid one, not pointing to an existing element of the container.
Therefore it is also exluded in the loop for( Something * p_it = p_begin; p_it != p_end ; ++p_it )

As I already said, this may be confusing for non-C++ programmers.
Since the topic of this thread are C pointers, I wonder whether C++ examples should be rather avoided?
OTOH, comparison with other languages can also give some insights.
So at the end I'm not sure whether such a comparison is rather confusing or nevertheless helpful for C guys.

NorthGuy · « **Reply #7 on:** June 29, 2021, 04:47:00 pm »

Quote from: Nominal Animal on June 29, 2021, 01:40:56 pm

... I sincerely hope others will tell what they do, if they do avoid pointer bugs in their C work product, because that is just more tools in the box to choose from.

"Don't be afraid of death. Be afraid of unlived life." [unknown source]

You cannot teach a baby to walk by making him afraid of falls. Rather you motivate him to walk. He must fall many times before he can walk, and he will. There's no way you can prevent this. This is a part of the learning. What you can do, however, is to teach him not to cry when he falls, but rather laugh, stand up, and walk again. Eventually, the baby will learn how to walk. By any means, this doesn't prevent him from falling. When he grows up he will walk and he will fall. But when he falls this won't feel like a tragedy because he knows how to stand up and how to walk further.

Similarly, you cannot learn programming by being afraid of bugs. The whole generation of programmers have grown up being afraid of bugs. A bug is a tragedy to them. Therefore they cannot program, and they will never be able to program. You must not be afraid of bugs. You will encounter bugs no matter what. It's what you do when you encounter a bug that counts.

Nominal Animal · « **Reply #8 on:** June 29, 2021, 04:51:48 pm »

What is a pointer in C?

No, we're not interested in how the standards define them; we are only interested in getting a maximally useful and workable concept here. For the reasons outlined in the initial post, and because during the last two decades or so my own C work product contains significantly fewer pointer-related bugs than average open source code, and the only difference I can tell between my approach and others is the concepts used, I must assume that is what makes the difference.

The most useful definition I have seen, is a three-way split: function pointers, void pointers, object pointers.

Function pointers are very powerful, but not bug-prone for whatever reason. So, the standards' definitions for these seem to work fine.
An important detail, however, is that systems like Linux do not actually use the C standard rules for function pointers, but a much stricter subset as defined by POSIX.
In particular, POSIX specifies compatibility between function pointers and non-function pointers, especially void pointers. This makes things like dlsym() work without silly casting shenanigans required by a strict reading of the C standard.

(You'll see similar stuff regarding unions and type punning. Some insist that things like struct sockaddr as used by POSIX bind() to bind a socket to an address family specific address is not compliant to the C standard, and cannot work correctly according to a strict reading of the C standard. Be that as it may, practice trumps theory and standards every time, and the massive amount of POSIX C that works very well indeed (with very few bugs related to struct sockaddr, I might add) indicates that that which needs "fixing" is not struct sockaddr, but the C standards instead.)

(Note that in embedded environments where a curious mix of freestanding C and C++ is used –– Arduino, for example ––, even though POSIX is nowhere near, the compilers –– GCC and LLVM/Clang in particular –– always provide the POSIX-compatible pointer behaviour instead of the strictly-by-the-standard you-may-not-do-that-even-if-it-would-work-well-in-practice silliness insisted by Language Lawyers. So, if you like, instead of "POSIX" you can substitute "GCC and LLVM/Clang" at least.)
Void pointers are the idiot siblings of all pointers. Using the stricter POSIX definitions, void pointers –– void * –– really boil down to the address they point to, plus optional qualifiers for accessing that address (const, "I promise I won't try to modify"; volatile, "undetectable other things may modify this at any point, so make no assumptions about its value, compiler", in particular); and both object pointers and function pointers are compatible with void pointers.

In my experience, the best approach to void pointers is to consider them the most basic import/export form for function and object pointers.
You do not use them to access anything; you use them only when you need to convey the address to be accessed without any further assumptions about it (except those aforementioned qualifiers that involve how and not what).

Simply put, you use void * only when you cannot reasonably use an object pointer ot a function pointer, and treat it as the transport format. (This has serious implications to what a programmer must consider when doing this; I'll discuss those a bit further down.)
Object pointers are those that point to data.

An object pointer is not just an address one can use. If that is all it were, it would be a void pointer instead.

Object pointers have non-void type, and that type specifies the size of the memory region referred to, indirectly (due to alignment restrictions) assumptions about the pointer itself, plus the type qualifiers like const and volatile related to the manner the region may be accessed.

In visualization terms, only void pointers are reasonably described as arrows or spears or javelins; object pointers are more like buckets or boxes, with label stickers describing what the buckets can hold (think of "this bucket can hold acids" or "only for foodstuffs"), and handling stickers ("fragile") corresponding to const/volatile access qualifiers. Function pointers can be thought of as little sticky notes describing where to go, with optional notes as to how to present oneself (types of parameters passed) and what to accept bringing back.

One can say that almost all pointer bugs occur when object pointers point to something other than intended. To fix this, we find a way to think of and use object pointers, so that we more easily detect the cases where mis-pointing may occur.

The first step in this is to realize that the error does not occur when you use ("dereference", is the term) such a pointer, even though this is exactly where your debugger and other tools will direct your focus at.

The true error occurred when the pointer value was first constructed.

(Well, almost. The exception to the case is use-after-free bugs, since there the pointer was absolutely fine, just left forgotten when the region of memory it points to was freed, quite unexpectedly from this pointers' point of view. Leeroy Jenkins of pointers, shall we say? Except I lean towards thinking that the error did occur when this pointer was constructed even in this case, by not having a mechanism to mark this pointer invalid if/when the memory region is freed, or by ensuring these pointers are not accessible anymore when the memory region is freed. If you look at C garbage collectors like Boehm GC, they do exactly this; and Boehm GC happens to have papers showing that it does this deterministically. So it is doable, and not as hard as one might think.)

Typical bugs when using pointers

There are several classes of bugs related to pointers, but the most common ones are off by one, out of bounds, and use after free.

Off by one

Perhaps the most common off-by-one bug is forgetting that strings in C are just sequences of non-nul char/unsigned char/signed char, terminated by a nul, '\0', and that nul terminator must be accounted for in the size needed to store that string.

Another common case is when backtracking, say removing trailing whitespace from a string, and forgetting to check if you are already at the very first char, and backtracking past the beginning.
Out of bounds

This is the nastiest class of pointer bugs in my opinion, because they are so hard to track down. Indeed, the core reason why proprietary kernel modules are marked "tainted" by the Linux kernel is not because of some ideological zealotry, it is because such modules can accidentally dereference pointers and scribble over any data structures at all (because Linux has a "monolithic" kernel architecture, one where kernel modules and drivers work within a single domain without boundaries; as opposed to "microkernel" architectures, where modules and drivers are protected from each other, paying a tiny performance cost, but moreso tying human developers to inter-domain information passing techniques –– and the Linux kernel developers like to be free to experiment, update, even replace stuff; that's why they won't limit themselves to a fixed binary interfaces within the kernel itself, for example). You see, even when you after hours of work discover that the machine or process crashed bcause "ah ha! This memory was garbled!", you have ZERO indication of exactly what code did the scribbling-over! You found the murder scene, but zero evidence of the murderer.

The simplest verifiable example that is obviously dangerous that you cannot get your compiler to warn about was posted in another thread here by ataradov. Tweaking it a tiny bit to more closely resemble the typical bug I see in real life, it can be written as

extern int bar(volatile char *p);

int foo(int i)
{
volatile char buffer[20];
return bar(buffer + i);
}

Above, buffer is volatile only to stop the compiler from making assumptions about it. The bug occurs, because we construct a pointer buffer + i while knowing that i could be any value at all representable by an int, with no guarantees that it is between 0 and 19 inclusive, that the code as written implicitly assumes.

I can only assure you that if you have written sufficiently low-bug and/or debugged sufficient amount of C code, your eyes detect the possibility/likelihood of that code having a bug within a fraction of a second. It is just that common, you see; it doesn't even involve conscious thought. To me, it is analogous to seeing a pot hole in the road.

If you do find a C compiler that supports a compilation/warning flag that makes it complain or warn about that code, let me know: I haven't found one for GCC or Clang (till 2021-06-29 at least).
Use after free

Unlike the above two bugs, this is kinda-sorta not the pointers own fault. It occurs, when the pointer points to memory that has been freed (or otherwise made inaccessible, say a memory map since unmapped), and is used ("dereferenced") afterwards.

The programming technique called poisoning is useful in helping to detect these bugs. It simply means that before the memory is freed, the contents are set to easily detected, preferably invalid, values – poison. That way, an use-after-free bug that does not lead to an immediate crash of the process, can be detected by the accesses yielding "poisoned" values. (Remember, most C library implementations do not make freed memory inaccessible, only marks it internally available for subsequent allocations. Only much larger continous allocations on architectures providing virtual memory tend to use memory mapping techniques, so that freeing such an allocation causes a subsequent access to raise a segment violation – in Linux/Unix/BSDs, the SIGSEGV signal. So, poisoning is a simple, cheap, practical indicator tool – but not perfect: if the memory is re-used by subsequent allocations, it is likely the poison has been replaced with non-poison data, making the underlying bug much harder to detect. Since the allocation pattern often varies between debug and non-debug builds, these often belong to the Heisenbug class: those that vanish when you try to debug or instrument them.)

The practice of NULLing pointers when they are freed is derided by some as "wasteful" or "unnecessary", but it does turn a sub-class of use-after-free bugs into NULL pointer dereferences, and the latter are easily detected by most C compilers at runtime; and since the point where the dereference occurs tells us which pointer was used, we have immediately discovered the murder weapon, and are much further in the debugging process than we would otherwise be. In practice, this looks like
free(ptr);
ptr = NULL;

Yes, the overall cost is clearing the pointer value, and in compiled machine code on architectures like x86, AMD64, and ARM64, the cost is effectively zero.

This does not help with the large subset of use-after-free bugs where the pointer used is a cached copy of an earlier value, however.

NULLing pointers is not just a bug-combating technique, however. In POSIX C, for example, a function reading and processing an input file line by line can be written as

#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
#include <errno.h>

#define SEE_ERRNO -1
#define I_ATE_IT -2

int for_each_line(FILE *src,
int (*handler)(long linenum,
void *base, size_t size,
char *line, size_t linelen,
void *context),
long *linenum_at,
void *context)
{
if (!src) {
errno = EBADF;
return SEE_ERRNO;
} else
if (ferror(src)) {
errno = EIO;
return SEE_ERRNO;
}

long linenum = 0;
char *line = NULL;
size_t size = 0;

/* If the caller is interested in the linenum, or has provided us with an initial number
(lines consumed before the first one), use that. */
if (linenum_at)
linenum = *linenum_at;

while (1) {
/* Read a new line, without line length limitations (other than avaiable memory). */
ssize_t len = getline(&line, &size, src);

/* len == 0 should not happen, but if it did, it would indicate end of input.
len < -1 should not happen, but if it did, it would be a C library bug.
We differentiate between all these cases by examining feof(src) and ferror(src).
*/
if (len < 1)
break;

/* The logic here is that we increment linenum whenever we read a new line.
So, if linenum_at == NULL or *linenum_at == 0, the first line read will be
linenum == 1. This is just one possible convention; you do you. */
linenum++;

/* If the caller is interested, then keep them up to date.
Volatile just tells the compiler to stop trying to be clever about it. */
if (linenum_at)
*(volatile long *)linenum_at = linenum;

/* Both GCC and Clang – the C compilers I prefer to use – generate slightly
better code for pointers as opposed to pointer+offset expressions.
So, taking care to avoid off-by-one errors, we use pointers from here on end. */
char *ptr = line;
char *end = line + len;

/* Trim off trailing whitespace, including newline characters.
Note the care taken to avoid off-by-one errors.
Also, the cast *is* necessary for isspace(), see man 2 isspace for details.
*/
while (end > ptr && isspace((unsigned char)(end[-1])))
*(--end) = '\0';

/* Trim off leading whitespace, if any. */
while (ptr < end && isspace((unsigned char)(*ptr)))
ptr++;

/* Do not bother passing empty lines to the handler function;
and if there is no handler function specified, we're done with this line. */
if (ptr >= end || !handler)
continue;

/* If the first non-whitespace character of the line is a # or ;,
we treat the entire line as a comment, and won't pass it to handler. */
if (*ptr == '#' || *ptr == ';')
continue;

/* Let handler handle the line now. */
int result = handler(linenum, line, size, ptr, (size_t)(end - ptr), context);
if (result == SEE_ERRNO) {
/* Error; abort. Return errno to the caller. */
const int saved_errno = errno;
free(line);
errno = saved_errno;
return SEE_ERRNO;
} else
if (result == I_ATE_IT) {
/* The handler took responsibility for the dynamically allocated buffer. */
line = NULL;
size = 0;
} else
if (result) {
/* Error; pass return value to caller. */
free(line);
return result;
}
}

/* The line buffer is no longer needed.
Also, free(NULL) is safe and does nothing, so no need to check it first. */
free(line);
/* We could add
line=NULL;
size=0;
here, but since the rest of the function does not look at them at all,
I shall omit it here. Just to show that rules of thumb are just that. */

/* Did we just get an I/O error instead of just an end of input? */
if (ferror(src) || !feof(src)) {
errno = EIO;
return SEE_ERRNO;
}

/* No errors, everything fine. We return a nice zero. */
return 0;
}

If the handler function returns I_ATE_IT, it means that it decided to reuse the entire dynamically allocated region containing the line, starting at base, having size size. (context is just a custom parameter not often needed; it is there only so that if the original caller has some context it wishes to pass to or share with the handler, it can do so without using global or thread-local variables.)

If we had passed the handler only ptr, it could not do that: you cannot pass ownership/responsibility of a dynamically allocated region of memory using a pointer that points to somewhere inside that region, because we don't have a way to determine the region (or even its base address) from the pointer. That's why we pass the base and size of the dynamically allocated region to the handler, too.

Because getline() allocates new dynamic memory buffer whenever the pointer is NULL and size zero, we only need to NULLify the pointer and set the size to zero to keep going. While this case may look so very contrived if one thinks of it only as an example of why NULLing a pointer inside a function is sometimes necessary for correct operation, take a step back and look how useful, simple, but yet powerful the function is. It tracks line numbers, gives the handler the ability to grab ownership of the buffer if it wants, handles comment lines and removes leading and trailing whitespace.

I wrote the above function using the concepts I talked earlier in this post, so to evaluate those concepts, look at the code. And try to find a scenario where it could bug out (except when given a bad FILE state to begin with, or by having handler() contain a bug).

How to avoid pointer bugs?

The key, I claim, is to realize that the bug does not occur when you use ("dereference") a pointer, but when you construct the pointer in the first place.
(Ignore technicalities and language lawyerism for a second. The context of that statement is the human understanding of how pointers behave, and how to think in ways that help you create fewer bugs than you would if you relied on pure language lawyerism and only on technically correct, impeccable definitions and statements. Minds are squishy, and need to be treated on their own terms.)

That initializing/setting/constructing expression is what you need to examine. You cannot do "security" checks later on, because the information on what the pointer should be, is already lost. Security cannot be bolted on: it is either integral to the process, or it does not exist.

Checks like if (!src) in the above function do technically check if src is NULL or not, but they are not bounds checks: they are just sanity checks, intended as an extra check to catch the most stupid pathological (as in "sick") cases. (A bounds check is one that is aware of the extent of the target to be accessed, and verifies that the access is within those bounds. A sanity check is a check against never-useful/workable/valid inputs; they only catch the insane cases that cannot ever work.)

I myself use sanity checks mostly because they make testing easier. (Okay, and because I like the belt-and-suspenders approach; but chalk that last one to being willing to pay the tiny extra cost just to reduce the likelihood of getting bit by a bug. You could say I have a bit of the programming bug equivalent of arachnophobia, I guess. It does not mean my house has fewer arachnids than anyone else, it's just that I'm willing to go a bit further in trying to avoid encountering any than what most consider "normal" or "reasonable". Actually, I do have a bit of arachnophobia, but I actually like spiders, because where I live there are no spiders harmful to humans in any way, and they keep mosquitoes and gnats and other blood-sucking pests I hate in check. So, I like spiders being close by, I just have an unfortunate atavistic reaction to seeing them. With programming bugs, I may be overcompensating and overblowing the importance of trying to fix bugs and create less of them, and try to make ones code detect and deal with unexpected input or errors, instead of letting them be expressed as unwanted/unexpected/unexplainable behaviour – bugs.)

Remember, because pointer bugs are caused at the moment of their construction and not when the pointers are used/dereferenced, such sanity checks mean that if I do find a bug, the sanity check has already ruled out whole classes of causes (due to preceding code) by that single, cheap, sanity check. Furthermore, things like accidentally passing a never-valid value (say, a NULL pointer), always caught by the sanity check, suddenly transform from catching a bug into reporting to the caller that they used invalid/unacceptable parameters. No bugs necessarily involved, as it turns say a NULL pointer from a bug, into explicitly ignored or rejected input value. Think of the free(NULL) case before you make your mind about that.

The extension of this basic idea is to be suspicious of any expression constructing a pointer value.

The cases where pointers are used/dereferenced, are fait accompli: a done deal, a fact of life; and no matter how deeply we would examine that part, we'd gain zero additional information whether that access is safe and non-buggy or not.

Of particular interest is whenever we convert a void pointer to an object pointer or a function pointer. (This is something you do and consider if you start treating void pointers as an import/export method for pointers you cannot express with better fidelity, as I mentioned much earlier.)
There is not much we can do, programming-wise, at that point to check the pointer; the only thing I can think of is to make sure it is sufficiently aligned per the hardware and ABI alignment rules, but because C does not really have existing tools to express such checks in a portable manner (like say a built-in is_unaligned(type-qualified pointer) operator or function), it is not useful to try and think of how to achieve that (beyond perhaps a simple binary AND mask on the least significant bits of the address, with the mask being a compile-time constant).

Instead, we must turn to our fellow developers, and start discussing what non-program-expressed guarantees the caller (or whoever provides us with the void pointer in the first place) can provide us with and whether they could instead pass us more information.

See how that logic meshes with the for_each_line() function implementation above, especially what we provide to the callback handler() function on every call?

A typical knee-jerk reaction from C programmers to that, is fear/dislike of passing that much more data, and consider it a waste of resources. However, for example on AMD64 on Linux (using AMD64 SysV Application Binary Interface, i.e. hardware calling convention), all six arguments to the handler function are passed in registers (rdi, rsi, rdx, rcx, r8, and r9, to be specific, with the return value in eax (the 32 least significant bits of the rax register; also used to hold the function pointer value in most patterns generated by clang/llvm and gcc), and the added register pressure does not affect the efficiency of the function in any measurable scale, since the bottleneck here is always the getline() standard library function.

In other words, if you felt any dislike at adding two seemingly superfluous parameters to a callback function, just so that the callback could *safely* grab complete ownership and responsibility for some dynamically allocated memory, you need to start fighting against your feelings or intuition, because it is demonstrably wrong here.

And herein lies the entire Mind Over Bugs trick: we don't look for outside assistance to help combat these bugs. We use our own minds to do so, by retraining ourselves to think in terms where such bugs become visible – even glaring! –, and thus easier to deal with, and hopefully much, much rarer as a result. Assuming we do care about bugs in the first place; many programmers don't, because they don't affect their pay slip either way, and bugs are dull.

That does not mean we become immune, though. I am fully aware that there may be a pointer-related bug even in the above for_each_line() example function above, I am just confident that its likelihood is low, based on past experience and statistics. But, because I am not certain, I used lots of comments explaining my reasons for the key expressions, so that if anything gives me pause (or pause to any of my colleagues, collaborator, or cow-orkers, including Nominal Animal of the Future, who tends to see things a bit differently than Nominal Animal of Today, who sees things differently than Nominal Animal of Yesteryear did), I can start the examination by comparing whether the code matches the assumptions and reasoning explained in the comments.

Again, Mind Over Bugs, this time from the other side. We use comments to describe our reasoning, and this gives two separate tools. Note that such comments are orthogonal to the code: you cannot reliably infer them from the code, and they definitely do not describe what the code does; they only describe developer intent and reasoning.

One tool such comments provide is that we can now compare the reasoning to known rules and behaviour – for example, a comment above says free(NULL) is safe, and the code does rely on this; so we can take a look at man 3 free or a C standard, and check. (The C standard does explicitly say that for free(ptr), "if ptr is a null pointer, then no action occurs".)

The second is that now we can compare the code and the comments, to see if they actually match. Even the best C programmer on the planet has brainfarts – because they are human, and every single human with measurable brain function occasionally has those; perhaps more gently called "thinkos" or thinking equivalents of typos. You don't call a professor an idiot just because one in one hundred instances of "which" in their output is mistyped as "witch". They could be, but that's not a valid reason to make the categorization. Similarly with thinkos, because proper software design is complex work, and occasionally a human brain just stumbles over a detail. Often, those details are the smallest ones, so comfortable and well known that one is doubly ashamed of the error. One of my own stumbling blocks is memset(), and the order of its fill value and size parameters. Some of my colleagues think less of me because I always have a terminal open, and I check –– I even have an alias for man -s 2,3,7,5 function –– instead of being a True Professional Who Knows the Words of Power and Wisdom and Power and Truthiness and Never Reads Manuals That Are For Lesser Beings Anyway, and wing it.

Combine this with expressions that construct pointers. If you have a comment that says that the caller is responsible for ensuring a parameter is within a specific interval, the human-readable description of that function better have that assumption/responsibility very visible, or most callers just won't know about the responsibility, and bugs will ensue.

Even if the entire codebase is written single-handedly by yourself, the person that wrote a piece of code two weeks, two months, or two years ago, does not have the same context and understanding they have right now. This is because humans only stay in the same context and understanding, if they are dead. Dead people do not often write code. The reason is not actually that they're dead, it is because constructive creation is a process that affects the creator almost as much.
In less grandiose terms, when you create something new, you learn, and that changes how you do things.
Even in the minimum "I write this only once and will never look at it again" case, when most believe the comments are not needed/useful/required, the comments are really almost the only way we can detect when we learned something we previously did not that caused us to generate buggy code.

A case in point is the rubber duck debugging method, where you grab a rubber duck, any other inanimate object, or your favourite pet animal that likes to hear you talk but understands basically none of it, and describe the problem you are having. Because of how the human mind does, this act of expressing the problem in a different manner, a spoken language, affects how your mind processes the problem; and surprisingly often, about midway in your description, the parts start fitting together and you realize the solution.
So even in the case where the code is for you yourself only and is only written once and never ever read again, those comments are useful because they can provide the same functionality for your Mind that the rubber duck target does.

In a very real sense, all of this can be boiled down to the idea that your mind is just another tool, and the concepts and words it uses as its own sub-tools determine how those tools are used, we-the-minds must redefine our concepts and use words that help us solve problems and accomplish tasks. Free speech aside, relying on standards and other authorities to give us the concepts and we'll just use those, is to limit oneself to the preset toolset of those authorities. It is the intellectual equivalent of tying ones hands behind their backs.

SiliconWizard · « **Reply #9 on:** June 29, 2021, 06:00:52 pm »

I recommend the following reference: https://cwe.mitre.org/
(Which goes way beyond C and pointers, but there's a lot of stuff regarding those.)

Feynman · « **Reply #10 on:** June 29, 2021, 06:56:52 pm »

I develop a lot of safety critical firmware in C and bugs due to pointers are VERY rare. Most of the time it is just a matter of applying coding standards (e. g. MISRA) that define rules for pointer arithmetic AND enforcing them via static code analysis tools and/or reviews, of course. Additionally you can do things like letting your static analysis tool assume that every pointer passed to a function is potentially NULL and complain if there is no execution path for ptr==NULL, for example.
Switching to C++ is not a bad idea as well since it provides additional compile time checks and containers like std::array that provide boundary safe access.

SiliconWizard · « **Reply #11 on:** June 29, 2021, 07:18:48 pm »

Quote from: Feynman on June 29, 2021, 06:56:52 pm

I develop a lot of safety critical firmware in C and bugs due to pointers are VERY rare. Most of the time it is just a matter of applying coding standards (e. g. MISRA) that define rules for pointer arithmetic AND enforcing them via static code analysis tools and/or reviews, of course. Additionally you can do things like letting your static analysis tool assume that every pointer passed to a function is potentially NULL and complain if there is no execution path for ptr==NULL, for example.

That would be my experience as well. IMHO, the site I mentioned above is still worth a read; if nothing else, to actually understand *why* the coding rules you have to stick to are what they are.

Bugs due to incorrect use of pointers are of course much more common in non-critical software. Often due to the fact strict coding rules are not enforced in this case, or very rarely so.

Sticking to reasonable coding rules, plus proper static analysis and testing makes things a lot better.

AFAIK, even if that goes beyond the scope of this thread, according to a number of studies already, the *main* source of software bugs, at least software developed in professional settings with a decent process, is incorrect or lacking specifications, rather than programming errors. And that's particularly visible with safety-critical software. Not saying that programmng errors are not a problem, but those should be avoided and/or caught up thanks to reasonable experience and a proper development process. It's much, much harder to ensure specifications are correct and exhaustive.

NorthGuy · « **Reply #12 on:** June 29, 2021, 09:07:37 pm »

Quote from: SiliconWizard on June 29, 2021, 07:18:48 pm

Sticking to reasonable coding rules, plus proper static analysis and testing makes things a lot better.

When you make a decision about something, you can either

- follow rules (or commands)

or

- think

Therefore, if you follow rules, you don't think.

Thinking is necessary. Rules explicitly suppress thinking. This is why bureaucracies stifle progress.

When you develop software (or hardware alike) you should think about various possible scenarios - what happens if the user presses the wrong button, what happens if a component fails etc. You design your software in a way which provides reasonable outcomes in all possible scenarios. If you don't think about this and a situation happens which you haven't foreseen, your software will malfunction. This is the most common source of bugs, and there's no rules which can substitute thinking in such situation.

Another thing you need to do - make everything as simple as possible - data structures, transactions, interactions. If you build something overly complex you may lose control over it and then you cannot handle bugs any more - fixing bugs produces new bugs making the situation worse and worse. This is the second most common source of bugs. And this cannot be helped with rules neither. Worse yet, rules may force you towards more complex solutions and consequently more bugs - for example if you can produce a simple and elegant solution with function pointers, but function pointers are forbidden.

The third, you need to understand everything in your system. If you don't really understand how underlying hardware works, your code may work in some cases, but in others it fails. You simply cannot fix such bugs unless you go back to basics. This happens very often where people use various libraries which creates an illusion that they don't need to understand what these libraries do. This often works, but occasionally it'll bite. You cannot fix this with any rules.

Finally, there are rare events. Like interrupt happening in exactly the wrong moment. It may be ok in the development and testing, but if you give your product to many customers the event will not be so rare anymore. You need to take courage and investigate every seemingly random glitch - there may be a problem behind it. Rule based programming won't fix this neither.

If someone always gets pointer errors in C and don't know how to find/fix them, he is simply incompetent and needs more learning/education/experience etc. Rules will not help an incompetent person to do the job. He'll find a way to screw up even if all the rules are followed.

So, my advise is - forget the rules, think.

T3sl4co1l · « **Reply #13 on:** June 29, 2021, 09:30:46 pm »

Quote from: NorthGuy on June 29, 2021, 09:07:37 pm

When you make a decision about something, you can either

- follow rules (or commands)

or

- think

Therefore, if you follow rules, you don't think.

Thinking is necessary. Rules explicitly suppress thinking. This is why bureaucracies stifle progress.

This seems misguided, or perhaps not well considered?

If the rules were so strict that thinking weren't necessary, it wouldn't be a Turing-complete language anymore. Problems expressible in it, might still require some problem solving to craft, but in general the set of programs that can be written in such an environment is impractically small.

The claim is true of the example; often, the point of bureaucracy is to prevent meaningful action, to serve a broader purpose (e.g. maintaining the budget, power, freedom, etc. of the large and privileged organization). The programming equivalent is one of those toy sandbox environments, where you can pick and choose certain function blocks, but only those, and only connected in the ways prescribed. You can't write arbitrary code on such a system.

Rather, I would suggest looking at it this way. Engineering, at its most basic, is the art of satisfying constraints. There are many fields to engineer within, and there are many ways to engineer a design within each of those fields. Sometimes the constraints are broad, general, or nebulous (budget or timeline restrictions); sometimes they are specific (a reactor which must be resistant to X, Y and Z chemicals, withstand process temperature and pressure, etc.). Sometimes they are counterproductive, but the results are nonetheless impressive, if not necessarily practical (not sure what products would exemplify this, but personal projects like discrete CPUs are very much in this vein, eschewing any integrated circuits, or anything more integrated than gates and registers, say).

And it's still very much an art form. Many artists speak warmly of restrictions. Doing a painting with certain materials or methods, or removing a color from the palette, or in a certain style, etc. Other artists may be more free-form in principle, but inevitably take on skills and habits through repetition (the restrictions might well be self-imposed and subconscious).

In exactly the same way, programmers write code with identifiable styles, whether they mean to or not.

So, the purpose of these rules, is to restrict the solution domain to a subset of the complete language, which is easier to be correct within. That may be some imposition to the developer, requiring extra boilerplate, or roundabout solutions where a simpler but less reliable or verifiable method might do (e.g. employing dynamic memory or recursive functions, neither of which is required for a lot of problems), but it's not about making whole solutions impossible.

...At least, as far as I know. Hey, I've never read any of those code standards, don't take my word for it.

Tim

SiliconWizard · « **Reply #14 on:** June 29, 2021, 09:47:38 pm »

I agree with T3sl4co1l here. His post is more elaborate than what I would have done myself, so I won't add much. Just saying @NorthGuy : your post was way too extreme. I would never advocate sticking to rules over thinking. Both are not mutually exclusive. Actually, when they are, that means one of the two has been taken to a dysfunctional extreme.

NorthGuy · « **Reply #15 on:** June 29, 2021, 10:09:53 pm »

Quote from: SiliconWizard on June 29, 2021, 09:47:38 pm

Both are not mutually exclusive. Actually, when they are, that means one of the two has been taken to a dysfunctional extreme.

But they always are. Thinking leads you towards making your own decisions. Rules are decisions made for you.

For example, you cannot have a rule which forbids the use of function pointers and at the same time think whether to use a function pointer or not.

Rules are useful where it is necessary to suppress thinking. Say, you cannot decide whether to drive on the right side of the road or on the left one, this must be mandated by local rules.

NorthGuy · « **Reply #16 on:** June 29, 2021, 10:31:14 pm »

Quote from: T3sl4co1l on June 29, 2021, 09:30:46 pm

So, the purpose of these rules, is to restrict the solution domain to a subset of the complete language, which is easier to be correct within. That may be some imposition to the developer, requiring extra boilerplate, or roundabout solutions where a simpler but less reliable or verifiable method might do (e.g. employing dynamic memory or recursive functions, neither of which is required for a lot of problems), but it's not about making whole solutions impossible.

Sure. It is possible to program within rules. It is debatable whether the rules are capable of decreasing the number of bugs. It is debatable whether the benefit of the rules is worth the efforts exerted to follow and enforce them.

I have some rules which I follow too. Say, when I type, always indent everything by 2 characters. If it's not intended by 2 characters, it is not written by me. But I understand that this doesn't make my code any better and I cannot find any rational explanation for this.

Nominal Animal · « **Reply #17 on:** June 30, 2021, 01:41:48 am »

Quote from: NorthGuy on June 29, 2021, 10:09:53 pm

But they always are. Thinking leads you towards making your own decisions. Rules are decisions made for you.

If that were exactly true, how would you explain chess?

The complexity in the amount of creativity/choices/options is roughly of the same order for certain size C programs, so the comparison is apt.

(In my opinion, that all has to do with the concept of choice or option. I do not know the exact term for that, sorry. Rules that restrict choices do behave like you describe, but rules do not necessarily restrict choices in any meaningful way. In chess, the number of possible games is so large that the rules that dictate how each single piece may move, does not reduce the choice among all possible games in a manner that would matter to a human (or to even a large computer system, really). Even in the choice-restricting end of the spectrum, Conway's Game of Life has just four rules with the only choice being the initial configuration (which then absolutely dictates the evolution of the system), but has been proven to be Turing-complete – a classification which we can interpret as "able to compute any finite computable problem you choose".)

T3sl4co1l · « **Reply #18 on:** June 30, 2021, 01:50:07 am »

Another microcosm of the same principle: the utter success of digital logic.

Sure, we could craft analog computers from transistors in arbitrary connections; they might not be as stable or consistent, but the sheer richness of continuum variables and nested exponential functions would surely be extraordinarily powerful.

But good luck designing them. Such a circuit is impossible to reason about. How many ways can transistors be connected? It's factorial in the number of elements! Let alone how to get stable operation vs. temperature, voltage, consistent manufacture, etc.

So we add rules to them. Digital logic is a subset of analog, where we can use fewer rules to describe the elements. The resulting elements are far easier to reason about, and much more stable and consistent. We can use simple logic simulations, and guard-banding min/max design rules, to verify circuit operation. Instead of being challenged by mere hundreds of transistors, we can bring billions to bear!

Although I wonder how much of that remains true today; I have a suspicion that FinFETs aren't nearly as good as their larger predecessors, and as a result the design rules and constraints are far more complex than they used to be. But in parallel, the development tools have also advanced, and can supply enough intelligence, and compute power, to solve those problems, even on such a vast scale.

Ironically, it may well come to be, that that process of increasing design complexity continues, and we get rules and constraints so broad that they're basically describing individual transistors themselves. Then we'll have structures, and solutions therein, basically like hardware neural nets: a sea of transistors, connected seemingly randomly, that nonetheless seems to give quite good results (of course, extensively testing its function space remains just as impossible as it was decades ago already; bugs won't ever be going away). Indeed we have such hardware already, though I don't know if it's implemented as digital cores, or if they're working with analog networks (or relevant elements like memristor arrays).

So, from this example -- I'm not sure how one can see the development of digital logic, and not appreciate that there can be some rules, which when applied, greatly facilitate development for various and sundry purposes. It doesn't take any stretch to imagine that, in other fields of study, there might be other rules that can be applied, to similar benefit. I'm not sure why one would submit a non sequitur as a counterexample!

Tim

NorthGuy · « **Reply #19 on:** June 30, 2021, 02:27:45 am »

Quote from: Nominal Animal on June 30, 2021, 01:41:48 am

If that were exactly true, how would you explain chess?

Chess is a game. It is abstract and defined by rules. If you break rules, this is not a chess game anymore and it doesn't make any sense to play it. But if a player could move the chess pieces as he wish, but his opponent had to follow chess rules, do you think his opponent would have any chance against him?

Our life is not a chess game. It is not restricted by rules. The reality we live in is restricted by laws of physics. You cannot change them. You must use them as they are. But if you could defy the laws of physics and change them for your project as you wish, don't you think you would be the best engineer ever?

Bureaucratic rules are not the same as laws of physics. They're not imposed on us by reality we live in. They're pushed on us by bureaucrats, for various reasons which have nothing to do with our engineering goals, but rather serve the bureaucrat's purpose helping them to escape responsibility, make their workers deplorable etc. If you must follow the programming rules by the decree of your employer, there's no choice for you (except if you want to to quit). But the same thing apply as with the laws of physics or with rules of chess - if you can defy these rules you will be able to do much better job.

NorthGuy · « **Reply #20 on:** June 30, 2021, 03:26:12 am »

Quote from: T3sl4co1l on June 30, 2021, 01:50:07 am

So we add rules to them. Digital logic is a subset of analog, where we can use fewer rules to describe the elements. The resulting elements are far easier to reason about, and much more stable and consistent. We can use simple logic simulations, and guard-banding min/max design rules, to verify circuit operation. Instead of being challenged by mere hundreds of transistors, we can bring billions to bear!

These are not rules, these are engineering decisions. Nobody forces car manufacturers to use round wheels. Wheels are round because square ones would work much worse. Nobody forces us to use digital logic. It is just much more tolerant to noise because there's only two levels.

We're talking about programming rules which force people to write programs in certain way while there's no technical reason behind the restriction. The theory behind this is that the rules help people avoid bugs which makes software less buggy, but there's no empirical evidence that this helps. Modern software is not remarkably efficient and bug free. I'm convinced that it is possible to write better software - less bloated, works faster, is faster to develop and easier to maintain. I'm sure many people do this. But they're rare and far between.

T3sl4co1l · « **Reply #21 on:** June 30, 2021, 03:36:58 am »

You seem to disregard the possibility that there could ever be a rule which is beneficial. Can you give examples of rules, used in software design, that you find restrictive or indeed useless? And can you provide evidence and arguments to defend them as such?

I'd be really interested to see such arguments; but if it's just a knee-jerk "I can do anything I want, no one can tell me what/not to do!", that's pretty weak.

Tim

Feynman · « **Reply #22 on:** June 30, 2021, 06:14:21 am »

Quote from: NorthGuy on June 29, 2021, 09:07:37 pm

Quote from: SiliconWizard on June 29, 2021, 07:18:48 pm
Sticking to reasonable coding rules, plus proper static analysis and testing makes things a lot better.
So, my advise is - forget the rules, think.

Well, I couldn't disagree more

For our example (C pointers) the tools for more or less avoiding bugs altogether are out there and well proven. And whether you like it or not these tools are rules and their enforcement with things like static code analysis. Thinking about problems that have already been solved is just a waste of mental resources most of the time. In a professional environment there is no shortage of problems your mental resources are better invested in. And no one says that you have to follow rules 100% of the time without thinking. That's what deviation processes are for. As a matter of fact nothing taught me more about the C language than the rationale and enforcement of rules, e. g. thinking about a lint warning ("Why is there a warning?", "Does the warning make sense here?", ...).

Car traffic without rules and everyone just "thinking" would be a total mess, even if only the smartest people where participating. The worst code I'm seeing in real life is often a result of people trying to be smart, respectively.

So, if it's about live or death (not exclusively), you better do not forget the rules

Siwastaja · « **Reply #23 on:** June 30, 2021, 08:00:33 am »

In-house programming rule sets often/usually/sometimes (pick your favorite) do have a few items that indeed do not make a lot of technical sense, or are overgeneralizations.

Even the C standard which is the "ultimate" rule book for C, has some rules which developers have decided to ignore as real-world needs win over theory.

Bad rule sets are regularly overridden by those experienced enough programmers who indeed "think". Rule sets are still needed to guide less experienced programmers who understand well enough to be clearly beneficial (and not damaging) to the project. Analogy would be a bank teller who definitely is beneficial for the business and does their job well but still is not allowed to go to the big vault with all the cash in it. It's a good rule because it only limits something that is not required in fulfilling the actual task, but exposes a risk of damage if done. Then such rule is easy enough to enforce.

A very typical example of a bad rule is forbidding goto in C, despite the fact that goto is useful, clear and documenting (due to the label) in breaking out of 2D nested loops or error handling; OTOH, working around the rules usually generates code of higher bug risk, for example generating complex loop conditions or temporary "do break break" variables. I have also seen rule sets where the ?: operator is disallowed, despite the fact very few use it "for fun" without a good reason already. The ?: operator has potential to reduce code copy-paste which is always a massive maintenance risk.

But I do think most of the rules in most rule sets do more good than harm (pretty handwavy statement, though).

Finally, if you are in the transcend position of Understanding Everything, then you can easily see which rules are not preventing you from thinking - these will be the Good Rules, and which are - these will be the Bad Rules. A good rule set then won't limit you the slightest, although if you are already Perfect^tm, it's just unnecessary - but not harmful.

DiTBho · « **Reply #24 on:** June 30, 2021, 10:06:52 am »

"Goto" is allowed, but only in critical code, where it makes sense.
Abuses are prone to produce spaghetti code, terrible for the ICE.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Mind over bugs: C pointer edition (Read 12038 times)

Share me