What is a pointer in C?No, we're not interested in how the standards define them; we are only interested in getting a maximally useful and workable concept here. For the reasons outlined in the initial post, and because during the last two decades or so my own C work product contains significantly fewer pointer-related bugs than average open source code, and the only difference I can tell between my approach and others is the concepts used, I must assume that is what makes the difference.
The most useful definition I have seen, is a three-way split:
function pointers,
void pointers,
object pointers.
- Function pointers are very powerful, but not bug-prone for whatever reason. So, the standards' definitions for these seem to work fine.
An important detail, however, is that systems like Linux do not actually use the C standard rules for function pointers, but a much stricter subset as defined by POSIX.
In particular, POSIX specifies compatibility between function pointers and non-function pointers, especially void pointers. This makes things like dlsym() work without silly casting shenanigans required by a strict reading of the C standard.
(You'll see similar stuff regarding unions and type punning. Some insist that things like struct sockaddr as used by POSIX bind() to bind a socket to an address family specific address is not compliant to the C standard, and cannot work correctly according to a strict reading of the C standard. Be that as it may, practice trumps theory and standards every time, and the massive amount of POSIX C that works very well indeed (with very few bugs related to struct sockaddr, I might add) indicates that that which needs "fixing" is not struct sockaddr, but the C standards instead.)
(Note that in embedded environments where a curious mix of freestanding C and C++ is used –– Arduino, for example ––, even though POSIX is nowhere near, the compilers –– GCC and LLVM/Clang in particular –– always provide the POSIX-compatible pointer behaviour instead of the strictly-by-the-standard you-may-not-do-that-even-if-it-would-work-well-in-practice silliness insisted by Language Lawyers. So, if you like, instead of "POSIX" you can substitute "GCC and LLVM/Clang" at least.)
- Void pointers are the idiot siblings of all pointers. Using the stricter POSIX definitions, void pointers –– void * –– really boil down to the address they point to, plus optional qualifiers for accessing that address (const, "I promise I won't try to modify"; volatile, "undetectable other things may modify this at any point, so make no assumptions about its value, compiler", in particular); and both object pointers and function pointers are compatible with void pointers.
In my experience, the best approach to void pointers is to consider them the most basic import/export form for function and object pointers.
You do not use them to access anything; you use them only when you need to convey the address to be accessed without any further assumptions about it (except those aforementioned qualifiers that involve how and not what).
Simply put, you use void * only when you cannot reasonably use an object pointer ot a function pointer, and treat it as the transport format. (This has serious implications to what a programmer must consider when doing this; I'll discuss those a bit further down.)
- Object pointers are those that point to data.
An object pointer is not just an address one can use. If that is all it were, it would be a void pointer instead.
Object pointers have non-void type, and that type specifies the size of the memory region referred to, indirectly (due to alignment restrictions) assumptions about the pointer itself, plus the type qualifiers like const and volatile related to the manner the region may be accessed.
In visualization terms, only void pointers are reasonably described as arrows or spears or javelins; object pointers are more like buckets or boxes, with label stickers describing what the buckets can hold (think of "this bucket can hold acids" or "only for foodstuffs"), and handling stickers ("fragile") corresponding to const/volatile access qualifiers. Function pointers can be thought of as little sticky notes describing where to go, with optional notes as to how to present oneself (types of parameters passed) and what to accept bringing back.
One can say that
almost all pointer bugs occur when object pointers point to something other than intended. To fix this, we find a way to think of and use object pointers, so that we more easily detect the cases where mis-pointing may occur.
The first step in this is to realize that the error does not occur when you
use ("dereference", is the term) such a pointer, even though this is exactly where your debugger and other tools will direct your focus at.
The true error occurred when the pointer value was first constructed.(Well, almost. The exception to the case is use-after-free bugs, since there the pointer was absolutely fine, just left forgotten when the region of memory it points to was freed, quite unexpectedly from this pointers' point of view. Leeroy Jenkins of pointers, shall we say? Except I lean towards thinking that the error did occur when this pointer was constructed even in this case, by not having a mechanism to mark this pointer invalid if/when the memory region is freed, or by ensuring these pointers are not accessible anymore when the memory region is freed. If you look at C garbage collectors like
Boehm GC, they do exactly this; and Boehm GC happens to have papers showing that it does this deterministically. So it is doable, and not as hard as one might think.)
Typical bugs when using pointersThere are several classes of bugs related to pointers, but the most common ones are
off by one,
out of bounds, and
use after free.
- Off by one
Perhaps the most common off-by-one bug is forgetting that strings in C are just sequences of non-nul char/unsigned char/signed char, terminated by a nul, '\0', and that nul terminator must be accounted for in the size needed to store that string.
Another common case is when backtracking, say removing trailing whitespace from a string, and forgetting to check if you are already at the very first char, and backtracking past the beginning.
- Out of bounds
This is the nastiest class of pointer bugs in my opinion, because they are so hard to track down. Indeed, the core reason why proprietary kernel modules are marked "tainted" by the Linux kernel is not because of some ideological zealotry, it is because such modules can accidentally dereference pointers and scribble over any data structures at all (because Linux has a "monolithic" kernel architecture, one where kernel modules and drivers work within a single domain without boundaries; as opposed to "microkernel" architectures, where modules and drivers are protected from each other, paying a tiny performance cost, but moreso tying human developers to inter-domain information passing techniques –– and the Linux kernel developers like to be free to experiment, update, even replace stuff; that's why they won't limit themselves to a fixed binary interfaces within the kernel itself, for example). You see, even when you after hours of work discover that the machine or process crashed bcause "ah ha! This memory was garbled!", you have ZERO indication of exactly what code did the scribbling-over! You found the murder scene, but zero evidence of the murderer.
The simplest verifiable example that is obviously dangerous that you cannot get your compiler to warn about was posted in another thread here by ataradov. Tweaking it a tiny bit to more closely resemble the typical bug I see in real life, it can be written as
extern int bar(volatile char *p);
int foo(int i)
{
volatile char buffer[20];
return bar(buffer + i);
}
Above, buffer is volatile only to stop the compiler from making assumptions about it. The bug occurs, because we construct a pointer buffer + i while knowing that i could be any value at all representable by an int, with no guarantees that it is between 0 and 19 inclusive, that the code as written implicitly assumes.
I can only assure you that if you have written sufficiently low-bug and/or debugged sufficient amount of C code, your eyes detect the possibility/likelihood of that code having a bug within a fraction of a second. It is just that common, you see; it doesn't even involve conscious thought. To me, it is analogous to seeing a pot hole in the road.
If you do find a C compiler that supports a compilation/warning flag that makes it complain or warn about that code, let me know: I haven't found one for GCC or Clang (till 2021-06-29 at least).
- Use after free
Unlike the above two bugs, this is kinda-sorta not the pointers own fault. It occurs, when the pointer points to memory that has been freed (or otherwise made inaccessible, say a memory map since unmapped), and is used ("dereferenced") afterwards.
The programming technique called poisoning is useful in helping to detect these bugs. It simply means that before the memory is freed, the contents are set to easily detected, preferably invalid, values – poison. That way, an use-after-free bug that does not lead to an immediate crash of the process, can be detected by the accesses yielding "poisoned" values. (Remember, most C library implementations do not make freed memory inaccessible, only marks it internally available for subsequent allocations. Only much larger continous allocations on architectures providing virtual memory tend to use memory mapping techniques, so that freeing such an allocation causes a subsequent access to raise a segment violation – in Linux/Unix/BSDs, the SIGSEGV signal. So, poisoning is a simple, cheap, practical indicator tool – but not perfect: if the memory is re-used by subsequent allocations, it is likely the poison has been replaced with non-poison data, making the underlying bug much harder to detect. Since the allocation pattern often varies between debug and non-debug builds, these often belong to the Heisenbug class: those that vanish when you try to debug or instrument them.)
The practice of NULLing pointers when they are freed is derided by some as "wasteful" or "unnecessary", but it does turn a sub-class of use-after-free bugs into NULL pointer dereferences, and the latter are easily detected by most C compilers at runtime; and since the point where the dereference occurs tells us which pointer was used, we have immediately discovered the murder weapon, and are much further in the debugging process than we would otherwise be. In practice, this looks like
free(ptr);
ptr = NULL;
Yes, the overall cost is clearing the pointer value, and in compiled machine code on architectures like x86, AMD64, and ARM64, the cost is effectively zero.
This does not help with the large subset of use-after-free bugs where the pointer used is a cached copy of an earlier value, however.
NULLing pointers is not just a bug-combating technique, however. In POSIX C, for example, a function reading and processing an input file line by line can be written as
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
#include <errno.h>
#define SEE_ERRNO -1
#define I_ATE_IT -2
int for_each_line(FILE *src,
int (*handler)(long linenum,
void *base, size_t size,
char *line, size_t linelen,
void *context),
long *linenum_at,
void *context)
{
if (!src) {
errno = EBADF;
return SEE_ERRNO;
} else
if (ferror(src)) {
errno = EIO;
return SEE_ERRNO;
}
long linenum = 0;
char *line = NULL;
size_t size = 0;
/* If the caller is interested in the linenum, or has provided us with an initial number
(lines consumed before the first one), use that. */
if (linenum_at)
linenum = *linenum_at;
while (1) {
/* Read a new line, without line length limitations (other than avaiable memory). */
ssize_t len = getline(&line, &size, src);
/* len == 0 should not happen, but if it did, it would indicate end of input.
len < -1 should not happen, but if it did, it would be a C library bug.
We differentiate between all these cases by examining feof(src) and ferror(src).
*/
if (len < 1)
break;
/* The logic here is that we increment linenum whenever we read a new line.
So, if linenum_at == NULL or *linenum_at == 0, the first line read will be
linenum == 1. This is just one possible convention; you do you. */
linenum++;
/* If the caller is interested, then keep them up to date.
Volatile just tells the compiler to stop trying to be clever about it. */
if (linenum_at)
*(volatile long *)linenum_at = linenum;
/* Both GCC and Clang – the C compilers I prefer to use – generate slightly
better code for pointers as opposed to pointer+offset expressions.
So, taking care to avoid off-by-one errors, we use pointers from here on end. */
char *ptr = line;
char *end = line + len;
/* Trim off trailing whitespace, including newline characters.
Note the care taken to avoid off-by-one errors.
Also, the cast *is* necessary for isspace(), see man 2 isspace for details.
*/
while (end > ptr && isspace((unsigned char)(end[-1])))
*(--end) = '\0';
/* Trim off leading whitespace, if any. */
while (ptr < end && isspace((unsigned char)(*ptr)))
ptr++;
/* Do not bother passing empty lines to the handler function;
and if there is no handler function specified, we're done with this line. */
if (ptr >= end || !handler)
continue;
/* If the first non-whitespace character of the line is a # or ;,
we treat the entire line as a comment, and won't pass it to handler. */
if (*ptr == '#' || *ptr == ';')
continue;
/* Let handler handle the line now. */
int result = handler(linenum, line, size, ptr, (size_t)(end - ptr), context);
if (result == SEE_ERRNO) {
/* Error; abort. Return errno to the caller. */
const int saved_errno = errno;
free(line);
errno = saved_errno;
return SEE_ERRNO;
} else
if (result == I_ATE_IT) {
/* The handler took responsibility for the dynamically allocated buffer. */
line = NULL;
size = 0;
} else
if (result) {
/* Error; pass return value to caller. */
free(line);
return result;
}
}
/* The line buffer is no longer needed.
Also, free(NULL) is safe and does nothing, so no need to check it first. */
free(line);
/* We could add
line=NULL;
size=0;
here, but since the rest of the function does not look at them at all,
I shall omit it here. Just to show that rules of thumb are just that. */
/* Did we just get an I/O error instead of just an end of input? */
if (ferror(src) || !feof(src)) {
errno = EIO;
return SEE_ERRNO;
}
/* No errors, everything fine. We return a nice zero. */
return 0;
}
If the handler function returns I_ATE_IT, it means that it decided to reuse the entire dynamically allocated region containing the line, starting at base, having size size. (context is just a custom parameter not often needed; it is there only so that if the original caller has some context it wishes to pass to or share with the handler, it can do so without using global or thread-local variables.)
If we had passed the handler only ptr, it could not do that: you cannot pass ownership/responsibility of a dynamically allocated region of memory using a pointer that points to somewhere inside that region, because we don't have a way to determine the region (or even its base address) from the pointer. That's why we pass the base and size of the dynamically allocated region to the handler, too.
Because getline() allocates new dynamic memory buffer whenever the pointer is NULL and size zero, we only need to NULLify the pointer and set the size to zero to keep going. While this case may look so very contrived if one thinks of it only as an example of why NULLing a pointer inside a function is sometimes necessary for correct operation, take a step back and look how useful, simple, but yet powerful the function is. It tracks line numbers, gives the handler the ability to grab ownership of the buffer if it wants, handles comment lines and removes leading and trailing whitespace.
I wrote the above function using the concepts I talked earlier in this post, so to evaluate those concepts, look at the code. And try to find a scenario where it could bug out (except when given a bad FILE state to begin with, or by having handler() contain a bug).
How to avoid pointer bugs?The key, I claim, is to realize that
the bug does not occur when you use ("dereference") a pointer, but when you construct the pointer in the first place.
(Ignore technicalities and language lawyerism for a second. The context of that statement is the human understanding of how pointers behave, and how to think in ways that help you create fewer bugs than you would if you relied on pure language lawyerism and only on technically correct, impeccable definitions and statements. Minds are squishy, and need to be treated on their own terms.)
That initializing/setting/constructing expression is what you need to examine. You cannot do "security" checks later on, because the information on what the pointer
should be, is already lost. Security cannot be bolted on: it is either integral to the process, or it does not exist.
Checks like
if (!src) in the above function do technically check if
src is NULL or not, but they are not bounds checks: they are just sanity checks, intended as an extra check to catch the most stupid pathological (as in "sick") cases. (A bounds check is one that is aware of the extent of the target to be accessed, and verifies that the access is within those bounds. A sanity check is a check against never-useful/workable/valid inputs; they only catch the insane cases that cannot ever work.)
I myself use sanity checks mostly because they make
testing easier. (Okay, and because I like the belt-and-suspenders approach; but chalk that last one to being willing to pay the tiny extra cost just to reduce the likelihood of getting bit by a bug. You could say I have a bit of the programming bug equivalent of arachnophobia, I guess. It does not mean my house has fewer arachnids than anyone else, it's just that I'm willing to go a bit further in trying to avoid encountering any than what most consider "normal" or "reasonable". Actually, I do have a bit of arachnophobia, but I actually like spiders, because where I live there are no spiders harmful to humans in any way, and they keep mosquitoes and gnats and other blood-sucking pests I hate in check. So, I like spiders being close by, I just have an unfortunate atavistic reaction to seeing them. With programming bugs, I may be overcompensating and overblowing the importance of trying to fix bugs and create less of them, and try to make ones code
detect and
deal with unexpected input or errors, instead of letting them be expressed as unwanted/unexpected/unexplainable behaviour – bugs.)
Remember, because pointer bugs are caused at the moment of their construction and not when the pointers are used/dereferenced, such sanity checks mean that if I do find a bug, the sanity check has already ruled out whole classes of causes (due to preceding code) by that single, cheap, sanity check. Furthermore, things like accidentally passing a never-valid value (say, a NULL pointer), always caught by the sanity check, suddenly transform from catching a bug into reporting to the caller that they used invalid/unacceptable parameters. No bugs necessarily involved, as it turns say a NULL pointer from a bug, into explicitly ignored or rejected input value. Think of the
free(NULL) case before you make your mind about that.
The extension of this basic idea is to be suspicious of any expression constructing a pointer value.The cases where pointers are used/dereferenced, are
fait accompli: a done deal, a fact of life; and no matter how deeply we would examine that part, we'd gain zero additional information whether that access is safe and non-buggy or not.
Of particular interest is whenever we convert a void pointer to an object pointer or a function pointer. (This is something you do and consider if you start treating void pointers as an import/export method for pointers you cannot express with better fidelity, as I mentioned much earlier.)
There is not much we can do, programming-wise, at that point to check the pointer; the only thing I can think of is to make sure it is sufficiently aligned per the hardware and ABI alignment rules, but because C does not really have existing tools to express such checks in a portable manner (like say a built-in
is_unaligned(type-qualified pointer) operator or function), it is not useful to try and think of how to achieve that (beyond perhaps a simple binary AND mask on the least significant bits of the address, with the mask being a compile-time constant).
Instead, we must turn to our fellow developers, and start discussing what non-program-expressed guarantees the caller (or whoever provides us with the void pointer in the first place) can provide us with
and whether they could instead pass us more information.
See how that logic meshes with the
for_each_line() function implementation above, especially what we provide to the callback
handler() function on every call?
A typical knee-jerk reaction from C programmers to that, is fear/dislike of passing that much more data, and consider it a waste of resources. However, for example on AMD64 on Linux (using AMD64 SysV Application Binary Interface, i.e. hardware calling convention), all six arguments to the handler function are passed in registers (rdi, rsi, rdx, rcx, r8, and r9, to be specific, with the return value in eax (the 32 least significant bits of the rax register; also used to hold the function pointer value in most patterns generated by clang/llvm and gcc), and the added register pressure does not affect the efficiency of the function in any measurable scale, since the bottleneck here is always the
getline() standard library function.
In other words, if you
felt any dislike at adding two seemingly superfluous parameters to a callback function, just so that the callback could *safely* grab complete ownership and responsibility for some dynamically allocated memory, you need to start fighting against your feelings or intuition, because it is demonstrably wrong here.
And herein lies the entire Mind Over Bugs trick: we don't look for outside assistance to help combat these bugs. We use our own minds to do so, by retraining ourselves to think in terms where such bugs become visible – even glaring! –, and thus easier to deal with, and hopefully much, much rarer as a result. Assuming we do care about bugs in the first place; many programmers don't, because they don't affect their pay slip either way, and bugs are dull.
That does not mean we become immune, though. I am fully aware that there may be a pointer-related bug even in the above
for_each_line() example function above, I am just confident that its likelihood is low, based on past experience and statistics. But, because I am not certain, I used lots of comments explaining my reasons for the key expressions, so that if anything gives me pause (or pause to any of my colleagues, collaborator, or cow-orkers, including Nominal Animal of the Future, who tends to see things a bit differently than Nominal Animal of Today, who sees things differently than Nominal Animal of Yesteryear did), I can start the examination by comparing whether the code matches the assumptions and reasoning explained in the comments.
Again, Mind Over Bugs, this time from the other side. We use comments to describe our reasoning, and this gives two separate tools. Note that such comments are orthogonal to the code: you cannot reliably infer them from the code, and they definitely do not describe
what the code does; they only describe developer
intent and
reasoning.
One tool such comments provide is that we can now compare the reasoning to known rules and behaviour – for example, a comment above says
free(NULL) is safe, and the code does rely on this; so we can take a look at
man 3 free or a C standard, and check. (The C standard does explicitly say that for
free(ptr),
"if ptr is a null pointer, then no action occurs".)
The second is that now we can compare the code and the comments, to see if they actually match. Even the best C programmer on the planet has brainfarts – because they are human, and every single human with measurable brain function occasionally has those; perhaps more gently called "thinkos" or thinking equivalents of typos. You don't call a professor an idiot just because one in one hundred instances of "which" in their output is mistyped as "witch". They could be, but that's not a valid reason to make the categorization. Similarly with thinkos, because proper software design is complex work, and occasionally a human brain just stumbles over a detail. Often, those details are the smallest ones, so comfortable and well known that one is doubly ashamed of the error. One of my own stumbling blocks is
memset(), and the order of its fill value and size parameters. Some of my colleagues think less of me because I always have a terminal open, and I check –– I even have an alias for
man -s 2,3,7,5 function –– instead of being a True Professional Who Knows the Words of Power and Wisdom and Power and Truthiness and Never Reads Manuals That Are For Lesser Beings Anyway, and wing it.
Combine this with expressions that construct pointers. If you have a comment that says that the caller is responsible for ensuring a parameter is within a specific interval, the human-readable description of that function better have that assumption/responsibility very visible, or most callers just won't know about the responsibility, and bugs will ensue.
Even if the entire codebase is written single-handedly by yourself, the person that wrote a piece of code two weeks, two months, or two years ago, does not have the same context and understanding they have right now. This is because humans only stay in the same context and understanding, if they are dead. Dead people do not often write code. The reason is not actually that they're dead, it is because
constructive creation is a process that affects the creator almost as much.
In less grandiose terms, when you create something new, you learn, and that changes how you do things.
Even in the minimum "I write this only once and will never look at it again" case, when most believe the comments are not needed/useful/required, the comments are really almost the only way we can detect when we learned something we previously did not that caused us to generate buggy code.
A case in point is the
rubber duck debugging method, where you grab a rubber duck, any other inanimate object, or your favourite pet animal that likes to hear you talk but understands basically none of it, and describe the problem you are having. Because of how the human mind does, this act of expressing the problem in a different manner, a spoken language, affects how your mind processes the problem; and surprisingly often, about midway in your description, the parts start fitting together and you realize the solution.
So even in the case where the code is for you yourself only and is only written once and never ever read again, those comments are useful because they can provide the same functionality for your Mind that the rubber duck target does.
In a very real sense, all of this can be boiled down to the idea that your mind is just another tool, and the concepts and words it uses as its own sub-tools determine how those tools are used, we-the-minds must redefine our concepts and use words that
help us solve problems and accomplish tasks. Free speech aside, relying on standards and other authorities to give us the concepts and we'll just use those, is to limit oneself to the preset toolset of those authorities. It is the intellectual equivalent of tying ones hands behind their backs.