In addition to the systematic checks, I like
additional tests based on pseudorandom number sequences, specifically using the
Xorshift family of PRNGs, because they're very fast and very 'random'.
The basic idea is that you use whatever randomness sources you have to generate a valid initial seed state (for Xorshift family of functions, the only invalid seed state is all zeros), then fill in the memory with the sequence starting at that state. You can then verify the memory contents by recalculating the same sequence (restarting at the same seed state).
The pseudorandom sequence tests should be considered statistical as opposed to deterministic, in the sense that they will never report a false error, but may not catch all possible errors, due to their pseudorandom nature and typically linear access pattern. An even more complex test is to use another pseudorandom number sequence to 'randomize' the memory access pattern itself, so that both the memory address and the value written there are random, but repeatable and thus verifiable.
The benefit from pseudorandom sequence tests is that the longer you run them, the more reliable the result (no errors) is. It is also important to note that certain architectures (including x86-64 aka AMD64) have memory access instructions that bypass all caches, "nontemporal" loads and stores; then, it is important to test both cached and nontemporal accesses; preferably even mix them.
For example, on commissioning new server hardware (or my own desktop machines), I habitually run similar memory tests (memtest86 and variants) for at least 48 hours straight (usually over a weekend!). If the hardware does not pass, I cannot rely on it, and will fix it or get something else instead. Before starting big development based on a specific microcontroller, I wouldn't mind testing a few in parallel for a week or more, just to get better understanding on their reliability.
Finally,
cosmic rays can occasionally flip bits or crash processors even when there is nothing physically wrong with the processor. So, a single error on a single unit is not indicative of anything, and one must take a rather statistical approach to reliability and robustness here.
(Apologies to everyone who already knows all of this. I just wanted to make sure these things were mentioned in the thread, since this thread pops up in web searches for "ram testing" "bit by bit" "byte by byte".)