I am developing a straight packet sniffer like tcpdump, with a couple of additional threads that also implement packet logger since it is useful for network traffic debugging.
That sounds very much like a producer-consumer setup.
The main problem I've found in such cases is handling memory use sensibly. Logging a packet can take a relatively long time, and they tend to occur in bursts, so you do want to use a buffer to cache them.
Each IP packet can be up to 65535 bytes long, but typically they are much shorter. Depending on your read/receive mechanism, you almost always will do a memory move from the receive buffer to a queue item structure (that will include additional fields,
CLOCK_REALTIME timestamp at minimum). In some cases, you have a classifier thread, that examines each received packet, and either discards them, or forwards them to a processing queue/heap. (A heap is often useful if the packets have different logging/processing priorities; a priority or arrival-time min-heap is surprisingly powerful for this.)
The memory problem is that since some of the packets need to be kept longer than others, you get severe memory fragmentation. (In particular, it is important to remember that in most operating systems, chunks of memory of up to say 120k, allocated via
malloc(), are not returned to the OS when
free()d, just kept in the allocation pool for that particular process. This excarberates the memory fragmentation and memory use for a logger that just mallocs and frees each packet.)
There are several approaches used to mitigate this effect. You can use memory "caches" or "slabs", that contain fixed-size units. For example, you could have 1.5k (covering typical Ethernet 1500-byte payload packets), 3k, 6k, and so on, with earliest unused unit in each slab used first, with a separate index-to-slab slot mapping, so that an unit can be moved within the slab if necessary to significantly shrink it. (Yes, to access a slab, you'll end up dereferencing a "pointer" twice, but that cost is acceptable considering the memory use efficiency. IIRC, I first saw this in classic Mac OS, version 7.5, I believe.)
(Many developers have powerful workstations with SSD storage, and often miss how their applications behave on a more typical machine, especially one with a spinny-disk HDD; and their application can become a swap-happy memory hog if nothing happens for long enough to the HDD to spin down, and then a stream of logged packets come in -- and due to the memory fragmentation, the OS cannot keep up with the memory use and I/O load at the same time, bogging everything down, and in the worst case, invokes the OOM killer... Yuk.)
Compared to the memory management details (to make the logger truly robust), the threads are almost trivial!
If you are implementing a transparent filter/logger, say on a small Linux SBC with two 1GbE interfaces, then surprisingly the memory management becomes simpler, as the underlying protocol is allowed to drop packets due to congestion. (When logging, you really should try to avoid that happening.) Such bridge gadgets are especially useful in analyzing exactly what internet traffic some devices have, and also to act as hyper-paranoid firewalls/loggers for experimental networks (say, isolating a subnet with devices that use customized IP stacks, or IP stacks you developed yourself).
This is exactly the kind of task where software development really becomes software engineering. You cannot rely on standards and assumptions on what "should" happen or almost always happens; it needs to be careful and robust. For debugging reasons, you need to check for errors even if those errors rarely occur in practice; knowing where reality first differed from expectations when things go b0rk is important. Otherwise, your service is like a voltmeter that draws a significant current from what it is measuring, affecting the system being logged so badly that the results are completely irrelevant.
On the other hand, this is an excellent opportunity to exercise proper engineering principles, from source control (are you using Git yet?), to unit testing. For example, when you implement the packet management stuff (as header and source files), you can write a test program that stress-tests it using generated packets, and measures the memory use/fragmentation, and speed. If you write a new one, you can use that test program to verify the new one is actually better. Similarly for the thread management and other stuff.
Do not forget to keep your documentation up to date! In the code, the comments should describe your intent, the overall purpose of the code, and not what the code does. We can read the code, so we can tell what it does; but we humans won't know
why, unless the developer explained the reasons or described the algorithm in comments.
Again, if you run into problems or out of ideas on how to implement some specific scenario, do ping me; I'd be happy to help anyone interested in writing robust POSIX C code.