If you only need to run a few 1MB files trough it once, does it really matter if it takes 100ms longer and 2MB more RAM in exchange for the hours more work to program it?
Python's built-in
CSV parser is particularly useful when interfacing to spreadsheet programs in a generic fashion; the
DictReader and DictWriter in particular can make short work of many types of problems.
I do have roughly equivalent functionality with C code I've written myself (and that too handles the RFC 4180 oddish double quote rule:
foo "bar" baz quoted is
"foo ""bar"" baz"), but it takes a lot more code; plus, the Python code is simple enough to give to a new/non-programmer user, and they can do minor alterations to it themselves. Not so with the C code.
Plus, you can add a PyQt5/PySide2 Qt GUI interface on top of it, and it'll work as-is on all major OSes, with just a couple of dozen lines of code; and code that is simple enough for nonprogrammers to understand.
That is a perfectly valid niche. I claim so, because I've used it that way myself, successfully.
(A lot of web-based services nowadays can output (authorized) parts of their database in CSV format, by the way. Using a CSV helper program – one that can be run as-is on different OSes – can munge those in a format that makes it vastly easier for not-very-technical users to do their work on it in a spreadsheet program. And that is not just papering over "lack of knowledge", either; it can apply policy, like anonymize data, and even do it in a reversible way. Such things are sometimes required by politics or internal guidelines – if by law, it must be included in the service itself – and being able to implement such quick tooling can make a big difference in practical operations office culture. Just remember to add Branding, so your work isn't forgotten the moment it is no longer needed.)
Many classical (as in using potential models and not quantum mechanics) molecular dynamics simulations produce snapshots of the simulated system in a text-based format. When simulating large systems, they can literally spew this data like from a firehose. It turns out that with RAID storage systems, the bottleneck isn't I/O,
but I/O parsing routines in the standard C library. I've written parser routines for that case too, working thus far at full I/O rates.
Of course, the proper solution to that was to have the simulators output the data in more concise format, preferably using as little CPU time as possible. One such I wrote for Fortran, that allowed storing the data in binary locally (and thus reducing network traffic, speeding up the simulation itself slightly), with Fortran and C tooling that allows slicing that data (across multiple files) in time and space. No, I wasn't clever enough to name it Tardis. (Although "Istard" would've been even more my style.) The simulator was written in Fortran, and that particular simulation worked with almost 200 million atoms. That's roughly 5 GiB of data per "frame" (snapshot of simulation at one moment). I think the dataset, when compressed, did fit a 2 TiB drive.
Sometimes you get to choose the language you use, sometimes you don't because you are maintaining or developing a mature project.
The cost of time spent to develop a solution is often overlooked, but it really is important, and can go either way. A funky addition of a couple of hundred lines of Fortran 95 code can make a project possible, or turn it from a month-long research project (per run) to a week-long one; but you might need a semi-crazy programmer to know whether it is even possible, and how. Letting the programmer choose their favourite language instead of the most appropriate one may turn a one-day project into a multi-month one, because fun is fun, and sometimes the objective is forgotten when you're having too much fun.