I wonder if in this case a ground-referenced input would work ok. After all, he isn't trying to amplify an audio signal. He just needs to *detect* the presence of a 4K audio carrier that could be sine, square, or whatever.
It makes no sense to design inferior circuit, especially if proper design does not require any additional engineering or expensive components. If you want to save on two resistors then maybe simply reconsider whole project? Clipped signal contains lots of harmonics. It could be so that such "clipped circuit" will detect not only 4KHz but also 2nd harmonic of clipped 2KHz.
I agree biasing input at 0V is not optimal, but asking "why not?" is a valid question.
The design could be simplified greatly, especially if the dynamic range doesn't need to be great and the background noise levels are low. If no electronic filtering or AGC is required, it may work with the input referenced to 0V. A Helmholtz resonator could be placed in front of the microphone, to act as a basic filter, then a comparator could convert the signal directly from the mic (no amplifier required) to a nice square wave, which could be counted by the micro-controller.
Unfortunately, I don't think such as minimalist set-up would suit the original poster's requirements.