@BrianHG, how we would get an accurate measurement of the audio envelope with only positive-responding inputs to an ADC in a uC? Especially when audio waveforms are notoriously asymmetrical. I completely agree that here in the 21st century we should as much as possible in software. But I am not seeing how to do that with your average DC ADC?
And, sampling the raw waveform would require a much higher sampling rate to accurately catch short peaks which would require a faster (more expensive) ADC and/or uC would it not?
Ok, step #1, looking at the source signals, the OP wants to analyze at least 4 stereo sources, line out, headphones out, line in gain, second line in, that 8 sources. (Ok, you may be forgiven that since I know this project is designed to fit into his pre-amp with headphone project...)
He wants to save space. How many op-amp/diode/filter bridges will be needed here? I still agree using 2 low voltage R-R quad j-fet input opamps, biased to 1/2 vref or VDD in voltage follower mode, series fed through DC blocking cap and 100k or 1 meg resistor.
Step #2, AC biased fed into 8 channel ADCs just means in software, he only needs to read the absolute value of the ADC reading to get a positive only peak info, but we can obviously take this further.
Now to the grunt of your question, processing power. Take for example a Microchip ATSAM4S2AB in 48 pin QFP, it has 12 bit ADC, 1MSPS, with 8 inputs (definitely fast enough for 8 channel at 44.1khz sampling), running a ARM Cortex-M4+ CPU at 120 mips for 3$. With this chip, forget silly VU meters, he can run 8 parallel, 256-point full spectrum analyzers in parallel with that small OLED full color display.
Now, if 3$ is too expensive, he can go the Microchip's 1.50$ ATSAMD20G14B, but it's 48mips core but the ADC with 350Ksps, say he can safely sample 6 channels at 44.1khz, or 8 channels at 40khz, and only produce 2 channel 256point fft spectrum analysis.
It even catches me off guard how much can be achieved with a 3$ MCU today with 32 bit, 120 mips as I still look for analog simplification for some designs thinking I have only a few hundred samples/second speed as a limit. Back in 1988, on my Amiga 1000 with a 7mhz 16 bit 68000, averaging 4 clocks per instruction and an 8 bit sampler on the parallel port, the best we could do was a mono 64 point real-time FFT bar scope on display with a 20khz sample rate. Today's 3$ MCU with many 32 bit instructions taking 1 clock at 120Mhz, consuming a few ma at 3.3v, with 128k flash rom onboard would run circles around my old favorite computer.
Ok, look, enough said here. I know a 3$ 48 pin TQFP IC may be too much for some to deal with sampling sound, or how to use a cap ad a DC filter to bias the audio coming in at 1/2 VDD. All I know it it will take many more parts, board space & give you less opertunity to go with a similar priced or more expensive much slower MCU with slower ADC and work out 8 copies of an analog front end so the MCU samples just a peak signals just to drive a display when the math involved to take the ADC reading and convert it to an absolute value & do any log computation to give you a true peak can be readily adjusted all so easy in software with nothing more than 1 gain figure & 1 decay time value for the meter drop down decay speed.