I have now analyzed almost all of the SDK jar that was provided for me (apparently the latest version seen mentioned here). In short, it is more like a demo-level library, not a proper SDK. So, any improved app would likely NOT want to use the official SDK as is, as there is not much possibilities for improvements left open in it. This applies at least to version 181027. (I also analyzed the old versions; less processing, allow reading the calibrated frames, but otherwise not much better as an "SDK".)
Here are some notes, some things compared to the versions some years ago (in the reverse-engineering thread back then by frenky, mahony & co).
The SDK has been added (possibly some time earlier) 3 separate threads for processing: one to read the frames from the camera, and two threads that process alternating frames. Nice.
What is not so nice is how they orchestrate between the two processing threads.
As both use a shared array and a Bitmap for output, they can not proceed to write into those until they know the other thread has finished (and likely the further use of those has completed). They do this by sleeping until the other thread has received the next frame before computing and storing its own output data. That forces a full frame delay at minimum, even if processing was completed quicker. After producing its outputs, it sleeps again long enough to ensure that the delay after the other thread has notified listeners of the previous ready frame is at least 110ms (~9Hz). Then the thread finally notifies listeners that its own outputs are ready for use.
If there is some hiccup during the final output processing, the last step of notifying and marking the time of completed frame will shift forward, and so will the sleeping in the other thread (to ensure that minimum 110ms). Which in turn sleeps and delays the frame after next, etc. Thus, if I understood it right, the "lag" from receiving a frame to displaying it will be the maximum of (~one frame period) and (~one frame period + the longest hiccup encountered during last steps of processing).
If it had been implemented properly, the lag could be as short as it takes to process the received frame, and it could still ensure an average 9Hz framerate.
(I have not actually run/debugged it, so, I could have misunderstood something..)
The amount of processing has increased a lot since the versions couple years back. E.g. dead-pixel filling is now done to two separate sets of pixel data (i.e. two big tasks per frame). Seems they have added some noise reduction and maybe edge enhancement (or some such), both of which are run at all times, and include two relatively heavy full frame calculations with 3x3 filter kernel (named bilateral and gaussian blur). (Note, while heavy, the algorithms are quite simple compared to e.g. what a decent photo/image editing software would do on a PC.) Both algorithm implementations are quite (read "horribly") unoptimized. (And, assuming they use the same code in their latest app, a bug I found in the code might explain the weird "leakage" effect between left-right edges.)
(Since the bilateral mask table has public access, and gaussian mask table has package access, it might be possible to reduce the effects by manipulating those tables, but some of the code uses hard-coded coefficients, and the processing amount would not change, so ...)
(Also means my statement in another thread about the app having no noise removal is likely incorrect. Need to go correct that one...)
Some parts of AGC seem to be calculated for all frames (whether AGC is enabled or not) (EDIT: seems that the AGC evaluation and scaling is done always, whether the related setting is true or false; if AGC is disabled, it seems to only do some temperature based limiting on min/max ends). The conversion to temperature values is always calculated for the full frame, not just for the selected pixels/area (like done couple years ago). But for the conversion they have added an option to use table lookup instead of math, but the amount of ops saved is miniscule due to large number scalings still going on. At least the table-version doesn't do square root. Which way it is calculated depends on the unexplained "recog" configuration values.
And then bunch of misc stuff, like unused framerate calculation code, pointless array initializations, etc. etc.