I often write notes in a project journal to keep a history of what I am thinking and doing. Here is an example from a semi-recent code to convert binary to decimal on the CH32V003, which does not have multiply, divide, or DAA.
Here are my journal notes:
Thinking about binary to decimal conversion on the non-M RISC-V MCUs (such as
the CH32V003), the usual way of using DAA and the full and half caries is not
going to work here.
Going to the last library for the STM8, one function does the DAA in multiple
instructions. To start, it assumes that the DAA add-6 will be necessary and
subtracts it back out if it turns out not. Also, this is done before the left
shift, so actually 0x33 is added so that the 0-9 BCD digits will stay in the
4 bit range. Also, adding the 3 to the 0-9 BCD value will set bit-3 (value of
8 or higher) when the DAA is needed (originally 5-9). We can use this bit test
to decide if the DAA is needed.
Idea: On the RISC-V code, we can add 0x33333333 to the 32-bit BCD result
register, assuming that all 8 digits will need DAA. Then check the high bit
of each BCD result and undo the add if the bit is clear (original digit was
0 to 4). What if we AND the result with 0x88888888 to get the DAA decision
bits, XOR to invert, shift right by 2 to get correction*2, shift right to
get correction*1, then subtract the two corrections to get a conditional
subtract 3 for each BCD digit.
RISC-VE has T0-T2 and A0 to A5 available. None of these need to be saved.
Since the BCD DAA is done in parallel, this will be the main conversion
function, and will serve wrapper functions with any number of input bits
and ASCII digits. Note that the binary input must be left aligned.
The "remove leading zero" function will be separate.
Also note that with only 8 ASCII digits, a full 32-bit conversion is not
possible with this code (limited to < 100 million).
For a binary to decimal conversion, this only needs 32+8 loops, and is probably
a lot faster than calling an integer divide (and remainder) function ten times
and then converting to ASCII and storing. YES! If the RISC-V MCU has a hardware
divide, that would be a better choice. But the CH32V003 does not, so this has
to be done with simple opcodes.
And here is the code in question:
/******************************************************************************
*
* Binary (up to 32 bits) to ASCII (up to 8 digits)
*
* in: binary, digit buffer, bit count, digit count
*/
void binx_decx(uint32_t bin, char *dec, int bits, int digits)
{
asm(
"li t0, 32 \n"
"sub t0, t0, %2 \n" /* 32 minus bit count. */
"sll %0, %0, t0 \n" /* Left justify bits. */
"li t0, 0 \n" /* BCD digits */
"li t1, 0x33333333 \n" /* DAA add (pre-shift) */
"li t2, 0x88888888 \n" /* DAA correection bit mask */
"j 2f \n"
/* Do the left shift and DAA */
"1: \n"
"c.add t0, t1 \n" /* assume DAA */
"and a5, t0, t2 \n" /* get correction bits */
"xor a5, a5, t2 \n" /* invert correction bits */
"srli a5, a5, 2 \n" /* correction *2 */
"sub t0, t0, a5 \n"
"srli a5, a5, 1 \n" /* correction *1 */
"sub t0, t0, a5 \n"
"slli t0, t0, 1 \n"
/* Add the high binary bit to the BCD */
"2: \n"
"srli a5, %0, 31 \n"
"c.add t0, a5 \n" /* add 0 or 1 */
/* Shift in the next binary bit */
"slli %0, %0, 1 \n"
/* Loop until done */
"c.addi %2, -1 \n"
"bnez %2, 1b \n"
/* All BCD digits are done now. */
"c.add %1, %3 \n" /* Start with lowest digit. */
"sb zero, (%1) \n"
"3: \n"
"c.addi %1, -1 \n"
"andi t1, t0, 15 \n" /* Get BCD digit. */
"addi t1, t1, '0' \n" /* Make ASCII. */
"sb t1, (%1) \n"
"srli t0, t0, 4 \n" /* Rotate in next BCD digit. */
"c.addi %3, -1 \n"
"bnez %3, 3b \n"
:
: "r" (bin), "r" (dec), "r" (bits), "r" (digits)
: "t0", "t1", "t2", "a5"
);
}
/*
INPUTS:
A0: binary input
A1: pointer for ASCII decimal result
A2: number of bits to process (loop counter)
A3: number of BCD digits to process (loop counter)
Working registers:
T0: BCD building register
T1: 0x33333333
T2: 0x88888888
A5: BCD correction register
*/