I had to take a break from this project to work on other things, back now.
Message QueuesWe need to have a way for the M4 and M7 processors to talk to each other and for now that will be done via a pair of message queues, one for sending data M4->M7 and the other for M7->M4. We can use the hardware semaphore mechanism to control access so that only one processor is reading/writing to a queue at a time.
files:
messageQueue.h,
messageQueue.cppEach queue can contain multiple Message objects which is just a 32 bit header divided into a 16 bit MessageID enum that tells us what kind of data is being sent and a 16 bit data length field stating how many bytes are being sent (including the header). Just to bound the problem a bit the max Message size is 0x600 (1536d) bytes and the MessageQueue is sized to hold multiple max-length Messages.
The MessageQueues track how many unread messages they contain, how many bytes are used, and the max values either of those fields have ever seen. Data is read FIFO from a circular buffer. Finally, each MessageQueue is initialized with the ID of the hardware semaphore that controls access to its buffer:
struct MessageQueue {
uint32_t pendingMessages; // the number of messages in the queue waiting to be processed
uint32_t maxPendingMessages; // the largest number of pending messages ever in the queue at once
uint32_t bytesInQueue; // number of bytes currently in the queue
uint32_t maxBytesInQueue; // the largest number of bytes ever contained in the queue
uint32_t head; // buffer index where the next byte should be written
uint32_t tail; // buffer index where the next byte should be read
HSEM_ID hsemID; // hardware semaphore controlling access to this queue
uint8_t buffer[MQ_MESSAGE_QUEUE_SIZE]; // the queue data buffer
};
The way the MessageQueues get created feels a bit janky to me, first I go to the linker file for each core and carve out part of the SRAM4 memory that is accessible to both processors and then export _sram4_mq as the start of that address space:
MEMORY
{
FLASH (RX) : ORIGIN = 0x08100000, LENGTH = 1M
RAM_D2 (RWX) : ORIGIN = 0x10000000, LENGTH = 288K
RAM_D3 (RWX) : ORIGIN = 0x18000000, LENGTH = 64K
SRAM4 (RWX) : ORIGIN = 0x38000000, LENGTH = 32K
SRAM4_MQ (RWX) : ORIGIN = 0x38008000, LENGTH = 32K /* reserve half of SRAM4 for message queue */
}
_estack = ORIGIN(RAM_D2) + LENGTH(RAM_D2); /* 0x10048000 */
_sram4_mq = ORIGIN(SRAM4_MQ); /* 0x38008000 */
Next, each processor imports __sram4_mq and then decides that an array of MessageQueue objects liver there:
extern void* _sram4_mq;
static MessageQueue* mq = (MessageQueue*)&_sram4_mq;
Then there is a MessageQueueID enum that is zero for the M4->M7 queue and one for M7->M4, that value is used to index our imaginary arry of MessageQueues living at the _sram4_mq address. This feels like a strange way to set things up and I'm very open to other ideas, but it does have the advantage that each processor agrees on exactly where the MessageQueues live in memory and the linker for each processor knows not to put anything else in that memory range.
The MessageQueue send operation spin-waits until it can obtain a hardware semaphore lock on the queue, then writes in the message header and data, increments the number of unread messages and updates the byte count, then unlocks the queue and exits.
The MessageQueue read operation is pretty similar - spin-wait until it gets a hardware semaphore lock, copy the header and data into a private buffer, decrement the number of unread messages and update the byte count, then release the lock and exit.
Finally, there is a method to quickly check the number of unread messages so you know if you need to do a read or not that does not involve waiting for a lock.
I tested the queue in both directions with some dummy Message types that let one processor ask the other to blink an LED or send some text out the debug port, it ran for long enough to wrap around the data buffer multiple times with no problem. There are plenty of improvements to be made (switching from reading/writing one byte at a time to 32 bit data transfers is the obvious one) but it's good enough for now.