bin files are indeed a continuous memory dump without segmentation, and doesn't have to start 0x0. PIC32s also clasically have this problem because of their bootloader FLASH. If you try to make a bin with the default GCC tools, you'll get one that's potentially 1GB big. There even is a dedicated elf2hex executable in XC32, because hex does allow data to placed at arbitrary memory addresses.
Back to ARM. Here is a linker script for a STM32L431 ld I picked (note: this is ST stock code):
/* used by the startup to initialize data */
_sidata = LOADADDR(.data);
/* Initialized data sections goes into RAM, load LMA copy after code */
.data :
{
. = ALIGN(8);
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(8);
_edata = .; /* define a global symbol at data end */
} >RAM AT> FLASH
/* Uninitialized data section */
. = ALIGN(4);
.bss :
{
/* This is used by the startup in order to initialize the .bss secion */
_sbss = .; /* define a global symbol at bss start */
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .; /* define a global symbol at bss end */
__bss_end__ = _ebss;
} >RAM
As you can see, .data is itialized and specifies that the initial RAM contents are stored in FLASH. However, .bss is uninitialized and so does not. The startup code for loading constants from FLASH to RAM looks something like:
CopyDataInit:
ldr r3, =_sidata /* <<< FLASH source address pointer */
ldr r3, [r3, r1]
str r3, [r0, r1]
adds r1, r1, #4
LoopCopyDataInit:
ldr r0, =_sdata /* <<< RAM destination start */
ldr r3, =_edata /* <<< RAM destination end */
adds r2, r0, r1
cmp r2, r3
bcc CopyDataInit
So for your problem, the easiest 'fix' I guess is to relocate buffers and not them preinitialize them in C (no value assignments). This prevents the linker from putting data there. Then you start using them, you could use memset()
If you do need those initial pre loaded with values, you'll probably have take note of the FLASH address of the generated image file (such as `sidata`), do a similar "AT >FLASH" structure for that section and record the destination start/end RAM address, and write your own startup memcpy loop.
To be honest it's not hard, but it's a bit fiddly to get right.
For that STM32L431 example, I had modified the startup script to move the program code (sections .text and .rodata) from FLASH to RAM1, copied it and booted with that address. By saying to the compiler it must link addresses for RAM1 and store the data in FLASH, all memory loads, program jumps, etc. are also correctly linked for RAM1. By running code from SRAM, I was able to power off FLASH on the chip and save a considerable amount of power.