Author Topic: Compiling a program (sections)  (Read 2256 times)

0 Members and 1 Guest are viewing this topic.

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Compiling a program (sections)
« on: January 24, 2019, 10:13:32 am »
What are sections: https://www.ele.uva.es/~jesus/hardware_empotrado/Compiler.pdf ?

So if I understand correctly the compiler splits the whole source code into sections and places this in the RAM?
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22307
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Compiling a program (sections)
« Reply #1 on: January 24, 2019, 10:51:19 am »
Well, sort of?  What do you mean by "code"?

You should read up on linking, the process of taking code and data objects and pasting them together into a final executable image/file.  This is the final step, after the compiler is done with its work, and sections are very important here!

.text is source code, i.e., the compiled instructions resulting from program statements and all that.  The rest are what variables, arrays, data structures -- any data in general -- is allocated as.

There are different data sections, because some shortcuts can be taken.  For example, .bss is all zeroes (on most platforms), so the memory can be wiped on program startup, and that's that -- no need to store a huge pile of zeroes.

Everything else that's initialized to expected values, has to be placed somewhere, so that those values can be copied in from the executable file.

Say your code has a series of variable declarations, e.g.
Code: [Select]
int foo, bar = 1, baz;
char b = 0, c, d = 0x40;
In general, you can probably expect (but always verify, if you must use this at all, but please don't actually use this property) that foo, baz, maybe b, and c will be placed in the order they were declared, in .bss; and bar, maybe b, and d will be placed in .data in that order.  But not that the variables will ALL be allocated in the order given, because that would be wasteful (the .bss zeroing is a simple for() loop generated by the compiler, or ran by the OS -- it's not going to zero random bytes in a patchwork section, and there's no savings to be had by throwing all the zeroes and initializers together).

The run-time function of sections depends on the targeted architecture.  For example, on platforms with memory protection, data sections will typically be tagged as no-execute (so that an attacker can't copy a buffer into the data section and have it executed as code), and code sections will typically be tagged as not-data (instruction fetch only, no data read or write; or data read is okay, I forget, but not writable).

Some platforms take this farther: Harvard architectures, like AVR, fundamentally do not have* a common memory area: they have separate buses for code and data.  The CPU can't read data as code, period, because it's not physically wired to!

*Except they of course do, because it would be a massive wasted opportunity not to.  On AVR, this is the LPM instruction (load program memory).  The downside is, C assumes a flat (Von Neumann) architecture, so when you declare and access variables stored in this way, you have to use macros to do it, which then compile to the correct addresses and instructions.  It's messier than just having sections tagged and letting the OS set memory protections.

Harvard architectures are a bit of a special case, with memory protection being the general case, where any physical memory can be mapped to any logical address, and tagged as any type (code, read-only, read-write..).

On older platforms (like 8086), sections would be divided according to memory segmentation requirements, and the memory model specified to the linker.  The 8086 is a real-mode processor, all memory can be trampled by any program.  (The only way multiple programs (including the OS itself) can coexist on an 8086, is if they all cooperate together, without trampling each others' memory.)  A segment on 8086 holds 64kiB.  Usually, a segment is allocated so that variables start at offset zero; I suppose if a segment isn't filled up, the next allocated segment might overlap it, in which case an unbounded array access (ever so easy to do in C) or buffer overflow would trample variables in both segments.  At start, the OS (MS-DOS, etc.) reads the EXE file and copies its data (including the code) into the sections listed in the EXE header, then jumps into the code's entry point (which is probably the compiler's initializer code, and then it's on to your code as such).

Or if you're working in assembler, you might not have sections at all.  On the 8086, a .COM file is just a flat, <= 64kiB chunk of code and data, which is loaded and jumped into at offset 0x0100.  (Normally, you'd use sections to take advantage of even these basic memory management paradigms, even if you technically don't need to do anything at all.)  Or on a Harvard architecture, you necessarily have to use sections, to allocate variables and write code, respectively.  (Note that assembler goes through the same linking process, and these sections are just hints passed along to the linker -- this is why the linker is critical to understanding the memory model of an executable. :) )

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 
The following users thanked this post: nForce

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4290
  • Country: us
Re: Compiling a program (sections)
« Reply #2 on: January 24, 2019, 11:32:24 am »
I'm going to go ahead and recommend an online class that I took some time ago:Introduction to Embedded Systems Software and Development Environments

I didn't think that this was a particularly great class, but it does talk about a some subjects - particularly "compiler environments", "Makefiles", and "memory regions" - that are left out of many computer curricula (in favor of providing students with a pre-configured environment that "just works.")

You can "audit" it for free.   It might help.
 
The following users thanked this post: nForce

Offline legacy

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: Compiling a program (sections)
« Reply #3 on: January 24, 2019, 01:33:41 pm »
What are sections?

directives for the linker script, from your point of view.
 
The following users thanked this post: nForce

Offline rsjsouza

  • Super Contributor
  • ***
  • Posts: 6047
  • Country: us
  • Eternally curious
    • Vbe - vídeo blog eletrônico
Re: Compiling a program (sections)
« Reply #4 on: January 24, 2019, 03:40:50 pm »
Vbe - vídeo blog eletrônico http://videos.vbeletronico.com

Oh, the "whys" of the datasheets... The information is there not to be an axiomatic truth, but instead each speck of data must be slowly inhaled while carefully performing a deep search inside oneself to find the true metaphysical sense...
 
The following users thanked this post: nForce

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3239
  • Country: ca
Re: Compiling a program (sections)
« Reply #5 on: January 24, 2019, 11:04:52 pm »
What are sections?

directives for the linker script, from your point of view.

Pieces of memory from mine.
 
The following users thanked this post: nForce

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4441
  • Country: nz
Re: Compiling a program (sections)
« Reply #6 on: January 24, 2019, 11:49:31 pm »
Another fun part is that the linker will concatenate together different sections with the same name, but can't split apart a section.

In traditional MacOS (maybe Windows?) every function is effectively compiled into its own section, and if a particular function (maybe from a library) is not used in the program then the linker will not include it. However in traditional Unix land every C source file is compiled into a single code section, a single data section, and a single bss section. So if you want to minimise the size of your program you need to provide compile flags such as gcc's -ffunction-sections and -fdata-sections.
 

Offline legacy

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: Compiling a program (sections)
« Reply #7 on: January 25, 2019, 12:01:19 am »
dead-code checkers can be your own personal Jesus :D
(I have recently implemented one for my C-builder)
 

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Re: Compiling a program (sections)
« Reply #8 on: January 25, 2019, 03:46:04 pm »
I have one question: Are sections the same thing as segments in C?

https://www.geeksforgeeks.org/memory-layout-of-c-program/ ?
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3239
  • Country: ca
Re: Compiling a program (sections)
« Reply #9 on: January 25, 2019, 04:55:21 pm »
I have one question: Are sections the same thing as segments in C?

Linker combines the sections into segments. The loader then loads the segments without knowing anything about sections.

For MCU, the ROM segments are programmed by the programmer, but you need to initialize some of the RAM segments at reset. In this case, the initializer (loader if you would) is a part of the program.
 
The following users thanked this post: nForce

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Re: Compiling a program (sections)
« Reply #10 on: January 25, 2019, 07:54:20 pm »
I have one question: Are sections the same thing as segments in C?

Linker combines the sections into segments. The loader then loads the segments without knowing anything about sections.

For MCU, the ROM segments are programmed by the programmer, but you need to initialize some of the RAM segments at reset. In this case, the initializer (loader if you would) is a part of the program.

Why are some sections named the same as some segments?
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3239
  • Country: ca
Re: Compiling a program (sections)
« Reply #11 on: January 25, 2019, 08:07:22 pm »
Why are some sections named the same as some segments?

Not necessarily. Segments may not have names at all, for example, in ELF.
 
The following users thanked this post: nForce


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf