In addition to what I wrote, using this version of AVR-GNUToolchain 3.4.4.24 allows you to get the most compact size of the firmware, which is especially important for the small amount of memory in ATMega328.
I did a short test on my Debian system for the standard 1.43m source with two gcc versions 5.4.0 (standard Debian stable package 1:5.4.0+Atmel3.6.1-2) and 7.3.0 (provided by Arduino IDE 1.8.13) and the compiler setting "-flto", the winner is:
gcc 7.3.0 with option -flto
AVR Memory Usage
----------------
Device: atmega328
Program:
26254 bytes (80.1% Full)(.text + .data + .bootloader)
Data: 224 bytes (10.9% Full)
(.data + .bss + .noinit)
EEPROM: 749 bytes (73.1% Full)
(.eeprom)
followed by:
gcc 5.4.0 with option -flto
AVR Memory Usage
----------------
Device: atmega328
Program:
26652 bytes (81.3% Full)(.text + .data + .bootloader)
Data: 224 bytes (10.9% Full)
(.data + .bss + .noinit)
EEPROM: 749 bytes (73.1% Full)
(.eeprom)
gcc 7.3.0
AVR Memory Usage
----------------
Device: atmega328
Program:
26830 bytes (81.9% Full)(.text + .data + .bootloader)
Data: 226 bytes (11.0% Full)
(.data + .bss + .noinit)
EEPROM: 749 bytes (73.1% Full)
(.eeprom)
gcc 5.4.0
AVR Memory Usage
----------------
Device: atmega328
Program:
27068 bytes (82.6% Full)(.text + .data + .bootloader)
Data: 226 bytes (11.0% Full)
(.data + .bss + .noinit)
EEPROM: 749 bytes (73.1% Full)
(.eeprom)
All other CFLAGS (except -flto) are unchanged from the source of 1.43m:
CFLAGS = -mmcu=${MCU} -Wall -I. -Ibitmaps
CFLAGS += -DF_CPU=${FREQ}000000UL
CFLAGS += -DOSC_STARTUP=${OSC_STARTUP}
CFLAGS += -gdwarf-2 -std=gnu99 -Os -mcall-prologues
CFLAGS += -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums
CFLAGS += -flto
CFLAGS += -MD -MP -MT $(*F).o -MF dep/$(@F).d
Earlier versions than atmel 3.6.1 of the toolchain (from Atmel website) did not work under Debian stable, but at least the newer version gives smaller code, so let's hope for the future.
Nevertheless it would be interesting to see the corresponding Windows build sizes.