Author Topic: MPLAB X a PIC inhibitor! Alternatives ? (Read 87829 times)

@rt · « **Reply #150 on:** November 22, 2017, 09:02:17 pm »

Can you modify XC16 with a Mac?

@rt · « **Reply #151 on:** November 22, 2017, 10:38:33 pm »

I just updated XC16 (not the IDE) from 1.30 to 1.33 so I could try the 60 day trial,
and now it compiles the same project some 300 words greater code size with optimisation “s”

Fortunately I can still select the older XC16 version, and it compiles as it used to!

cv007 · « **Reply #152 on:** November 22, 2017, 10:39:34 pm »

OK. This is simpler than I thought (and may work for Windows/Mac users, also). No need to change anything.
(that mchp_mafrlcsj symbol was bugging me, so I had to figure out what was going on)

Create a text file in the project (project root folder is where I put it), call it specs.txt

Code: [Select]

*cc1:+ -mafrlcsj

*cc1plus:+ -mafrlcsj

edit- cleaned up a little, notice the required blank line also (specs file format is a little odd)
(if you want to see the gcc specs output when compiling, add -v to the global additional options)

Now, in project properties, XC16 or XC32 (Global Options), additional options field add this -specs=specs.txt
Compile away with whatever options you want.

The XC16/32 front end is not allowing the -mafrlcsj option to get to cc1/cc1plus. By specifying your own specs files it appears the option gets to where it needs. The specs file above simply adds the -mafrlcsj option to what it already uses in its default specs.

I'm not the smartest bulb in the Christmas tree, so if I can figure this out, not too complicated.
(edit- that should be 'brightest bulb', which is more proof)

MCHP could change the next version, but they are required to release the source, so its a futile game they would be playing. I would also add that it looks like the source they are using is not the source they are providing, otherwise that option would make it to cc1. I suspect they are changing the switch/case for that option- in the source we get, that global option is being set to true in the switch/case for the -mafrlcsj option, in their source I think they just set it to 0 as the option is otherwise recognized.

mikeselectricstuff · « **Reply #153 on:** November 22, 2017, 11:35:10 pm »

Just tried that on the HAD badge project (86K object code), -s optimisation saved 10% code -3 doubled it ( hopefully improved speed)
Not checked if it runs.

@rt · « **Reply #154 on:** November 23, 2017, 02:52:20 am »

Is it unlikely the program and data size for a large project would stay exactly the same no matter the optimisation option?
That’s what’s happening here. Also the same if optimisation is zero.

I remember when first installing MPLABX, there was some way I enabled it for each individual file in a project, and I saw a difference then.

https://photos.app.goo.gl/WuHzzvErHNUsqbb83

cv007 · « **Reply #155 on:** November 23, 2017, 04:58:18 am »

I'm not sure how great the optimizations are for pic24/33 or pic32 (I seem to recall it was not a big deal when I tried them). For the pic32, the greater benefit it would seem to me is the mips16 and c++ options. I also seem to recall that generating mips16 code reduced the size a bit, but I think there were some hoops to jump through like maybe not being able to use in interrupts (or something like that).

Also, there are more options than -Os/1/2/3- things like isolate each function in its own section (gcc) along with removed unused sections (ld), and others.

It looks like a few versions ago, they created that -mafrlcsj option to (I assume) free the developers from having to deal with the hassle of (their own) licensing. I'll bet they didn't realize they could not prevent that option from making its way to cc1/cc1plus/lto1.

Here are the relevant files for 'mafrlcsj'
binutils/bfd/pic32-options.c
gcc/gcc/config/pic32/mchp.opt
gcc/gcc/config/pic32/mchp.c

technix · « **Reply #156 on:** November 23, 2017, 05:20:27 am »

Quote from: @rt on November 22, 2017, 09:02:17 pm

Can you modify XC16 with a Mac?

Sure I can. macOS is UNIX too, being based on BSD instead of Linux though.

@rt · « **Reply #157 on:** November 23, 2017, 05:26:51 am »

I seem to recall when I first installed it, and was able to set optimisation on project files individually, that optimisation did make a difference with the same project.
It broke the main file, and I had to change a delay loop that would have appeared useless.

technix · « **Reply #158 on:** November 23, 2017, 05:43:48 am »

Quote from: @rt on November 23, 2017, 05:26:51 am

I seem to recall when I first installed it, and was able to set optimisation on project files individually, that optimisation did make a difference with the same project.
It broke the main file, and I had to change a delay loop that would have appeared useless.

Delay loops usually don't survive well with optimizers' DCE passes. If you have timers to spare it may be better to implement delays using one of those.

@rt · « **Reply #159 on:** November 23, 2017, 06:05:03 am »

The only delay required was a call to a function with argument, so it was easily resolved, but that doesn’t explain the rest of it. I still find it hard to believe it’s working at all. I mean surely there should be one word difference at least.. between all five options. I bet if I wrote a new useless function full of setting an unused variable to zero, it still won’t change anything.

technix · « **Reply #160 on:** November 23, 2017, 06:15:34 am »

Quote from: @rt on November 23, 2017, 06:05:03 am

The only delay required was a call to a function with argument, so it was easily resolved, but that doesn’t explain the rest of it. I still find it hard to believe it’s working at all. I mean surely there should be one word difference at least.. between all five options. I bet if I wrote a new useless function full of setting an unused variable to zero, it still won’t change anything.

The function would be DCE'd into a single "return" instruction, and if the function is only used in the same file it would be entirely optimized out through inlining and further DCE.

JPortici · « **Reply #161 on:** November 23, 2017, 06:24:23 am »

Quote from: cv007 on November 23, 2017, 04:58:18 am

I'm not sure how great the optimizations are for pic24/33 or pic32 (I seem to recall it was not a big deal when I tried them).

-O0 to -O1 (which is available in free mode) is already optimizing A LOT, the acccumulators are used in a more intelligent way and a lot of unnecessary stack / frame pointers are not saved/restored between function calls

hans · « **Reply #162 on:** November 23, 2017, 08:45:05 am »

Optimization level 0 turns it basically off, creating rather large and slow code. But for debugger best to follow along.
At any other optimization level, except 'g' which I don't think XC16/XC32 support, you lose this support.

Also consider using optimization level 2. I find that optimizing for size makes the function inliner very lazy, especially in C++. It seems to compile functions as compact as it can in an isolated manner. That means the code still has lots of calls to tiny functions which it could have inlined.
If you use optimize level 3, you may find the compiler wants to unroll entire loops, which can make the code size explode. It is a bit faster, but at what cost.
Finally, you can always use attribute optimize (add __attribute__((optimize("Os"))) to function prototype) in GCC to hand pick which functions should be more aggressively optimized.

The remove unused sections should IMO be on by default. Sometimes you get a library from a vendor and only call 5% of it's functions, but meanwhile it compiles the other 95% of code with it. If you don't isolate functions into seperate sections, the linker cannot usually be as aggressive in doing this.

@rt · « **Reply #163 on:** November 23, 2017, 01:19:44 pm »

Looking a the output window for mine, changing the opt level does nothing.
I can change it for each file in “configurations.xml”.

andersm · « **Reply #164 on:** November 23, 2017, 04:02:25 pm »

Quote from: @rt on November 23, 2017, 01:19:44 pm

Looking a the output window for mine, changing the opt level does nothing.
I can change it for each file in “configurations.xml”.

Why not do the sane thing and change it in the IDE project options? You have no control over when the IDE reads/saves its files, or when the makefiles are generated.

@rt · « **Reply #165 on:** November 23, 2017, 11:29:26 pm »

If you read some posts above. It didn’t work, and still doesn’t. Images are also attached.

cv007 · « **Reply #166 on:** November 24, 2017, 12:19:11 am »

I tested the -mafrlcsj option via a specs file on a Windows pc, and it works also. I assume MacOS would work, too.

Only the latest XC16 (v1.33) and XC32 (v1.44) were tested, but the source code shows the option was added in XC16 ver 1.26, and in XC32 v1.42 so assume it will work starting with those versions.

3 lines of 'code' (1 blank) into 1 file in the project folder, 1 added global option, and you have no restrictions as it should be.

Disregard my previous scripts (unless they decide to remove the -mafrlcsj option in later builds).

Quote

If you read some posts above. It didn’t work, and still doesn’t. Images are also attached.

Your images don't really help. The gui may show that you selected the 's' optimization, but only the debug build output would show what is actually being applied- for instance if there is a problem with your trial license it would revert to free mode and the 's' optimization would not happen (you will also then get messages in the build debug output- which we cannot see).

@rt · « **Reply #167 on:** November 24, 2017, 02:54:18 am »

The output window shows optimisation level 1, no matter what is globally selected, which is what I originally applied in the free version.
MPLABX does appear to acknowledge the trial license (for the GPL software

) by telling me I had 61 days remaining,
which displays as 60 days when I look at it today.
Having worked on the same project on & off for 18 months or so, I’d definitely notice anything odd in the output window.

It’s only when I changed the optimisation level it the configurations.xml file that they actually take effect,
and I am able to do that for individual project files.

I wouldn’t consider this an issue anymore, since the file is easy to edit if it does change.
I’ll see what I can do in the 60 days, and then fiddle with it again after that.

Same project again:
https://photos.app.goo.gl/EJ2inghXXUQoaXxP2

andersm · « **Reply #168 on:** November 24, 2017, 06:03:26 am »

Quote from: @rt on November 23, 2017, 11:29:26 pm

If you read some posts above. It didn’t work, and still doesn’t. Images are also attached.

Your images didn't show the build transcript, which would show that you built with the options you think you did.

@rt · « **Reply #169 on:** November 24, 2017, 06:11:19 am »

They show the generated code size, which doesn’t change until I manually edit the file. The output window shows optimisation level 1 for all project files no matter which of five options are set.

andersm · « **Reply #170 on:** November 24, 2017, 07:30:55 am »

Quote from: @rt on November 24, 2017, 06:11:19 am

The output window shows optimisation level 1 for all project files no matter which of five options are set.

So you've diddled with the project settings in a way that the IDE hasn't regenerated the makefiles, and haven't actually built the project with different settings.

JPortici · « **Reply #171 on:** November 24, 2017, 09:11:00 am »

Quote from: cv007 on November 22, 2017, 10:39:34 pm

OK. This is simpler than I thought (and may work for Windows/Mac users, also). No need to change anything.
(that mchp_mafrlcsj symbol was bugging me, so I had to figure out what was going on)

Create a text file in the project (project root folder is where I put it), call it specs.txt
Code: [Select]
*cc1:+ -mafrlcsj *cc1plus:+ -mafrlcsjedit- cleaned up a little, notice the required blank line also (specs file format is a little odd)
(if you want to see the gcc specs output when compiling, add -v to the global additional options)

Now, in project properties, XC16 or XC32 (Global Options), additional options field add this -specs=specs.txt
Compile away with whatever options you want.

The XC16/32 front end is not allowing the -mafrlcsj option to get to cc1/cc1plus. By specifying your own specs files it appears the option gets to where it needs. The specs file above simply adds the -mafrlcsj option to what it already uses in its default specs.

I'm not the smartest bulb in the Christmas tree, so if I can figure this out, not too complicated.
(edit- that should be 'brightest bulb', which is more proof)

MCHP could change the next version, but they are required to release the source, so its a futile game they would be playing. I would also add that it looks like the source they are using is not the source they are providing, otherwise that option would make it to cc1. I suspect they are changing the switch/case for that option- in the source we get, that global option is being set to true in the switch/case for the -mafrlcsj option, in their source I think they just set it to 0 as the option is otherwise recognized.

hey, it did something

(XC16 v1.33 on windows)

tried with -O2 without additional options: "license file is required" and it compiled with -O1
with the additional options, it compiled.

-O0 and -O1 produce the same hex file with and without the additional option

Optimization level -O0
Data: 6950
Program: 8119

Optimization level -O1
Data: 6950
Program: 6045

Optimization level -O2
Data: 6950
Program: 7190 <--probably unrolling some loops

Optimization level -Os
Data: 6950
Program: 5799 <-- this actually worries me

Optimization level -O0
Data: 6950
Program: 9006 <-- hm...

So something is happening but i have to investigate WTH is happening, I wonder what happens, in this project code is written to be as optimized as it can get, all number-crunching tasks were either assembly routines or using builtins for things like divisions (where the variables are bounded so it will never overflow)

JPortici · « **Reply #172 on:** November 24, 2017, 09:30:39 am »

oh oh.. i already see some of the tricks it adopted (that i can use with -O1 to force the compiler to optimize)

comparison #1
this routine produce the same code with -O1, -O2 and -O3

Code: [Select]

27:                void shift_sentbuf(sentbuf_t* sentbuf,unsigned int times) {
28:                  int idx;
29:                  while (times > 0) {
004EFE  E00001     CP0 W1
004F00  32000D     BRA Z, 0x4F1C
004F1A  3AFFF3     BRA NZ, 0x4F02
30:                    for (idx=0;idx<(_SENT_BUF_SIZE-1);idx++) {
31:                      sentbuf->buffer[idx] = sentbuf->buffer[idx+1];
004F02  900120     MOV [W0+4], W2
004F04  9001B0     MOV [W0+6], W3
004F06  BE8802     MOV.D W2, [W0]
004F08  900140     MOV [W0+8], W2
004F0A  9001D0     MOV [W0+10], W3
004F0C  980022     MOV W2, [W0+4]
004F0E  980033     MOV W3, [W0+6]
004F10  900160     MOV [W0+12], W2
004F12  9001F0     MOV [W0+14], W3
004F14  980042     MOV W2, [W0+8]
004F16  980053     MOV W3, [W0+10]
32:                    }
33:                    times--;
004F18  E90081     DEC W1, W1
34:                  }
35:                }
004F1C  060000     RETURN

with -Os, however

Code: [Select]

27:                void shift_sentbuf(sentbuf_t* sentbuf,unsigned int times) {
28:                  int idx;
29:                  while (times > 0) {
004D14  370008     BRA 0x4D26
004D26  E00001     CP0 W1
30:                    for (idx=0;idx<(_SENT_BUF_SIZE-1);idx++) {
004D20  518FE3     SUB W3, #0x3, [W15]
004D22  3AFFF9     BRA NZ, 0x4D16
31:                      sentbuf->buffer[idx] = sentbuf->buffer[idx+1];
004D16  E80183     INC W3, W3
004D18  900222     MOV [W2+4], W4
004D1A  9002B2     MOV [W2+6], W5
004D1C  BE8904     MOV.D W4, [W2]
004D1E  410164     ADD W2, #0x4, W2
32:                    }
33:                    times--;
004D24  E90081     DEC W1, W1
004D26  E00001     CP0 W1
004D28  320003     BRA Z, 0x4D30
004D2A  780100     MOV W0, W2
004D2C  EB0180     CLR W3
004D2E  37FFF3     BRA 0x4D16
004D30  060000     RETURN
34:                  }
35:                }

reason is simple: #define _SENT_BUF_SIZE 4
so the compiler decided to unroll the loop for small loops in every optimization level but -Os. i wouldn't have done that because in dsPIC33E branches have a high latency.. with 33F instead it would be fine but see it uses the MOV.D instructions!

this however can be applied with every optimization.. with -O1

Code: [Select]

10:                sent_t pop_sentbuf(sentbuf_t* sentbuf) {
11:                  sent_t data;
12:                  unsigned int idx;
13:                  if (sentbuf->idx > 0) {
004ED0  900901     MOV [W1+16], W2
004ED2  E00002     CP0 W2
004ED4  320010     BRA Z, 0x4EF6
14:                    data = sentbuf->buffer[0];
004ED6  781831     MOV [W1++], [W0++]
004ED8  781021     MOV [W1--], [W0--]
15:                    for (idx=0;idx<_SENT_BUF_SIZE-1;idx++) {
16:                      sentbuf->buffer[idx] = sentbuf->buffer[idx+1];
004EDA  900221     MOV [W1+4], W4
004EDC  9002B1     MOV [W1+6], W5
004EDE  BE8884     MOV.D W4, [W1]
004EE0  900241     MOV [W1+8], W4
004EE2  9002D1     MOV [W1+10], W5
004EE4  9800A4     MOV W4, [W1+4]
004EE6  9800B5     MOV W5, [W1+6]
004EE8  900261     MOV [W1+12], W4
004EEA  9002F1     MOV [W1+14], W5
004EEC  9800C4     MOV W4, [W1+8]
004EEE  9800D5     MOV W5, [W1+10]
17:                    }
18:                    sentbuf->idx--;
004EF0  E90102     DEC W2, W2
004EF2  980882     MOV W2, [W1+16]
004EF4  370003     BRA 0x4EFC
19:                  }
20:                  else {
21:                    data.Data_H = 0;
004EF6  EB0080     CLR W1
004EF8  780801     MOV W1, [W0]
22:                    data.Data_L = 0;
004EFA  980011     MOV W1, [W0+2]
23:                  }
24:                  return data;
25:                }
004EFC  060000     RETURN

and with every other optimization level

Code: [Select]

10:                sent_t pop_sentbuf(sentbuf_t* sentbuf) {
11:                  sent_t data;
12:                  unsigned int idx;
13:                  if (sentbuf->idx > 0) {
0057C4  900901     MOV [W1+16], W2
0057C6  E00002     CP0 W2
0057C8  320010     BRA Z, 0x57EA
14:                    data = sentbuf->buffer[0];
0057CA  781831     MOV [W1++], [W0++]
0057CC  781021     MOV [W1--], [W0--]
15:                    for (idx=0;idx<_SENT_BUF_SIZE-1;idx++) {
16:                      sentbuf->buffer[idx] = sentbuf->buffer[idx+1];
0057CE  900221     MOV [W1+4], W4
0057D0  9002B1     MOV [W1+6], W5
0057D2  BE8884     MOV.D W4, [W1]
0057D4  900241     MOV [W1+8], W4
0057D6  9002D1     MOV [W1+10], W5
0057D8  9800A4     MOV W4, [W1+4]
0057DA  9800B5     MOV W5, [W1+6]
0057DC  900261     MOV [W1+12], W4
0057DE  9002F1     MOV [W1+14], W5
0057E0  9800C4     MOV W4, [W1+8]
0057E2  9800D5     MOV W5, [W1+10]
17:                    }
18:                    sentbuf->idx--;
0057E4  E90102     DEC W2, W2
0057E6  980882     MOV W2, [W1+16]
0057E8  060000     RETURN
19:                  }
20:                  else {
21:                    data.Data_H = 0;
0057EA  780802     MOV W2, [W0]
22:                    data.Data_L = 0;
0057EC  980012     MOV W2, [W0+2]
23:                  }
24:                  return data;
25:                }
0057EE  060000     RETURN

not much difference, right? but notice at the end, with -O1 at the end of the if (or at the end of switch statements) that go to the end of the routine, one could simply use return, instead with -O1 there is a branch to the end of the statement, which would be the general case. add "return" where needed and spare yourself 3/5 cycles! (this may be a "duh" but i would have expected the compiler to do it already with -O1)

JPortici · « **Reply #173 on:** November 24, 2017, 10:09:18 am »

strangenes...
-O1

Code: [Select]

54:                  UART2_TXBuf.dim = 6;
004224  200060     MOV #0x6, W0
004226  889390     MOV W0, 0x1272

vs -O2 (this may explain the 300 or so more instruction as i don't see much difference in the listing)

Code: [Select]

54:                  UART2_TXBuf.dim = 6;
0049A0  200060     MOV #0x6, W0
0049A2  889390     MOV W0, 0x1272
0049CA  200060     MOV #0x6, W0
0049CC  889390     MOV W0, 0x1272
0049F6  200060     MOV #0x6, W0
0049F8  889390     MOV W0, 0x1272

this happens many times..
and with routine calls too.. WTF?

Code: [Select]

60:                  UART2_TXBuf.buffer[5] = uart_chkCalc(&UART2_TXBuf);
0049B0  212500     MOV #0x1250, W0
0049B2  0706E3     RCALL _uart_chkCalc
0049B4  8892D0     MOV W0, 0x125A
0049DA  212500     MOV #0x1250, W0
0049DC  0706CE     RCALL _uart_chkCalc
0049DE  8892D0     MOV W0, 0x125A
004A06  212500     MOV #0x1250, W0
004A08  0706B8     RCALL _uart_chkCalc
004A0A  8892D0     MOV W0, 0x125A

@rt · « **Reply #174 on:** November 24, 2017, 11:06:33 am »

Quote from: andersm on November 24, 2017, 07:30:55 am

Quote from: @rt on November 24, 2017, 06:11:19 am
The output window shows optimisation level 1 for all project files no matter which of five options are set.
So you've diddled with the project settings in a way that the IDE hasn't regenerated the makefiles, and haven't actually built the project with different settings.

No, I’ve set the optimisation to 1 for every individual file through the MPLABX interface before updating XC16,
and either lost the ability, or forgotten how to set optimisation for individual project files.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: MPLAB X a PIC inhibitor! Alternatives ? (Read 87829 times)

Share me