Here are my 2c (Did not read through the whole thread, so please forgive me if these are dupilcates):
1. In C the arrays are quite problematic when passed to functions, because the length of the array is not implicitly passed as an argument. So, the new language should pass the length of an array implicitly to the function to be called. The length of the passed array needs to be accessible inside the called function. Enabling/disabling run-time checks for validating array accesses should be supported globally / by compilation unit / by function, and during unit testing. Also, when passing a variable or an array as a void*, the caller should pass the total size of the item in bytes to the called function.
This is a language goal. The problem was solved years ago and is available to us straight from PL/I, here's how I envisage it looking in revised grammar: (this grammar is being actively developed currently, based on PL/I but slowly being revised as language goals solidify).
dcl matrix(10,20) string(64) varying;
func scan_for_text(table) returns bool
arg table(*,*) string(64);
var x = Dim(table,0); // i.e 10
var y = Dim(table,1); // i.e 20
var isup = is_uppercase(table(3,7));
end
func is_uppercase(text) returns bool
arg text string(*);
end
Enabling dynamic access to array and varying string metadata is fundamental, I'm glad you raised it.
2. In C handling of endianess when declaring structs and bit fields required some work in order to make code portable across different architectures: Keywords and language constructs for native / big endian / little endian should be provided, and the compiler should take care of producing code for the correct endianess.
This is interesting, I know that Arm supports both types, how would you envisage this looking to a developer? what aspects of code are influenced by this?
3. In C the padding of structs requires careful design to get it right, especially when porting to different architectures. The compiler should provide simple and intuitive means for specifying and asserting the padding.
I agree, PL/I was richer in this area, taking that and revising it is a goal, controlling alignment, physical member ordering, padding, with control at the structure level and member level.
4. Design by contract (https://en.wikipedia.org/wiki/Design_by_contract) should be supported natively by the compiler so that the verification of the contracts can be enabled during compilation time and when running the unit tests. The run-time validation of the contract asserts, pre- and postcondition contracts should be controllable globally / by compilation unit / by function when creating for the actual production build. In C this can be done using macros, but this should be built-in into the language, and should be supported during run-time system.
Completely new area to me! I've read bits about it in Ada and Spark, I need to study this more, unsure of how to quantify the scope, but it could well impact grammar design so should be thought about early on.
5. Modern C compilers complain when comparing unsigned value to a signed value, which is good. But modern C compilers allow assigning unsigned variables to signed variables, and vice versa. Assigning unsigned variables to signed variables, and vice versa, should be flagged as an error by default in general. Literal values should be casted automatically by the compiler, of course. If the user wants to assign signed variables to unsigned variables, and vice versa, the operation should be done explicitly by the programmer.
Well first off, strict consistency is an absolute must, different policies in different circumstances is just asking for problems. Scenarios that can lead to unexpected, non-intuitive outcomes are a real problem, this falls into exception support area too.
6. Native 8-bit unsigned and signed data types, and a separate 8-bit char data type which is not compatible with the integer types without proper casting.
Again PL/I was attentive here, it had the 'char' data type, in essence a 'byte' with no concept of sign. For numeric use we can do this:
dcl counter bin(8) signed;
dcl totals bin(16) unsigned;
dcl rate bin(16,4) signed;
dcl interim bin(12); // defaults to signed (perhaps)
This shows the arbitrary fixed point binary types, fixed decimal is included as well.
7. Strict type checking. For example the following should not be allowed without explicit cast, because they are declared to be different data types although they are both int8_t types at the end of the day:
typedef int8_t apple_t;
typedef int8_t orange_t;
apple_t apples = 0;
orange_t oranges = 0;
apples = oranges; /* This should be flagged as an error without explicit cast */
8. More intuitive way to work with bits/bitmasks.
Type conversion is a big area here, PL/I never supported user defined type names, but it should be supported. Whether it simply be analogous to C's typedef, I'm not sure yet. It makes little sense to me though, to allow a new type name to be defined for an existing type name, no idea why C allows that.
As for bit fields and stuff, the 'bit' is a native language data type here. One can declare bit strings (akin to our familiar) character strings. And I envisage 'pad' as a type too, one can declare multiple explicit padding members if desired and all pad members can even have the same name if so desired, easing readability of code.
Thanks again for your detailed post!