Author Topic: Replacement for C standard library: your wishlist? (Read 8533 times)

Nominal Animal · « **Reply #50 on:** January 10, 2021, 05:05:14 pm »

Quote from: dunkemhigh on January 10, 2021, 04:13:24 pm

OK. I retract my objection, then.

It was a good point, though. Being able to use the higher-level "stream" interfaces on all types of descriptors is important.

Another important case is full duplex I/O. We need to make that simple, because so many file descriptor type interfaces are full duplex in Linux and BSDs.
I do have a few ideas (based on past experience) on how to implement this, but there are several ways of implementing it.

Many things boil down to exposing at least the receive/read buffer, so that the user can check if a specific code or sequence has been received, and how much data there is in the read buffer, without consuming or discarding the buffered data. Exactly how best to do that, is an open question for me.

SiliconWizard · « **Reply #51 on:** January 10, 2021, 05:48:17 pm »

Quote from: Nominal Animal on January 08, 2021, 05:18:57 pm

Quote from: SiliconWizard on January 08, 2021, 03:12:26 pm
I'd like to see standard hash tables/dictionaries.
Environment variables are an example of a dictionary every process has access to, and I have an idea on those.

Do you have any examples of interfaces you've found useful? Function prototypes would give a good idea, with example real-world use cases.
Compare to e.g. qsort_r() instead of qsort() for sorting: the comparison operation often needs external information, like offset or column within the string to start the comparison at, and passing an untyped user pointer to the comparison function makes that easy in a thread-safe manner, as one does not need to use global variables.

Dictionaries are useful for a very large set of applications.

I'm not sure about the interface. I think ideally you should deal with arbitrary keys and values, so keys and values would be "items", each for instance being a pointer to the item's data, and a size field. I would certainly not restrict it to "strings".

Then a function adding a key-value pair to the dictionary would take 2 "items" as parameters.
Another function returning the value from the key would for instance take 1 "item" as parameter (the key) and would return the value as a pointer to an "item". A returned NULL would mean no matching key found.
You could also add a function for removing an entry (by key).

The function creating a dictionary may take an optional parameter defining if we want a dynamically resizable one, or a fixed max size (number of entries). Optionally it could also allow the use of a statically allocated dictionary (if we want to avoid dynamic allocation for instance.)

As for the key searching, the exact match could be either implemented with memcmp() (or equivalent), or with an optionally user-defined compare function for cases where a key match would not be strict binary equality of the whole key's data.

westfw · « **Reply #52 on:** January 15, 2021, 09:28:44 am »

I would really like a better set of string (text) functions. Maybe more capable of dealing with unicode that current stuff, but ... definitely more like the support in other languages.

I'm not sure what that would look like, exactly. One possibility is that strings could have their own garbage-collected memory management, without switching other things away from malloc/free.

Quote

rather than things like uint32_t

I find it a bit depressing how often people have used uint16_t, when what they really should have used was uint_fast16_t. Of course, if you didn't like uint16_t because of readabilty or typeability issues, you REALLY hate uint_fast16_t :-(

Nominal Animal · « **Reply #53 on:** January 15, 2021, 01:45:08 pm »

Quote from: westfw on January 15, 2021, 09:28:44 am

I would really like a better set of string (text) functions. Maybe more capable of dealing with unicode that current stuff, but ... definitely more like the support in other languages.

It is interesting to note that current Unicode limits to code points 0 to 0x10FFFF, inclusive (1,114,111 unique code points), which means that UTF-8 code points are 1, 2, or 3 bytes long. All newline conventions are either one or two bytes long. Commonly interesting escape/end sequences are two or three bytes long. And so on.

It seems to me that we really need string functions that instead of single-character bytes, work on characters or character sequences that are 1, 2, or 3 bytes long. This covers not only UTF-8, but other use cases as well. UTF-8 sequences are in many ways even easier, because the initial byte also describes the sequence length; this makes them relatively easy to support automagically when globbing or implementing regular expressions.

(I've worked quite a bit with wide character strings and wide I/O, and while they solve the individual character problem, they do not solve combined glyphs nor newline conventions nor escape sequences.)

For operations that are done to a limited-size buffer, the functions need to be able to return the case when the decisive sequence is cut short by the end of the buffer, so that the caller knows to resize/grow/move the buffer.
(So, the equivalent of strnstr() should be able to return here, not found, or cut short by end of buffer.)
However, even this is most important for those short sequences - a few characters at most; the longer string matching is much rarer operation, relatively speaking.

Making the common operations efficient is the key. After that, the rarer operations only need to be non-silly.

Quote from: westfw on January 15, 2021, 09:28:44 am

I'm not sure what that would look like, exactly. One possibility is that strings could have their own garbage-collected memory management, without switching other things away from malloc/free.

Definitely an intriguing option. I also like the underlying idea of modularity.

Perhaps, instead of a monolithic base library, it should be split into a core and optional sub-libraries?

Quote from: westfw on January 15, 2021, 09:28:44 am

I find it a bit depressing how often people have used uint16_t, when what they really should have used was uint_fast16_t. Of course, if you didn't like uint16_t because of readabilty or typeability issues, you REALLY hate uint_fast16_t :-(

That's exactly the reason I was mulling using u16 for uint_fast16_t and u16e for uint16_t.

Minimum-size fast types should be the most commonly used ones, so why not make them the easiest ones to use? The exact-sized types can then be thought of as stricter type variants (in the logical sense; computationally completely separate types), so a suffix denoting "exactly" seems logical to me.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Replacement for C standard library: your wishlist? (Read 8533 times)

Nominal Animal

Re: Replacement for C standard library: your wishlist?

SiliconWizard

Re: Replacement for C standard library: your wishlist?

westfw

Re: Replacement for C standard library: your wishlist?

Nominal Animal

Re: Replacement for C standard library: your wishlist?

Share me