Author Topic: The Imperium programming language - IPL (Read 86958 times)

Nominal Animal · « **Reply #725 on:** January 10, 2023, 12:09:00 pm »

Instead of flat text files, the source code of this language would be better stored as a token stream, alongside with token-to-text mappings in each language for that project, say in a zip archive like Microsoft Office OOXML files.

Source code editors would handle lexical analysis, with the compiler, JIT, or interpreter processing the token stream instead.

Each token could be typographical (say, "Newline", "Indentation of one level", "Indentation of two levels"), syntactical (say, beginning of a quoted string, end of a quoted string, object member reference), operator (addition, subtraction, negation, multiplication, division, assignment, equality comparison), name, language keyword, and so on. This would allow things like one developer seeing quoted strings as "Thus" whereas another would see it as «Thus» and yet another as “Thus”; assignment could be := or = or even equals, for each developer working on the same source, depending on their personal preferences (by just modifying their personal token mapping). The editor would be responsible for ensuring whatever the developer writes is unambiguously tokenized.

Literal strings themselves could be included in the token mapping, although it probably should be independent of the developer token mapping, as it could be used for runtime localization (say, multi-language user interfaces).

One option for these mappings would be to store source as its Gödel number, assuming each token (and literal string) is assigned an unique natural number.

For security, the ZIP files could be encrypted with a per-project key, a reverse public key. (That is, the public key is used to encrypt the contents, and the private key used to decrypt the contents.) Or you could use a centralized project key storage. The latter would be very interesting in the business sense, as obviously the development of both the editing environment and the toolchain requires resources, so the vendor managing the project keys would ensure licensee validity, even while letting toolchain and IDE downloads be freely available.

JPortici · « **Reply #726 on:** January 10, 2023, 12:13:48 pm »

Quote from: Siwastaja on January 10, 2023, 10:38:25 am

Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

tggzzz · « **Reply #727 on:** January 10, 2023, 12:47:00 pm »

Quote from: JPortici on January 10, 2023, 12:13:48 pm

Quote from: Siwastaja on January 10, 2023, 10:38:25 am
Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

Despite replying to my post, he omitted to address the issue of how that would interact with tooling like IDEs and repositories.

Nor did he have anything to say about whether the translatable-non-keywords concept would or wouldn't be extemded to, say, if( aRectangle.সূচনাকরুন() ) {...} ? If not, why not?

And then there's concentrating on syntax, while resolutely avoiding the more important topic of what "f.g(h)" might actually mean/do in his language

Sherlock Holmes · « **Reply #728 on:** January 10, 2023, 01:36:04 pm »

Those wishing to seriously discuss programming languages should at least make an effort to understand the difference between grammar and syntax. Its like trying to explain a passive tuned circuit to a class who don't know what DC resistance is.

tggzzz · « **Reply #729 on:** January 10, 2023, 01:41:02 pm »

Quote from: Sherlock Holmes on January 10, 2023, 01:36:04 pm

Those wishing to seriously discuss programming languages should at least make an effort to understand the difference between grammar and syntax. Its like trying to explain a passive tuned circuit to a class who don't know what DC resistance is.

Ditto syntax and semantics - and especially their relative importance to the usefulness of the tool.

Sherlock Holmes · « **Reply #730 on:** January 10, 2023, 01:42:39 pm »

Quote from: tggzzz on January 10, 2023, 12:47:00 pm

Quote from: JPortici on January 10, 2023, 12:13:48 pm
Quote from: Siwastaja on January 10, 2023, 10:38:25 am
Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

Despite replying to my post, he omitted to address the issue of how that would interact with tooling like IDEs and repositories.

Nor did he have anything to say about whether the translatable-non-keywords concept would or wouldn't be extemded to, say, if( aRectangle.সূচনাকরুন() ) {...} ? If not, why not?

And then there's concentrating on syntax, while resolutely avoiding the more important topic of what "f.g(h)" might actually mean/do in his language

It is you who is fixated on syntax, I rarely use the term here myself. There are many inaccuracies in your posts here, for example one can already, today write identifiers using Eastern characters, many programming languages do this, its been around for decades, nothing new.

Sherlock Holmes · « **Reply #731 on:** January 10, 2023, 01:43:43 pm »

Quote from: tggzzz on January 10, 2023, 01:41:02 pm

Quote from: Sherlock Holmes on January 10, 2023, 01:36:04 pm
Those wishing to seriously discuss programming languages should at least make an effort to understand the difference between grammar and syntax. Its like trying to explain a passive tuned circuit to a class who don't know what DC resistance is.

Ditto syntax and semantics - and especially their relative importance to the usefulness of the tool.

Are they the same thing?

tggzzz · « **Reply #732 on:** January 10, 2023, 01:46:12 pm »

Quote from: Sherlock Holmes on January 10, 2023, 01:42:39 pm

Quote from: tggzzz on January 10, 2023, 12:47:00 pm
Quote from: JPortici on January 10, 2023, 12:13:48 pm
Quote from: Siwastaja on January 10, 2023, 10:38:25 am
Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

Despite replying to my post, he omitted to address the issue of how that would interact with tooling like IDEs and repositories.

Nor did he have anything to say about whether the translatable-non-keywords concept would or wouldn't be extemded to, say, if( aRectangle.সূচনাকরুন() ) {...} ? If not, why not?

And then there's concentrating on syntax, while resolutely avoiding the more important topic of what "f.g(h)" might actually mean/do in his language

It is you who is fixated on syntax, I rarely use the term here myself. There are many inaccuracies in your posts here, for example one can already, today write identifiers using Eastern characters, many programming languages do this, its been around for decades, nothing new.

And still you fail to address important considerations that don't mesh with your chosen path and viewpoint!

tggzzz · « **Reply #733 on:** January 10, 2023, 01:47:56 pm »

Quote from: Sherlock Holmes on January 10, 2023, 01:43:43 pm

Quote from: tggzzz on January 10, 2023, 01:41:02 pm
Quote from: Sherlock Holmes on January 10, 2023, 01:36:04 pm
Those wishing to seriously discuss programming languages should at least make an effort to understand the difference between grammar and syntax. Its like trying to explain a passive tuned circuit to a class who don't know what DC resistance is.

Ditto syntax and semantics - and especially their relative importance to the usefulness of the tool.

Are they the same thing?

That question is revealing. But probably not in the way you intend!

DiTBho · « **Reply #734 on:** January 10, 2023, 01:57:52 pm »

Note the arrogance.

Sherlock Holmes · « **Reply #735 on:** January 10, 2023, 02:49:36 pm »

Quote from: tggzzz on January 10, 2023, 09:05:46 am

Quote from: Sherlock Holmes on January 09, 2023, 09:36:01 pm
The vocabulary rendered on screen is as immaterial as the colors or fonts one uses in their IDE, I don't care if you use bold purple for C keywords or italic blue, why would I care that you have that freedom? if you find one makes your more productive, improves your concentration then use it!

Yes indeed. So why are you concentrating your attention on something that is "immaterial"?

You have absolutely no idea what I am concentrating my attention on. To make the language agnostic with respect to keyword vocabulary, takes design, design effort.

Quote from: tggzzz on January 10, 2023, 09:05:46 am

Quote
If you truly want to talk about programming languages then all these issues are part of the subject, so get used to it.

Programming languages are more about semantics and behaviour than they are about syntax. If you don't believe that, then consider the many very different things that "f.g(h)" can cause to happen in different languages.

So why are you concentrating your attention on syntax?

As I said I am designing a language more or less from scratch. One cannot design a language without designing a grammar and it seems you are unaware, semantics is part of grammar. Discussions about coroutines or set membership or interrupt handling or memory allocation/freeing are discussions about semantics and all of these have been discussed by me in this thread.

Sherlock Holmes · « **Reply #736 on:** January 10, 2023, 02:52:12 pm »

Quote from: tggzzz on January 10, 2023, 09:17:42 am

Quote from: Sherlock Holmes on January 10, 2023, 03:18:44 am
Quote from: tggzzz on January 09, 2023, 08:50:52 pm
Quote from: ataradov on January 09, 2023, 07:45:18 pm
And the whole idea of programming languages with different "reserved" words is pretty bad. It completely shuts down international teamwork. People don't do this not because it is impossible, but because it is impractical and nobody wants it.

I tend to agree.

How many languages are there? How many won't be supported?

In many of my jobs it has been normal to have several different nationalities in the team.
If multiple languages are allowed, will if( aRectangle.সূচনাকরুন() ) {...} be allowed? If not, why not?

Which language would be stored in the source code control system?
How would "diffs" be done?
Who will write the tool that allows reconciles code in Basque, Finnish, Malayalam, Arabic, Hebrew?
While we don't have any boustrophedonic languages any more, will the top-to-bottom languages be excluded?
And, of course there is the issue of pair programming? Which language wins out there?

How will IDEs cope when parsing code so they can provide ctrl-space style autocompletion?
Can a programmer add a method in Chinese to a class in Spanish with a superclass in Welsh?

Yup it is a whole can of worms. Makes APL's somewhat arbitrary symbols look like a sensible option.

Maybe there's a reason the Algol experience hasn't been repeated!

APL's symbols weren't arbitrary read Iverson's Turing Award lecture, hopefully you'll learn something about this subject. Your confused questions make it clear you understand little about this subject.

Hence my use of the word "somewhat".

Nonetheless, to an extent all words in all languages are arbitrary. There is no reason why one specific group of animals should be called "cat" and another "dog" - and in many languages they aren't!

Quote
Everything I've said about this feature is OPTIONAL just like any number of coding standards or conventions. There is no computational reason for language keywords to be restricted to English, the abstract concepts are not English, or Japanese or Hebrew, they are universal concepts just like mathematical ideas, or APL ideas.

Oh! You actually realise the points I made! From your arguments I didn't guess that.

Why guess? use facts and logic instead, try it.

Quote from: tggzzz on January 10, 2023, 09:17:42 am

Optionality is a problem. With C++ (and to a lesser extent with C) there are so many ways of doing the same thing, which leads to the requirement that one of many subsets of the language is chosen in each project/company.

Unjustified complexity is a problem. Even the C++ committee designing the language didn't understand the consequences of the language features' interaction; in one case they had to have their noses rubbed in it!

Regarding the many shortcoming of C and C++, you're preaching to the converted here.

Quote from: tggzzz on January 10, 2023, 09:17:42 am

Quote
You can use Google translate right now to translate Russian or Norwegian articles into English, is that a bad thing? does that cause nothing but problems and confusion? do people try to mix multiple languages within the same article? think about what I'm saying to you.

Being able to translate source code from English to Spanish to Dutch to Russian with absolutely no impact on the code's meaning or runtime behavior is actually a hugely powerful capability, one you clearly cannot perceive or grasp, it goes right over your head.

Next you'll be suggesting that ChatGPT produces output where it is guaranteed that the details are reliable.

Another guess I see.

Sherlock Holmes · « **Reply #737 on:** January 10, 2023, 02:54:57 pm »

Quote from: DiTBho on January 10, 2023, 11:49:51 am

One famous book is Papadimitriou's Computational Complexity; a book that shows that theoretical computer science is practically a branch of mathematics, mostly concerned with studies of algorithms, so it also shows that even when people say "computer science" and they usually include many things which would not be considered mathematics, for instance computer architecture, specific programming languages, etc, well, certain aspects of both "sets" are shared { logic, proofs, ...}

So computer science is practically a branch of mathematics

Mathematics is not science, science is about discerning structure, patterns, symmetry in the natural world, computers are not "natural" they are created by mind.

Sherlock Holmes · « **Reply #738 on:** January 10, 2023, 03:03:54 pm »

Quote from: Nominal Animal on January 10, 2023, 12:09:00 pm

Instead of flat text files, the source code of this language would be better stored as a token stream, alongside with token-to-text mappings in each language for that project, say in a zip archive like Microsoft Office OOXML files.

Source code editors would handle lexical analysis, with the compiler, JIT, or interpreter processing the token stream instead.

Each token could be typographical (say, "Newline", "Indentation of one level", "Indentation of two levels"), syntactical (say, beginning of a quoted string, end of a quoted string, object member reference), operator (addition, subtraction, negation, multiplication, division, assignment, equality comparison), name, language keyword, and so on. This would allow things like one developer seeing quoted strings as "Thus" whereas another would see it as «Thus» and yet another as “Thus”; assignment could be := or = or even equals, for each developer working on the same source, depending on their personal preferences (by just modifying their personal token mapping). The editor would be responsible for ensuring whatever the developer writes is unambiguously tokenized.

Literal strings themselves could be included in the token mapping, although it probably should be independent of the developer token mapping, as it could be used for runtime localization (say, multi-language user interfaces).

One option for these mappings would be to store source as its Gödel number, assuming each token (and literal string) is assigned an unique natural number.

For security, the ZIP files could be encrypted with a per-project key, a reverse public key. (That is, the public key is used to encrypt the contents, and the private key used to decrypt the contents.) Or you could use a centralized project key storage. The latter would be very interesting in the business sense, as obviously the development of both the editing environment and the toolchain requires resources, so the vendor managing the project keys would ensure licensee validity, even while letting toolchain and IDE downloads be freely available.

This is quite true, this has crossed my mind too. But then we would not be able to peruse raw text files, we'd need tooling to replace the abstract token codes with some real human vocabulary.

The ability to use multiple vocabularies is entirely optional, if someone wants to work wholly in English they can, there's no compulsion or even visibility to the feature.

C lets us write identifiers using the Greek alphabet if we want, I could write Ω for a variable representing resistance, I can do that now in C, C++ and umpteen languages, funny how the cavemen here are not getting all bent out of shape by this optional feature.

Sherlock Holmes · « **Reply #739 on:** January 10, 2023, 03:09:31 pm »

Quote from: tggzzz on January 10, 2023, 12:47:00 pm

Quote from: JPortici on January 10, 2023, 12:13:48 pm
Quote from: Siwastaja on January 10, 2023, 10:38:25 am
Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

Despite replying to my post, he omitted to address the issue of how that would interact with tooling like IDEs and repositories.

Nor did he have anything to say about whether the translatable-non-keywords concept would or wouldn't be extemded to, say, if( aRectangle.সূচনাকরুন() ) {...} ? If not, why not?

And then there's concentrating on syntax, while resolutely avoiding the more important topic of what "f.g(h)" might actually mean/do in his language

Very well, I'll indulge your little questions:

1. IDE's? It is possible to imbue an IDE or editor with an ability to instantly re-render source code in any vocabulary, what more do you need to know?
2. Repositories? Well if that's a struggle for you I can't see why. For any open source project one could impose the rule that all pull-requests are converted to (say) English when merged, only English source gets merged, that's a basic team decision, not a problem.
3. One can today mix character sets in C code, I can name a struct "parameters" and give it an int member named "Ω", languages are flexible, use them as you wish, why are you asking me how you'd choose to name things?

Sherlock Holmes · « **Reply #740 on:** January 10, 2023, 03:10:37 pm »

Quote from: tggzzz on January 10, 2023, 01:46:12 pm

Quote from: Sherlock Holmes on January 10, 2023, 01:42:39 pm
Quote from: tggzzz on January 10, 2023, 12:47:00 pm
Quote from: JPortici on January 10, 2023, 12:13:48 pm
Quote from: Siwastaja on January 10, 2023, 10:38:25 am
Quote from: DC1MC on January 09, 2023, 08:52:29 pm
And this is supposed to be used for microcontrollers

Don't forget, it was supposed to be specifically tailored for those professionals in the industry who write cycle-accurate code for PIC microcontrollers all day long!

The hilarity of this "Sherlock Holmes" character has totally surpassed anything I expected, although the signs were clear from the start. Excellent thread to follow!

I smelled something when "THERE SHOULD BE NO RESERVED WORDS" became "Not only there will be reserved words, but they will be translatable"

Despite replying to my post, he omitted to address the issue of how that would interact with tooling like IDEs and repositories.

Nor did he have anything to say about whether the translatable-non-keywords concept would or wouldn't be extemded to, say, if( aRectangle.সূচনাকরুন() ) {...} ? If not, why not?

And then there's concentrating on syntax, while resolutely avoiding the more important topic of what "f.g(h)" might actually mean/do in his language

It is you who is fixated on syntax, I rarely use the term here myself. There are many inaccuracies in your posts here, for example one can already, today write identifiers using Eastern characters, many programming languages do this, its been around for decades, nothing new.

And still you fail to address important considerations that don't mesh with your chosen path and viewpoint!

I disagree.

Sherlock Holmes · « **Reply #741 on:** January 10, 2023, 03:12:11 pm »

Quote from: tggzzz on January 10, 2023, 01:47:56 pm

Quote from: Sherlock Holmes on January 10, 2023, 01:43:43 pm
Quote from: tggzzz on January 10, 2023, 01:41:02 pm
Quote from: Sherlock Holmes on January 10, 2023, 01:36:04 pm
Those wishing to seriously discuss programming languages should at least make an effort to understand the difference between grammar and syntax. Its like trying to explain a passive tuned circuit to a class who don't know what DC resistance is.

Ditto syntax and semantics - and especially their relative importance to the usefulness of the tool.

Are they the same thing?

That question is revealing. But probably not in the way you intend!

That's not an answer to the question, do "grammar" and "syntax" mean the same thing when discussing linguistics? do you know the answer? if you do why the reticence in honestly answering it?

Sherlock Holmes · « **Reply #742 on:** January 10, 2023, 03:28:09 pm »

I'd like to ask, please, if you have nothing to contribute other than invective, ridicule and sarcasm, then you consider not posting to the thread?

I expect adults to discuss like adults not like juveniles, too many posts here are non-constructive, little more than name calling, personal expressions of distaste, of outrage and blatant pettiness.

This is sadly a side-effect of people hiding behind the internet, insulting those with whom they disagree rather than politely disagreeing, I have a policy that I never address a person online in a way I would not address them in person in a meeting of peers, only the coward behaves differently in person and online.

If the subject doesn't interest you, if you are a know-all with a closed mind or simply if you think I'm an idiot, a dunce, a bozo, that's fine - just don't post here, please...

Nominal Animal · « **Reply #743 on:** January 10, 2023, 05:03:28 pm »

Quote from: Sherlock Holmes on January 10, 2023, 03:03:54 pm

Quote from: Nominal Animal on January 10, 2023, 12:09:00 pm
Instead of flat text files, the source code of this language would be better stored as a token stream, alongside with token-to-text mappings in each language for that project, say in a zip archive like Microsoft Office OOXML files.

Source code editors would handle lexical analysis, with the compiler, JIT, or interpreter processing the token stream instead.

Each token could be typographical (say, "Newline", "Indentation of one level", "Indentation of two levels"), syntactical (say, beginning of a quoted string, end of a quoted string, object member reference), operator (addition, subtraction, negation, multiplication, division, assignment, equality comparison), name, language keyword, and so on. This would allow things like one developer seeing quoted strings as "Thus" whereas another would see it as «Thus» and yet another as “Thus”; assignment could be := or = or even equals, for each developer working on the same source, depending on their personal preferences (by just modifying their personal token mapping). The editor would be responsible for ensuring whatever the developer writes is unambiguously tokenized.

Literal strings themselves could be included in the token mapping, although it probably should be independent of the developer token mapping, as it could be used for runtime localization (say, multi-language user interfaces).

One option for these mappings would be to store source as its Gödel number, assuming each token (and literal string) is assigned an unique natural number.

For security, the ZIP files could be encrypted with a per-project key, a reverse public key. (That is, the public key is used to encrypt the contents, and the private key used to decrypt the contents.) Or you could use a centralized project key storage. The latter would be very interesting in the business sense, as obviously the development of both the editing environment and the toolchain requires resources, so the vendor managing the project keys would ensure licensee validity, even while letting toolchain and IDE downloads be freely available.

This is quite true, this has crossed my mind too. But then we would not be able to peruse raw text files, we'd need tooling to replace the abstract token codes with some real human vocabulary.

How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

Sherlock Holmes · « **Reply #744 on:** January 10, 2023, 05:48:35 pm »

Quote from: Nominal Animal on January 10, 2023, 05:03:28 pm

Quote from: Sherlock Holmes on January 10, 2023, 03:03:54 pm
Quote from: Nominal Animal on January 10, 2023, 12:09:00 pm
Instead of flat text files, the source code of this language would be better stored as a token stream, alongside with token-to-text mappings in each language for that project, say in a zip archive like Microsoft Office OOXML files.

Source code editors would handle lexical analysis, with the compiler, JIT, or interpreter processing the token stream instead.

Each token could be typographical (say, "Newline", "Indentation of one level", "Indentation of two levels"), syntactical (say, beginning of a quoted string, end of a quoted string, object member reference), operator (addition, subtraction, negation, multiplication, division, assignment, equality comparison), name, language keyword, and so on. This would allow things like one developer seeing quoted strings as "Thus" whereas another would see it as «Thus» and yet another as “Thus”; assignment could be := or = or even equals, for each developer working on the same source, depending on their personal preferences (by just modifying their personal token mapping). The editor would be responsible for ensuring whatever the developer writes is unambiguously tokenized.

Literal strings themselves could be included in the token mapping, although it probably should be independent of the developer token mapping, as it could be used for runtime localization (say, multi-language user interfaces).

One option for these mappings would be to store source as its Gödel number, assuming each token (and literal string) is assigned an unique natural number.

For security, the ZIP files could be encrypted with a per-project key, a reverse public key. (That is, the public key is used to encrypt the contents, and the private key used to decrypt the contents.) Or you could use a centralized project key storage. The latter would be very interesting in the business sense, as obviously the development of both the editing environment and the toolchain requires resources, so the vendor managing the project keys would ensure licensee validity, even while letting toolchain and IDE downloads be freely available.

This is quite true, this has crossed my mind too. But then we would not be able to peruse raw text files, we'd need tooling to replace the abstract token codes with some real human vocabulary.

How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

Yes, good questions and I don't know exactly how a team might choose to work, let me elaborate on what I've been doing then try to answer you.

At this stage I've proven that a self-consistent non-ambiguous grammar can be devised that is insensitive to the exact spelling of its keywords, I wasn't 100% sure but suspected this was possible for a grammar that has no reserved words (like PL/I, Fortran etc.). I also wasn't sure if current parser tools were able to do what I needed, I've used them before and they can block progress sometimes once some idea is tried in a grammar, also hand crafted lexers and parsers are possible (I've written these before) but they make experiments, changes very hard, very slow going, so slow that one ends up not experimenting much.

Anyway Antlr is beyond my highest expectations, it is truly very powerful and far beyond anything I could craft by hand, in a couple of seconds I can tweak a grammar rule and regen the parser source code and test it, very powerful indeed.

At parse time I specify the keyword lexicon code "en" (English), "fr" (French) etc and the generated lexer knows what the keywords should be for that language code, the parser code is agnostic, has no knowledge of keyword spelling at all other than in the abstract sense.

So, how to use this?

Well the language code could be a compiler option or a preprocessor setting within a source file or inferred from the name of the file, there are several ways one could do that, not a huge effort.

Also the parser code is a class library (Java or C#), so can be used as part of a compiler or within another tool, for example I'm looking at a simple command line tool that can consume a source file in one language and create an output in another, this is not very hard to do, yes there's code involved to recreate a text file from the parse tree, spaces, comments, line endings etc but that is - in principle - not much of a problem, I asked the Antlr team, I've not looked at that in earnest yet but I might do soon.

Anyway it should be easy to build such power into an IDE or editor, there are numerous editors that support all kinds of extensibility (VS Code is very good) so one could just open and edit a *.ipl" file and click a dropdown "Convert to..." where we can choose some target and the tool will instantly refresh the file with the source code in that chosen language.

Or just as easily, we can envisage "Save As..." where we can save "test_1_abc.ipl" as (say) "test_1_abc.ru.ipl" (or any name really) and specify "Russian" as the choice of keyword lexicon, the tool would regen that file being saved into the equivalent in Russian using the same tool I describe above.

Or "Open As" could open a file in whatever language, into whatever language one wanted to edit the file in!

I've been thinking about a way to detect the lexicon used in a source file, there are ways to do this.

These ideas mean we could live in a world where we can open any file in any language yet see it only in our chosen language and when we save it it gets saved back in its original language, all invisibly, these are all serious possibilities.

These and other reasons are why I've been focused on the grammar, one cannot start to implement semantic processing (and code generation really) until one commits to a grammar and once you've written that stuff it is very very very hard indeed to go back and adjust the grammar without often complex rework on the semantic processor etc, the goal has been to get to a grammar that can support essential features and then start to design the semantic processing and then the rest of the parts in a compiler's back end.

The back end is to all intents and purposes a relatively routine phase of a project like this, back end's are decoupled (largely) from the front end grammar but we do need the middle - the semantic phase, optimizers etc.

Antlr does not generate (or assist with the generation of) an abstract syntax tree, that code is an essential part of a full compiler and as you can appreciate the AST generator must consume the parse tree so if the grammar had to change much, the parse tree would and then we'd have to rework the AST generatror along with any code we'd written that was consuming that AST, lots of needless work, wasted time.

These and other real world issues are being glossed over by some of the naïve detractors posting in this thread, they asked several times "why are you fixated on syntax rather than the nitty gritty compiler and code generator" I'm afraid such a question only reveals their naivety about real world "gloves off" compiler design.

Just to stress, the multiple keyword lexicon was not initially on my list of goals for the grammar, I only added it after realizing it was possible and with very little effort when using powerful tooling like Antlr. An eventual compiler would let one work wholly in a single language if that's all they wanted to do, seamlessly, there's no impact on simple basic use by having this multi language feature, if one doesn't care for it then disregard it.

JPortici · « **Reply #745 on:** January 10, 2023, 06:15:23 pm »

Quote from: Nominal Animal on January 10, 2023, 05:03:28 pm

How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

I can't honestly tell if you are being serious right now.
To me that sounds like a nightmare. Simple syntax, having to use a specific language in a project instead of each one their own is a compromize we all have to make to share a common base. And that's good. What's good engineering if not the best possible compromize?

My PC has some settings in italian locale, other in english locale. Some software decide to use one, other software the other, without consistency. And there were cases in which the locale were mixed, who knows why.
Since you bring up excel, I also have had (very few) documents that refued to work because some functions wouldn't be translated from one language to the other. In OO Calc all menus are in english because system language is english but i have to use italian formulae. Go figure.

Sherlock Holmes · « **Reply #746 on:** January 10, 2023, 06:21:48 pm »

Quote from: JPortici on January 10, 2023, 06:15:23 pm

Quote from: Nominal Animal on January 10, 2023, 05:03:28 pm
How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

I can't honestly tell if you are being serious right now.
To me that sounds like a nightmare. Simple syntax, having to use a specific language in a project instead of each one their own is a compromize we all have to make to share a common base. And that's good. What's good engineering if not the best possible compromize?

My PC has some settings in italian locale, other in english locale. Some software decide to use one, other software the other, without consistency. And there were cases in which the locale were mixed, who knows why.
Since you bring up excel, I also have had (very few) documents that refued to work because some functions wouldn't be translated from one language to the other. In OO Calc all menus are in english because system language is english but i have to use italian formulae. Go figure.

I don't see any advantage in a programming language that recognizes only English keywords over one that can recognize English and - optionally - other cultures' keywords too. What advantage does such a programming language offer and to whom? what can the former do that the latter cannot?

tggzzz · « **Reply #747 on:** January 10, 2023, 06:23:59 pm »

Quote from: JPortici on January 10, 2023, 06:15:23 pm

Quote from: Nominal Animal on January 10, 2023, 05:03:28 pm
How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

I can't honestly tell if you are being serious right now.
To me that sounds like a nightmare. Simple syntax, having to use a specific language in a project instead of each one their own is a compromize we all have to make to share a common base. And that's good. What's good engineering if not the best possible compromize?

I'm 99.999% sure he isn't being serious, partly because he (almost always

) makes sense

Instead he is one of several people pointing out to the OP that, while his ideas might be cute in some way, in practice they will be a can of worms. As of a couple of posts ago, the OP appears to accept that he will also be rewriting the development tools around the language!

tggzzz · « **Reply #748 on:** January 10, 2023, 06:26:41 pm »

Quote from: Sherlock Holmes on January 10, 2023, 06:21:48 pm

I don't see any advantage in a programming language that recognizes only English keywords over one that can recognize English and - optionally - other cultures' keywords too. What advantage does such a programming language offer and to whom? what can the former do that the latter cannot?

Most people will ask a different question, along the lines of "What is the benefit of a language that has (non-) keywords in many languages? How does it improve the code quality and reduce cost?

In other words, don't confuse "features" with "benefits".

Sherlock Holmes · « **Reply #749 on:** January 10, 2023, 06:28:08 pm »

Quote from: tggzzz on January 10, 2023, 06:23:59 pm

Quote from: JPortici on January 10, 2023, 06:15:23 pm
Quote from: Nominal Animal on January 10, 2023, 05:03:28 pm
How else would multicultural and multilingual development teams cooperate?

Consider this the programming equivalent of personal pronouns: each subgroup gets to define how the code looks to them, without any oppressor forcing them to use a specific form. I'm quite sure this is the future of socially aware software development.

(This might also open up interesting possibilities for funding the development of such a programming language.)

I can't honestly tell if you are being serious right now.
To me that sounds like a nightmare. Simple syntax, having to use a specific language in a project instead of each one their own is a compromize we all have to make to share a common base. And that's good. What's good engineering if not the best possible compromize?

I'm 99.999% sure he isn't being serious, partly because he (almost always ) makes sense

Instead he is one of several people pointing out to the OP that, while his ideas might be cute in some way, in practice they will be a can of worms. As of a couple of posts ago, the OP appears to accept that he will also be rewriting the development tools around the language!

You must be paraphrasing me, can you quote the post where I said "I will also be rewriting the development tools around the language"? I said no such thing, nor would I ever express myself in such a sloppy, cavalier manner.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: The Imperium programming language - IPL (Read 86958 times)

Share me