Author Topic: DARPA suggests turning old C code automatically into Rust (Read 2640 times)

tggzzz · « **Reply #25 on:** October 11, 2024, 05:07:24 pm »

Quote from: coppice on October 11, 2024, 02:26:49 pm

Quote from: tggzzz on October 11, 2024, 02:20:09 pm
Quote from: coppice on October 11, 2024, 02:15:10 pm
Quote from: tggzzz on October 11, 2024, 01:44:59 pm
I don't think anybody claimed Igor Aleksander's 1983 WISARD had IQ

It was designed, like modern ML systems, to do pattern matching based on training sets. It worked well in the lab when trained on pictures of tanks and cars, but failed miserably on Luneberger Heath. Nobody could tell why. Eventually they figured out the training set was car adverts and tanks in N Europe, and it was discriminating between sunny days and cloudy days. It is rumoured that Aleksander's colleagues wouldn't recognise his existence on sunny days.

Modern ML systems are more sophisticated. They recommend for/against medical treatment based on the font labelling the scan.
You keep quoting that ham fisted botched up case like it represents something significant. It just represents what incompetence can achieve on a good day.

It was an early harbinger of things to come.

The same fundamental problems exist in modern ML systems in spades, except now they can screw up many peoples' lives.

Key concepts:
you can't test quality into a system (test==training example), quality has to be designed
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult." Modern ML systems can only be the latter
Ah, the same old rubbish you say each time. An ML system doesn't have to be perfect, since we mostly use these systems to replace highly flawed humans. We only need reasonable confidence that they are no worse than a human. An ML system is quite like a human with no common sense. Well, large numbers of humans have no common sense either, which is probably why bad many ML system responses seem a lot like the dumber of the people around us. When you look at applications where things need to be perfect, like the AI chips floor planning tools that are becoming widely used, the tool doesn't have to be perfect. Like a human doing the floor planning, you expect a lot of errors, so you use rigorous tools to see that the final result gets properly cleaned up.

So, you are content to be denied essential medical treatment "because the computer says so", and cannot explain why.

So, you are content to be incarcerated because - by design - the computer copies existing discriminatory practices, and cannot explain why

I'm not. We deserve better.

BTW, both those examples and many more have been observed in the wild; they are practical not theoretical. FFI start with the moderated, high SNR, low volume risks digest https://en.wikipedia.org/wiki/RISKS_Digest. Searchable online at http://catless.ncl.ac.uk/Risks/ Note that many of the contributors are giants in their field.

coppice · « **Reply #26 on:** October 11, 2024, 06:47:29 pm »

Quote from: tggzzz on October 11, 2024, 05:07:24 pm

So, you are content to be denied essential medical treatment "because the computer says so", and cannot explain why.

So, you are content to be incarcerated because - by design - the computer copies existing discriminatory practices, and cannot explain why

I'm not. We deserve better.

BTW, both those examples and many more have been observed in the wild; they are practical not theoretical. FFI start with the moderated, high SNR, low volume risks digest https://en.wikipedia.org/wiki/RISKS_Digest. Searchable online at http://catless.ncl.ac.uk/Risks/ Note that many of the contributors are giants in their field.

We deserve better than the appalling rate with which doctors and legal people misdiagnose what they are looking at.

In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains. The results of trials of these things were very impressive, but they were largely crushed by special interests. We had one that diagnosed configuration and fault issues in our telecoms systems. When it was shown to customers our sales people were threatened with violence, because the customers could see it had the potential to drive them out of their cushy expert jobs. These things could generally do a better job than a human. They encompassed the distilled knowledge of numerous humans, but didn't get stressed, tired, or emotional. I read papers of trials of medical expert systems. Doctors operate in an more emotional and stressful environment than most experts. Expert systems did a great job of diagnosis there. Getting the answers from patients needed by the expert system requires more care by the user, like assessing just what a patient means by "very" in "very painful", but after those value judgements are made by humans, the Bayesian reasoning approach can do a more consistently good job than a human. The only thing I see remaining from that work is one of its most elementary aspects - check lists. Check lists are far more common than in the 1980s, and are considered to greatly reduce human errors in every place they are used. You often have to fight to get even those in place, as humans are generally arrogant and "don't need no check lists".

We have ways to have provably better decision making systems in place than we do today, and we don't use them. The problem with ML systems is not what they can do wrong, but what their human masters will have them do wrong for petty emotional reasons. Figures for the number of hospital admissions due to one treatment badly interacting with another are horrifying, and we don't even take that waste of resources and resultant human misery seriously enough to have systems fully rote check multiple treatments for interactions. Personally I'd like to see those expert systems revived. Maybe they can check the ML systems errors.

Siwastaja · « **Reply #27 on:** October 11, 2024, 07:05:13 pm »

coppice, thanks for interesting commentary. What I see kinda-sorta odd is that nowadays the only suggested pattern is "do decision making / create content with AI, use human to check and correct". I would see bigger potential turning that around: use human beings for initial decision making and content creation because humans are still better being very creative, but use AI/ML to check and verify their work. This way all of the risks of AI are greatly reduced, but we still can use the potential of it.

In other words, if it has anything useful to say, let it do so. Hear the opinion, but don't make it the master of the process.

SiliconWizard · « **Reply #28 on:** October 11, 2024, 08:42:33 pm »

Again the main issue here is not with a particular technology, we are a "technological" species.
The issue is that some people (and no, not all humans as a whole - who gets to decide?) keep pushing the idea that so-called "AI" should not be just a tool, like every other tool we've ever created, but something that makes us tools instead.

Why some are obsessed with that idea, you tell me.

tggzzz · « **Reply #29 on:** October 11, 2024, 08:56:50 pm »

Quote from: coppice on October 11, 2024, 06:47:29 pm

Quote from: tggzzz on October 11, 2024, 05:07:24 pm
So, you are content to be denied essential medical treatment "because the computer says so", and cannot explain why.

So, you are content to be incarcerated because - by design - the computer copies existing discriminatory practices, and cannot explain why

I'm not. We deserve better.

BTW, both those examples and many more have been observed in the wild; they are practical not theoretical. FFI start with the moderated, high SNR, low volume risks digest https://en.wikipedia.org/wiki/RISKS_Digest. Searchable online at http://catless.ncl.ac.uk/Risks/ Note that many of the contributors are giants in their field.
We deserve better than the appalling rate with which doctors and legal people misdiagnose what they are looking at.

Which bit of "cannot explain why" do you not understand?

You can get a doctor to explain "why", and challenge it. I have had to do that too many times

Quote

In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains.

As someone who was active in the research/implementation (i.e. not just field evaluation) of AI systems in the early-mid 80s, I know that is essentially wrong. The AI of the time consisted of, to put it simply, if-then-else rules constructed by eliciting the knowledge from experts.

You could easily interrogate the system to see which rules "fired", and hence see the chain of inferences from input to output.

That is, to all intents and purposes, impossible with what is now called AI based around Bayesian maths.

Quote

The results of trials of these things were very impressive, but they were largely crushed by special interests. We had one that diagnosed configuration and fault issues in our telecoms systems. When it was shown to customers our sales people were threatened with violence, because the customers could see it had the potential to drive them out of their cushy expert jobs. These things could generally do a better job than a human. They encompassed the distilled knowledge of numerous humans, but didn't get stressed, tired, or emotional. I read papers of trials of medical expert systems. Doctors operate in an more emotional and stressful environment than most experts. Expert systems did a great job of diagnosis there. Getting the answers from patients needed by the expert system requires more care by the user, like assessing just what a patient means by "very" in "very painful", but after those value judgements are made by humans, the Bayesian reasoning approach can do a more consistently good job than a human. The only thing I see remaining from that work is one of its most elementary aspects - check lists. Check lists are far more common than in the 1980s, and are considered to greatly reduce human errors in every place they are used. You often have to fight to get even those in place, as humans are generally arrogant and "don't need no check lists".

80s AI systems success was because they distilled expert's knowledge.

Modern AI has a large number of examples throw at it, and told "see if you can set millions of of twiddle factors (sorry, 'weights') to copy it".

coppice · « **Reply #30 on:** October 11, 2024, 09:18:17 pm »

Quote from: tggzzz on October 11, 2024, 08:56:50 pm

Quote
In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains.
As someone who was active in the research/implementation (i.e. not just field evaluation) of AI systems in the early-mid 80s, I know that is essentially wrong. The AI of the time consisted of, to put it simply, if-then-else rules constructed by eliciting the knowledge from experts.

If the systems you worked on were if-then-else based they were at the very crude end, and of limited ability. Sounds like an early 80s approach. By about 86 or 87 there was a big move to fuzzy logic or true Bayesian approaches. Some things are easy to diagnose with an if-then-else approach, because all the answers to the probing questions are clear yes/no ones. Most things aren't like that. A human expert does things in a Bayesian way, without even thinking much about it. A doctor doesn't go "Patient has a cough. Could be <very long list of possibilities>" and then labouriously work through them. They work is a likelihood based manner. "Patient has a cough. Probably a simple infection. Let's see what other things favour that, or suggest its wrong". Its all weight based. Interestingly, this is the same period where things like data recovery from disks, and decoding in communications channels was moving from hard to maximum likelihood type approaches (e.g. PRML for disks and tape, which started deployment in the early 80s). Simplistic decision slicing has been disappearing from most areas of electronics since that time. None of those can give you a specific reason for their decisions, other than "that's how the weighting worked out".

tggzzz · « **Reply #31 on:** October 11, 2024, 09:45:28 pm »

Quote from: coppice on October 11, 2024, 09:18:17 pm

Quote from: tggzzz on October 11, 2024, 08:56:50 pm
Quote
In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains.
As someone who was active in the research/implementation (i.e. not just field evaluation) of AI systems in the early-mid 80s, I know that is essentially wrong. The AI of the time consisted of, to put it simply, if-then-else rules constructed by eliciting the knowledge from experts.
If the systems you worked on were if-then-else based they were at the very crude end, and of limited ability. Sounds like an early 80s approach. By about 86 or 87 there was a big move to fuzzy logic or true Bayesian approaches. Some things are easy to diagnose with an if-then-else approach, because all the answers to the probing questions are clear yes/no ones. Most things aren't like that. A human expert does things in a Bayesian way, without even thinking much about it. A doctor doesn't go "Patient has a cough. Could be <very long list of possibilities>" and then labouriously work through them. They work is a likelihood based manner. "Patient has a cough. Probably a simple infection. Let's see what other things favour that, or suggest its wrong". Its all weight based. Interestingly, this is the same period where things like data recovery from disks, and decoding in communications channels was moving from hard to maximum likelihood type approaches (e.g. PRML for disks and tape, which started deployment in the early 80s). Simplistic decision slicing has been disappearing from most areas of electronics since that time. None of those can give you a specific reason for their decisions, other than "that's how the weighting worked out".

As I said "to put it simply". Bayesian weights can be and were attached to if-then-else rules.

I don't see why you can't understand the difference between rules extracted from experts, and millions of twiddle factors assigned for unknown and unknowable reasons.

coppice · « **Reply #32 on:** October 11, 2024, 10:00:00 pm »

Quote from: tggzzz on October 11, 2024, 09:45:28 pm

Quote from: coppice on October 11, 2024, 09:18:17 pm
Quote from: tggzzz on October 11, 2024, 08:56:50 pm
Quote
In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains.
As someone who was active in the research/implementation (i.e. not just field evaluation) of AI systems in the early-mid 80s, I know that is essentially wrong. The AI of the time consisted of, to put it simply, if-then-else rules constructed by eliciting the knowledge from experts.
If the systems you worked on were if-then-else based they were at the very crude end, and of limited ability. Sounds like an early 80s approach. By about 86 or 87 there was a big move to fuzzy logic or true Bayesian approaches. Some things are easy to diagnose with an if-then-else approach, because all the answers to the probing questions are clear yes/no ones. Most things aren't like that. A human expert does things in a Bayesian way, without even thinking much about it. A doctor doesn't go "Patient has a cough. Could be <very long list of possibilities>" and then labouriously work through them. They work is a likelihood based manner. "Patient has a cough. Probably a simple infection. Let's see what other things favour that, or suggest its wrong". Its all weight based. Interestingly, this is the same period where things like data recovery from disks, and decoding in communications channels was moving from hard to maximum likelihood type approaches (e.g. PRML for disks and tape, which started deployment in the early 80s). Simplistic decision slicing has been disappearing from most areas of electronics since that time. None of those can give you a specific reason for their decisions, other than "that's how the weighting worked out".

As I said "to put it simply". Bayesian weights can be and were attached to if-then-else rules.

I don't see why you can't understand the difference between rules extracted from experts, and millions of twiddle factors assigned for unknown and unknowable reasons.

So, you were just badly saying what I'd already said.

Those expert systems worked, and they practically disappeared. Clearly people aren't interested in solutions that work well, so they get the current round of systems that are the idiot's idea of a smart person. I don't see why you can't figure out that people are happier with something quirky and untrustworthy, than something which works well and makes them feel small.

thm_w · « **Reply #33 on:** October 12, 2024, 12:19:59 am »

Quote from: artag on August 12, 2024, 04:43:11 pm

If it's unsafe C code and the translator creates safe Rust, it has to do something different. So it's not functionally equivalent. The C code might DEPEND on being unsafe to work. How do you know the translation is correct ?

You test it.

Quote

At best you end up with something that can be tested (probably requiring fixes) and maintained as Rust.

Yes that is the entire point... they literally say that in the article.

Quote

And might be unreadable.

Surely the intent would be to be readable and maintainable. Its a failure if not.

Quote from: Siwastaja on October 11, 2024, 07:05:13 pm

coppice, thanks for interesting commentary. What I see kinda-sorta odd is that nowadays the only suggested pattern is "do decision making / create content with AI, use human to check and correct". I would see bigger potential turning that around: use human beings for initial decision making and content creation because humans are still better being very creative, but use AI/ML to check and verify their work. This way all of the risks of AI are greatly reduced, but we still can use the potential of it.

In other words, if it has anything useful to say, let it do so. Hear the opinion, but don't make it the master of the process.

But this topic is about translation, not original content creation.

tggzzz · « **Reply #34 on:** October 12, 2024, 11:07:20 am »

Quote from: coppice on October 11, 2024, 10:00:00 pm

Quote from: tggzzz on October 11, 2024, 09:45:28 pm
Quote from: coppice on October 11, 2024, 09:18:17 pm
Quote from: tggzzz on October 11, 2024, 08:56:50 pm
Quote
In the 80s "expert systems", which are essentially Bayesian reasoning machines built from extensive interviewing with a range of human experts, were built for a number of problem domains.
As someone who was active in the research/implementation (i.e. not just field evaluation) of AI systems in the early-mid 80s, I know that is essentially wrong. The AI of the time consisted of, to put it simply, if-then-else rules constructed by eliciting the knowledge from experts.
If the systems you worked on were if-then-else based they were at the very crude end, and of limited ability. Sounds like an early 80s approach. By about 86 or 87 there was a big move to fuzzy logic or true Bayesian approaches. Some things are easy to diagnose with an if-then-else approach, because all the answers to the probing questions are clear yes/no ones. Most things aren't like that. A human expert does things in a Bayesian way, without even thinking much about it. A doctor doesn't go "Patient has a cough. Could be <very long list of possibilities>" and then labouriously work through them. They work is a likelihood based manner. "Patient has a cough. Probably a simple infection. Let's see what other things favour that, or suggest its wrong". Its all weight based. Interestingly, this is the same period where things like data recovery from disks, and decoding in communications channels was moving from hard to maximum likelihood type approaches (e.g. PRML for disks and tape, which started deployment in the early 80s). Simplistic decision slicing has been disappearing from most areas of electronics since that time. None of those can give you a specific reason for their decisions, other than "that's how the weighting worked out".

As I said "to put it simply". Bayesian weights can be and were attached to if-then-else rules.

I don't see why you can't understand the difference between rules extracted from experts, and millions of twiddle factors assigned for unknown and unknowable reasons.
So, you were just badly saying what I'd already said.

No, I wasn't saying that.

Clearly you don't understand the difference.

I'll try once more with an analogy, in the knowledge that analogies are dangerous because people tend to latch on to irrelevant aspects.

Modern ML systems are like an analogue circuit boards composed of very large numbers of randomly interconnected opamps and trimpots. To get the circuit to "work", you throw lots of waveforms at the input, look at the outputs, and something randomly mutates the trimpot settings until the outputs look OK.

Now add yet another different input waveform: you will not be able to predict what the output will be. Mostly it will be similar, but sometimes it will be grossly different.

Now reach in and manually change a trimpot or ten. You will not be able to predict to what extent the output will change.

Quote

Those expert systems worked, and they practically disappeared. Clearly people aren't interested in solutions that work well, so they get the current round of systems that are the idiot's idea of a smart person. I don't see why you can't figure out that people are happier with something quirky and untrustworthy, than something which works well and makes them feel small.

Oh, there we agree. Unfortunately

coppice · « **Reply #35 on:** October 12, 2024, 03:32:41 pm »

Quote from: tggzzz on October 12, 2024, 11:07:20 am

I'll try once more with an analogy, in the knowledge that analogies are dangerous because people tend to latch on to irrelevant aspects.

Modern ML systems are like an analogue circuit boards composed of very large numbers of randomly interconnected opamps and trimpots. To get the circuit to "work", you throw lots of waveforms at the input, look at the outputs, and something randomly mutates the trimpot settings until the outputs look OK.

Now add yet another different input waveform: you will not be able to predict what the output will be. Mostly it will be similar, but sometimes it will be grossly different.

Now reach in and manually change a trimpot or ten. You will not be able to predict to what extent the output will change.
Quote
So, you are saying a ML system is a lot like a human one, and not to be trusted too much. There we agree. Why does that make ML unacceptable, when the alternative is similarly flaky human behaviour? Don't come back with that line about being able to probe a human's reasoning. We know that almost all answers given in situations like that are created post hoc by rationalising the choice that was made. They do not represent how the choice was made. The extent to which this has been found to be the case is quite amazing. Surely you are aware of some of the research in this area? It keeps getting used by activists to claim there is no free will, but its you're head freely doing it, even if influenced by your mood, by propaganda, by other external agents.

tggzzz · « **Reply #36 on:** October 12, 2024, 04:53:30 pm »

Quote from: coppice on October 12, 2024, 03:32:41 pm

Quote from: tggzzz on October 12, 2024, 11:07:20 am
I'll try once more with an analogy, in the knowledge that analogies are dangerous because people tend to latch on to irrelevant aspects.

Modern ML systems are like an analogue circuit boards composed of very large numbers of randomly interconnected opamps and trimpots. To get the circuit to "work", you throw lots of waveforms at the input, look at the outputs, and something randomly mutates the trimpot settings until the outputs look OK.

Now add yet another different input waveform: you will not be able to predict what the output will be. Mostly it will be similar, but sometimes it will be grossly different.

Now reach in and manually change a trimpot or ten. You will not be able to predict to what extent the output will change.
So, you are saying a ML system is a lot like a human one, and not to be trusted too much. There we agree. Why does that make ML unacceptable, when the alternative is similarly flaky human behaviour? Don't come back with that line about being able to probe a human's reasoning. We know that almost all answers given in situations like that are created post hoc by rationalising the choice that was made. They do not represent how the choice was made. The extent to which this has been found to be the case is quite amazing. Surely you are aware of some of the research in this area? It keeps getting used by activists to claim there is no free will, but its you're head freely doing it, even if influenced by your mood, by propaganda, by other external agents.

The existence or otherwise of free will is a philosophical/religious rathole, with no resolution. Hence it can only be a red herring.

It doesn't matter whether the reason was created post hoc. What matters is that the false reasoning can be detected and corrected.

If an ML gives a false output, which trimpot should be twiddled, and if twiddled what's to guarantee it won't cause something else to break? Suck it and see is the antithesis of good engineering, and is the hallmark of alchemy.

People place too much credence in computer outputs, for many reasons including ignorance, laziness, stupidity, malice, financial gain, prejudice, etc.

"Because the computer says so and it is more than my jobs worth" has long been a vile excuse. Modern ML vastly expands the scope for that to occur, and therefore frequency.

coppice · « **Reply #37 on:** October 12, 2024, 04:55:07 pm »

Quote from: tggzzz on October 12, 2024, 04:53:30 pm

Quote from: coppice on October 12, 2024, 03:32:41 pm
Quote from: tggzzz on October 12, 2024, 11:07:20 am
I'll try once more with an analogy, in the knowledge that analogies are dangerous because people tend to latch on to irrelevant aspects.

Modern ML systems are like an analogue circuit boards composed of very large numbers of randomly interconnected opamps and trimpots. To get the circuit to "work", you throw lots of waveforms at the input, look at the outputs, and something randomly mutates the trimpot settings until the outputs look OK.

Now add yet another different input waveform: you will not be able to predict what the output will be. Mostly it will be similar, but sometimes it will be grossly different.

Now reach in and manually change a trimpot or ten. You will not be able to predict to what extent the output will change.
So, you are saying a ML system is a lot like a human one, and not to be trusted too much. There we agree. Why does that make ML unacceptable, when the alternative is similarly flaky human behaviour? Don't come back with that line about being able to probe a human's reasoning. We know that almost all answers given in situations like that are created post hoc by rationalising the choice that was made. They do not represent how the choice was made. The extent to which this has been found to be the case is quite amazing. Surely you are aware of some of the research in this area? It keeps getting used by activists to claim there is no free will, but its you're head freely doing it, even if influenced by your mood, by propaganda, by other external agents.

Oh FFS.

How many times do I have to repeat that you can get a human to explain their reason for their output, but an ML is impenetrable. That's true for correct outputs and, more importantly for incorrect or questionable outputs.

People place too much credence in computer outputs, for many reasons including ignorance, laziness, stupidity, malice, financial gain, prejudice, etc.

"Because the computer says so and it is more than my jobs worth" has long been a vile excuse. Modern ML vastly expands the scope for that to occur, and therefore frequency.

If you are not aware of the hungry judge issue, look it up, and see if you can some back and repeat what you wrote with a straight face.

tggzzz · « **Reply #38 on:** October 12, 2024, 05:08:55 pm »

Looks like a lot of my edits crossed your reply in the aether. It happens. My edited reply is repeated below

The existence or otherwise of free will is a philosophical/religious rathole, with no resolution. Hence it can only be a red herring.

It doesn't matter whether the reason was created post hoc. What matters is that the false reasoning can be detected and corrected.

If an ML gives a false output, which trimpot should be twiddled, and if twiddled what's to guarantee it won't cause something else to break? Suck it and see is the antithesis of good engineering, and is the hallmark of alchemy.

People place too much credence in computer outputs, for many reasons including ignorance, laziness, stupidity, malice, financial gain, prejudice, etc.

"Because the computer says so and it is more than my jobs worth" has long been a vile excuse. Modern ML vastly expands the scope for that to occur, and therefore frequency.

tggzzz · « **Reply #39 on:** October 12, 2024, 05:19:05 pm »

Quote from: coppice on October 12, 2024, 04:55:07 pm

If you are not aware of the hungry judge issue, look it up, and see if you can some back and repeat what you wrote with a straight face.

I am well aware that people often exhibit less than commendable behaviour.

MLs work by copying existing suboptimal behaviour.

You cannot instruct an ML to "do the same as is always done except in special case X do Y", because the X and Y are distributed across millions of trimpots.

coppice · « **Reply #40 on:** October 15, 2024, 12:26:17 pm »

Quote from: tggzzz on October 12, 2024, 05:19:05 pm

Quote from: coppice on October 12, 2024, 04:55:07 pm
If you are not aware of the hungry judge issue, look it up, and see if you can some back and repeat what you wrote with a straight face.

I am well aware that people often exhibit less than commendable behaviour.

MLs work by copying existing suboptimal behaviour.

You cannot instruct an ML to "do the same as is always done except in special case X do Y", because the X and Y are distributed across millions of trimpots.

The hungry judge study is really interesting. Judges adjudicating parole hearings were monitored for a wide range of things that might influence their decision making. The one issue that stood head and shoulders above all others was how long it had been since the judge ate. I think they were looking beyond actual case factors, for issues like how fresh or tired the judge was, but they finally realised the big deal was stomach contents. So, you could ask the judge how they came to their decision, and they could give you a nice balanced view of the factors they took into account, yet they would miss the tank in the night and day..... I mean just how sated they were at the time.

I don't know what mechanisms they use, but when current AI systems don't give the results people want, either being inaccurate or being politically unacceptable, they seem to be able to cook the system very quickly to heavily nudge the outcomes. We saw this with Google's recent mess with every famous historical white figure being rendered as some weird slightly Asian, slightly African, slightly Middle Eastern and slightly native American unisex figure. They responded to that quickly, if incompletely. On the other hand its several years since Google was ridiculed for searches like "white couples" producing only results for couples who were not white. They fixed that for the obvious cases, but I still see searches like "white plastic sheeting" giving me results for people selling every kind of plastic except white. Google's search engine is really badly broken, and we don't really know what it consists of in 2024. Google have made a lot oif TPUs, though, so I guess there is a big ML aspect to their current system.

I noticed some videos responding to a recent paper from Apple, about GSM (a high school math test) results from current AI systems. These systems have been achieving high marks, but Apple showed the systems have no analytical ability. If they take the standard GSM questions, and modify them a little major things happen to the accuracy of the results. Things like simply changing the names in questions of the "Jane does X and Fred does Y" type substantial reduces the accuracy of all the AI results. Presumably these questions, in some form, and lots of commentary about the GSM tests, is among the training data. When they changed questions to add some irrelevant detail, that even a low ability human would easily ignore, the AIs took that detail into account, and produced strange results. So, they found no sense of any understanding, merely pattern matching. The researchers seemed shocked to find this. An elaborate pattern matching machine pattern matches quite well, but isn't sophisticated enough to fully generalise the patterns? Whoda thought?

So, I guess I have to apologise to tggzzz. I thought people had got past debacles like the "find the tanks among the trees" and similar messes of the 1980s. I guess a new generation of researchers and developers need to learn the old lessons again.

tggzzz · « **Reply #41 on:** October 15, 2024, 01:58:40 pm »

Quote from: coppice on October 15, 2024, 12:26:17 pm

Quote from: tggzzz on October 12, 2024, 05:19:05 pm
Quote from: coppice on October 12, 2024, 04:55:07 pm
If you are not aware of the hungry judge issue, look it up, and see if you can some back and repeat what you wrote with a straight face.

I am well aware that people often exhibit less than commendable behaviour.

MLs work by copying existing suboptimal behaviour.

You cannot instruct an ML to "do the same as is always done except in special case X do Y", because the X and Y are distributed across millions of trimpots.
The hungry judge study is really interesting. Judges adjudicating parole hearings were monitored for a wide range of things that might influence their decision making. The one issue that stood head and shoulders above all others was how long it had been since the judge ate. I think they were looking beyond actual case factors, for issues like how fresh or tired the judge was, but they finally realised the big deal was stomach contents. So, you could ask the judge how they came to their decision, and they could give you a nice balanced view of the factors they took into account, yet they would miss the tank in the night and day..... I mean just how sated they were at the time.

Back in the 80s there were many similar examples found while eliciting knowledge from medical experts. There were many reasons, e.g. not realising all the clues they were using when making rules, forgetting edge cases, symptoms with causes that might be inside or outside the sub-medical domain being investigated.

Fundamentally knowledge elicitation is an art, an imperfect art.

Quote

I don't know what mechanisms they use, but when current AI systems don't give the results people want, either being inaccurate or being politically unacceptable, they seem to be able to cook the system very quickly to heavily nudge the outcomes. We saw this with Google's recent mess with every famous historical white figure being rendered as some weird slightly Asian, slightly African, slightly Middle Eastern and slightly native American unisex figure. They responded to that quickly, if incompletely.

"Responded" is, of course, insufficient. There were famous problems with gorillas being included in the results; oops. Updates were confidently issued - and new failures were almost instantly discovered. Rinse and repeat several times.

That is exactly what you would expect from suck-it-and-see "fixes", where nobody can know what the changes are, nor the results spring from the changes.

Quote

On the other hand its several years since Google was ridiculed for searches like "white couples" producing only results for couples who were not white. They fixed that for the obvious cases, but I still see searches like "white plastic sheeting" giving me results for people selling every kind of plastic except white. Google's search engine is really badly broken, and we don't really know what it consists of in 2024. Google have made a lot oif TPUs, though, so I guess there is a big ML aspect to their current system.

If you can't predict how something will behave, then anytime it gives an expected result has to be chance.

Quote

I noticed some videos responding to a recent paper from Apple, about GSM (a high school math test) results from current AI systems. These systems have been achieving high marks, but Apple showed the systems have no analytical ability. If they take the standard GSM questions, and modify them a little major things happen to the accuracy of the results. Things like simply changing the names in questions of the "Jane does X and Fred does Y" type substantial reduces the accuracy of all the AI results. Presumably these questions, in some form, and lots of commentary about the GSM tests, is among the training data. When they changed questions to add some irrelevant detail, that even a low ability human would easily ignore, the AIs took that detail into account, and produced strange results. So, they found no sense of any understanding, merely pattern matching. The researchers seemed shocked to find this. An elaborate pattern matching machine pattern matches quite well, but isn't sophisticated enough to fully generalise the patterns? Whoda thought?

Do read comp.risks; it is one of the very few sources I recommend unreservedly to everybody. http://catless.ncl.ac.uk/risks

Since it comes out infrequently (<1/week), I use the RSS feed http://catless.ncl.ac.uk/risksrss2.xml to catch the new infelicities.

Quote

So, I guess I have to apologise to tggzzz. I thought people had got past debacles like the "find the tanks among the trees" and similar messes of the 1980s. I guess a new generation of researchers and developers need to learn the old lessons again.

Unfortunately that is the case.

People don't change. Young people continue to think their forebears know nothing. Far too old quote: "When I was 14 I thought my father was a fool. When I was 21 I was amazed at how much he had learned in the past 7 years".

My simple attitude:

technology X has fundamental limitation Y
later on, technology X' is promoted as being better
ask how it avoids Y
salesman will normally assert X' doesn't have Y (for software, typically respond with "oh excellent, the Byzantine General's problem / split brain problem / etc has been solved at last; where can I see the published results?)
ask how and why not
the lack of an answer almost always reveals that X->Y has been ignored or forgotten

Works with hardware, works with software languages, software products, software environments, works with politics, ........

coppice · « **Reply #42 on:** October 15, 2024, 02:17:03 pm »

Quote from: tggzzz on October 15, 2024, 01:58:40 pm

Quote
I don't know what mechanisms they use, but when current AI systems don't give the results people want, either being inaccurate or being politically unacceptable, they seem to be able to cook the system very quickly to heavily nudge the outcomes. We saw this with Google's recent mess with every famous historical white figure being rendered as some weird slightly Asian, slightly African, slightly Middle Eastern and slightly native American unisex figure. They responded to that quickly, if incompletely.

"Responded" is, of course, insufficient. There were famous problems with gorillas being included in the results; oops. Updates were confidently issued - and new failures were almost instantly discovered. Rinse and repeat several times.

That is exactly what you would expect from suck-it-and-see "fixes", where nobody can know what the changes are, nor the results spring from the changes.

Sure, its insufficient, but the speed of their responses mean their solution isn't just trained. Retrains take a long time. They must be using some tampering mechanism. A lot of the worst results I have seen from AI look like the results of poorly thought through human manipulation of results, rather than the results of a neutral training session. So, it looks to me like much of the involvement of humans actually makes things worse. The AI naturally make wacky mistakes. Humans sculpt them to make manipulative mistakes.

Have you ever worked with simple trained systems, like echo cancellers or channel equalizers? Even something with a simple goal like "minimise ISI" or "minimise echo" can be surprisingly hard to manually tweak in meaningful ways as you are experimenting to investigate their behaviour during development.

tggzzz · « **Reply #43 on:** October 15, 2024, 03:59:27 pm »

Quote from: coppice on October 15, 2024, 02:17:03 pm

Quote from: tggzzz on October 15, 2024, 01:58:40 pm
Quote
I don't know what mechanisms they use, but when current AI systems don't give the results people want, either being inaccurate or being politically unacceptable, they seem to be able to cook the system very quickly to heavily nudge the outcomes. We saw this with Google's recent mess with every famous historical white figure being rendered as some weird slightly Asian, slightly African, slightly Middle Eastern and slightly native American unisex figure. They responded to that quickly, if incompletely.

"Responded" is, of course, insufficient. There were famous problems with gorillas being included in the results; oops. Updates were confidently issued - and new failures were almost instantly discovered. Rinse and repeat several times.

That is exactly what you would expect from suck-it-and-see "fixes", where nobody can know what the changes are, nor the results spring from the changes.
Sure, its insufficient, but the speed of their responses mean their solution isn't just trained. Retrains take a long time. They must be using some tampering mechanism. A lot of the worst results I have seen from AI look like the results of poorly thought through human manipulation of results, rather than the results of a neutral training session. So, it looks to me like much of the involvement of humans actually makes things worse. The AI naturally make wacky mistakes. Humans sculpt them to make manipulative mistakes.

Currently there is a lot of hidden, um, post-processing being done by humans. Hells teeth, in too many systems is isn't even post-processing, e.g. some of the driverless taxis. Scandalous, even dangerous - and just why do companies do it?!

Quote

Have you ever worked with simple trained systems, like echo cancellers or channel equalizers? Even something with a simple goal like "minimise ISI" or "minimise echo" can be surprisingly hard to manually tweak in meaningful ways as you are experimenting to investigate their behaviour during development.

I've worked in radio propagation, so while I've never been involved with (say) a Viterbi decoder, I know the conditions that lead to them being required.

There's a generally useful aphorism about any and all sorts of optimisation: "If you know how to optimise, do so. If you don't know how to optimise, randomise". Radio propagation merely follows that rule

coppice · « **Reply #44 on:** October 15, 2024, 05:15:31 pm »

Quote from: tggzzz on October 15, 2024, 03:59:27 pm

Currently there is a lot of hidden, um, post-processing being done by humans. Hells teeth, in too many systems is isn't even post-processing, e.g. some of the driverless taxis. Scandalous, even dangerous - and just why do companies do it?!

There are a lot of "voice recognition" telephone applications which have been working well for more than 20 years. That's because they use entirely human recognition. Some have a clunky speech recognition system working alongside a human, so if some of what is said gets recognised properly it cuts the amount of typing required. When you speak into your cellphone these days the recognition can be pretty good, but recognising across a 3.4kHz bandwidth line, where all the unvoiced sounds merge into one, still isn't that great. There is an interesting psychology, especially in the US, where people would rather deal with a machine than a human. So, to keep these people happy there are various systems which hide a human, and present like a machine. Text to speech has been adequate for that for quite a while.

tggzzz · « **Reply #45 on:** October 15, 2024, 05:41:01 pm »

Quote from: coppice on October 15, 2024, 05:15:31 pm

Quote from: tggzzz on October 15, 2024, 03:59:27 pm
Currently there is a lot of hidden, um, post-processing being done by humans. Hells teeth, in too many systems is isn't even post-processing, e.g. some of the driverless taxis. Scandalous, even dangerous - and just why do companies do it?!
There are a lot of "voice recognition" telephone applications which have been working well for more than 20 years. That's because they use entirely human recognition. Some have a clunky speech recognition system working alongside a human, so if some of what is said gets recognised properly it cuts the amount of typing required. When you speak into your cellphone these days the recognition can be pretty good, but recognising across a 3.4kHz bandwidth line, where all the unvoiced sounds merge into one, still isn't that great. There is an interesting psychology, especially in the US, where people would rather deal with a machine than a human. So, to keep these people happy there are various systems which hide a human, and present like a machine. Text to speech has been adequate for that for quite a while.

https://www.relayuk.bt.com/ is something banks use to communicate remotely with deaf customers, which is needed more and more as bank branches close. I presume you have to give the (wo)man-in-the-middle your banking access credentials; I hope that isn't against banks' Ts&Cs!

Having a human is fine. What is despicably deceitful is claiming an automated ML system, but actually having a Mechanical Turk.

coppice · « **Reply #46 on:** October 15, 2024, 05:54:52 pm »

Quote from: tggzzz on October 15, 2024, 05:41:01 pm

Quote from: coppice on October 15, 2024, 05:15:31 pm
Quote from: tggzzz on October 15, 2024, 03:59:27 pm
Currently there is a lot of hidden, um, post-processing being done by humans. Hells teeth, in too many systems is isn't even post-processing, e.g. some of the driverless taxis. Scandalous, even dangerous - and just why do companies do it?!
There are a lot of "voice recognition" telephone applications which have been working well for more than 20 years. That's because they use entirely human recognition. Some have a clunky speech recognition system working alongside a human, so if some of what is said gets recognised properly it cuts the amount of typing required. When you speak into your cellphone these days the recognition can be pretty good, but recognising across a 3.4kHz bandwidth line, where all the unvoiced sounds merge into one, still isn't that great. There is an interesting psychology, especially in the US, where people would rather deal with a machine than a human. So, to keep these people happy there are various systems which hide a human, and present like a machine. Text to speech has been adequate for that for quite a while.

https://www.relayuk.bt.com/ is something banks use to communicate remotely with deaf customers, which is needed more and more as bank branches close. I presume you have to give the (wo)man-in-the-middle your banking access credentials; I hope that isn't against banks' Ts&Cs!

Having a human is fine. What is despicably deceitful is claiming an automated ML system, but actually having a Mechanical Turk.

I thought the internet would have killed that off by now. I did some work in the early 90s for deaf people to communicate with each other in Chinese. A combination of cell phones with SMS and the internet killed that off long ago. Its amazing how much effort people will go to for simple message exchange when the route most of us use is blocked. Its also amazing how eager deaf people are, regardless of age and other factors, to dump older technologies when something easier to use comes along.

tggzzz · « **Reply #47 on:** October 15, 2024, 07:34:20 pm »

Quote from: coppice on October 15, 2024, 05:54:52 pm

Quote from: tggzzz on October 15, 2024, 05:41:01 pm
Quote from: coppice on October 15, 2024, 05:15:31 pm
Quote from: tggzzz on October 15, 2024, 03:59:27 pm
Currently there is a lot of hidden, um, post-processing being done by humans. Hells teeth, in too many systems is isn't even post-processing, e.g. some of the driverless taxis. Scandalous, even dangerous - and just why do companies do it?!
There are a lot of "voice recognition" telephone applications which have been working well for more than 20 years. That's because they use entirely human recognition. Some have a clunky speech recognition system working alongside a human, so if some of what is said gets recognised properly it cuts the amount of typing required. When you speak into your cellphone these days the recognition can be pretty good, but recognising across a 3.4kHz bandwidth line, where all the unvoiced sounds merge into one, still isn't that great. There is an interesting psychology, especially in the US, where people would rather deal with a machine than a human. So, to keep these people happy there are various systems which hide a human, and present like a machine. Text to speech has been adequate for that for quite a while.

https://www.relayuk.bt.com/ is something banks use to communicate remotely with deaf customers, which is needed more and more as bank branches close. I presume you have to give the (wo)man-in-the-middle your banking access credentials; I hope that isn't against banks' Ts&Cs!

Having a human is fine. What is despicably deceitful is claiming an automated ML system, but actually having a Mechanical Turk.
I thought the internet would have killed that off by now. I did some work in the early 90s for deaf people to communicate with each other in Chinese. A combination of cell phones with SMS and the internet killed that off long ago. Its amazing how much effort people will go to for simple message exchange when the route most of us use is blocked. Its also amazing how eager deaf people are, regardless of age and other factors, to dump older technologies when something easier to use comes along.

FAQs only deal with simple common cases. Chatbots are (unless the airline is stupid!) limited to things in the FAQ. It doesn't take much to exceed that, and you have to deal with so exceeding that is nominally intelligent and imaginative.

F2F allows lipreading, but banks are closing branches.

That leaves telephone or requires specific equipment.

As for "new.things", yes - provided the new thing is affordable, works, and is better than current practice. We both know how frequently new==different and new!=better

coppice · « **Reply #48 on:** October 15, 2024, 08:02:07 pm »

Quote from: tggzzz on October 15, 2024, 07:34:20 pm

FAQs only deal with simple common cases. Chatbots are (unless the airline is stupid!) limited to things in the FAQ. It doesn't take much to exceed that, and you have to deal with so exceeding that is nominally intelligent and imaginative.

F2F allows lipreading, but banks are closing branches.

That leaves telephone or requires specific equipment.

As for "new.things", yes - provided the new thing is affordable, works, and is better than current practice. We both know how frequently new==different and new!=better

Most of my encounters with my bank, ISP, cell phone provider, and insurance companies in the last few years have been by typing back and forth on-line, probably with someone in a cheap labour country. Could even be deaf people. Seems like a great job for them.

tggzzz · « **Reply #49 on:** October 15, 2024, 08:34:08 pm »

Quote from: coppice on October 15, 2024, 08:02:07 pm

Quote from: tggzzz on October 15, 2024, 07:34:20 pm
FAQs only deal with simple common cases. Chatbots are (unless the airline is stupid!) limited to things in the FAQ. It doesn't take much to exceed that, and you have to deal with so exceeding that is nominally intelligent and imaginative.

F2F allows lipreading, but banks are closing branches.

That leaves telephone or requires specific equipment.

As for "new.things", yes - provided the new thing is affordable, works, and is better than current practice. We both know how frequently new==different and new!=better
Most of my encounters with my bank, ISP, cell phone provider, and insurance companies in the last few years have been by typing back and forth on-line, probably with someone in a cheap labour country. Could even be deaf people. Seems like a great job for them.

Of course; valid for the simple everyday cases. Exception: any interaction with Xafiniti Pensions is even worse than Virgin Media.

But the exceptions are likely to be complex, difficult and important.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: DARPA suggests turning old C code automatically into Rust (Read 2640 times)

Share me