Author Topic: AI parsing data sheets (Read 1907 times)

AndyC_772 · « **on:** September 16, 2024, 04:09:16 pm »

Has anyone had any success getting any real help from AI to read, parse, and answer questions about parts from their data sheets?

Given how relatively proficient ChatGPT seems to be in coding, I had reasonably high expectations that it might also be able to answer questions about components. In this instance, I found two ADCs from competing manufacturers, which have completely different part numbers but actually look as though one may have been designed as a pin compatible alternative to the other. They both come in the same package, have similar headline specs, and although many pins have different names (eg. "VCC" vs "AVDD"), their functions seem similar. Certainly there are too many similarities to be a coincidence.

This seemed like a good opportunity to get some help from AI. Ask it what the functional differences are, and whether they are in fact likely to be compatible.

Of course I can't trust the answers without checking, but if it notices that (say) one has a particular feature that the other lacks, that at least is something I can quickly look for in both data sheets. I often seem to find myself trying to compare and contrast similar parts, so any help in spotting differences is welcome.

To say I was unimpressed would be an understatement. ChatGPT (-o1 preview) failed to correctly identify the number of inputs or even the number of ADC cores in each part. Features clearly common to both were missed in one device, and highlighted as a point of difference. Each time I said "no, try again" it would apologise, correct its mistake, then make a bunch of new mistakes next time around. Stupid, obvious errors, like "pin X is called NNNN", where it just isn't.

In the end it was completely useless, and I gave up - but I do think this is an area where AI might one day help and save time.

Has anyone found an AI tool that does a better job of "understanding" data sheets and answering questions about them?

tom66 · « **Reply #1 on:** September 16, 2024, 04:56:13 pm »

A while ago I attempted to use it to find pin-compatible alternative voltage regulators, but it just barfed some made up part numbers at me. Unfortunately, at the edge of its training data, it's quite likely a lot of the output is garbage. I can imagine an LLM would be quite good at this if trained specifically on electronics engineering datasets.

kripton2035 · « **Reply #2 on:** September 16, 2024, 05:01:22 pm »

this one : https://docs.flux.ai/
claims to have been trained to do such a job of reading datasheets, but I've never been able to make it work correctly.

tggzzz · « **Reply #3 on:** September 16, 2024, 05:28:41 pm »

Quote from: AndyC_772 on September 16, 2024, 04:09:16 pm

Has anyone had any success getting any real help from AI to read, parse, and answer questions about parts from their data sheets?

Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.

There's a name for humans that manage to do that: bullshitters.

Is is any surprise that managers think LLMs work well?

SiliconWizard · « **Reply #4 on:** September 16, 2024, 07:38:22 pm »

But the marketing is apparently really good.

amyk · « **Reply #5 on:** September 17, 2024, 12:34:13 am »

Given that https://www.eevblog.com/forum/chatgptai/ai-generated-lies-in-datasheet-search-results/ has been all I've seen AI do with datasheets, I'm not surprised.

Berni · « **Reply #6 on:** September 17, 2024, 05:21:39 am »

Quote from: kripton2035 on September 16, 2024, 05:01:22 pm

this one : https://docs.flux.ai/
claims to have been trained to do such a job of reading datasheets, but I've never been able to make it work correctly.

They are likely using OpenAIs GPT models via the API or something similar.

Just blindly asking these general purpose LLM AIs about very specific things like chip partnumbers or specs is a bad idea because when they talk about a more niche topic they hallucinate like there is no tomorrow, making up stuff as they go, rather than saying that they don't know. To make things worse it puts the made up information into nice confident sentences that makes it look like it knows what it is talking about.

Feeding it a PDF of a datasheet likely works fairly well since todays LLMs have enough context length to hold even a fairly sizable PDF in memory. Tho most of the time Ctrl+F will get your answer sooner in datasheets, while you also have to understand that the AI doesn't read a PDF like a human, so depending on the internal PDF structure it might not be able to see some diagrams or see them wrong.

ebastler · « **Reply #7 on:** September 17, 2024, 06:07:29 am »

Quote from: tggzzz on September 16, 2024, 05:28:41 pm

Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.

That's a description on the same level as "Computers are really simple and really dumb. All they do is manipulate 0s and 1s." It's true on some level, but it leads you to totally underestimate the capabilities that can be obtained by piling on a few layers of abstraction and complexity.

(That computer quote was actually pretty commonly heard when computers became more visible in the 1970s. Typically used by people who were slightly nervous about them and had a very limited idea what they were and how they worked.)

tggzzz · « **Reply #8 on:** September 17, 2024, 07:49:41 am »

Quote from: ebastler on September 17, 2024, 06:07:29 am

Quote from: tggzzz on September 16, 2024, 05:28:41 pm
Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.

That's a description on the same level as "Computers are really simple and really dumb. All they do is manipulate 0s and 1s." It's true on some level, but it leads you to totally underestimate the capabilities that can be obtained by piling on a few layers of abstraction and complexity.

(That computer quote was actually pretty commonly heard when computers became more visible in the 1970s. Typically used by people who were slightly nervous about them and had a very limited idea what they were and how they worked.)

With computers you could always work out why they produced the result they did[1], and the limits within the results were valid. With LLMs that is, ahem, "an active research topic" with no resolution in sight.

If you read comp.risks (which everybody should), real world examples abound, from sentencing/custody decisions in the US, to things which appear to "work" for spurious "reasons" https://catless.ncl.ac.uk/Risks/32/80/#subj4.1
"Some AIs were found to be picking up on the *text font* that certain
hospitals used to label the scans. As a result, fonts from hospitals with
more serious caseloads became predictors of covid risk."

That's a recent version of an old old problem. Igor Aleksander's 1983 WISARD, effectively the forerunner of today's LLMs, demonstrated a key property of modern LLMs: you didn't and couldn't predict/understand the result it would produce. WISARD correctly distinguished between cars and tanks in the lab, but failed dismally when taken to Luneberger Heath in north Germany. Eventually they worked out the training set was tanks under grey skys and car adverts under sunny skies.

As for the "edge of the envelope" problem, consider there are documented examples where changing one pixel in a photo of a road "stop" sign, caused the ML system to change to interpreting it as a "40MPH" sign. The corollary is that if a bugfix is issued for that, there is no way of knowing whether it will cause a "slow" sign to be misinterpreted as a "50 mph".

[1] that's less true now because they have piled on so many layers of abstraction and complexity that not many people comprehend them all. Look at all the synchronous and asynchronous protocols layered on top of each other in a typical enterprise/telecom system. Too many practitioners think FSM== parsers, and believe the salesmen's (implicit) claims that their software avoids the Byzantine Generals problem. Except that either they don't know that problem or have forgotten it and don't relate it to their distributed systems.

ebastler · « **Reply #9 on:** September 17, 2024, 08:12:43 am »

Quote from: tggzzz on September 17, 2024, 07:49:41 am

With computers you could always work out why they produced the result they did[1], and the limits within the results were valid. With LLMs that is, ahem, "an active research topic" with no resolution in sight.

That is a very different concern from your prior over-simplification that "LLMs just produce grammatically correct sentences with plausible words." Which is all I had commented on.

Quote

If you read comp.risks (which everybody should), real world examples abound, from sentencing/custody decisions in the US, to things which appear to "work" for spurious "reasons" https://catless.ncl.ac.uk/Risks/32/80/#subj4.1
"Some AIs were found to be picking up on the *text font* that certain
hospitals used to label the scans. As a result, fonts from hospitals with
more serious caseloads became predictors of covid risk."

That particular anecdote is a quote from a "quote" from a Technology Review article. The Driggs paper which Technology Review refers to does not seem to deal with LLMs at all, and it certainly does not contain the word "font". Which confirms my bias that Technology Review is just a collection of grammatically correct sentences with plausible words...

AndyC_772 · « **Reply #10 on:** September 17, 2024, 09:00:21 am »

Quote from: tggzzz on September 16, 2024, 05:28:41 pm

Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.

I'm going to go out on a limb here and suggest you've not spent any time with a recent paid-for ChatGPT interface.

If you get the chance - and I strongly recommend subscribing for a month just so you can - ask it to explain a technical topic with which you're already somewhat familiar. Ask it to write some code in your preferred language to carry out a useful function, or upload code of your own and ask it to check for errors. I think you'll very soon realise it's far more capable than just a package that strings together plausible sounding words on a statistically likely basis.

The latest -o1 preview build is specifically crafted to follow chains of logical reasoning. It even describes each step as it works toward composing a response.

tom66 · « **Reply #11 on:** September 17, 2024, 09:07:55 am »

Quote from: tggzzz on September 17, 2024, 07:49:41 am

That's a recent version of an old old problem. Igor Aleksander's 1983 WISARD, effectively the forerunner of today's LLMs, demonstrated a key property of modern LLMs: you didn't and couldn't predict/understand the result it would produce. WISARD correctly distinguished between cars and tanks in the lab, but failed dismally when taken to Luneberger Heath in north Germany. Eventually they worked out the training set was tanks under grey skys and car adverts under sunny skies.

"AI was a bit shit in the 80s, so of course in 2024 it will still be shit."

Yes, there are problems with LLMs & NN image processing, but that does not mean that they are useless. For your tank example it's a well known problem in model training where the model is overconstrained and the training data is too small. ChatGPT has been trained on some 40 terabytes of input data with man years of human filtering applied afterwards.

And, anecdotally, I've found that over the past two years it's become increasingly difficult to get ChatGPT to output bullshit. It either says "I don't know the answer" or it answers.

Berni · « **Reply #12 on:** September 17, 2024, 10:51:14 am »

Even the GPT4o you get from a free ChatGPT account is impressively smart.

Yes deep down these kind of AIs are indeed glorified statistics. The actual technology here behind this AI is in figuring out a computationally more efficient way to calculate these statistics since there is no way you can store the probability of a word for all possible combinations of 10000+ words. So a giant neural network is trained to output something as close as possible to the real word probability.

The intelligence part of LLMs is an emergent behavior (like the patterns in flocks of birds) where the simple actions come together to express a complex behavior. The neural net inside the LLM learns to encode various high level concepts inside the network because that is the only way it can squeeze them into the network, there are not enough neurons to store it brute force as a giant 'lookup table'.

You can make the same point with a biological brain. It is just a pile of neurons that fire off signals depending on their inputs, there is no real intelligence inside a neuron. not even in groups of neurons. But when you put a huge number of them together and teach them on how to react to certain stimulus, the whole thing starts acting as something that appears to be intelligent. If the pile of biological neurons is trained badly it also performs badly.

And yes an AI like this is only as good as its training data. For common things where they could scrape a lot of training data from the internet,books,etc.., there the AI performs remarkably well. For things where it did not have enough training data it will perform poorly. You need a LOT of data for a good training dataset. However even more importantly you need high quality training data, so that there are enough properly varied examples to isolate the concept of interest and there are no large biases. There also has to be intentional examples of garbage in the training data so that it can learn to ignore the garbage.

Fact of the matter is that LLM AIs have enabled computers to do tasks they could never do before. If you train an AI for a task properly and only use it for that task, they perform great. The problem is that this new AI stuff has picked up a massive amount of hype around it, so now everyone is forcefully shoehorning the new fangled AI tech into absolutely everything, regardless if it makes sense to do so.

daqq · « **Reply #13 on:** September 17, 2024, 04:51:59 pm »

I tried it on a chinese datasheet for a LED driver - I was actually pleasantly surprised by the results. Wouldn't be very useful for an MCU probably, but it worked great for the SM16306SJ:

https://chatgpt.com/share/66e9b300-d3e0-8002-bf35-3a4c456b2a20

tggzzz · « **Reply #14 on:** September 17, 2024, 07:31:37 pm »

Quote from: tom66 on September 17, 2024, 09:07:55 am

Quote from: tggzzz on September 17, 2024, 07:49:41 am
That's a recent version of an old old problem. Igor Aleksander's 1983 WISARD, effectively the forerunner of today's LLMs, demonstrated a key property of modern LLMs: you didn't and couldn't predict/understand the result it would produce. WISARD correctly distinguished between cars and tanks in the lab, but failed dismally when taken to Luneberger Heath in north Germany. Eventually they worked out the training set was tanks under grey skys and car adverts under sunny skies.

"AI was a bit shit in the 80s, so of course in 2024 it will still be shit."

Er, no.

AI in the 1980s was a set of rules, based on knowledge elicited from experts. It was then easy to see which rules caused the results, and why.

WISARD was not like that; it was unique.

In the 2020s ML is shit for different reasons, as foreshadowed by WISARD. In particular it is impossible to know the chain that caused the results, because there are no explicit rules.

Quote

Yes, there are problems with LLMs & NN image processing, but that does not mean that they are useless. For your tank example it's a well known problem in model training where the model is overconstrained and the training data is too small. ChatGPT has been trained on some 40 terabytes of input data with man years of human filtering applied afterwards.

That merely makes the problems more subtle.

Never forget Tony Hoare's wise words from his 1980 Turing Award Lecture[1]; Communications of the ACM 24 (2), (February 1981): pp. 75-83.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.

Quote

And, anecdotally, I've found that over the past two years it's become increasingly difficult to get ChatGPT to output bullshit. It either says "I don't know the answer" or it answers.

Are you one of those people that believes a piece of software is works because it passes the unit tests?

tggzzz · « **Reply #15 on:** September 17, 2024, 07:49:19 pm »

Quote from: Berni on September 17, 2024, 10:51:14 am

Even the GPT4o you get from a free ChatGPT account is impressively smart.

Yes deep down these kind of AIs are indeed glorified statistics. The actual technology here behind this AI is in figuring out a computationally more efficient way to calculate these statistics since there is no way you can store the probability of a word for all possible combinations of 10000+ words. So a giant neural network is trained to output something as close as possible to the real word probability.

The intelligence part of LLMs is an emergent behavior (like the patterns in flocks of birds) where the simple actions come together to express a complex behavior. The neural net inside the LLM learns to encode various high level concepts inside the network because that is the only way it can squeeze them into the network, there are not enough neurons to store it brute force as a giant 'lookup table'.

You can make the same point with a biological brain. It is just a pile of neurons that fire off signals depending on their inputs, there is no real intelligence inside a neuron. not even in groups of neurons. But when you put a huge number of them together and teach them on how to react to certain stimulus, the whole thing starts acting as something that appears to be intelligent. If the pile of biological neurons is trained badly it also performs badly.

And yes an AI like this is only as good as its training data. For common things where they could scrape a lot of training data from the internet,books,etc.., there the AI performs remarkably well. For things where it did not have enough training data it will perform poorly. You need a LOT of data for a good training dataset. However even more importantly you need high quality training data, so that there are enough properly varied examples to isolate the concept of interest and there are no large biases. There also has to be intentional examples of garbage in the training data so that it can learn to ignore the garbage.

Fact of the matter is that LLM AIs have enabled computers to do tasks they could never do before. If you train an AI for a task properly and only use it for that task, they perform great. The problem is that this new AI stuff has picked up a massive amount of hype around it, so now everyone is forcefully shoehorning the new fangled AI tech into absolutely everything, regardless if it makes sense to do so.

The point about the similarity between wetware and LLMs is undeniable. After all, wetware was the inspiration for LLMs, starting with the earliest "perceptron" experiments in the 50s and 60s.

The output of brains and LLMs is indeed a consequence of "emergent behaviour". That makes it impossible to determine what the response will be and why. As an engineer (not a psychologist, not a politician, etc) I regard that as a key failing.

Bad training data is a key problem. People are already worrying about what happens when a lot of the content on the web is semi-accurate LLM output, and that is used as training data for the next round of LLMs. Experiments have demonstrated that is a realistic problem.

Saying "it works if you have good training data" is merely opium that I would expect salesmen to spout. Similarly, fracking salesmen say that "fracking is safe when done properly", which is obviously a meaningless truism, and ignores the probability of it not being done properly.

Some training data has, by definition, to be polluted. That doesn't stop users from claiming "the machine says so therefore it is right" (note the present tense). Classic example is the systems in the US that determine the risk associated by an accused person or a convicted felon and therefore whether they should be incarcerated before/after trial. Since black people have a higher probability of being incarcerated, the LLMs can only perpetuate that injustice.

SiliconWizard · « **Reply #16 on:** September 17, 2024, 09:17:05 pm »

In other words, throwing stuff at the wall until it sticks, and calling that engineering.

Berni · « **Reply #17 on:** September 18, 2024, 06:59:59 am »

Yep LLMs are most definitely not perfect and make mistakes (Pretty much any AI chat bot warns about that on the first page).

But as imperfect as they are, they are still performing certain tasks better than any of the prior non AI based solutions. Great example of it is translation between languages. We had computer programs that can translate text to a different language for many decades, however if you ever used any of those you will know that they all produce sentences that sound broken or sometimes translate it outright wrong because a word is similar. But even the first ChatGPT that ran on GPT 3 was providing translations that seam to be as good as a human translator. Not only that, it provided good translations into niche languages (for example Slovenian is only spoken by 2 milion people and is a difficult language, yet it has no issues with it, despite it not even being an officially supported language). This is already from a general purpose chatbot AI that was not specifically trained to do translation, it just happened to have a mish mash of languages in its training data and it figured out the connections.

What these AIs excel in is interpreting fuzzy input. Computers with classical programing work great for interpreting very rigid languages, this is what programing languages are. Here everything follows a strict set of well defined rules, however human languages that have organically evolved over thousands of years are very messy. Hence this is why translating human language text has always been so difficult to get right for computers.

When it comes to things like summarizing large piles of human text or extracting a certain bit of information from it, we didn't really have any way of doing it using classical programing that follows strict rules. So even if it performs poorly that means it still performs way better than not performing at all. Similarly classical computer vision algorithms only really work in ideal conditions, so in some industrial machine they have the camera rigidly mounted to look exactly straight onto the scene, target is the same object every time and is perfectly illuminated..etc. However if you are going to be looking for traffic signs out of a car windshield you won't have those ideal conditions, so applying the classical machine vision approaches would perform even worse as it would either completely ignore most signs if set to be strict, or if set loose it would misidentify signs. This is a very difficult task to solve.

I also don't like the unpredictable nature of large AI models just as much. There is no way to 100% guarantee it won't generate the wrong output. You can just put more 9s on the 99.9..% by more testing. If you can solve the problem with classical programming then you should NOT be involving AI in this at all. But for tasks that can't be solved this way having something that works 99.9% of the time is still way better than something that does not work at all. As long as you are aware that it will make a mistake at some point and have something that can handle the mistake.

tggzzz · « **Reply #18 on:** September 18, 2024, 08:33:44 am »

Quote from: Berni on September 18, 2024, 06:59:59 am

Yep LLMs are most definitely not perfect and make mistakes (Pretty much any AI chat bot warns about that on the first page).

But as imperfect as they are, they are still performing certain tasks better than any of the prior non AI based solutions. Great example of it is translation between languages. We had computer programs that can translate text to a different language for many decades, however if you ever used any of those you will know that they all produce sentences that sound broken or sometimes translate it outright wrong because a word is similar. But even the first ChatGPT that ran on GPT 3 was providing translations that seam to be as good as a human translator. Not only that, it provided good translations into niche languages (for example Slovenian is only spoken by 2 milion people and is a difficult language, yet it has no issues with it, despite it not even being an officially supported language). This is already from a general purpose chatbot AI that was not specifically trained to do translation, it just happened to have a mish mash of languages in its training data and it figured out the connections.

What these AIs excel in is interpreting fuzzy input. Computers with classical programing work great for interpreting very rigid languages, this is what programing languages are. Here everything follows a strict set of well defined rules, however human languages that have organically evolved over thousands of years are very messy. Hence this is why translating human language text has always been so difficult to get right for computers.

When it comes to things like summarizing large piles of human text or extracting a certain bit of information from it, we didn't really have any way of doing it using classical programing that follows strict rules. So even if it performs poorly that means it still performs way better than not performing at all. Similarly classical computer vision algorithms only really work in ideal conditions, so in some industrial machine they have the camera rigidly mounted to look exactly straight onto the scene, target is the same object every time and is perfectly illuminated..etc. However if you are going to be looking for traffic signs out of a car windshield you won't have those ideal conditions, so applying the classical machine vision approaches would perform even worse as it would either completely ignore most signs if set to be strict, or if set loose it would misidentify signs. This is a very difficult task to solve.

I also don't like the unpredictable nature of large AI models just as much. There is no way to 100% guarantee it won't generate the wrong output. You can just put more 9s on the 99.9..% by more testing. If you can solve the problem with classical programming then you should NOT be involving AI in this at all. But for tasks that can't be solved this way having something that works 99.9% of the time is still way better than something that does not work at all. As long as you are aware that it will make a mistake at some point and have something that can handle the mistake.

There's some validity in that, e.g. translating a restaurant menu is very convenient, and I expect LLMs would avoid the historic "out of sight, out of mind"->"invisible idiot" and "the spirit is weak but the flesh is willing"->"the vodka is strong but the meat is rotten". It is also possible to validate (to some extent) a translation by getting the machine (preferably a different LLM) to do the reverse translation.

However, any theoretical benefits can be overwhelmed by the way LLM systems are used in out in the wild...

You have no way of knowing whether adding one new test input will cause the weights to be recalculated so that previously acceptable outputs become dangerous. That is inherent in the way LLMs work, and cannot be avoided. "You can't test quality into a product".

LLMs aren't good at digesting large amounts of fuzzy input https://hai.stanford.edu/news/hallucinating-law-legal-mistakes-large-language-models-are-pervasive

The "99% is better than nothing" is valid in some circumstances - but not in others. The 99% falls into an "uncanny valley": the user has to be aware that you have to correct/ignore the output 1% of the time, but probably isn't able judge which 1%. Add to that the lazy/unscrupulous user (of which there are many) and it is a recipe for problems.

Whether or not detecting speed limit signs is/isn't possible with a LLM or conventional system is a red herring. No unscrupulous manufacturers have claimed an non-LLM system could do that; LLM manufacturers on the other hand do. Ray of hope: US transportation authorities are finally catching onto that, and have started demanding manufacturers justify their claims; I wonder how that will pan out.

LLMs are being used for life-changing applications, without understanding of how they reach safe/dangerous conclusions. In some cases the decision making is even hidden behind "commercial confidentiality" clauses, in others users actively want to not think about the conclusions, in others legal liability for their illegal and false outputs is denied.

https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know
The medical problems I've previously mentioned.
The "lock them up and let God sort them out" applications that I've previously mentioned.
Any of the yoootoob vids showing Tesla cars trying to drive on the wrong side of the road, or drive down railway tracks, etc, etc.
Cruise and Waymo in the US.
Etc.

tggzzz · « **Reply #19 on:** September 18, 2024, 08:47:48 am »

Quote from: tom66 on September 17, 2024, 09:07:55 am

And, anecdotally, I've found that over the past two years it's become increasingly difficult to get ChatGPT to output bullshit. It either says "I don't know the answer" or it answers.

That, of course, has two major limitations...

What's the test that ChatGPT uses to determe that it doesn't "know" the answer? Why should the output of that test be any more reliable than its other output?

The problem, as with people, isn't when they don't "know" the answer, it is when they don't "know" that they "don't" know the answer. Human equivalents are bullshitters and those who cannot realise they suffer from the Dunning-Kruger syndrome.

5U4GB · « **Reply #20 on:** September 18, 2024, 08:50:00 am »

Quote from: tggzzz on September 16, 2024, 05:28:41 pm

Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.

In particular things like ChatGPT are an implementation of the Chinese room:

Quote

Searle imagines himself alone in a room following a computer program for responding to Chinese characters slipped under the door. Searle understands nothing of Chinese, and yet, by following the program for manipulating symbols and numerals just as a computer does, he sends appropriate strings of Chinese characters back out under the door, and this leads those outside to mistakenly suppose there is a Chinese speaker in the room.

When you want to know whether an LLM can solve a particular type of problem, you need to consider whether a Chinese room could solve it. If not then the LLM can't either, although it will hallucinate a plausible-sounding solution if pressed hard enough.

tom66 · « **Reply #21 on:** September 18, 2024, 09:09:07 am »

Quote from: tggzzz on September 17, 2024, 07:31:37 pm

Quote from: tom66 on September 17, 2024, 09:07:55 am
Quote from: tggzzz on September 17, 2024, 07:49:41 am
That's a recent version of an old old problem. Igor Aleksander's 1983 WISARD, effectively the forerunner of today's LLMs, demonstrated a key property of modern LLMs: you didn't and couldn't predict/understand the result it would produce. WISARD correctly distinguished between cars and tanks in the lab, but failed dismally when taken to Luneberger Heath in north Germany. Eventually they worked out the training set was tanks under grey skys and car adverts under sunny skies.

"AI was a bit shit in the 80s, so of course in 2024 it will still be shit."

Er, no.

AI in the 1980s was a set of rules, based on knowledge elicited from experts. It was then easy to see which rules caused the results, and why.

WISARD was not like that; it was unique.

In the 2020s ML is shit for different reasons, as foreshadowed by WISARD. In particular it is impossible to know the chain that caused the results, because there are no explicit rules.

Yes, fundamentally, it is impossible to prove that a model is always accurate without testing every possible input (which is numerically infeasible for anything over a small set of input data). But I am not sure that is as much of a 'gotcha' as you think it is. In order to implement the level of object matching that a convolutional neural network performs, for instance, it is necessary to have a fuzzy statistical model, because the input data is not clean, and there are too many combinations to match against.

Take the problem of recognising an apple in an image. How would you implement a purely algorithmic image matching process? You could test against hundreds or thousands of candidate apple images, perhaps using an image-distance algorithm, but that gives a similar fuzzy output, a statistical probability of the image looking like an apple. And really that's all a convolutional NN is doing, except it is far more effective with a far smaller input training data set. You could use something like Haar classification, but that requires high input image contrast and well-defined features, so it doesn't work well in low-light or when looking at objects off-axis for instance. There are a few other methods out there, but the majority of them have fallen out of favour compared to CNN/DNN models because those produce far better results.

You don't need a neural network to be 100% accurate, you just need it to be sufficiently accurate to generate useful outputs. Take the oft-noted example of the Tesla Autopilot seeing the moon as a traffic light. The visual neural network does, indeed, perceive a yellow traffic signal. The rest of the image processing algorithm clearly realises this data is garbage, so no action is taken on the input. So the system is working correctly, although they might do better to filter the displayed output on the UI.

Quote from: tggzzz on September 17, 2024, 07:31:37 pm

Are you one of those people that believes a piece of software is works because it passes the unit tests?

If you design a sufficiently complex testbench then any program can be proven to be correct, I fail to see why this is a controversial statement. Now whether such a testbench is feasible is another matter. Certainly in the world of FPGA and ASIC design it becomes essential to thoroughly testbench your hardware design because debugging a failure is far more difficult than in software. Unit-testing individual components and writing a well-designed interface specification between them helps reduce the problem space.

tggzzz · « **Reply #22 on:** September 18, 2024, 09:14:13 am »

Quote from: 5U4GB on September 18, 2024, 08:50:00 am

Quote from: tggzzz on September 16, 2024, 05:28:41 pm
Realise that LLMs are a mechanism for producing grammatically correct sentences of plausible words. They have zero understanding of the words an concepts.
In particular things like ChatGPT are an implementation of the Chinese room:
Quote
Searle imagines himself alone in a room following a computer program for responding to Chinese characters slipped under the door. Searle understands nothing of Chinese, and yet, by following the program for manipulating symbols and numerals just as a computer does, he sends appropriate strings of Chinese characters back out under the door, and this leads those outside to mistakenly suppose there is a Chinese speaker in the room.
When you want to know whether an LLM can solve a particular type of problem, you need to consider whether a Chinese room could solve it. If not then the LLM can't either, although it will hallucinate a plausible-sounding solution if pressed hard enough.

Agreed.

I wonder how many people (general public, not those with AI experience) have heard of that question, and whether they understand it. Or, being more cynical ("follow the money"), want to understand it.

Given the amount of unadulterated crap in the AI training material (i.e. the web) I have obvious doubts. That will become worse when LLMs ingest the LLM output already on the web. Meta/Farcebook has stated they will use Farcebook posts to train their LLM; no obvious problems in that, are there?

ebastler · « **Reply #23 on:** September 18, 2024, 09:18:53 am »

Quote from: tggzzz on September 18, 2024, 08:47:48 am

The problem, as with people, isn't when they don't "know" the answer, it is when they don't "know" that they "don't" know the answer. Human equivalents are bullshitters and those who cannot realise they suffer from the Dunning-Kruger syndrome.

Like AndyC in one of the early replies in this thread, I keep wondering: Do you prefer to philosophize about LLMs or have you actually used one lately? What did you use it for? How was your experience with it? What impressed you, what disappointed you, what specific problems did you run into?

Berni · « **Reply #24 on:** September 18, 2024, 09:28:54 am »

Yes exactly the problem is that Ai is being used in all sorts of applications without consideration if AI is an appropriate solution.

The advent of these big powerful LLM AIs made it even worse because they are general purpose models that can do seemingly anything. Even worse is that they take in generic text or images and produce any kind of text, so this makes it easy for developers to just shove a bunch of data into it as text and just blindly hope the AI can make sense of it. This is where the shitty chatbot 'helpers' on websites come from, they are completely useless.

AI can be used in safety critical or life changing situations, but only when there are extra checks and safeguards around it, this effectively adds more layers to the 'swiss cheese model' where it gets more and more improbable that a mistake makes it trough everything due to everything lining up just right.

For example Mobileye developed a camera based vehicle safety system back in 2008 that manufacturers like BMW, Hyundai, Volvo..etc integrated into their cars as part of a automatic emergency braking feature. I got a Volvo from 2014 with this feature and it works well. They take in information from the camera, radar, ultrasonic etc.. letting it able to detect a dangerous situation, first warning the driver with sound and flashing (likely already enough to get the driver to look forward and react) then if no action is taken the brakes are slammed on full at the very last moment. The whole thing works using a machine learned model running on a FPGA. But the system is just one extra layer of cheese in the model to cover up any holes in the human drivers cheese slice (since humans make mistakes too). Be it because the driver is distracted by a billboard next to the road while someone slams on the brakes in front, or if you are going down the highway and can't see a stopped car trough the fog, but the radar can see it.

Once you give AI the task of full autonomous driving things are different. That one is a difficult problem that i have no idea what the correct solution to it might be (apart from a dedicated driverless car road network, but that is too expensive to be viable)

Quote from: 5U4GB on September 18, 2024, 08:50:00 am

Quote
Searle imagines himself alone in a room following a computer program for responding to Chinese characters slipped under the door. Searle understands nothing of Chinese, and yet, by following the program for manipulating symbols and numerals just as a computer does, he sends appropriate strings of Chinese characters back out under the door, and this leads those outside to mistakenly suppose there is a Chinese speaker in the room.
When you want to know whether an LLM can solve a particular type of problem, you need to consider whether a Chinese room could solve it. If not then the LLM can't either, although it will hallucinate a plausible-sounding solution if pressed hard enough.

Yep this is a good explanation of how a LLM works.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: AI parsing data sheets (Read 1907 times)

Share me