Yup hands are weird. And it gets weird with motion, too.
What Gigem posted is mostly correct*, but only part of the story. The models need training data. That can be anything from the entire world wide web, to just specific data that you feed directly to it.
The larger the training data sample, and the less curated, the more likely you are to get weird outputs. Like the weird stadium, flag, and logos above, or the weird hands on images of humans.
And in the worst case, the AI LLMs trained on the entire web, have a tendency to "hallucinate." They create completely false facts when queried on a subject. The false facts are usually plausible, but they're absolutely 100% fabricated, because the training data has so much ambiguity and incorrect data within it.
Of note with both the hallucinations and the weird hands is that these are two symptoms of a very important point. Artificial intelligence--or to be more precise,
Generative AI--isn't actually intelligent at this time. This makes it all the more impressive what it's actually capable of, and explains why it can also be spectacularly wrong.
Generative AI for text (i.e. based upon large language models and transformers) are, to oversimplify, just a predictive engine trying to figure out the next word(s) that follow the previous words it has written. Based on the quality of the prompt it is able to narrow down what portions of its language model to start with, and then the quality of the training data and model helps to guide it for where it goes from there.
For example, let's say it was trained on this site and I were to ask it: "How did utee94 end up in his career with [large computer manufacturer]?"
It could equally respond with:
- utee94 grew up in Austin, TX, chose to go to the University of Texas and study electrical engineering, before taking his job with [large computer manufacturer].
- utee94 grew up in Austin, TX, chose to go to Texas A&M University and study physics, before taking his job with [large computer manufacturer].
One is correct and one just made
@utee94 throw up in his mouth a little, but both would be plausible. Because on this board "A&M" often follows "Texas"--sometimes in his own posts--and recently we've had discussions where utee's posts are discussing physics classes.
The truth is that the model
isn't intelligent. It doesn't know who utee94 is. It doesn't know what Texas is. It doesn't have any way to "self-correct" because it has no context with which to guide itself.
The same is true with generative AI for images. For example, utee posted the image down below in another thread.
A generative AI engine is given a prompt of "show me a beautiful woman camping". And it did a darn good job... Right up until it put the woman's campfire right in the middle of her tent. Oh, and also seemingly got confused and ended up drawing one
really deformed and elongated finger.
Fingers are hard for AI because AI doesn't know what a finger is, doesn't "understand" it's drawing a finger, and doesn't have context which tells it that humans (which it doesn't necessarily "know" what they are) only have 5 fingers on each hand. The training data is obviously going to have pictures of humans and those humans are going to have 5 fingers, but it's just trying to figure out "should I move on or should I put another beige fleshy appendage or two on?" based on its predictive engine and training data.

What AI is capable of today is truly impressive... But we have not reached artificial
general intelligence. We have not reached a point where these Generative AI engines actually truly
know what it is they're producing.
That said:
How hard would it be to completely duplicate a web site or email address to cull information from folks? I know that already exists, folks exchange a symbol for some letter somewhere to differentiate that can't be seen.
You don't need to create artificial intelligence to overcome natural stupidity.