« Top 5 Posts Including: Why Vision Pro Sales Aren't "Slow", and Writers' Real Problem With ChatGPT | Main | This Artist Turns Second Life Avatar Screenshots Intro Breathtaking Images Evoking Paintings by Modigliani & Lempicka »

Monday, April 29, 2024

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Aleena

Microsoft Copilot is just ChatGPT 4-Turbo, but with the added ability to access data you have given Microsoft. I suspect when you tried your original query, you posted it at ChatGPT 3.5, which is the version OpenAI makes available to free users on its website.

Adeon Writer

AI text generation should not be expected to give factual information. It's understandable why people continue to expect this (as that is how it is marketed) but that is never going to change reality. Please try to do your best to combat this pervasive misunderstanding in expectations.

Viki

>> human-type sentience that's simply not there at all.

exactly.

n explained really well that to use a LLM as a substitute for Google or as a database for factual information is misunderstanding what the thing is doing. As Adeon re-iterates; AI text generation should not be expected to give factual information.

I really recommend this article: https://medium.com/@colin.fraser/hallucinations-errors-and-dreams-c281a66f3c35. Its long but does explain how the whole thing "is a dream". Also read the first response.

n

Thank you for listening and giving it a try, and I'm glad that you see that it's better! However, I recommended using Copilot in Precise mode specifically because it's designed to be more accurate. In the screenshot you shared, it appears you used the default mode (Balanced), instead.

You can click this (see the linked image) to switch to Precise mode and give it another go:

https://i.postimg.cc/7hnLM3kK/precise.png

Here is what it returned for me when I used Precise mode:

https://i.postimg.cc/9Xdzhh7y/precise-response.png

As you can see, the results in Precise mode are even better and it didn't make any mistake. In the few cases it happens, you can still check the sources. This is indeed a much better approach than ChatGPT. Precise mode significantly mitigates "hallucinations", not only by using web search to stay grounded (and it tries to determine the most reliable among the available sources), but also by keeping the response more concise, moreover it likely has the hyperparameter “temperature” set to 0, which helps control the randomness of the output.

Hallucinations can be useful for creativity, so Copilot Creative’s temperature is set high, which can be fun for brainstorming or suggesting ideas. Balanced mode, like most general-purpose applications, has a moderate temperature setting (also it is powered by a less capable model). Precise mode, on the other hand, has its temperature likely set to low or zero for maximum accuracy. With local models, you have have more flexibility to set the temperature as you please.

n

As for the "hallucinations", that's how they are called. It's "lie" that may rather imply (or make think the listener of) awareness.

You wrote: «I'd quibble that "LLMs hallucinate" is more accurate than saying "LLMs lie", since a hallucination implies there's already a base stable awareness where none actually exists.»
But I think it's the other way around.

Lies are typically used with the purpose of deceiving or misleading someone, deliberately. This is not the case with LLMs, that generate an output based on patterns they have learned during training. The output may contain something that is inaccurate or nonexistent, in other words, they "hallucinate".

Hallucination, in the context of machine learning and AI research, has been used for decades and it does not imply any form of awareness:
https://www.ibm.com/topics/ai-hallucinations
(also, out of the metaphor, often people aren't aware they are hallucinating, e.g. with schizophrenia or dementia).
At most, among researchers, there is also who prefer to call them "confabulations".

n

As for the expectations, yeah, as I've repeatedly said, LLMs are not databases to retrieve factual information from. To expect that is a misunderstanding of how they work. Vice-versa, they aren't useless just because they don't meet such expectations or aren't a sci-fi AI (I don't mean you told that).

You said: «So overall I still question the usefulness of LLMs beyond being a highly imperfect, unreliable assistant», when you were looking for factual information and accurate responses (and you used Balanced). The Copilot's Creative mode would have been even worse than Balanced for that task, as it tends to generate fictional facts. This not because it's flawed, but because it's not designed for such tasks: the Creative mode is intended for generating imaginative content, hence the name "creative". That task was better suited for Copilot Precise: the language model (GPT-4) processes your natural language input, calls the search engine, quotes the results as precisely as possible. Then it's still a good thing (and I appreciate it a lot) that it provides links to the sources, so you can verify.

However, it's clear that there's a demand for assistants that provide factual information. LLMs, trained on enormous datasets, end up with a wast, but imperfect, knowledge base. Would you consider an expert useless simply because they can't recall every detail flawlessly? Obviously, you wouldn't rely entirely on memory, but rather consult data and sources. Similarly, there are various ways in which LLMs can do a better job at this task. Even so, again, not always perfect, but not so terribly imperfect and so unreliable. And they can do many other things too.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Your Information

(Name is required. Email address will not be displayed with the comment.)

Making a Metaverse That Matters Wagner James Au ad
Please buy my book!
Thumb Wagner James Au Metaverse book
Wagner James "Hamlet" Au
Wagner James Au Patreon
Equimake 3D virtual world web real time creation
Bad-Unicorn SL builds holdables HUD
Dutchie Evergreen Slideshow 2024
Juicybomb_EEP ad
IMG_2468
My book on Goodreads!
Wagner James Au AAE Speakers Metaverse
Request me as a speaker!
Making of Second Life 20th anniversary Wagner James Au Thumb
PC for SL
Recommended PC for SL
Macbook Second Life
Recommended Mac for SL

Classic New World Notes stories:

Woman With Parkinson's Reports Significant Physical Recovery After Using Second Life - Academics Researching (2013)

We're Not Ready For An Era Where People Prefer Virtual Experiences To Real Ones -- But That Era Seems To Be Here (2012)

Sander's Villa: The Man Who Gave His Father A Second Life (2011)

What Rebecca Learned By Being A Second Life Man (2010)

Charles Bristol's Metaverse Blues: 87 Year Old Bluesman Becomes Avatar-Based Musician In Second Life (2009)

Linden Limit Libertarianism: Metaverse community management illustrates the problems with laissez faire governance (2008)

The Husband That Eshi Made: Metaverse artist, grieving for her dead husband, recreates him as an avatar (2008)

Labor Union Protesters Converge On IBM's Metaverse Campus: Leaders Claim Success, 1850 Total Attendees (Including Giant Banana & Talking Triangle) (2007)

All About My Avatar: The story behind amazing strange avatars (2007)

Fighting the Front: When fascists open an HQ in Second Life, chaos and exploding pigs ensue (2007)

Copying a Controversy: Copyright concerns come to the Metaverse via... the CopyBot! (2006)

The Penguin & the Zookeeper: Just another unlikely friendship formed in The Metaverse (2006)

"—And He Rezzed a Crooked House—": Mathematician makes a tesseract in the Metaverse — watch the videos! (2006)

Guarding Darfur: Virtual super heroes rally to protect a real world activist site (2006)

The Skin You're In: How virtual world avatar options expose real world racism (2006)

Making Love: When virtual sex gets real (2005)

Watching the Detectives: How to honeytrap a cheater in the Metaverse (2005)

The Freeform Identity of Eboni Khan: First-hand account of the Black user experience in virtual worlds (2005)

Man on Man and Woman on Woman: Just another gender-bending avatar love story, with a twist (2005)

The Nine Souls of Wilde Cunningham: A collective of severely disabled people share the same avatar (2004)

Falling for Eddie: Two shy artists divided by an ocean literally create a new life for each other (2004)

War of the Jessie Wall: Battle over virtual borders -- and real war in Iraq (2003)

Home for the Homeless: Creating a virtual mansion despite the most challenging circumstances (2003)

Newstex_Author_Badge-Color 240px
JuicyBomb_NWN5 SL blog
Ava
Ava Delaney SL Blog
Ava
my site ... ... ...
Virtual_worlds_museum_NWN