Beverly Millson, an emerging tech consultant and cyberculture enthusiast, was recently playing around with ChatGPT 3.5 and she decided to ask the AI about her Second Life avatar, Bettina Tizzy. Through that avatar name, she edited Not Possible in Real Life, which up until 2009 was a must-read site covering the works of an emerging new art form: Immersive expression created in metaverse platforms like Second Life.
But in reply, ChatGPT gave Beverly's avatar a small if not exactly accurate promotion -- the creator of the very term, metaverse:
As I've mentioned before, asking ChatGPT about a topic where you have some deep expertise exposes just how much the program is mostly a glorified, easier-to-read version of Google search -- and equally prone to coughing up false-positives or extremely mediocre results. (ChatGPT's definition of the Metaverse is not very accurate to the source material either, but leave that to one side.)
I also know enough to infer how ChatGPT probably went so wrong, in Bettina's case -- ChatGPT has likely been trained on blog posts like this one where she's been described as a "metaverse art maven", and it made the next illogical leap.
Amused, Beverly recently sent me this result for a chuckle. But it also makes me somewhat concerned how large language models like ChatGPT are training on important but somewhat obscure subjects: In this case, metaverse history.
For instance (also via Beverly), here's how ChatGPT describes the SL artist known as "AM Radio":
Contrary to what ChatGPT outputs, AM Radio is mainly known for creating rural virtual scenes -- which were even profiled in the New York Times -- and his real identity has been reported: I wrote the profile on Polygon, another large news outlet, which did so.
Here's my concern: By now it's pretty well known that ChatGPT will often "hallucinate" bad data, and so needs its output fact-checked. But because virtual world history and culture is a second (or third) layer of knowledge beneath real life, so to speak, I strongly suspect LLMs are not sophisticated enough to train on it.
I'm not sure if there's a good immediate solution to this problem, so I'd just be extra cautious when using ChatGPT in a metaverse context for anything more than some goofy fun. At least ChatGPT got metaverse artist Bryn Oh's bio right (above). Likely because Bryn is still highly active online.
Googling "who is the real person behind the SL avatar am radio", by the way, gave me the right answer in under two seconds.
The answers you get out of ChatGPT can be very version dependant. Version 4 (indicated with a black icon in its responses) tends to be more accurate and less prone to hallucinating facts. 3.5 (indicated with a green icon in its responses) is a significantly less sophisticated model.
As a comparison, I asked similar questions of ChatGPT 4: https://imgur.com/a/ulFGlgB
Posted by: Aleena | Wednesday, May 03, 2023 at 04:35 PM
Thanks! Wow yeah that's considerably better. Though ChatGPT still thinks AM Radio is anonymous!
Posted by: Wagner James Au | Wednesday, May 03, 2023 at 06:49 PM
Bing Chat (GPT-4), too, thinks AM is anonymous, probably because even the Polygon article says: "We track down Second Life's most famous anonymous artist, AM Radio". But when you ask "Who is AM Radio in Second Life? And who is he in real life (name etc)?" Bing Chat tells you the name.
Posted by: Nade | Thursday, May 04, 2023 at 02:02 AM
Hamlet, as the journalist who has covered the metaverse the most and the longest, your informed surveillance and scrutiny of its history is essential. Thanks for looking into this further.
Aleena, your version comparison is encouraging. Version 4, here I come.
Posted by: Bettina Tizzy | Thursday, May 04, 2023 at 11:03 AM
I wouldn't use ChatGPT as a database to retrieve facts. Not even the 4.
When you open ChatGPT, the first thing you see is the disclaimer: it "May occasionally generate incorrect information". If you use it, expect hallucinations.
At most you get an idea how probably, more or less, the facts might be, because it is estimating the probabilities of the next word (or better the next token), based on the previous sequence of words. If I start singing "Happy birthday to", you would most likely expect "you" or a name to follow, isn't it? Because the human brain also works that way, although here you have this predictive ability modeled with mathematical functions.
Whether ChatGPT (the application) is powered by GPT-3.5 or GPT-4 (the language models), it is NOT a reliable source of factual information (not even about itself!), nor a "mostly a glorified, easier-to-read version of Google search".
To make it simple and clear, so that you can more easily understand: language models are NOT the same as search engines or databases, they are NOT designed to store and retrieve information from a structured collection of data. They are artificial neural networks that mimic some aspects of biological brains, rather than traditional computer programs that follow precise instructions - the actual algorithm is the one that produces this model, this sort of simulated neural network, but this artificial neural network then doesn't work algorithmically (if, then, else, ...) and it's trained - They learn from large amounts of text (the whole Wikipedia, books, web pages, the Common Crawls etc.) and they try to predict the next word or phrase based on the previous ones (actually the next token). That's what they are designed for (although there are emergent capabilities, so I would neither dismiss it as just only a glorified next word predictor / autocomplete).
Now think of this: how can they store all that data... into a limited number of parameters (GPT-3 has 175 billion)? They somehow remember the immense amount of text they have seen, but as a rough approximation. It gets the patterns, though. What the model knows may be vaguely compared to a lossy compressed image, a somewhat blurry image of the Internet, or compared to a knowledgeable friend who sometimes forgets or makes up details.
If you want to use a language model to help you find facts on the Internet, you need one that can access a search engine and use its results to generate text. One example and the currently most accessible of such a system is Bing Chat, which is also powered by GPT-4 and uses the Bing search engine to answer questions and provide information. It is free and easy to use, but you should still verify its output (it provides links to the sources, so you can look at them), as it may still contain errors or inaccuracies.
On the other hand, a language model + search engine combination can do more than a traditional search engine or than just return facts. It can also summarize, synthesize, and analyze information from multiple sources and present it in a coherent and engaging way (to get further explanations or anything else). And much more.
Posted by: Nadeja | Sunday, May 07, 2023 at 05:57 AM
As for the most accessible ones, I meant those powered by GPT-4, sorry.
GPT-4 aside, you can also try Bard (LaMDA + Google), that also stays up-to-date with current events and cites its sources. It's not as good as GPT-4, but they plan to replace LaMDA with PaLM. However, it's available only in the USA for now.
You.com YouChat is another one and it is around from even before Bing Chat. You can use it directly (it will ask you to register after a handful of interactions). When I tried it back in December I wasn't super impressed, but I see they improved it recently.
Anyway, new things and ideas keep coming. Several papers have recently been published, on self reflections and much more more, that show how the response of these models can be improved significantly and someone is experimenting with that already:
https://www.youtube.com/watch?v=wVzuvf9D9BU
Posted by: Nadeja | Monday, May 08, 2023 at 09:13 AM