How ChatGPT and other LLMs work - and where they might head next

Wireless

AI chatbots such as Because ChatGPT and Google Bard are having a moment – the next generation of chatbot tools promises to do everything from take over our web searches to produce an endless supply of creative literature to remember all the world’s knowledge so we don’t have to.

ChatGPT, Google Bard, and other bots like them are examples of Language Large Models, or LLMs, and it’s worth researching how they work. This means that you’ll be able to get better use of them, and have a better appreciation for what they’re good at (and what they really shouldn’t be trusted with).

Like a lot of AI systems — like those designed to recognize your voice or create cat pictures — LLM is trained on vast amounts of data. The companies behind it have been somewhat cautious when it comes to revealing exactly where that data came from, but there are certain clues we can look at.

For example, the paper presenting the LaMDA (Language Model for Dialog Applications) model, on which Bard is built, refers to Wikipedia, “public forums,” and “code documents from programming-related sites such as Q&A sites, tutorials, etc.” Meanwhile, Reddit wants to start charging for access to 18 years of text chats, and StackOverflow just announced plans to start charging, too. The implication here is that LLM has been widely using both sites up to this point as resources, completely free and on the backs of the people who built and used those resources. Obviously, much of what is publicly available on the web has been scraped and analyzed by LLM.

LLM uses a combination of machine learning and human input.

OpenAI via David Nield

All this textual data, wherever it comes from, is processed by a neural network, which is a commonly used type of AI engine composed of many layers and layers. These networks are constantly adjusting the way they interpret and make sense of data based on a range of factors, including the results of past trial and error. Most LLMs use a specific neural network architecture called an adapter, which has some tricks that are particularly suitable for language processing. (GPT after chat stands for Generative Predefined Adapters.)

Specifically, the converter can read huge amounts of text, identify patterns in how words and phrases relate to each other, and then make predictions about which words should come next. You may have heard that LLMs have been compared to supercharged autocorrect engines, and that’s actually not far from the truth: ChatGPT and Bard don’t really know anything, but they’re pretty good at figuring out word after word, which starts to sound like real thought and creativity when you get to sufficiently advanced stage.

One of the main innovations of these adapters is the self-attention mechanism. This is difficult to explain in a paragraph, but in essence it means that the words in a sentence are not considered in isolation, but also in relation to each other in a variety of evolving ways. It allows for a greater level of understanding than would otherwise be possible.

There is some randomness and differences built into the code, which is why you won’t get the same response from a transforming bot every time. This idea of autocorrect also explains how errors can creep in. At a basic level, ChatGPT and Google Bard don’t know what’s accurate and what’s not. They are looking for responses that seem reasonable and natural, and that are consistent with the data they have been trained on.

Source link

Techspiro5

How ChatGPT and other LLMs work - and where they might head next | Wired

Post a Comment

How to Get Canva Pro for Free?

AI can perform 1 million microbial experiments a year - ScienceDaily

Should I upgrade my 3D printer to a faster one? Not so fast

Think Monetized Kids News and other VC news

Seagate's expensive Xbox Storage expansion cards are finally getting a price cut

Ahmed Haroud