List of leading open-source conversational AI models in the field of language

Estimated read time: 6 min

Wireless

Conversational AI refers to technology such as a virtual agent or chatbot that uses large amounts of data and natural language processing to mimic human interactions and recognize speech and text. In recent years, the conversational AI landscape has grown exponentially, particularly with the launch of ChatGPT. Here are some other large open source language models (LLMs) revolutionizing conversational AI.

  • release date: February 24, 2023

LLaMa is a Foundation Master developed by Meta AI. Designed to be more versatile and responsible than other models. The release of LLaMA aims to democratize access to the research community and promote responsible AI practices.

LLaMa is available in several sizes, with the number of parameters ranging from 7B to 65B. Permission to access the form will be granted on a case-by-case basis to industry research laboratories, academic researchers, etc.

🚀 Join the fastest ML Subreddit community

  • release date: March 8, 2023

Open Assistant is a project developed by LAION-AI to provide everyone with a large chat-based language model. Through extensive training on huge amounts of text and code, he gained the ability to perform various tasks, including answering queries, generating script, translating languages, and producing creative content.

Although OpenAssistant is still in the development stage, it has already acquired many skills, such as interacting with external systems such as Google Search to gather information. In addition, it is an open source initiative, which means that anyone can contribute to its progress.

  • release date: March 8, 2023

Dolly is a instruction-following LLM developed by Databricks. It is trained on the Databricks machine learning platform which is licensed for commercial use. Dolly is powered by a Pythia 12B model and has been trained on a wide array of instruction/response registers totaling approximately 15K. Although Dolly’s performance in the following walkthrough isn’t cutting edge, it’s impressively high quality.

  • release date: March 13, 2023

Alpaca is a small model to follow instructions developed by Stanford University. It is based on the Meta LLaMa model (Parameters 7B). It is designed to perform well in many instruction-following tasks while being easy and cheap to reproduce at the same time.

Although it looks similar to the OpenAI text-davinci-003 model, it is much cheaper (under $600) to produce. The model is open source and has been trained on a dataset of 52,000 tutorial demonstrations to follow instructions.

Vicuna was developed by a team from UC Berkeley, CMU, Stanford, and UC San Diego. It is a chatbot trained by tuning the LLaMa model to conversations shared by users collected from ShareGPT.

Based on the Transformers architecture, Vicuna is an automatic regression language template and provides natural and engaging conversational capabilities. With 13B coefficients, it produces more detailed and well-structured answers than Alpaca, and its quality is comparable to that of ChatGPT.

  • release date: April 3, 2023

Berkeley Artificial Intelligence Research Lab (BAIR) has developed Koala, a dialogue model based on LLaMa 13B model. It is supposed to be safer and more easily interpretable than other LLMs. Koala has been fine-tuned to freely available interaction data, focusing on data involving interaction with highly capable, closed-source models.

Koala is useful for studying linguistic model integrity and bias and for understanding the inner workings of dialogue language paradigms. In addition, Koala is an open source alternative to ChatGPT that includes EasyLM, a framework for training and tuning LLMs.

Eleuther AI has created a set of regression language models called Pythia, which is designed to support scientific research. Pythia consists of 16 different models ranging from 70M to 12B parameters. All models are trained using the same data and architecture, allowing for comparisons and exploration of how they evolve with measurement.

  • release date: April 5, 2023

Together, he developed OpenChatKit, an open source chatbot development framework that aims to streamline and simplify the process of building conversational AI applications. The chatbot is designed for conversation and instruction and excels at summarizing, creating tables, rating, and dialogue.

With OpenChatKit, developers can access a robust, open-source foundation for creating specialized and general-purpose chatbots for different applications. The framework is built on the GPT-4 architecture and is available in three different model sizes – 3B, 6B and 12B – to accommodate diverse computational resources and application requirements.

  • release date: April 13, 2023

RedPajama is a project created by a team from Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and the MILA Québec AI Institute. Their goal is to develop first-class open source models, starting by reproducing the LLaMA training dataset containing more than 1.2 trillion symbols.

This project aims to create a completely open, iterable, and evolving language model with three basic elements: pre-training data, base models, and instruction set data and models. The dataset is currently accessible through Hugging Face, and users have the option to copy the results using Apache 2.0 scripts, which are available on GitHub.

  • release date: April 19, 2023

StableLM is an open source language model developed by Stability AI. The model is trained on an experimental data set three times larger than The Pile’s data set and is efficient in conversational and coding tasks despite its small size. The model comes in 3B and 7B parameters, with larger models yet to come.

StableLM can generate both text and code, which makes it suitable for many downstream applications. Stable AI also provides a series of improved search-through-help models, using a combination of five updated, open-source datasets designed specifically for conversational agents. These exact models are exclusively for research and are available under a noncommercial CC BY-NC-SA 4.0 license.


scan the paper And github link. Don’t forget to join 20k+ML Sub RedditAnd discord channelAnd Email newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we’ve missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check out 100’s AI Tools in the AI ​​Tools Club


References:

https://www.ibm.com/topics/conversational-ai
https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
https://crfm.stanford.edu/2023/03/13/alpaca.html
https://vicuna.lmsys.org/
https://bair.berkeley.edu/blog/2023/04/03/koala/
https://www.together.xyz/blog/redpajama
https://arxiv.org/pdf/2304.01373.pdf
https://openchatkit.net/
https://github.com/databrickslabs/dolly


I am a civil engineering graduate (2022) from Jamia Millia Islamia University, New Delhi, I have a keen interest in data science, especially neural networks and their applications in various fields.


Source link

Post a Comment

Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.