Stanford University researchers propose steaming: a new AI approach that reduces the inference cost of language models by 110x.

Wireless

Large language paradigms are constantly making headlines nowadays. With its extraordinary capabilities and applications in various fields, a new research paper or update in LLM is released almost every day. Existing LLMs have a large number of parameters which make the cost of training very high. They’ve been trained on trillions of tokens, which makes them very expensive.

In a recent paper, some Stanford and Cornell University students proposed a method that could handle the challenge of an expensive LLM. The team shared how language models (LMs) cost when processing large documents. They cited an example of the cost of running the inference on more than 55 million Wikipedia pages, which is over $100,000, and equates to a price of over $0.002 per 1,000 tokens. The approach proposed by the authors can reduce inference costs by a factor of 110 while also improving the quality of results compared to running direct inference on each document.

EVAPORATE, LLMs call the strength of this model system and identify two different strategies for implementing the system. The first strategy is to ask the LLM to extract the values directly from the documents. The second is to ask the LLM to compile the code that does the extraction. The team evaluated these two approaches and found a trade-off between cost and quality. While it was cheaper to synthesize the code, it was also less accurate than direct per-document processing with LLM.

🚀 Join the fastest ML Subreddit community

EVAPORATE identifies and exploits redundancy across multiple documents to improve efficiency. The team used the example of extracting a device classification attribute from FDA reports for medical devices to illustrate this. Instead of treating every semi-structured document with LLM, the authors explore using LLM to create reusable functions to extract from each document.

In order to improve quality as well as keep cost down, the team proposed an extended code synthesis implementation called EVAPORATE-CODE+. This approach generates many candidate jobs and aggregates their extractions using weak supervision. While weak moderation has traditionally been applied to human-generated jobs, EVAPORATE-CODE+ works with machine-generated jobs and deals with the challenges of that setup to enable quality improvements.

EVAPORATE is evaluated on 16 sets of documents across a range of formats, subjects, and feature types. EVAPORATE-CODE+ outperforms SOTA systems by using sublinear scrolling over documents with LLM, reducing the number of tokens the LLM needs to process by 110-fold, with an average of 16 evaluation setups of 10K documents each.

In conclusion, this paper presents a promising approach for automating table extraction from semi-structured documents using LLMs. By identifying the trade-offs between direct extraction and code synthesis and proposing a broad application that achieves better quality while maintaining a low cost, this work will certainly make progress towards the data management community.

scan the paper And repo. Don’t forget to join 20k+ML Sub RedditAnd discord channelAnd Email newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we’ve missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check out 100’s AI Tools in the AI Tools Club

Tania Malhotra is a final year from University of Petroleum and Energy Studies, Dehradun, pursuing a BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is passionate about data science and has good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.

Source link

Techspiro5

Stanford University researchers propose steaming: a new AI approach that reduces the inference cost of language models by 110x.

Post a Comment

How to Get Canva Pro for Free?

AI can perform 1 million microbial experiments a year - ScienceDaily

Should I upgrade my 3D printer to a faster one? Not so fast

Think Monetized Kids News and other VC news

Seagate's expensive Xbox Storage expansion cards are finally getting a price cut

Ahmed Haroud