Does one size fit all or is there a need for domain-specific LLMs?
Prasenjit Dey, Senior Vice President, Merlyn Mind
Massive-sized generic models such as GPTs, Bard, PaLM and others trained on a large generic corpus have shown exceptional promise and behavior beyond what they were originally designed for. GPT-4 is able to tackle multimodal information and can also perform complex reasoning using chain-of-thought. So what happens when these generic models meet highly specific industry vertical requirements? Can they perform the same way they do for other tasks?
We see numerous experimentations being performed with LLMs trained on domain-specific corpuses such as finance, medicine, legal, education, etc. These models are smaller in size (of the order of 50 B parameters) but trained on a much larger number of tokens than the generic models (compared to their parameter sizes). They have been consistently showing better performance than the generic models (which are 5-10x their sizes) on these industry-specific tasks. So should every industry vertical have a domain-specific model or are generic models good enough? What are the cost-performance tradeoffs? I will also share some thoughts on how the education industry, where we work, can possibly benefit from the best of both these worlds.
Get in touch with us
Join our Slack community: https://toloka.ai/community
Check out more of our events: https://toloka.ai/events
Read our Medium: https://medium.com/toloka
Follow us on social networks to make sure you won't miss any updates.
Twitter: https://twitter.com/TolokaAI
Facebook: https://facebook.com/globaltoloka
Linkedin: https://linkedin.com/company/toloka/