Remember a year ago, back in November before we knew about ChatGPT, when machine learning was all about building models to solve a single task like loan approval or fraud protection? This approach seems to have gone out the window with the advent of generalized MBA, but the truth is that generalized models are not a perfect fit for every problem, and task-based models are still alive and well in the organization.
These task-based models have, until the advent of LLMs, been the foundation of most enterprise AI, and they are not going away. It’s what Amazon CTO Werner Vogels referred to as “good old-fashioned AI” in his keynote this week, and in his view, it’s the kind of AI that still solves a lot of real-world problems.
Atul Deo, general manager of Amazon’s Bedrock, a product introduced earlier this year as a way to connect to a variety of large language models via APIs, believes task models won’t simply disappear. Instead, it has become another AI tool in the arsenal.
“Before the advent of big language models, we were mostly in a task-specific world. The idea was to train a model from scratch to do a particular task,” Dew told TechCrunch. He says the main difference between a task model and an LLM is that one is trained on that task. defined, while the other can deal with things outside the boundaries of the model.
The industry has been talking about emerging capabilities in big language models such as out-of-domain reasoning and power, says John Turo, a partner at investment firm Madrona, who spent nearly a decade at AWS. “This allows you to be able to expand beyond the narrow definition of what the model was initially expected to do,” he said. But he added that it is still up for debate how far these capabilities can go.
Like Dew, Toro says mission models won’t suddenly disappear. “Clearly there is still a role for mission-specific models because they can be smaller, they can be faster, they can be cheaper, and in some cases they can be more performant because they are designed for a specific mission,” he said. .
But the lure of a multi-purpose model is hard to ignore. “When you look at the aggregate level in the company, when there are hundreds of machine learning models being trained separately, it doesn’t make any sense,” Dew said. “Whereas if you choose a more capable large language model, you get the advantage of reuse right away, while allowing you to use a single model to address a range of different use cases.”
For Amazon, SageMaker, the company’s machine learning operations platform, remains a flagship product, and one aimed at data scientists rather than developers, as is the case with Bedrock. Reports Tens of thousands of customers are building millions of models. It would be reckless to abandon that, and frankly, just because LLM is the flavor of the moment doesn’t mean the technology that came before won’t remain relevant for some time to come.
Enterprise software in particular doesn’t work that way. No one simply abandons a large investment because something new comes along, even if it’s as powerful as the current crop of big language models. It’s worth noting that Amazon announced upgrades to SageMaker this week, aimed squarely at managing large language models.
Before these larger, more capable language models, the task model was really the only option, and that’s how companies approached it, by building a team of data scientists to help develop these models. What is the role of a data scientist in the era of large language models where tools are aimed at developers? Turo believes they still have a major job to do, even at companies that focus on LLMs.
“They will think critically about data, and that’s actually a growing role, not a shrinking one,” he said. Regardless of the model, Toro believes data scientists will help people understand the relationship between AI and data within large companies.
“I think all of us need to think critically about what AI is and isn’t capable of, what data means and what it doesn’t mean,” he said. This is true regardless of whether you are building a more general large language model or a task model.
That’s why these two approaches will continue to work in tandem for some time because sometimes bigger is better, and sometimes it’s not.