
When training language models,
a major challenge is not only the availability of adequate hardware resources and volumes of source data, but also the amount of training costs. Avoiding third-party search platforms, as Alibaba notes, allows reducing training costs by up to 88%.

The corresponding approach, as reported by the South China Morning Post , is called ZeroSearch. The task of generating the data necessary for training new models in this case falls on existing language models. In essence, they imitate access to third-party search services, but the method implies much lower costs. For example, accessing Google via API with 64,000 requests will cost developers $586.70, while an AI model with 14 billion parameters is capable of processing the same number of requests at a cost of no more than $70.80. This provides more than 8-fold savings.
This approach will help smaller companies that don’t have access to big infrastructure and budgets make more progress in developing AI systems. Alibaba itself is already combining the capabilities of its Qwen family of models with search engines, providing higher accuracy in answering complex search queries.