OpenAI and rivals seek new path to smarter AI as current methods hit limitations

investing.com 14/11/2024 - 21:02 PM

AI Companies Seek New Training Techniques

By Krystal Hu and Anna Tong

(Reuters) – Artificial intelligence companies like OpenAI are facing unexpected delays and challenges in pursuing larger language models. They are developing training techniques that mimic more human-like ways of thinking.

A dozen AI scientists, researchers, and investors informed Reuters that methods behind OpenAI's recently released o1 model could reshape the AI arms race and impact resource demands, including energy and chip types.

OpenAI declined to comment. Following the release of ChatGPT two years ago, tech companies have claimed that scaling existing models by increasing data and computing power would consistently enhance AI capabilities.

Prominent AI scientists are now expressing skepticism about the “bigger is better” philosophy. Ilya Sutskever, co-founder of Safe Superintelligence (SSI) and OpenAI, indicated that results from scaling pre-training have plateaued.

“The 2010s were the age of scaling; now we’re back in the age of wonder and discovery. Everyone is looking for the next thing,” Sutskever said, emphasizing the importance of scaling the right aspects.

Researchers at major AI labs have experienced delays and setbacks in their quest to release models exceeding OpenAI’s GPT-4, which is nearly two years old. Training large models can be expensive, costing tens of millions by running hundreds of chips simultaneously.

Another challenge is that large language models require enormous data quantities, and accessible data has been depleted. Power shortages further complicate training runs, demanding vast energy resources.

To tackle these issues, researchers are investigating “test-time compute,” a technique that enhances AI models during their “inference” phase. Instead of selecting a single answer quickly, models could generate and assess multiple options, which allows for better decision-making.

OpenAI has implemented this approach in its new model, o1, which can process problems in a multi-step manner akin to human reasoning. This model employs data and feedback from experts and builds on base models like GPT-4.

Other AI labs, including Anthropic, xAI, and Google DeepMind, are also developing similar techniques. “We see low-hanging fruit to improve these models quickly,” said Kevin Weil, OpenAI's chief product officer.

This shift may change the competitive landscape for AI hardware, traditionally driven by demand for Nvidia’s AI chips. Prominent investors are observing this transition.

“This shift will move us from massive pre-training clusters to inference clouds,” stated Sequoia Capital’s Sonya Huang, indicating a paradigm shift. Nvidia's market dominance in training chips may face challenges in the inference market.

Nvidia's CEO Jensen Huang acknowledged the growing demand for inference, linking it to a new scaling law revealing the rising importance of the o1 model's techniques.




Comments (0)

    Greed and Fear Index

    Note: The data is for reference only.

    index illustration

    Fear

    34