OpenAI making dangerous models by lowering quality to rush release!

OpenAI making dangerous models by lowering quality to rush release!

126 Lượt nghe
OpenAI making dangerous models by lowering quality to rush release!
https://StartupHakk.com/?v=_crkUzPUIoI According to OpenAI's own testing on their PersonQA benchmark, which measures accuracy of knowledge about people, the new o3 model hallucinates 33% of the time - that's one-third of all responses containing made-up information. This hallucination rate is roughly double what we saw in their previous reasoning models, o1 and o3-mini, which had rates of 16% and 14.8% respectively. The smaller o4-mini model performs even worse, hallucinating a staggering 48% of the time - meaning nearly half of what it tells you about people is just completely fabricated. Throughout my 25 years in software development, I've never seen a major tech company proudly release a product that performs worse than its predecessor on such a critical metric. What's particularly concerning is that historically, each new AI model generation has improved on hallucination rates, making this a troubling reversal of progress. This regression suggests OpenAI may be prioritizing other capabilities like speed or reasoning at the expense of basic factual accuracy - an extremely dangerous trade-off. In their own technical report for o3 and o4-mini, OpenAI explicitly states that "more research is needed" to understand why hallucinations are getting worse as they scale up their reasoning models. The company theorizes that because these models "make more claims overall," they naturally make "more accurate claims as well as more inaccurate/hallucinated claims" - a concerning lack of understanding of their own technology. Former OpenAI researcher Neil Chowdhury suggests that "the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated by standard post-training pipelines." As someone who's built and managed complex software systems for 25 years, I can tell you it's incredibly concerning when developers don't understand why their systems behave in unexpected ways. This suggests OpenAI may be moving too quickly in their development cycle, rushing to release models without fully understanding their behaviors or limitations. The fact that they released these models publicly while openly admitting this knowledge gap shows a potentially dangerous disregard for the consequences of deploying flawed AI systems. #coding #codingbootcamp #softwaredeveloper #CodeYourFuture