So I threw GPT-4o and Claude into a chess match to see who's got better skills. Then I made them both face Stockfish (spoiler: it doesn't end well for them).
Just used some Python to set everything up and let them play against each other. Pretty interesting to see how these LLMs handle a chess board!
If you're curious which AI is the worst, or just want to watch them get demolished by Stockfish, check it out. I go through the code too, but you can skip ahead to just see the games if that's your thing.
--
Timeline:
00:00 LLMs and Chess
01:52 Code Example: OpenAI and Anthropic SDKs in FastAPI
05:25 GPT-3.5 Turbo vs. Claude Sonnet 3.5
07:25 GPT-4o vs. Claude Sonnet 3.5
08:46 Claude Sonnet 3.5 vs. Stockfish
10:03 GPT-4o vs. Stockfish
--
Blog: https://www.gettingstarted.ai/chatgpt-vs-stockfish/
--
Chess piece icons by Colin M.L. Burnett (https://en.wikipedia.org/wiki/User:Cburnett), licensed under Creative Commons Attribution-Share Alike 3.0 Unported (CC BY-SA 3.0).
Source: https://commons.wikimedia.org/wiki/Category:SVG_chess_pieces
License: https://creativecommons.org/licenses/by-sa/3.0/