Vision-Language Synergy in ARC
Think Visually, Reason Textually: Vision-Language Synergy in ARC by Beichen Zhang et al. arXiv:2511.15703
This paper shows performance gains we can achieve on abstract reasoning tasks when we combine text-based and visual strategies. I’ve always found this general idea interesting, availing ourselves to heuristics in other domains and leveraging transfer effects. It’s also nice to see a comparison of open source projects like Qwen3 with closed source projects like GPT and Gemini.