Vision capability
The model accepts images alongside text input.
The model accepts images alongside text. Quality varies hugely — text-recognition (OCR) is largely solved across major models; nuanced visual reasoning (charts, diagrams, UI screenshots) is not.