Connect with us

Hi, what are you looking for?

Tech News

Microsoft brings out a small language model that can look at pictures

Illustration: The Verge

Microsoft announced a new version of its small language model, Phi-3, which can look at images and tell you what’s in them.

Phi-3-vision is a multimodal model — aka it can read both text and images — and is best used on mobile devices. Microsoft says Phi-3-vision, now available on preview, is a 4.2 billion parameter model (parameters refer to how complex a model is and how much of its training it understands) that can do general visual reasoning tasks like asking questions about charts or images.

But Phi-3-vision is far smaller than other image-focused AI models like OpenAI’s DALL-E or Stability AI’s Stable Diffusion. Unlike those models, Phi-3-vision doesn’t generate images, but it can understand what’s in an image and analyze it for a…

Continue reading…

You May Also Like

Editor's Pick

In this edition of StockCharts TV‘s The Final Bar, Dave shows how breadth conditions have evolved so far in August, highlights the renewed strength in the...

Tech News

Image: Becca Farsace / The Verge Instagram is a popular place to show off your latest photos, but if you’re a real photography enthusiast,...

Tech News

Rufino Choque, from the Urus Indigenous community, stands over a boat in the middle of the extinct Poopó Lake, which disappeared in 2015. |...

Politics

When word first broke that Joe Biden would be sitting down with Howard Stern for a live interview Friday on his SiriusXM show, it...

Generated by Feedzy