Image-to-text

Models to generate text from any given image.

clip-interrogator

The CLIP Interrogator is a prompt engineering tool that combines OpenAI's CLIP and Salesforce's BLIP to optimise text prompts to match a given image. Use the resulting prompts with text-to-image models like Stable Diffusion to create cool art!

20.8k runs

img2prompt

Get an approximate text prompt, with style, matching an image.

58.3k runs

blip-2

Answers questions about images

54.2k runs