blip-2

Blip-2 (Based on the Language Model LLM) is a model that answers questions about images. To use it, provide an image, and then ask a question about that image.

Hold: 0

Required: 5,000 $wRAI

54.2k runs

Input

image

Click to select a file to input

Input prompt

caption

Select if you want to generate image captions instead of asking questions

question

Question to ask about this image. Leave blank for captioning

context

Optional - previous questions and answers to be used as context for answering current question

use_nucleus_sampling

Toggles the model using nucleus sampling to generate responses

temperature

Temperature for use with nucleus sampling (minimum: 0.5; maximum: 1)

Hold at least 5,000 wRAI to use this model

Multi AI platform is completely free, but most models are only accessible to wRAI token holders. If you have any questions, feel free to ask in our Telegram chat

Output

Download

Input

image

question

what body of water does this bridge cross?

temperature

Output

Download

Readme

The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt to create more images similar to what was given.

Сортировать по: