🤖 Vision Language AI Demo

Interactive application showcasing multiple vision-language AI capabilities

Upload an image and AI will generate a description

Upload Image

Generated Caption

📸 Click on examples to try

📚 About This Application

Models: BLIP (Captioning & VQA) + CLIP (Classification)
Framework: Gradio + Transformers
Deployment: Can be deployed to Hugging Face Spaces
Open Source: All models are open source

⚡ Performance Tip: Use Hugging Face Spaces Zero GPU for significantly faster processing