๐Ÿค– Vision Language AI Demo

Interactive application showcasing multiple vision-language AI capabilities

Upload an image and AI will generate a description

๐Ÿ“ธ Click on examples to try

๐Ÿ“š About This Application

  • Models: BLIP (Captioning & VQA) + CLIP (Classification)
  • Framework: Gradio + Transformers
  • Deployment: Can be deployed to Hugging Face Spaces
  • Open Source: All models are open source

โšก Performance Tip: Use Hugging Face Spaces Zero GPU for significantly faster processing