Vision Language Models MoE-LLaVA, MOBILE-AGENT, and more
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Routers in Vision Mixture of Experts: An Empirical Study
LLaVA-1.6: Improved reasoning, OCR, and world knowledge
MouSi: Poly-Visual-Expert Vision-Language Models https://arxiv.org/pdf/2401.17221.pdf
https://github.com/vikhyat/moondream
https://huggingface.co/LanguageBind/MoE-LLaVA-Phi2-2.7B-4e-384