LLAVA • ChatNetAI

#LLaVA

address

Introduction

LLaVA (Large Language and Vision Assistant) is a large multi-modal model jointly released by researchers from the University of Wisconsin-Madison, Microsoft Research, and Columbia University. The model demonstrates some image and text understanding capabilities approaching multimodal GPT-4: achieving a relative score of 85.1% relative to GPT-4. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieved a new SoTA with 92.53% accuracy.

Features

Free image recognition capabilities
Support adjusting parameters

address

Introduction

Features

screenshot