November 22, 2023

The impact of multimodality in the landscape of Large Language Models

In the dynamic landscape of Large Language Models, the old expression "A picture is worth a thousand words" takes on new meaning. Since the recent introduction of GPT-4 Vision, there has been a growing interest in multimodality - i.e. the capacity to handle information of diverse formats, including text, image, video, and voice, unlocking new applications that were previously beyond the reach of text-only models.

Feel free to reach out if you're interested in learning how to embark on a journey with LMMs, unlocking new applications that were previously beyond reach.