The term ‘Multi-Modal’ refers to the ability to support more than just text, encompassing images, videos, and files. Within the Chainlit application, users have the flexibility to attach any file to their messages. This can be achieved either by utilizing the drag and drop feature or by clicking on the attach button located in the chat bar.

Attach files to a message

As a developer, you have the capability to access these attached files through the cl.on_message decorated function.

import chainlit as cl


@cl.on_message
async def on_message(msg: cl.Message):
    if not msg.elements:
        await cl.Message(content="No file attached").send()
        return

    # Processing images exclusively
    images = [file for file in msg.elements if "image" in file.mime]

    # Read the first image
    with open(images[0].path, "r") as f:
        pass

    await cl.Message(content=f"Received {len(images)} image(s)").send()

Image Processing with Transformers

Multi-modal capabilities are being added to Large Language Model (effectively making them Large Multi Modal Models). For more information on this, refer to the LLaVa cookbook example.

Disabling Multi-Modal Functionality

If you wish to disable this feature (which would prevent users from attaching files to their messages), you can do so by setting features.multi_modal=false in your Chainlit config file.