Image generation using the LLM Mesh#

Large Language Models (LLMs) are helpful for summarization, classification, chatbots, etc. Their effectiveness can be extended with frameworks like agents and function calling, and their accuracy can be improved with RAG and fine-tuning. We usually use LLMs for textual interaction, even if we can input some images in the different models.

This tutorial explores another side of LLMs - their image generation capabilities. Dataiku LLM Mesh allows users to use Image generation models. The Python Dataiku LLM Mesh API lets users quickly set up and test various LLMs. In this tutorial, you will use the image generation capabilities of the LLM Mesh and and build a visual web application around it. You will create a movie poster from an overview of a film. This information comes from the IMDB database, but you can easily gather film information using a search tool.

Prerequisites#

  • Dataiku 13.1

  • A valid LLM connection

  • If you want to test the webapp part, a code-env with the following packages:

    dash
    dash-bootstrap-components
    

This tutorial has been tested with dash==2.17.1 and dash-bootstrap-components==1.6.0.

You will need the IMDB database, which can be downloaded here.

Getting the LLM#

Getting an LLM ID for image generation is not so much different than retrieving a “classical” LLM ID. Code 1 shows how to retrieve this ID.

Code1: List existing LLMs capable of image generation and their associated ID.#
import dataiku

client = dataiku.api_client()
project = client.get_default_project()
llm_list = project.list_llms(purpose="IMAGE_GENERATION")
for llm in llm_list:
    print(f"- {llm.description} (id: {llm.id})")

Once you have identified which LLM you want to use, note the associated ID (LLM_ID).

Retrieving movie information and image creation#

To query the dataset easily, create an SQL dataset named movies from the data you have previously downloaded. Then, create the search function in a notebook, as shown in Code 2.

Code 2: Searching for information about a movie#
from dataiku import SQLExecutor2


def search(title):
    """
    Search for information based on movie title
    Args:
        title: the movie title

    Returns:
        the image src, title, year, overview and genre of the movie.
    """
    dataset = dataiku.Dataset("movies")
    table_name = dataset.get_location_info().get('info', {}).get('table')
    print(table_name)
    executor = SQLExecutor2(dataset=dataset)
    query_reader = executor.query_to_iter(
        f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{title}'""")
    for tupl in query_reader.iter_tuples():
        return tupl

Then, you can create a poster movie using a code similar to Code 3 from that information.

Code 3: Creating a poster movie.#
LLM_ID = ""  # Replace with a valid LLM id

imagellm = dataiku.api_client().get_default_project().get_llm(LLM_ID)
movie = search('Citizen Kane')
img = imagellm.new_images_generation().with_prompt(f"""
Generate a poster movie. The size of the poster should be 120x120px. 
The title of the movie is: "{movie[1]}."
The year of the movie is: "{movie[2]}."
The summary is: "{movie[3]}."
The genre of the movie is: "{movie[4]}."
""")
resp = img.execute()

from IPython.display import display, Image

if resp.success:
    display(Image(resp.first_image()))

Using the prompt defined in Code 3 will obtain something like the poster shown in Figure 1 and Figure 2.

Figure 1: Resulting image.

Figure 1: Resulting image.#

Figure 2: Resulting image -- second run.

Figure 2: Resulting image – second run.#

Creating a web app#

Like the previous section, the web application retrieves and displays information from the movies dataset. Based on the information and user needs, it generates a prompt for image generation. Once the LLM generates the image, the web application displays the poster movie next to the original.

Figure 3: Start of the webapp.

Figure 3: Start of the webapp.#

Figure 3 shows the web application when it starts. In Code 4, we import the necessary libraries and define the web application’s default values.

Code 4: Global definitions#
from dash import html
from dash import dcc
import dash_bootstrap_components as dbc
from dash.dependencies import Input
from dash.dependencies import Output
from dash.dependencies import State
from dash import no_update

import base64

import dataiku
from dataiku import SQLExecutor2

dbc_css = "https://cdn.jsdelivr.net/gh/AnnMarieW/dash-bootstrap-templates/dbc.min.css"
app.config.external_stylesheets = [dbc.themes.SUPERHERO, dbc_css]

IMG_LLM_ID = ""
DATASET_NAME = "movies"

USE_TITLE = 1
USE_YEAR = 2
USE_OVERVIEW = 3
USE_GENRE = 4

imagellm = dataiku.api_client().get_default_project().get_llm(IMG_LLM_ID)

The highlighted lines in Code 5 display the result of the query database when the user clicks on the search button.

Code 5: Application layout#
# build your Dash app
v1_layout = html.Div([
    dbc.Row([html.H2("Using LLM Mesh to generate of Poster movie."), ]),
    dbc.Row(dbc.Label("Please enter a name of a movie:")),
    dbc.Row([
        dbc.Col(dbc.Input(id="movie", placeholder="Citizen Kane", debounce=True), width=10),
        dbc.Col(dbc.Button("Search", id="search", color="primary"), width=2)
    ], justify="between", class_name="mt-3 mb-3"),
    dbc.Row([
        dbc.Col(dbc.Label("Select features you want to use for genrating an image:")),
        dbc.Col(dbc.Row([
            dbc.Checklist(
                options=[
                    {"label": "Use Title", "value": USE_TITLE},
                    {"label": "Use Year", "value": USE_YEAR},
                    {"label": "Use Overview", "value": USE_OVERVIEW},
                    {"label": "Use Genre", "value": USE_GENRE}
                ],
                value=[USE_OVERVIEW],
                id="features",
                inline=True,
            ),
        ]), width=6),
        dbc.Col(dbc.Button("Generate", id="generate", color="primary"), width=2)
    ], align="center", class_name="mt-3 mb-3"),
    dbc.Row([
        dbc.Col([
            dbc.Row([html.H2("Movie information")]),
            dbc.Row([
                html.H3(children="", id="title")
            ], align="center", justify="around"),
            dbc.Row([
                html.H4(children="", id="year")
            ]),
            dbc.Row([
                html.H4(children="", id="genre")
            ]),
            dbc.Textarea(id="overview", style={"min-height": "500px"})
        ], width=4),
        dbc.Col(html.Img(id="image", src="", width="95%"), width=4),
        dbc.Col(html.Img(id="generatedImg", src="", width="95%"), width=4),
    ], align="center"),
    dbc.Toast(
        [html.P("Searching for information about the movie", className="mb-0"),
         dbc.Spinner(color="primary")],
        id="search-toast",
        header="Querying the database",
        icon="primary",
        is_open=False,
        style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
    ),
    dbc.Toast(
        [html.P("Generating an image", className="mb-0"),
         dbc.Spinner(color="primary")],
        id="generate-toast",
        header="Querying the LLM",
        icon="primary",
        is_open=False,
        style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
    ),

    dcc.Store(id="step", data=[{"current_step": 0}]),
], className="container-fluid mt-3")

app.layout = v1_layout

Code 6 shows how to connect the callbacks needed for the web application result. Figure 4 shows the result of searching “Citizen Kane,” and Figure 5 shows the result of image generation.

Code 6: Callbacks#
def search(movie_title):
    """
    Search information about a movie
    Args:
        movie_title: title of the movie
    Returns:
        Information of the movie
    """
    dataset = dataiku.Dataset(DATASET_NAME)
    table_name = dataset.get_location_info().get('info', {}).get('table')
    executor = SQLExecutor2(dataset=dataset)
    query_reader = executor.query_to_iter(
        f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{movie_title}'""")
    for tupl in query_reader.iter_tuples():
        return tupl
    return None


@app.callback([
    Output("image", "src"),
    Output("title", "children"),
    Output("year", "children"),
    Output("genre", "children"),
    Output("overview", "value")
],
    Input("search", "n_clicks"),
    State("features", "value"),
    Input("movie", "value"),
    prevent_initial_call=True,
    running=[(Output("search-toast", "is_open"), True, False),
             (Output("search", "disabled"), True, False),
             (Output("generate", "disabled"), True, False)
             ]
)
def gather_information(_, value, title):
    info = search(title)
    if info:
        return [info[0], info[1], info[2], info[4], info[3]]
    else:
        return ["", "", "", "", f"""No information concerning the movie: "{title}" """]


@app.callback([
    Output("generatedImg", "src"),
],
    Input("generate", "n_clicks"),
    State("title", "children"),
    State("year", "children"),
    State("genre", "children"),
    State("overview", "value"),
    State("features", "value"),
    prevent_initial_call=True,
    running=[(Output("generate-toast", "is_open"), True, False),
             (Output("search", "disabled"), True, False),
             (Output("generate", "disabled"), True, False)
             ],
)
def generate_image(_, title, year, genre, overview, features):
    prompt = "Generate a poster movie."
    if USE_TITLE in features:
        prompt = f"""${prompt} The title of the movie is: "{title}." """
    if USE_YEAR in features:
        prompt = f"""${prompt} The film was released in: "{year}." """
    if USE_GENRE in features:
        prompt = f"""${prompt} The film genre is: "{year}." """
    if USE_OVERVIEW in features:
        prompt = f"""${prompt} The film synopsis is: "{overview}." """

    img = imagellm.new_images_generation().with_prompt(prompt)
    response = img.execute()
    if response.success:
        return ['data:image/png;base64,' + base64.b64encode(response.first_image()).decode('utf-8')]
    return no_update
Figure 4: Searching for "Citizen Kane".

Figure 4: Searching for “Citizen Kane”.#

Figure 5: Generating an image from the user inputs.

Figure 5: Generating an image from the user inputs.#

Wrapping up#

You have a working web application that generates images using LLM Mesh. You can enhance this application by using an LLM to write the image prompts or by using a search engine to collect information directly from the Internet instead of a database. This tutorial is easily adaptable to other use cases for image generation. For example, if you work for a company with brands, you can adapt this tutorial to generate images of your products.

Here is the complete code of the web application:

app.py
from dash import html
from dash import dcc
import dash_bootstrap_components as dbc
from dash.dependencies import Input
from dash.dependencies import Output
from dash.dependencies import State
from dash import no_update

import base64

import dataiku
from dataiku import SQLExecutor2

dbc_css = "https://cdn.jsdelivr.net/gh/AnnMarieW/dash-bootstrap-templates/dbc.min.css"
app.config.external_stylesheets = [dbc.themes.SUPERHERO, dbc_css]

IMG_LLM_ID = ""
DATASET_NAME = "movies"

USE_TITLE = 1
USE_YEAR = 2
USE_OVERVIEW = 3
USE_GENRE = 4

imagellm = dataiku.api_client().get_default_project().get_llm(IMG_LLM_ID)

# build your Dash app
v1_layout = html.Div([
    dbc.Row([html.H2("Using LLM Mesh to generate of Poster movie."), ]),
    dbc.Row(dbc.Label("Please enter a name of a movie:")),
    dbc.Row([
        dbc.Col(dbc.Input(id="movie", placeholder="Citizen Kane", debounce=True), width=10),
        dbc.Col(dbc.Button("Search", id="search", color="primary"), width=2)
    ], justify="between", class_name="mt-3 mb-3"),
    dbc.Row([
        dbc.Col(dbc.Label("Select features you want to use for genrating an image:")),
        dbc.Col(dbc.Row([
            dbc.Checklist(
                options=[
                    {"label": "Use Title", "value": USE_TITLE},
                    {"label": "Use Year", "value": USE_YEAR},
                    {"label": "Use Overview", "value": USE_OVERVIEW},
                    {"label": "Use Genre", "value": USE_GENRE}
                ],
                value=[USE_OVERVIEW],
                id="features",
                inline=True,
            ),
        ]), width=6),
        dbc.Col(dbc.Button("Generate", id="generate", color="primary"), width=2)
    ], align="center", class_name="mt-3 mb-3"),
    dbc.Row([
        dbc.Col([
            dbc.Row([html.H2("Movie information")]),
            dbc.Row([
                html.H3(children="", id="title")
            ], align="center", justify="around"),
            dbc.Row([
                html.H4(children="", id="year")
            ]),
            dbc.Row([
                html.H4(children="", id="genre")
            ]),
            dbc.Textarea(id="overview", style={"min-height": "500px"})
        ], width=4),
        dbc.Col(html.Img(id="image", src="", width="95%"), width=4),
        dbc.Col(html.Img(id="generatedImg", src="", width="95%"), width=4),
    ], align="center"),
    dbc.Toast(
        [html.P("Searching for information about the movie", className="mb-0"),
         dbc.Spinner(color="primary")],
        id="search-toast",
        header="Querying the database",
        icon="primary",
        is_open=False,
        style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
    ),
    dbc.Toast(
        [html.P("Generating an image", className="mb-0"),
         dbc.Spinner(color="primary")],
        id="generate-toast",
        header="Querying the LLM",
        icon="primary",
        is_open=False,
        style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
    ),

    dcc.Store(id="step", data=[{"current_step": 0}]),
], className="container-fluid mt-3")

app.layout = v1_layout


def search(movie_title):
    """
    Search information about a movie
    Args:
        movie_title: title of the movie
    Returns:
        Information of the movie
    """
    dataset = dataiku.Dataset(DATASET_NAME)
    table_name = dataset.get_location_info().get('info', {}).get('table')
    executor = SQLExecutor2(dataset=dataset)
    query_reader = executor.query_to_iter(
        f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{movie_title}'""")
    for tupl in query_reader.iter_tuples():
        return tupl
    return None


@app.callback([
    Output("image", "src"),
    Output("title", "children"),
    Output("year", "children"),
    Output("genre", "children"),
    Output("overview", "value")
],
    Input("search", "n_clicks"),
    State("features", "value"),
    Input("movie", "value"),
    prevent_initial_call=True,
    running=[(Output("search-toast", "is_open"), True, False),
             (Output("search", "disabled"), True, False),
             (Output("generate", "disabled"), True, False)
             ]
)
def gather_information(_, value, title):
    info = search(title)
    if info:
        return [info[0], info[1], info[2], info[4], info[3]]
    else:
        return ["", "", "", "", f"""No information concerning the movie: "{title}" """]


@app.callback([
    Output("generatedImg", "src"),
],
    Input("generate", "n_clicks"),
    State("title", "children"),
    State("year", "children"),
    State("genre", "children"),
    State("overview", "value"),
    State("features", "value"),
    prevent_initial_call=True,
    running=[(Output("generate-toast", "is_open"), True, False),
             (Output("search", "disabled"), True, False),
             (Output("generate", "disabled"), True, False)
             ],
)
def generate_image(_, title, year, genre, overview, features):
    prompt = "Generate a poster movie."
    if USE_TITLE in features:
        prompt = f"""${prompt} The title of the movie is: "{title}." """
    if USE_YEAR in features:
        prompt = f"""${prompt} The film was released in: "{year}." """
    if USE_GENRE in features:
        prompt = f"""${prompt} The film genre is: "{year}." """
    if USE_OVERVIEW in features:
        prompt = f"""${prompt} The film synopsis is: "{overview}." """

    img = imagellm.new_images_generation().with_prompt(prompt)
    response = img.execute()
    if response.success:
        return ['data:image/png;base64,' + base64.b64encode(response.first_image()).decode('utf-8')]
    return no_update