Image generation using the LLM Mesh#
Large Language Models (LLMs) are helpful for summarization, classification, chatbots, etc. Their effectiveness can be extended with frameworks like agents and function calling, and their accuracy can be improved with RAG and fine-tuning. We usually use LLMs for textual interaction, even if we can input some images in the different models.
This tutorial explores another side of LLMs - their image generation capabilities. Dataiku LLM Mesh allows users to use Image generation models. The Python Dataiku LLM Mesh API lets users quickly set up and test various LLMs. In this tutorial, you will use the image generation capabilities of the LLM Mesh and and build a visual web application around it. You will create a movie poster from an overview of a film. This information comes from the IMDB database, but you can easily gather film information using a search tool.
Prerequisites#
Dataiku 13.1
A valid LLM connection
If you want to test the webapp part, a code-env with the following packages:
dash dash-bootstrap-components
This tutorial has been tested with dash==2.17.1
and dash-bootstrap-components==1.6.0
.
You will need the IMDB database, which can be downloaded here.
Getting the LLM#
Getting an LLM ID for image generation is not so much different than retrieving a “classical” LLM ID. Code 1 shows how to retrieve this ID.
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
llm_list = project.list_llms(purpose="IMAGE_GENERATION")
for llm in llm_list:
print(f"- {llm.description} (id: {llm.id})")
Once you have identified which LLM you want to use, note the associated ID (LLM_ID).
Retrieving movie information and image creation#
To query the dataset easily, create an SQL dataset named movies
from the data you have previously downloaded.
Then, create the search function in a notebook,
as shown in Code 2.
from dataiku import SQLExecutor2
def search(title):
"""
Search for information based on movie title
Args:
title: the movie title
Returns:
the image src, title, year, overview and genre of the movie.
"""
dataset = dataiku.Dataset("movies")
table_name = dataset.get_location_info().get('info', {}).get('table')
print(table_name)
executor = SQLExecutor2(dataset=dataset)
query_reader = executor.query_to_iter(
f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{title}'""")
for tupl in query_reader.iter_tuples():
return tupl
Then, you can create a poster movie using a code similar to Code 3 from that information.
LLM_ID = "" # Replace with a valid LLM id
imagellm = dataiku.api_client().get_default_project().get_llm(LLM_ID)
movie = search('Citizen Kane')
img = imagellm.new_images_generation().with_prompt(f"""
Generate a poster movie. The size of the poster should be 120x120px.
The title of the movie is: "{movie[1]}."
The year of the movie is: "{movie[2]}."
The summary is: "{movie[3]}."
The genre of the movie is: "{movie[4]}."
""")
resp = img.execute()
from IPython.display import display, Image
if resp.success:
display(Image(resp.first_image()))
Using the prompt defined in Code 3 will obtain something like the poster shown in Figure 1 and Figure 2.
Creating a web app#
Like the previous section, the web application retrieves and displays information from the movies
dataset.
Based on the information and user needs, it generates a prompt for image generation.
Once the LLM generates the image, the web application displays the poster movie next to the original.
Figure 3 shows the web application when it starts. In Code 4, we import the necessary libraries and define the web application’s default values.
from dash import html
from dash import dcc
import dash_bootstrap_components as dbc
from dash.dependencies import Input
from dash.dependencies import Output
from dash.dependencies import State
from dash import no_update
import base64
import dataiku
from dataiku import SQLExecutor2
dbc_css = "https://cdn.jsdelivr.net/gh/AnnMarieW/dash-bootstrap-templates/dbc.min.css"
app.config.external_stylesheets = [dbc.themes.SUPERHERO, dbc_css]
IMG_LLM_ID = ""
DATASET_NAME = "movies"
USE_TITLE = 1
USE_YEAR = 2
USE_OVERVIEW = 3
USE_GENRE = 4
imagellm = dataiku.api_client().get_default_project().get_llm(IMG_LLM_ID)
The highlighted lines in Code 5 display the result of the query database when the user clicks on the search button.
# build your Dash app
v1_layout = html.Div([
dbc.Row([html.H2("Using LLM Mesh to generate of Poster movie."), ]),
dbc.Row(dbc.Label("Please enter a name of a movie:")),
dbc.Row([
dbc.Col(dbc.Input(id="movie", placeholder="Citizen Kane", debounce=True), width=10),
dbc.Col(dbc.Button("Search", id="search", color="primary"), width=2)
], justify="between", class_name="mt-3 mb-3"),
dbc.Row([
dbc.Col(dbc.Label("Select features you want to use for genrating an image:")),
dbc.Col(dbc.Row([
dbc.Checklist(
options=[
{"label": "Use Title", "value": USE_TITLE},
{"label": "Use Year", "value": USE_YEAR},
{"label": "Use Overview", "value": USE_OVERVIEW},
{"label": "Use Genre", "value": USE_GENRE}
],
value=[USE_OVERVIEW],
id="features",
inline=True,
),
]), width=6),
dbc.Col(dbc.Button("Generate", id="generate", color="primary"), width=2)
], align="center", class_name="mt-3 mb-3"),
dbc.Row([
dbc.Col([
dbc.Row([html.H2("Movie information")]),
dbc.Row([
html.H3(children="", id="title")
], align="center", justify="around"),
dbc.Row([
html.H4(children="", id="year")
]),
dbc.Row([
html.H4(children="", id="genre")
]),
dbc.Textarea(id="overview", style={"min-height": "500px"})
], width=4),
dbc.Col(html.Img(id="image", src="", width="95%"), width=4),
dbc.Col(html.Img(id="generatedImg", src="", width="95%"), width=4),
], align="center"),
dbc.Toast(
[html.P("Searching for information about the movie", className="mb-0"),
dbc.Spinner(color="primary")],
id="search-toast",
header="Querying the database",
icon="primary",
is_open=False,
style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
),
dbc.Toast(
[html.P("Generating an image", className="mb-0"),
dbc.Spinner(color="primary")],
id="generate-toast",
header="Querying the LLM",
icon="primary",
is_open=False,
style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
),
dcc.Store(id="step", data=[{"current_step": 0}]),
], className="container-fluid mt-3")
app.layout = v1_layout
Code 6 shows how to connect the callbacks needed for the web application result. Figure 4 shows the result of searching “Citizen Kane,” and Figure 5 shows the result of image generation.
def search(movie_title):
"""
Search information about a movie
Args:
movie_title: title of the movie
Returns:
Information of the movie
"""
dataset = dataiku.Dataset(DATASET_NAME)
table_name = dataset.get_location_info().get('info', {}).get('table')
executor = SQLExecutor2(dataset=dataset)
query_reader = executor.query_to_iter(
f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{movie_title}'""")
for tupl in query_reader.iter_tuples():
return tupl
return None
@app.callback([
Output("image", "src"),
Output("title", "children"),
Output("year", "children"),
Output("genre", "children"),
Output("overview", "value")
],
Input("search", "n_clicks"),
State("features", "value"),
Input("movie", "value"),
prevent_initial_call=True,
running=[(Output("search-toast", "is_open"), True, False),
(Output("search", "disabled"), True, False),
(Output("generate", "disabled"), True, False)
]
)
def gather_information(_, value, title):
info = search(title)
if info:
return [info[0], info[1], info[2], info[4], info[3]]
else:
return ["", "", "", "", f"""No information concerning the movie: "{title}" """]
@app.callback([
Output("generatedImg", "src"),
],
Input("generate", "n_clicks"),
State("title", "children"),
State("year", "children"),
State("genre", "children"),
State("overview", "value"),
State("features", "value"),
prevent_initial_call=True,
running=[(Output("generate-toast", "is_open"), True, False),
(Output("search", "disabled"), True, False),
(Output("generate", "disabled"), True, False)
],
)
def generate_image(_, title, year, genre, overview, features):
prompt = "Generate a poster movie."
if USE_TITLE in features:
prompt = f"""${prompt} The title of the movie is: "{title}." """
if USE_YEAR in features:
prompt = f"""${prompt} The film was released in: "{year}." """
if USE_GENRE in features:
prompt = f"""${prompt} The film genre is: "{year}." """
if USE_OVERVIEW in features:
prompt = f"""${prompt} The film synopsis is: "{overview}." """
img = imagellm.new_images_generation().with_prompt(prompt)
response = img.execute()
if response.success:
return ['data:image/png;base64,' + base64.b64encode(response.first_image()).decode('utf-8')]
return no_update
Wrapping up#
You have a working web application that generates images using LLM Mesh. You can enhance this application by using an LLM to write the image prompts or by using a search engine to collect information directly from the Internet instead of a database. This tutorial is easily adaptable to other use cases for image generation. For example, if you work for a company with brands, you can adapt this tutorial to generate images of your products.
Here is the complete code of the web application:
app.py
from dash import html
from dash import dcc
import dash_bootstrap_components as dbc
from dash.dependencies import Input
from dash.dependencies import Output
from dash.dependencies import State
from dash import no_update
import base64
import dataiku
from dataiku import SQLExecutor2
dbc_css = "https://cdn.jsdelivr.net/gh/AnnMarieW/dash-bootstrap-templates/dbc.min.css"
app.config.external_stylesheets = [dbc.themes.SUPERHERO, dbc_css]
IMG_LLM_ID = ""
DATASET_NAME = "movies"
USE_TITLE = 1
USE_YEAR = 2
USE_OVERVIEW = 3
USE_GENRE = 4
imagellm = dataiku.api_client().get_default_project().get_llm(IMG_LLM_ID)
# build your Dash app
v1_layout = html.Div([
dbc.Row([html.H2("Using LLM Mesh to generate of Poster movie."), ]),
dbc.Row(dbc.Label("Please enter a name of a movie:")),
dbc.Row([
dbc.Col(dbc.Input(id="movie", placeholder="Citizen Kane", debounce=True), width=10),
dbc.Col(dbc.Button("Search", id="search", color="primary"), width=2)
], justify="between", class_name="mt-3 mb-3"),
dbc.Row([
dbc.Col(dbc.Label("Select features you want to use for genrating an image:")),
dbc.Col(dbc.Row([
dbc.Checklist(
options=[
{"label": "Use Title", "value": USE_TITLE},
{"label": "Use Year", "value": USE_YEAR},
{"label": "Use Overview", "value": USE_OVERVIEW},
{"label": "Use Genre", "value": USE_GENRE}
],
value=[USE_OVERVIEW],
id="features",
inline=True,
),
]), width=6),
dbc.Col(dbc.Button("Generate", id="generate", color="primary"), width=2)
], align="center", class_name="mt-3 mb-3"),
dbc.Row([
dbc.Col([
dbc.Row([html.H2("Movie information")]),
dbc.Row([
html.H3(children="", id="title")
], align="center", justify="around"),
dbc.Row([
html.H4(children="", id="year")
]),
dbc.Row([
html.H4(children="", id="genre")
]),
dbc.Textarea(id="overview", style={"min-height": "500px"})
], width=4),
dbc.Col(html.Img(id="image", src="", width="95%"), width=4),
dbc.Col(html.Img(id="generatedImg", src="", width="95%"), width=4),
], align="center"),
dbc.Toast(
[html.P("Searching for information about the movie", className="mb-0"),
dbc.Spinner(color="primary")],
id="search-toast",
header="Querying the database",
icon="primary",
is_open=False,
style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
),
dbc.Toast(
[html.P("Generating an image", className="mb-0"),
dbc.Spinner(color="primary")],
id="generate-toast",
header="Querying the LLM",
icon="primary",
is_open=False,
style={"position": "fixed", "top": "50%", "left": "50%", "transform": "translate(-50%, -50%)"},
),
dcc.Store(id="step", data=[{"current_step": 0}]),
], className="container-fluid mt-3")
app.layout = v1_layout
def search(movie_title):
"""
Search information about a movie
Args:
movie_title: title of the movie
Returns:
Information of the movie
"""
dataset = dataiku.Dataset(DATASET_NAME)
table_name = dataset.get_location_info().get('info', {}).get('table')
executor = SQLExecutor2(dataset=dataset)
query_reader = executor.query_to_iter(
f"""SELECT "Poster_Link", "Series_Title", "Released_Year", "Overview", "Genre" FROM "{table_name}" WHERE "Series_Title" = '{movie_title}'""")
for tupl in query_reader.iter_tuples():
return tupl
return None
@app.callback([
Output("image", "src"),
Output("title", "children"),
Output("year", "children"),
Output("genre", "children"),
Output("overview", "value")
],
Input("search", "n_clicks"),
State("features", "value"),
Input("movie", "value"),
prevent_initial_call=True,
running=[(Output("search-toast", "is_open"), True, False),
(Output("search", "disabled"), True, False),
(Output("generate", "disabled"), True, False)
]
)
def gather_information(_, value, title):
info = search(title)
if info:
return [info[0], info[1], info[2], info[4], info[3]]
else:
return ["", "", "", "", f"""No information concerning the movie: "{title}" """]
@app.callback([
Output("generatedImg", "src"),
],
Input("generate", "n_clicks"),
State("title", "children"),
State("year", "children"),
State("genre", "children"),
State("overview", "value"),
State("features", "value"),
prevent_initial_call=True,
running=[(Output("generate-toast", "is_open"), True, False),
(Output("search", "disabled"), True, False),
(Output("generate", "disabled"), True, False)
],
)
def generate_image(_, title, year, genre, overview, features):
prompt = "Generate a poster movie."
if USE_TITLE in features:
prompt = f"""${prompt} The title of the movie is: "{title}." """
if USE_YEAR in features:
prompt = f"""${prompt} The film was released in: "{year}." """
if USE_GENRE in features:
prompt = f"""${prompt} The film genre is: "{year}." """
if USE_OVERVIEW in features:
prompt = f"""${prompt} The film synopsis is: "{overview}." """
img = imagellm.new_images_generation().with_prompt(prompt)
response = img.execute()
if response.success:
return ['data:image/png;base64,' + base64.b64encode(response.first_image()).decode('utf-8')]
return no_update