Creating an API endpoint from webapps#
In this tutorial, you will learn how to build an API from a web application’s backend (called headless API)
and how to use it from code.
You will use an SQLExecutor
to fetch information from a dataset, filtering them if needed.
You can use a headless web application to create an API endpoint for a particular purpose that doesn’t fit well in the API Node. For example, you may encounter this need if you want to use an SQLExecutor, access datasets, etc. You should use this functionality if you can’t use the API Node. The API Node is the preferred way to expose functionalities as it is scalable and highly available.
Prerequisites#
Dataiku >= 13.1
You must download
this dataset
and create an SQL dataset namedpro_customers_sql
.
Dataiku >= 13.1
A code environment with
dash
.You must download
this dataset
and create an SQL dataset namedpro_customers_sql
.
Defining the routes#
The first step is to define the routes you want your API to handle. A single route is responsible for a (simple) process. Dataiku provides an easy way to describe those routes. Relying on a Flask server helps you return the desired resource types. Check the API access in the web apps’ settings to use this functionality, as shown in Figure 1.
This tutorial relies on a single route handling some parameters to filter the data.
The get_customer_info
will provide all data stored in the pro_customers_sql
as raw text.
Filtering is done by adding an id
parameter to this route.
The answer will be in a JSON format.
For example, a query on get_customer_info
will return the data stored in the dataset,
shown in Table 1.
id |
name |
job |
company |
---|---|---|---|
tcook |
Tim Cook |
CEO |
Apple |
snadella |
Satya Nadella |
CEO |
Microsoft |
jbezos |
Jeff Bezos |
CEO |
Amazon |
fdouetteau |
Florian Douetteau |
CEO |
Dataiku |
wcoyote |
Wile E. Coyote |
Business Developer |
ACME |
If you query get_customer_info?id=fdouetteau
, the API should return only the information about the customer with the id == fdouetteau
.
Note
You can still use the backend to create a classical web application. Turning a web application into a headless one does not prevent developing a web application.
You must enable the Python backend and define the route when using a standard web application, as shown in Code 1.
import dataiku
from dataiku import SQLExecutor2
from flask import request, make_response
import logging
logger = logging.getLogger(__name__)
DATASET_NAME = 'pro_customers_sql'
@app.route('/get_customer_info')
def get_customer_info():
dataset = dataiku.Dataset(DATASET_NAME)
table_name = dataset.get_location_info().get('info', {}).get('table')
executor = SQLExecutor2(dataset=dataset)
id = request.args.get('id', None)
if id:
query_reader = executor.query_to_iter(
f"""SELECT name, job, company FROM "{table_name}" WHERE id='{id}'""")
for (name, job, company) in query_reader.iter_tuples():
result = {"name": name, "job": job, "company": company}
response = make_response(json.dumps(result))
response.headers['Content-type'] = 'application/json'
return response
else:
query_reader = executor.query_to_iter(
f"""SELECT name, job, company FROM "{table_name}" """)
result = ""
for (name, job, company) in query_reader.iter_tuples():
result += f"""{name}, {job}, {company}
"""
response = make_response(result)
response.headers['Content-type'] = 'text/plain'
return response
Once you have set the code env in the settings panel, you will define the route, as shown in Code 1.
import dash_html_components as html
import dataiku
from dataiku import SQLExecutor2
from flask import request, make_response
DATASET_NAME = 'pro_customers_sql'
@app.server.route('/get_customer_info')
def get_customer_info():
dataset = dataiku.Dataset(DATASET_NAME)
table_name = dataset.get_location_info().get('info', {}).get('table')
executor = SQLExecutor2(dataset=dataset)
id = request.args.get('id', None)
if id:
query_reader = executor.query_to_iter(
f"""SELECT name, job, company FROM "{table_name}" WHERE id='{id}'""")
for (name, job, company) in query_reader.iter_tuples():
result = {"name": name, "job": job, "company": company}
response = make_response(json.dumps(result))
response.headers['Content-type'] = 'application/json'
return response
else:
query_reader = executor.query_to_iter(
f"""SELECT name, job, company FROM "{table_name}" """)
result = ""
for (name, job, company) in query_reader.iter_tuples():
result += f"""{name}, {job}, {company}
"""
response = make_response(result)
response.headers['Content-type'] = 'text/plain'
return response
# We need to have a layout (even if we don't use it)
# In case we don't set a layout dash application won't start
app.layout = html.Div()
When using Dash as a web application framework, you can access the defined routes directly without enabling API Access.
Every web application is accessible via the URL https://<DSS_ADDRESS>:<DSS_PORT>/public-webapps/<PROJECT_KEY>/<WEBAPP_ID>/
.
You will find more information on extracting those parameters in the cUrl section.
It does not mean that the web application is public; it means that the application is also exposed on this route.
You can also use Vanity URL if you want.
Interacting with the newly defined API#
To access the headless API, you must be logged on to the instance or have an API key that identifies you. If you need help setting up an API key, please read this tutorial. Then, there are several different ways to interact with a headless API.
Using cUrl requires an API key to access the headless API or an equivalent way of authenticating,
depending on the authentication method set on the Dataiku instance.
Once you have this API key, you can access the API endpoint with the following command.
The WEBAPP_ID
is the first eight characters (before the underscore) in the webapp URL.
For example, if the webapp URL in DSS is /projects/HEADLESS/webapps/kUDF1mQ_api/view
, the WEBAPP_ID
is
kUDF1mQ
and the PROJECT_KEY
is HEADLESS
.
curl -X GET --header 'Authorization: Bearer <USE_YOUR_API_KEY>' \
'http://<DSS_ADDRESS>:<DSS_PORT>/web-apps-backends/<PROJECT_KEY>/<WEBAPP_ID>/get_customer_info'
You can access the headless API using the Python API.
Depending on whether you are inside Dataiku or outside, you will use the dataikuapi
or
the dataiku
package, respectively, as shown in Code 3.
import dataiku, dataikuapi
API_KEY="bx73rdSrUHol2qfmmefetUBCPUaJd3BY"
DSS_LOCATION = "http://dss.example.com/"
PROJECT_KEY = "HEADLESS"
WEBAPP_ID = "kUDF1mQ"
# If you are outside Dataiku use this function call
client = dataikuapi.DSSClient(DSS_LOCATION, API_KEY)
# If you are inside Dataiku you can use this function call
client = dataiku.api_client()
project = client.get_project(PROJECT_KEY)
webapp = project.get_webapp(WEBAPP_ID)
backend = webapp.get_backend_client()
# To retrieve all users
print(backend.session.get(backend.base_url + '/get_customer_info').text)
# To filter on one user
print(backend.session.get(backend.base_url + '/get_customer_info?id=fdouetteau').text)
Wrapping up#
If you need to give access to unauthenticated users, you can turn your web application into a public one, as this documentation suggests. Now that you understand how to turn a web application into a headless one, you can create an agent-headless API.