Skip to main content

IBM watsonx.ai

WatsonxLLM is a wrapper for IBM watsonx.ai foundation models.

This example shows how to communicate with watsonx.ai models using LangChain.

Overview

Integration details

ClassPackageLocalSerializableJS supportPackage downloadsPackage latest
WatsonxLLMlangchain-ibmPyPI - DownloadsPyPI - Version

Setup

To access IBM watsonx.ai models you'll need to create an IBM watsonx.ai account, get an API key, and install the langchain-ibm integration package.

Credentials

The cell below defines the credentials required to work with watsonx Foundation Model inferencing.

Action: Provide the IBM Cloud user API key. For details, see Managing user API keys.

import os
from getpass import getpass

watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key

Additionaly you are able to pass additional secrets as an environment variable.

import os

os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ["WATSONX_INSTANCE_ID"] = "your instance_id for accessing the CPD cluster"

Installation

The LangChain IBM integration lives in the langchain-ibm package:

!pip install -qU langchain-ibm

Instantiation

You might need to adjust model parameters for different models or tasks. For details, refer to documentation.

parameters = {
"decoding_method": "sample",
"max_new_tokens": 100,
"min_new_tokens": 1,
"temperature": 0.5,
"top_k": 50,
"top_p": 1,
}

Initialize the WatsonxLLM class with previously set parameters.

Note:

  • To provide context for the API call, you must add project_id or space_id. For more information see documentation.
  • Depending on the region of your provisioned service instance, use one of the urls described here.

In this example, we’ll use the project_id and Dallas url.

You need to specify model_id that will be used for inferencing. All available models you can find in documentation.

from langchain_ibm import WatsonxLLM

watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
params=parameters,
)

Alternatively you can use Cloud Pak for Data credentials. For details, see documentation.

watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="PASTE YOUR URL HERE",
username="PASTE YOUR USERNAME HERE",
password="PASTE YOUR PASSWORD HERE",
instance_id="openshift",
version="4.8",
project_id="PASTE YOUR PROJECT_ID HERE",
params=parameters,
)

Instead of model_id, you can also pass the deployment_id of the previously tuned model. The entire model tuning workflow is described here.

watsonx_llm = WatsonxLLM(
deployment_id="PASTE YOUR DEPLOYMENT_ID HERE",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
params=parameters,
)

For certain requirements, there is an option to pass the IBM's APIClient object into the WatsonxLLM class.

from ibm_watsonx_ai import APIClient

api_client = APIClient(...)

watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
watsonx_client=api_client,
)

You can also pass the IBM's ModelInference object into the WatsonxLLM class.

from ibm_watsonx_ai.foundation_models import ModelInference

model = ModelInference(...)

watsonx_llm = WatsonxLLM(watsonx_model=model)

Invocation

To obtain completions, you can call the model directly using a string prompt.

# Calling a single prompt

watsonx_llm.invoke("Who is man's best friend?")
"Man's best friend is his dog. Dogs are man's best friend because they are always there for you, they never judge you, and they love you unconditionally. Dogs are also great companions and can help reduce stress levels. "
# Calling multiple prompts

watsonx_llm.generate(
[
"The fastest dog in the world?",
"Describe your chosen dog breed",
]
)
LLMResult(generations=[[Generation(text='The fastest dog in the world is the greyhound. Greyhounds can run up to 45 mph, which is about the same speed as a Usain Bolt.', generation_info={'finish_reason': 'eos_token'})], [Generation(text='The Labrador Retriever is a breed of retriever that was bred for hunting. They are a very smart breed and are very easy to train. They are also very loyal and will make great companions. ', generation_info={'finish_reason': 'eos_token'})]], llm_output={'token_usage': {'generated_token_count': 82, 'input_token_count': 13}, 'model_id': 'ibm/granite-13b-instruct-v2', 'deployment_id': None}, run=[RunInfo(run_id=UUID('750b8a0f-8846-456d-93d0-e039e95b1276')), RunInfo(run_id=UUID('aa4c2a1c-5b08-4fcf-87aa-50228de46db5'))], type='LLMResult')

Streaming the Model output

You can stream the model output.

for chunk in watsonx_llm.stream(
"Describe your favorite breed of dog and why it is your favorite."
):
print(chunk, end="")
My favorite breed of dog is a Labrador Retriever. They are my favorite breed because they are my favorite color, yellow. They are also very smart and easy to train.

Chaining

Create PromptTemplate objects which will be responsible for creating a random question.

from langchain_core.prompts import PromptTemplate

template = "Generate a random question about {topic}: Question: "

prompt = PromptTemplate.from_template(template)
API Reference:PromptTemplate

Provide a topic and run the chain.

llm_chain = prompt | watsonx_llm

topic = "dog"

llm_chain.invoke(topic)
'What is the origin of the name "Pomeranian"?'

API reference

For detailed documentation of all WatsonxLLM features and configurations head to the API reference.


Was this page helpful?