Skip to content
LinkedInX

Python AI SDK in Practice

About 5 minutes

Target audience: Those who understand Python and API basics and want to integrate Claude into a real application

The Anthropic SDK is the official library for calling the Claude API from Python. It enables simple, clean code for the core Claude API features: sending and receiving messages, streaming responses, and tool use. This page walks through practical, runnable code examples to build a solid understanding of how to use the SDK.

pip install anthropic

Manage API keys through environment variables — never write them directly in code:

# macOS / Linux
export ANTHROPIC_API_KEY="sk-ant-..."

# Windows (PowerShell)
$env:ANTHROPIC_API_KEY = "sk-ant-..."

For projects using a .env file, python-dotenv is convenient:

pip install python-dotenv
# .env file (add this to .gitignore)
ANTHROPIC_API_KEY=sk-ant-...
from dotenv import load_dotenv
load_dotenv()  # Load the .env file

Reading the API key from an environment variable ensures that it is not exposed even if the code is published to GitHub.

The simplest way to query Claude:

import anthropic

client = anthropic.Anthropic()
# ANTHROPIC_API_KEY environment variable is loaded automatically

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain Python in three sentences."}
    ]
)

# Extract the response text
print(message.content[0].text)
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What is 1+1?"}]
)

# Fields of the response object
print(message.id)               # Unique message ID
print(message.model)            # Model name used
print(message.role)             # "assistant"
print(message.stop_reason)      # "end_turn" (normal completion)

# Token usage
print(message.usage.input_tokens)   # Number of input tokens
print(message.usage.output_tokens)  # Number of output tokens

# Extracting the text
text = message.content[0].text
print(text)  # "2."
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system="You are a Python expert. Always include working, concrete code examples in your responses.",
    messages=[
        {"role": "user", "content": "How do I remove duplicates from a list?"}
    ]
)

print(message.content[0].text)
import anthropic

client = anthropic.Anthropic()

# A list to accumulate conversation history
conversation = []

def chat(user_message: str) -> str:
    """Interact with Claude while maintaining conversation history"""
    conversation.append({"role": "user", "content": user_message})

    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        system="You are a helpful Python learning assistant.",
        messages=conversation
    )

    assistant_message = response.content[0].text
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

# Carry on a multi-turn conversation
print(chat("Can you explain Python list comprehensions?"))
print(chat("Could you show a more detailed example?"))
print(chat("How are they different from dictionary comprehensions?"))

LLM responses take time to generate. Streaming allows generated text to be displayed in real time, improving the user experience.

import anthropic

client = anthropic.Anthropic()

# Use a with block to manage the stream
with client.messages.stream(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain the basics of cloud architecture."}]
) as stream:
    # Print text as it is generated
    for text in stream.text_stream:
        print(text, end="", flush=True)

print()  # Final newline

# Retrieve the final message object after the stream ends
final_message = stream.get_final_message()
print(f"\nTokens used: {final_message.usage.input_tokens} in / {final_message.usage.output_tokens} out")
import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Compare three programming languages."}]
) as stream:
    for event in stream:
        if hasattr(event, "type"):
            if event.type == "content_block_start":
                print("[Generation started]")
            elif event.type == "content_block_delta":
                if hasattr(event.delta, "text"):
                    print(event.delta.text, end="", flush=True)
            elif event.type == "content_block_stop":
                print("\n[Generation ended]")
            elif event.type == "message_stop":
                print("[Message complete]")

Tool use lets Claude call external functions or APIs — for retrieving weather data, searching databases, performing calculations, and more.

import anthropic
import json

client = anthropic.Anthropic()

# Tool definition (tells Claude what functions are available)
tools = [
    {
        "name": "get_weather",
        "description": "Retrieves the current weather for a specified city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city to check weather for (e.g., Tokyo, London)"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit"
                }
            },
            "required": ["city"]
        }
    }
]

def get_weather(city: str, unit: str = "celsius") -> dict:
    """Call a real weather API (this example returns dummy data)"""
    # In a real implementation, call an API like OpenWeatherMap
    weather_data = {
        "Tokyo":  {"temp": 22, "condition": "Sunny",    "humidity": 60},
        "London": {"temp": 15, "condition": "Cloudy",   "humidity": 75},
    }
    data = weather_data.get(city, {"temp": 20, "condition": "Unknown", "humidity": 50})

    if unit == "fahrenheit":
        data["temp"] = data["temp"] * 9 / 5 + 32

    return {"city": city, "temperature": data["temp"], "unit": unit,
            "condition": data["condition"], "humidity": data["humidity"]}

def process_tool_call(tool_name: str, tool_input: dict) -> str:
    """Call the appropriate function based on tool name and arguments"""
    if tool_name == "get_weather":
        result = get_weather(**tool_input)
        return json.dumps(result)
    return json.dumps({"error": "Unknown tool"})

def chat_with_tools(user_message: str) -> str:
    """Interact with Claude, allowing it to use tools"""
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )

        # If no tool use needed, return the answer
        if response.stop_reason == "end_turn":
            return response.content[0].text

        # If tool use is needed
        if response.stop_reason == "tool_use":
            # Add the assistant response (including tool calls) to the conversation
            messages.append({"role": "assistant", "content": response.content})

            # Process each tool call
            tool_results = []
            for content_block in response.content:
                if content_block.type == "tool_use":
                    tool_result = process_tool_call(
                        content_block.name,
                        content_block.input
                    )
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": content_block.id,
                        "content": tool_result
                    })

            # Add tool results and continue
            messages.append({"role": "user", "content": tool_results})

# Run it
answer = chat_with_tools("Compare the weather in Tokyo and London.")
print(answer)

Rate limits and network errors can occur when calling AI APIs:

import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(messages: list, max_retries: int = 3) -> str:
    """Retry logic that handles rate limiting"""
    for attempt in range(max_retries):
        try:
            response = client.messages.create(
                model="claude-opus-4-5",
                max_tokens=1024,
                messages=messages
            )
            return response.content[0].text

        except anthropic.RateLimitError:
            if attempt < max_retries - 1:
                wait_seconds = 2 ** attempt  # Exponential backoff: 1s, 2s, 4s
                print(f"Rate limit reached. Retrying in {wait_seconds} second(s)...")
                time.sleep(wait_seconds)
            else:
                raise

        except anthropic.APIConnectionError:
            print("Network connection error.")
            raise

        except anthropic.AuthenticationError:
            print("Invalid API key. Check the ANTHROPIC_API_KEY environment variable.")
            raise

        except anthropic.APIStatusError as e:
            print(f"API error: {e.status_code} - {e.message}")
            raise

    return ""

result = call_with_retry([{"role": "user", "content": "Hello!"}])
print(result)
import os
import anthropic

# Always read from environment variable
api_key = os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
    raise ValueError("ANTHROPIC_API_KEY environment variable is not set")

client = anthropic.Anthropic(api_key=api_key)
# For short answers (classification, decisions)
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=256,   # Keep low to control cost
    messages=[{"role": "user", "content": "Is the sentiment of this text Positive or Negative? Answer in one word."}]
)

# For long answers (document generation, detailed explanations)
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=4096,  # Set higher
    messages=[{"role": "user", "content": "Please provide a detailed explanation of async processing in Python."}]
)

A small, complete CLI application that combines everything covered above:

"""
simple_qa.py - A simple Claude Q&A CLI tool

Usage:
  python simple_qa.py
"""

import anthropic
import os
import sys

def create_client() -> anthropic.Anthropic:
    """Create an Anthropic client"""
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        print("Error: ANTHROPIC_API_KEY environment variable is not set")
        sys.exit(1)
    return anthropic.Anthropic(api_key=api_key)

def ask_claude(client: anthropic.Anthropic, question: str, history: list) -> str:
    """Ask Claude a question and return the streaming response"""
    history.append({"role": "user", "content": question})

    full_response = ""
    print("\nClaude: ", end="", flush=True)

    with client.messages.stream(
        model="claude-opus-4-5",
        max_tokens=1024,
        system="You are a helpful and knowledgeable assistant.",
        messages=history
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
            full_response += text

    print()  # Newline
    history.append({"role": "assistant", "content": full_response})
    return full_response

def main():
    """Main loop"""
    client = create_client()
    history = []

    print("=== Claude Q&A Tool ===")
    print("Enter a question. Type 'quit' or 'exit' to end.\n")

    while True:
        try:
            question = input("You: ").strip()

            if not question:
                continue

            if question.lower() in ("quit", "exit"):
                print("Goodbye.")
                break

            if question.lower() == "clear":
                history.clear()
                print("Conversation history cleared.\n")
                continue

            ask_claude(client, question, history)
            print()

        except KeyboardInterrupt:
            print("\n\nGoodbye.")
            break

        except anthropic.RateLimitError:
            print("Rate limit reached. Please wait before retrying.")

        except anthropic.APIConnectionError:
            print("Network connection error. Please check your connection.")

if __name__ == "__main__":
    main()

How to run it:

export ANTHROPIC_API_KEY="sk-ant-..."
python simple_qa.py
  • Create a client with anthropic.Anthropic() and send messages with client.messages.create()
  • Always read API keys from environment variables; never write them in code
  • Use client.messages.stream() for streaming responses
  • Tool use (function calling) allows Claude to call external functions
  • Check stop_reason and the type of each element in content when processing responses
  • Handle exceptions like anthropic.RateLimitError appropriately

Q: How do I use models other than claude-opus-4-5?

A: Just change the model parameter. Faster, lower-cost models (e.g., claude-haiku-4-5) can reduce latency and costs. See Anthropic’s model list for options.

Q: Does the cost depend on the max_tokens setting?

A: No. Cost is based on the number of tokens actually generated. max_tokens is a ceiling — if the response is short, you are only charged for what was generated.

Q: Is the cost different with and without streaming?

A: No. Streaming only affects when the output is displayed, not the number of tokens generated.

Q: How do I debug tool use errors?

A: Print response.content to inspect its contents and verify that the tool_use block contains the expected name and input values. If the input_schema definition is incorrect, Claude may not be able to generate the correct JSON arguments.

See the references for the external specifications and background sources used on this page.[1][2][3][4]

  1. Anthropic Python SDK (GitHub)
  2. Claude API Documentation
  3. Tool Use Guide (Anthropic)
  4. Messages API Reference