Question: I wrote a calculator tool for an agent built in langgraph. However, the LLM keeps trying to do math itself rather than using the tool. How can I discourage it from that?
This is a common challenge with LLMs and tool calling! Here are several strategies to encourage your agent to use the calculator tool instead of doing math itself:
1. Explicit Instructions in System Prompt Be very direct about when to use the tool:
python
xsystem_prompt = """You are a helpful assistant with access to a calculator tool.
CRITICAL: You must ALWAYS use the calculator tool for ANY mathematical operations, including:- Basic arithmetic (addition, subtraction, multiplication, division)- Comparisons and numerical evaluations- Any calculation involving numbers
Never attempt to calculate results yourself. Always call the calculator tool, even for simple operations like 2+2.
After receiving the tool result, present it to the user naturally."""2. Add Examples in Few-Shot Format Show the desired behavior explicitly:
python
xxxxxxxxxxsystem_prompt = """You have access to a calculator tool.
Examples of correct behavior:User: "What is 15 + 27?"Assistant: [calls calculator tool with expression="15 + 27"]Tool result: 42Assistant: "15 + 27 equals 42"
User: "What's 8 times 9?"Assistant: [calls calculator tool with expression="8 * 9"]
ALWAYS use the calculator for any math operation."""3. Validation Node Add a node that checks if the LLM skipped the tool when it shouldn't have:
python
xxxxxxxxxxdef validate_response(state): """Check if LLM did math without using tool""" last_message = state["messages"][-1] # Check if response contains numbers but no tool call was made if has_calculation_in_text(last_message.content) and not state.get("tool_calls"): # Force a tool call or regenerate return { "messages": [SystemMessage(content="You must use the calculator tool. Try again.")] } return state4. Conditional Routing Route based on whether tools were called:
python
xxxxxxxxxxdef should_continue(state): last_message = state["messages"][-1] # If LLM provided an answer with numbers but didn't call tool if contains_math_answer(last_message) and not last_message.tool_calls: return "force_tool_use" # Route back to retry return "end"5. Model-Specific Behavior Some models are better at following tool-use instructions:
Claude (Anthropic) tends to be very compliant with tool usage instructions
GPT-4 sometimes tries to "help" by doing simple math
Smaller models may struggle more with consistent tool usage
For your teaching context, this is actually a great opportunity to show students the difference between models!
6. Force Tool-Only Mode Make the tool call mandatory by modifying your agent structure:
python
xxxxxxxxxx# In your graph definitiondef agent_node(state): # Get LLM response response = model.invoke(state["messages"]) # If it's a math question and no tool was called if is_math_question(state["messages"][-1]) and not response.tool_calls: # Inject a tool call manually or force regeneration return {"messages": [ SystemMessage(content="ERROR: You must use the calculator tool for this. Do not provide an answer without calling the tool first.") ]} return {"messages": [response]}For your course, this behavior difference is actually pedagogically valuable! You could:
First show the problem - let students see the LLM doing math incorrectly
Demonstrate progressive solutions - from weak prompting to strong guardrails
Compare models - show which models are most/least compliant
Add validation layers - teach defensive programming for agent systems