Centralized Prompt Management in MCP Architecture

English Summary

Following the MCP (Model-Client-Prompt) design principles, this lesson demonstrates centralizing prompts in an MCP Server instead of embedding them in client-side code. This approach enhances modularity, maintainability, and consistency across different clients.

Key points covered:

Server-side prompt management allows for unified updating and maintenance
Client applications retrieve prompts from the server when needed
The response object from the server contains the prompt text that must be parsed by the client
MCP architecture divides server functionality into three types: tool functions, resource support, and prompt templates

The practical implementation involves:

Creating a MCP.prompt function on the server side
Writing a client-side function to retrieve prompts from the server
Extracting the text content from the returned prompt object
Using this server-retrieved prompt with user input for LLM processing

This centralized approach ensures all clients can access consistent prompts, reduces redundancy, and simplifies maintenance - conforming to MCP design principles.

繁體中文摘要

依照MCP（Model-Client-Prompt）設計原則，本課程展示了如何將提示詞集中管理在MCP伺服器上，而非嵌入在客戶端代碼中。這種方法提高了模組化程度、可維護性以及跨不同客戶端的一致性。

主要涵蓋的重點：

伺服器端的提示詞管理允許統一更新和維護
客戶端應用程序在需要時從伺服器獲取提示詞
伺服器的回應物件包含提示詞文本，需要由客戶端解析
MCP架構將伺服器功能分為三種類型：工具函數、資源支援和提示詞模板

實際實現包括：

在伺服器端創建MCP.prompt函數
編寫客戶端函數以從伺服器獲取提示詞
從返回的提示詞物件中提取文本內容
將此伺服器獲取的提示詞與用戶輸入一起用於LLM處理

這種集中式方法確保所有客戶端可以訪問一致的提示詞，減少冗餘，簡化維護 - 符合MCP設計原則。

Implementation Summary

This example demonstrates a complete MCP architecture implementation with two main components:

1. Phone Directory Server (MCP Server)

The server component implements:

Tools: Functions that can be called by clients (search_phone)
Resources: Static content that can be requested (greeting)
Prompts: Centralized system prompts for LLM interaction

2. Chatbot UI (MCP Client)

The client component implements:

Model Integration: Connects to an LLM via Ollama
Server Communication: Retrieves prompts and calls tools from the MCP server
User Interface: Provides a chat interface using Streamlit
Response Processing: Parses LLM outputs to determine when to call server tools

The key demonstration is how the system prompt is defined on the server side and retrieved by the client, ensuring consistent prompting across all implementations that use this server.

Code Implementation

phone_directory_server.py

import pandas as pd
from mcp.server.fastmcp import FastMCP

# 初始化 FastMCP Server
mcp = FastMCP("Phone Directory Server")

# 工具：搜尋電話
@mcp.tool()
def search_phone(query: str) -> str:
    """
    搜尋電話簿中的資料
    :param query: 查詢關鍵字
    :return: 查詢結果
    """
    try:
        df = pd.read_excel("phone_directory.xlsx", dtype=str)
        df.columns = [col.strip() for col in df.columns]

        for col in ['姓名', '電話']:
            if col not in df.columns:
                return f"電話表缺少必要欄位：{col}"

        df['電話'] = df['電話'].astype(str)

        mask = (
            df['姓名'].str.contains(query, case=False, na=False) |
            df['電話'].str.contains(query, case=False, na=False)
        )
        results = df[mask]

        if results.empty:
            return "找不到符合查詢條件的資料。"

        response_lines = []
        for _, row in results.iterrows():
            line = f"姓名：{row['姓名']}, 電話：{row['電話']}"
            if '地址' in row and pd.notna(row['地址']):
                line += f", 地址：{row['地址']}"
            if '備註' in row and pd.notna(row['備註']):
                line += f", 備註：{row['備註']}"
            response_lines.append(line)

        return "\n".join(response_lines)

    except Exception as e:
        return f"電話表讀取失敗: {e}"

# 資源：問候語
@mcp.resource("greeting://{name}")
def get_greeting(name: str) -> str:
    return f"Hello, {name}!"

@mcp.prompt()
def system_prompt() -> str:
    """
    定義與 LLM 交互的系統提示詞。
    """
    return (
        "你是一個能呼叫工具的助理。"
        "如果需要查電話，請回傳 JSON 格式：\n"
        '{ "action": "search_phone", "args": { "query": "xxx" } }\n'
        "如果不需要呼叫任何工具，就回傳：\n"
        '{ "action": "none", "answer": "你要回答的內容" }\n'
        "不要輸出任何多餘文字，不要有多餘註解。"
    )


# 啟動 MCP Server
if __name__ == "__main__":
    mcp.run()

chatbot_ui.py

#1.引入必要的函式庫import streamlit as st
import streamlit as st
import openai
import asyncio
from mcp import ClientSession
from mcp.client.stdio import stdio_client, StdioServerParameters
import platform
import json
from mcp import ClientSession

#2.設定異步事件迴圈策略（針對 Windows 系統）
if platform.system() == "Windows":
    asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())


#3.設定 Ollama 本地模型的 API 金鑰和基礎 URL
client = openai.OpenAI(
    api_key="ollama",
    base_url="http://localhost:11434/v1"  # 根據您的 Ollama 設定
)


#4.MCP Server 串接
async def call_mcp_tool(tool_name, args):
    server_params = StdioServerParameters(command="python", args=["phone_directory_server.py"])

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            if tool_name == "search_phone":
                return await session.call_tool("search_phone", args)
            elif tool_name == "greeting":
                content, _ = await session.read_resource(f"greeting://{args['name']}")
                return content
    return "MCP 工具呼叫失敗"


#5.定義清理 LLM 輸出的函式
def clean_llm_output(text):
    # 去除 markdown 的 ```json 或 ``` 包裝
    if text.startswith("```"):
        text = text.strip("`")  # 去除反引號
        lines = text.split("\n")
        # 如果第一行是 ```json 就跳過
        if lines[0].startswith("json"):
            lines = lines[1:]
        # 移除最後一行 ``` 結尾
        if lines and lines[-1].strip() == "":
            lines = lines[:-1]
        elif lines and lines[-1].startswith("```"):
            lines = lines[:-1]
        text = "\n".join(lines)
    return text.strip()

async def get_mcp_prompt(prompt_name):
    """
    調用 MCP 伺服器的提示詞。
    :param prompt_name: 提示詞的名稱
    :return: 提示詞的內容
    """
    server_params = StdioServerParameters(command="python", args=["phone_directory_server.py"])
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            prompt = await session.get_prompt(prompt_name)
            return prompt  
  
#6.streamlit UI
st.title("📞 電話簿助理 Chatbot (Function Calling Demo)")

# 聊天紀錄
chat_history = st.session_state.get("chat_history", [])

# 使用者輸入
user_input = st.chat_input("請輸入您的訊息...")

if user_input:
    # 加入使用者訊息
    chat_history.append({"role": "user", "content": user_input})

   # 假設 get_mcp_prompt 是一個異步函數，用於從 MCP 伺服器獲取提示物件
    prompt_object = asyncio.run(get_mcp_prompt("system_prompt"))
    
    # 提取提示內容
    if prompt_object and prompt_object.messages:
        system_prompt = prompt_object.messages[0].content.text
    else:
        raise ValueError("未能從 MCP 伺服器獲取有效的提示內容")                

    # 你可以在最前面插入一則 system message
    conversation = [
        {"role": "system", "content": system_prompt}
    ] + chat_history

    # 送去 LLM
    response = client.chat.completions.create(
        model="phi4:latest",  # 或你的模型名稱
        messages=conversation
    )

    llm_output = response.choices[0].message.content
    llm_output_clean = clean_llm_output(llm_output)
   
    st.write("🔍 LLM Output 原始內容：", llm_output)
    # 解析 LLM 回應
    try:
        parsed = json.loads(llm_output_clean) 
        action = parsed.get("action")
        args = parsed.get("args", {})
        answer = parsed.get("answer", "")
    except json.JSONDecodeError as e:
        st.error(f"❌ JSON 解析錯誤：{e}")
        st.write("⚠️ LLM 回傳的內容：", llm_output)
        parsed = {"action": "none", "answer": llm_output}
        action = "none"
        args = {}
        answer = llm_output


    final_reply = ""  # 最終回覆內容

    if action == "search_phone":
        # 如果 LLM 決定要搜尋電話
        query_str = args.get("query", user_input)  # fallback 用 user_input
        with st.spinner("正在查詢電話簿..."):
            mcp_result = asyncio.run(call_mcp_tool("search_phone", {"query": query_str}))
            
        if hasattr(mcp_result, "content") and mcp_result.content:
            texts = [item.text for item in mcp_result.content if hasattr(item, "text")]
            final_reply = "\n".join(texts)
        else:
            final_reply = "⚠️ 查無資料"            
        # 最終回覆可將 MCP 結果組合進去
        final_reply = f"以下是查詢結果：\n{final_reply}"
    elif action == "none":
        # 如果不需要呼叫工具，就純顯示 LLM 結果
        final_reply = answer+parsed.get("answer")
    else:
        # 萬一解析出奇怪的 action，就當成普通文字回覆
        final_reply = llm_output

    # 將最終回覆加入聊天
    chat_history.append({"role": "assistant", "content": final_reply})

    # 顯示聊天記錄
    for msg in chat_history:
        st.chat_message(msg["role"]).write(msg["content"])

    # 存回 session_state
    st.session_state["chat_history"] = chat_history

Code Walk-through

Server-side (phone_directory_server.py)

Key Components:

Tool Function (@mcp.tool): The search_phone function provides the core functionality to search a phone directory stored in an Excel file.
Resource Function (@mcp.resource): The get_greeting function demonstrates how to provide simple resources to clients.
Prompt Definition (@mcp.prompt): The system_prompt function defines the centralized system prompt that guides the LLM behavior.

This server encapsulates both the business logic (searching phone records) and the LLM interaction strategy (prompt) in one place, allowing multiple clients to use consistent functionality and prompting strategies.

Client-side (chatbot_ui.py)

Key Components:

MCP Client Functions: call_mcp_tool and get_mcp_prompt handle communication with the MCP server.
LLM Integration: Using OpenAI client to connect to a local Ollama instance for language model inference.
Streamlit UI: Provides a chat interface for user interaction.
Prompt Retrieval: The get_mcp_prompt function fetches the system prompt from the server rather than hardcoding it in the client.
Response Parsing: clean_llm_output and JSON parsing logic to interpret LLM responses and determine when to call server tools.

The client demonstrates the MCP architecture by keeping the model (LLM) and prompt management separate, connecting to the server for prompts and tools, and focusing on the user interface and workflow coordination.

Key Architecture Benefits

Centralized Prompt Management: All system prompts are defined on the server, ensuring consistency across clients.
Separation of Concerns: The server handles data access and business logic, while the client manages user interaction.
Scalability: Multiple different clients (web, mobile, CLI) could use the same MCP server.
Maintainability: Prompt updates only need to be made in one place (the server).
Flexibility: The LLM can dynamically decide when to use tools based on the user's request.