AI Agent Least Privilege Token Implementation Guide: MCP OAuth 2.1 + RFC 8707
If you have ever been cited in a security audit for hardcoding API keys into an agent, this article is exactly what you need. A 2025 survey of 5,200 MCP servers revealed that the actual usage rate of OAuth was a mere 8.5%, and the same survey showed that more than half of the servers relied on long-term API keys or PATs that allow indefinite and unlimited access in the event of a breach. In a complex workflow where agents invoke tools, wake up other agents, and attempt to write to a production database, "a single long token" is no different from pinning a bomb.
By combining RFC 8707 Resource Indicators and the Ephemeral Scope Token pattern with OAuth 2.1, the official authentication standard for MCP, you can control all tool calls requested by AI agents using least privilege temporary access tokens bound to "specific server, specific task, specific time." In this article, we summarize everything from core concepts to multi-agent delegation chains, practical code, and common mistakes, covering why the existing Bearer token method is dangerous in agent environments, which attack vectors RFC 8707 blocks, and how to apply this pattern to actual MCP servers.
Key Concepts
Why MCP Chose OAuth 2.1
MCP (Model Context Protocol) is a standard protocol proposed by Anthropic that connects AI agents with external tools and services. As the official specification in March 2025 mandated OAuth 2.1 for remote server access, MCP servers were officially classified as OAuth Resource Servers.
Authorization Server vs Resource Server: The Authorization Server is the entity that issues tokens (e.g., Auth0, ZITADEL), and the Resource Server is the entity that receives those tokens and processes API requests (in this case, the MCP server). The agent first obtains a token from the Authorization Server, and then submits that token to the Resource Server (MCP server) to call the tool.
OAuth 2.1 is an integrated standard that eliminates security vulnerabilities of the existing OAuth 2.0.
| Removed Item | Reason |
|---|---|
| Implicit Flow | Access token exposed in URL fragment |
| Non-PKCE Authorization Code Flow | Vulnerable to Authorization Code Heist Attacks |
| Resource Owner Password Flow | Client handles user credentials directly |
PKCE (Proof Key for Code Exchange): An extension that prevents the theft of authorization codes from public clients (browsers and agents). When making an Authorization Request, it sends code_challenge (the validator's hash value), and subsequently, when making a separate Token Request, it sends the original code_verifier together to prove that the two requests belong to the same entity. The key point is that the two parameters are transmitted at different stages.
You can see how the OAuth 2.1 authorization code + PKCE flow actually works in the token validation code of Example 1.
RFC 8707 Resource Indicators — Solution to Token Misredemption Attacks
The core threat addressed by RFC 8707 is Token Mis-Redemption attacks.
Token Misrediction Attack: An attack in which a malicious MCP server A intercepts an access token issued for Service B and reuses it for Service B. This occurs because if a token does not specify an audience, it can be submitted to any Resource Server.
RFC 8707 adds the resource parameter to the token request to imprint the constraint "this token is valid only on the server of this URI" on the token itself.
토큰 요청 (resource 파라미터 포함):
POST /token HTTP/1.1
Host: auth.example.com
Content-Type: application/x-www-form-urlencoded
grant_type=authorization_code
&code=SplxlOBeZQQYbYS6WxSbIA
&redirect_uri=https://agent.example.com/callback
&resource=https://mcp-server.example.com ← RFC 8707 핵심
&code_verifier=dBjftJeZ4CVP-mB92K27uhbUJU1p1r_wW1gFWFOEjXkJWT payload of the issued token:
{
"iss": "https://auth.example.com",
"sub": "agent-user-123",
"aud": "https://mcp-server.example.com",
"scope": "tool:files:read",
"exp": 1713100800,
"iat": 1713100500,
"jti": "a1b2c3d4-e5f6-..."
}The MCP server must validate whether the aud claim matches its own URI. If it does not match, it rejects it immediately. You can see how this validation is implemented in actual code in Example 1.
Ephemeral Scoped Token — The embodiment of least authority
The short-term issuance token pattern implements least privilege along three axes.
| Axis | Traditional API Key | Ephemeral Scoped Token |
|---|---|---|
| Time to Live (TTL) | Indefinite or months | 2 to 30 minutes |
| Audience | No limit | One specific server URI |
| Scope | Full authority or broad scope | Single tool · Single task |
Designing separate scopes for each tool allows you to separate high-risk tasks from low-risk exploration.
scope=tool:files:read → 파일 읽기, TTL 5분
scope=tool:files:write → 파일 쓰기, TTL 3분
scope=tool:code:execute → 코드 실행, TTL 2분 (고위험)
scope=tool:db:write:prod → 프로덕션 DB 쓰기, TTL 1분 + 인간 승인 필요You can see how this scope design is actually applied in the agent code in Example 2.
RFC 8693 Token Exchange — Agent Delegation Chain
In a multi-agent architecture, tokens from a parent agent must be narrowed and exchanged for a child agent. RFC 8693 handles this role.
[사용자] → Orchestrator Agent
↓ RFC 8693 Token Exchange
Sub-Agent A (act 클레임에 상위 에이전트 정보 포함)
↓
MCP Server (aud=서버 URI, TTL=30초)Each parameter of the exchange request:
POST /token HTTP/1.1
Host: auth.example.com
Content-Type: application/x-www-form-urlencoded
grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&subject_token=<orchestrator_access_token>
&subject_token_type=urn:ietf:params:oauth:token-type:access_token
&requested_token_type=urn:ietf:params:oauth:token-type:access_token
&resource=https://mcp-subagent.example.com
&scope=tool:search:read
&actor_token=<subagent_client_assertion>
&actor_token_type=urn:ietf:params:oauth:token-type:jwt| Parameters | Role |
|---|---|
grant_type |
RFC 8693 Specifying Token Exchange Grant Type |
subject_token |
Delegator (Orchestrator)'s current access token |
subject_token_type |
Specify that subject_token is access_token |
requested_token_type |
Specify token type to be issued |
resource |
RFC 8707: Target server URI of token to be issued |
scope |
Minimum scope to delegate to subordinate agents |
actor_token |
Sub-Agent Self-Proof JWT |
actor_token_type |
Specify that actor_token is a JWT |
The act claim is added to the issued sub-token, creating a delegation chain that allows all actions to be traced back to the human who first authorized them.
{
"sub": "user-456",
"aud": "https://mcp-subagent.example.com",
"scope": "tool:search:read",
"exp": 1713100830,
"act": {
"sub": "orchestrator-agent",
"act": {
"sub": "user-456"
}
}
}Incremental Consent: This is a method where the agent requests additional scopes only when the task is actually needed, rather than requesting all scopes from the start. For example, file read operations are already authorized, and separate user approval is obtained later when a production DB write is required. This prevents the agent from accumulating excessive privileges in advance and creates a natural gateway for human intervention when calling high-risk tools. The effectiveness of this pattern is most pronounced in the delegation chain scenario (Example 3).
Practical Application
Example 1: Implementing RFC 8707 Token Validation on MCP Server (Python)
You can concisely implement Resource Server validation logic by utilizing the mcp-auth library and the MCP Python SDK.
# requirements: mcp[auth], python-jose, fastapi, httpx
from fastapi import FastAPI, Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from jose import jwt, JWTError
import httpx
app = FastAPI()
security = HTTPBearer()
MCP_SERVER_URI = "https://mcp-server.example.com"
AUTH_SERVER_JWKS_URI = "https://auth.example.com/.well-known/jwks.json"
async def get_jwks():
"""
Authorization Server의 공개키를 가져와 토큰 서명을 검증합니다.
주의: 이 구현은 설명 목적으로 캐싱 로직을 생략했습니다.
프로덕션에서는 cachetools, httpx_auth 등을 이용한 캐싱 처리가 반드시
필요합니다. 캐싱 없이 사용하면 매 요청마다 외부 호출이 발생해 레이턴시가
증가하고, Authorization Server 장애 시 서비스가 함께 중단될 수 있습니다.
"""
async with httpx.AsyncClient() as client:
response = await client.get(AUTH_SERVER_JWKS_URI)
return response.json()
async def verify_ephemeral_token(
credentials: HTTPAuthorizationCredentials = Depends(security)
) -> dict:
"""
RFC 8707 준수 검증:
1. 서명 유효성
2. 만료 여부 (exp)
3. audience 클레임이 이 서버 URI와 일치하는지
4. 필요한 scope 포함 여부
"""
token = credentials.credentials
jwks = await get_jwks()
try:
payload = jwt.decode(
token,
jwks,
algorithms=["RS256"],
audience=MCP_SERVER_URI, # ← RFC 8707 핵심: aud 검증
options={"verify_exp": True}
)
except JWTError as e:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail=f"토큰 검증 실패: {str(e)}",
headers={"WWW-Authenticate": "Bearer"}
)
return payload
@app.get("/tools/files/read")
async def read_file(
path: str,
token_payload: dict = Depends(verify_ephemeral_token)
):
"""tool:files:read 스코프를 가진 단기 토큰만 허용합니다."""
scopes = token_payload.get("scope", "").split()
if "tool:files:read" not in scopes:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="이 도구에 필요한 스코프가 없습니다: tool:files:read"
)
# TTL이 매우 짧으므로 여기까지 도달한 토큰은 유효한 것으로 신뢰
# TODO: 실제 파일 읽기 로직 구현 필요
return {"content": f"파일 읽기: {path}", "ttl_remaining": "검증됨"}| Code Point | Role |
|---|---|
audience=MCP_SERVER_URI |
RFC 8707 Key — aud Automatically denied if claim differs from this server URI |
"verify_exp": True |
Strictly check for short-term TTL token expiration |
scope Verification |
Prevents Lateral Movement by Separating Scopes by Tool |
| JWKS Caching Comments | Must Add Caching Layer Before Production |
Example 2: Requesting a Dynamic Token When Calling a Tool (TypeScript MCP Client)
This is a pattern where the agent requests a narrow scope token immediately before each tool call.
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
interface TokenRequest {
resource: string; // RFC 8707: 대상 서버 URI
scope: string; // 도구별 최소 권한
}
/**
* client_credentials 그랜트로 단기 토큰을 요청합니다.
*
* 참고: PKCE(code_verifier/code_challenge)는 authorization_code 플로우에서만
* 사용합니다. client_credentials 그랜트에는 적용되지 않습니다(RFC 7636).
* 사용자 인가가 필요한 플로우라면 별도로 PKCE를 적용한 authorization_code
* 플로우를 사용하세요.
*/
async function requestEphemeralToken(req: TokenRequest): Promise<string> {
const response = await fetch("https://auth.example.com/token", {
method: "POST",
headers: { "Content-Type": "application/x-www-form-urlencoded" },
body: new URLSearchParams({
grant_type: "client_credentials",
resource: req.resource, // ← RFC 8707
scope: req.scope,
client_assertion_type:
"urn:ietf:params:oauth:client-assertion-type:jwt-bearer",
client_assertion: await generateClientAssertion(),
}),
});
const { access_token } = await response.json();
return access_token;
}
/**
* authorization_code 플로우에서 PKCE를 사용할 경우,
* code_verifier는 RFC 7636 §4.1에 따라 43-128자의 Base64URL 문자열이어야 합니다.
* crypto.randomUUID()는 엔트로피가 충분하지 않으므로 아래 방식을 사용하세요.
*/
function generateCodeVerifier(): string {
const array = new Uint8Array(32);
crypto.getRandomValues(array);
return btoa(String.fromCharCode(...array))
.replace(/\+/g, "-")
.replace(/\//g, "_")
.replace(/=/g, "");
}
async function callToolWithEphemeralToken(
toolName: string,
toolArgs: Record<string, unknown>
) {
const toolScopeMap: Record<string, { scope: string; ttl: number }> = {
"read_file": { scope: "tool:files:read", ttl: 300 },
"write_file": { scope: "tool:files:write", ttl: 180 },
"execute_code": { scope: "tool:code:execute", ttl: 120 },
};
const toolConfig = toolScopeMap[toolName];
if (!toolConfig) throw new Error(`알 수 없는 도구: ${toolName}`);
// 1. 도구 호출 직전 단기 토큰 요청
const token = await requestEphemeralToken({
resource: "https://mcp-server.example.com",
scope: toolConfig.scope,
});
// 2. 토큰을 헤더에 담아 MCP 도구 호출
// 참고: 이 예시는 도구 호출마다 새 연결을 생성합니다.
// 프로덕션에서는 연결을 재사용하고 토큰만 갱신하는 방식이 일반적입니다.
const transport = new StdioClientTransport({
command: "mcp-server",
env: { MCP_AUTH_TOKEN: token },
});
const client = new Client({ name: "agent", version: "1.0.0" }, {});
await client.connect(transport);
const result = await client.callTool({ name: toolName, arguments: toolArgs });
await client.close();
return result;
}Example 3: Multi-Agent Delegation Chain — Explosion Radius Limitation Scenario
We assume a worst-case scenario where the publishing agent is compromised. We compare how damage is contained if an Ephemeral Scoped Token is present.
import httpx
# 취약한 방식: 장기 API 키 사용
headers_unsafe = {
"Authorization": f"Bearer {LONG_LIVED_API_KEY}",
# 만료: 없음, 범위: 전체, 대상: 모든 서비스
}
# 안전한 방식: Ephemeral Scoped Token + RFC 8707
async def publish_post_safely(post_content: str) -> dict:
"""
scope=create:post, TTL=60초로 발급된 토큰만 사용.
토큰이 탈취되더라도:
- 60초 후 자동 만료
- dev.to 서버 외 재사용 불가 (aud 클레임)
- 새 글 생성 외 다른 작업 불가 (scope 제한)
request_ephemeral_token()은 클라이언트 측 토큰 요청 함수입니다.
시그니처:
async def request_ephemeral_token(
resource: str, # RFC 8707 대상 서버 URI
scope: str, # 최소 권한 스코프
ttl: int # 토큰 수명 (초)
) -> str: ...
예시 2의 requestEphemeralToken()과 동일한 역할을 Python으로 구현한 것입니다.
"""
token = await request_ephemeral_token(
resource="https://dev.to/api",
scope="create:post",
ttl=60 # 60초 TTL
)
async with httpx.AsyncClient() as client:
response = await client.post(
"https://dev.to/api/articles",
headers={"Authorization": f"Bearer {token}"},
json={"article": {"title": "...", "body_markdown": post_content}}
)
return response.json()| Scenario | In case of Long-term API Key Leak | In case of Ephemeral Token Leak |
|---|---|---|
| Expiration Date | Indefinite | Max 60 seconds |
| Accessible Services | All services allowed by the key | dev.to only (aud fixed) |
| Available Operations | All API Operations | Create 1 Post Only |
| Scope of Damage | Entire Account Hijacking | Single Post Created 1 time |
Pros and Cons Analysis
Advantages
| Item | Content |
|---|---|
| Prevention of Token Misredition | aud Claims are fixed to a specific server URI to completely block token reuse attacks |
| Minimize Explosion Radius | Short TTL + Narrow Scope Limits Damage to Specific Tasks/Times in Case of Infringement |
| Human Supervision Possible | Allows inserting a user approval step for agent high-risk scope requests via Incremental Consent |
| Audit Trail | act Claim-based delegation chains can trace all actions back to the initial authorizing authority |
| Standard Compatibility | Based on IETF standards (RFC 8707, 8693, 9728) and not tied to a specific vendor |
| Compliance | 2026-03-15 RFC 8707 implementation elevated to Client MUST in MCP Specification |
Disadvantages and Precautions
| Item | Content | Response Plan |
|---|---|---|
| Implementation Complexity | Requires multiple endpoints such as Authorization Server, Resource Server, Token Introspection, and Dynamic Registration | Utilizes specialized platforms like Scalekit, Auth0 Token Vault, and ZITADEL |
| Token Update Overhead | If TTL is minutes, frequent reissuance requests occur with every tool call, causing delays | Token caching + Proactive Refresh Strategy Applied |
| Single Point of Failure | All agents stop functioning if Auth Server goes down | HA (High Availability) configuration + Token Introspection result caching |
| Low Field Adoption Rate | Most MCP Servers Still Use Long-Term API Keys | Mitigation Costs Eased with Dynamic Client Registration (RFC 7591) |
| Scope Design Difficulty | Excessive granularity leads to skyrocketing management complexity; excessive breadth violates the principle of least privilege | Designing a 3–5 level scope hierarchy based on tool risk is recommended |
| Clock Skew | Short-term TTL tokens are sensitive to time synchronization errors between servers | NTP synchronization required, set a 60-second margin for the nbf claim |
Clock Skew: Time difference that occurs when the clocks of servers in a distributed system differ slightly. In a token with a TTL of 2 minutes, a 30-second clock skew can reduce the effective time by half.
Dynamic Client Registration (RFC 7591): A protocol that allows an agent to autonomously obtain a client ID without prior registration when it first encounters an MCP server. It resolves the M×N problem where all combinations of M agents × N servers must be registered in advance.
The Most Common Mistakes in Practice
- Case where claim verification is omitted:
aud— If you only check the JWT signature and do not verifyaud, you will remain vulnerable to token misredemption attacks even if RFC 8707 is applied. The Resource Server must decode it usingaudience=자신의_URI. In Python,jwt.decode(..., audience="https://your-mcp-server.example.com")is the starting point. - When setting a long TTL — If you set the TTL to 24 hours for implementation convenience, the meaning of the "short-term token" pattern is lost. It is effective to document and adhere to a TTL policy based on risk level for each tool type (low risk: 5 minutes, medium risk: 2 minutes, high risk: 1 minute + human approval).
- Setting the scope too broadly — Granting a comprehensive scope to a single agent, such as
scope=*orscope=admin, undermines the principle of least privilege. It is better to design the scope based on the list of tools the agent actually calls.
In Conclusion
RFC 8707 Resource Indicators and the Ephemeral Scoped Token pattern are the most practical standards-based approaches to realizing the principle of least privilege in the era of AI agents.
The core of this pattern is simple. If you imprint four constraints—"who, where, what, and until"—on a token, even if the agent is compromised, what the attacker obtains is a single token valid for a few minutes, only on a single server, and capable of only one task.
Before starting implementation, there are architecture-level questions that the team must decide first. RFC 8707 support and the implementation schedule vary significantly depending on whether to select a managed service (Auth0, ZITADEL, Scalekit, etc.) or build it in-house for the Authorization Server. It is efficient to finalize this choice first before proceeding to the following steps.
In March 2026, RFC 8707 was elevated to a client MUST in the MCP Specification. There is a reason for this to go beyond a mere recommendation. MCP clients that do not implement this requirement may experience compatibility issues with specification-compliant servers, and in enterprise environments, this can lead to security audit failures.
3 Steps to Start Now:
- It is recommended that you review the token validation code of your existing MCP server. Check if the
audclaim is being validated against your server URI, and if not, you can start by adding theaudienceparameter to your JWT decode logic. - Organizing the list of tools called by the agent and writing documentation mapping a scope of the format
tool:{domain}:{action}and a TTL policy to each makes it easier to maintain consistency in future implementations. - You can add OAuth 2.1 to your existing MCP server as a plugin by following the mcp-auth.dev Quick Start. Rather than building an Authorization Server from scratch, utilizing a managed platform that supports RFC 8707 to delegate token issuance and focusing first on the Resource Server validation logic lowers the barrier to entry.
Next Post: How to Implement a Zero-Config Authentication Architecture Where MCP Clients Auto-Discovery of Server Authentication Requirements Using RFC 9728 Protected Resource Metadata
Reference Materials
- The New MCP Authorization Specification | dasroot.net
- Authorization - Model Context Protocol 공식 명세 | modelcontextprotocol.io
- MCP, OAuth 2.1, PKCE, and the Future of AI Authorization | Aembit
- OAuth 2.0 Resource Indicators (RFC 8707) Explained | Scalekit
- RFC 8707: Resource Indicators for OAuth 2.0 | IETF
- RFC 8693: OAuth 2.0 Token Exchange | IETF
- MCP Authentication and Authorization Implementation Guide | Stytch
- Authorization for MCP: OAuth 2.1, PRMs, and Best Practices | Oso
- MCP Spec Updates from June 2025 | Auth0
- OAuth for MCP: Emerging Enterprise Patterns for Agent Authorization | GitGuardian
- Ephemeral Authentication: Securing Autonomous AI Workflows | AI Journal
- Auth0 Token Vault: Secure Token Exchange for AI Agents | Auth0
- Credential Risks Across 5,200 MCP Servers | NHIMG
- Design MCP Authorization to Securely Expose APIs | Curity
- Secure your MCP servers: Implement OAuth 2.1 | Scalekit
- MCP Auth Official Documentation | mcp-auth.dev