Cost Katana

A comprehensive toolkit for optimizing AI model costs, tracking usage, and analyzing performance across multiple providers with centralized dashboard analytics.

Features

Multi-Provider Support: OpenAI, Anthropic, Google AI, AWS Bedrock, Cohere, and more
Cost Optimization: Intelligent prompt compression, context trimming, and request fusion
Usage Tracking: Detailed analytics and cost monitoring with dashboard integration
Project Management: Organize usage by projects with budget tracking
Real-time Analytics: Monitor costs, tokens, and performance metrics via web dashboard
Suggestion Engine: AI-powered recommendations for cost reduction
🔍 Distributed Tracing: Visualize AI workflows with hierarchical traces and timelines
🚀 AI Gateway: Intelligent proxy with caching, retries, and cost optimization
💾 Smart Caching: Reduce costs with configurable response caching
🔄 Smart Retries: Automatic retry logic with exponential backoff
📊 Workflow Tracking: Group related requests for end-to-end cost analysis
🎯 Session Analysis: Track and visualize complete agent flows with cost attribution

Documentation

API Reference - Detailed API documentation
Gateway Guide - AI Gateway setup and configuration
Prompt Optimization - Techniques for optimizing prompts
Webhooks - Real-time event notifications
Examples - Code examples and use cases

Prerequisites

⚠️ Important: Registration Required

Before using this package, you must:

Register at costkatana.com
Create a project in your Cost Katana dashboard
Get your API Key and Project ID from the dashboard settings

This package automatically syncs all usage data with your Cost Katana dashboard for comprehensive analytics and cost tracking.

Installation

npm install ai-cost-tracker

Environment Setup

Create a .env file in your project root:

# Required: Get these from your costkatana.com dashboard
API_KEY=dak_your_dashboard_api_key_here
PROJECT_ID=your_project_id_here

# Your AI provider API keys
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# ... other provider keys as needed

Quick Start

import AICostTracker, { AIProvider } from 'ai-cost-tracker';

// Initialize with your provider configuration
const tracker = await AICostTracker.create({
  providers: [
    {
      provider: AIProvider.OpenAI,
      apiKey: process.env.OPENAI_API_KEY
    }
  ],
  optimization: {
    enablePromptOptimization: true,
    enableModelSuggestions: true,
    enableCachingSuggestions: true
  },
  tracking: {
    enableAutoTracking: true
  }
});

// Make a tracked request (automatically syncs with costkatana.com dashboard)
const response = await tracker.makeRequest({
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Your prompt here' }],
  maxTokens: 150
});

console.log('Response:', response.choices[0].message.content);
// Usage data is automatically sent to your costkatana.com dashboard

🔍 Sessions & Distributed Tracing

Cost Katana provides enterprise-grade distributed tracing for all your AI operations. Track every LLM call, tool execution, and API request with automatic parent-child relationships, latency metrics, and cost attribution.

Tracing Features

🌳 Hierarchical Traces: Automatic parent-child span relationships
⚡ Zero-Code Instrumentation: Automatic tracing with our middleware
💰 Cost Attribution: Per-span cost tracking with token counts
📊 Visual Timeline: Interactive trace visualization in dashboard
🔒 PII Redaction: Automatic server-side data sanitization
⏱️ Performance Metrics: Latency, duration, and throughput analysis
🎯 Error Tracking: Trace errors through your entire AI pipeline

How It Works

// 1. Add tracing middleware (Express example)
import { traceMiddleware } from 'ai-cost-tracker/trace';
app.use(traceMiddleware);

// 2. Use traced LLM wrapper
import { TrackedOpenAI } from 'ai-cost-tracker/providers';
const ai = new TrackedOpenAI({ apiKey: process.env.OPENAI_API_KEY });

// 3. All calls are automatically traced!
const response = await ai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});
// Creates: HTTP span → LLM span with tokens, cost, latency

// 4. View traces in dashboard
// Navigate to costkatana.com/sessions to see:
// - Hierarchical trace tree
// - Visual timeline
// - Cost breakdown per span
// - Token usage metrics

Manual Instrumentation

For custom logic, tools, or database calls:

import { traceService } from 'ai-cost-tracker/trace';

// Start a custom span
const span = await traceService.startSpan({
  name: 'database-query',
  type: 'tool',
  metadata: { query: 'SELECT * FROM users' }
});

// Your custom logic here
const result = await db.query('...');

// End the span with metrics
await traceService.endSpan(span.traceId, {
  status: 'ok',
  metadata: { rowCount: result.rows.length }
});

🚀 AI Gateway - Intelligent Proxy

The AI Gateway provides intelligent proxy functionality with caching, retries, and cost optimization. It acts as a smart layer between your application and AI providers.

Gateway Features

🌐 Universal Proxy: Works with any AI provider (OpenAI, Anthropic, Google AI, Cohere, etc.)
💾 Smart Caching: Configurable response caching with TTL and user scoping
🔄 Smart Retries: Exponential backoff with configurable parameters
📊 Workflow Tracking: Group related requests for end-to-end cost analysis
🏷️ Cost Attribution: Custom properties for detailed cost allocation
🔒 Privacy Controls: Omit sensitive data from logs
📈 Performance Monitoring: Detailed analytics and statistics

Quick Gateway Start

import { createGatewayClientFromEnv } from 'ai-cost-tracker';

// Create gateway client (uses environment variables)
const gateway = createGatewayClientFromEnv({
  baseUrl: 'https://cost-katana-backend.store/api/gateway', // Your gateway URL
  enableCache: true,
  enableRetries: true
});

// Make OpenAI request through gateway
const response = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ],
  max_tokens: 50
});

console.log('Response:', response.data.choices[0].message.content);
console.log('Cache Status:', response.metadata.cacheStatus); // HIT or MISS
console.log('Retry Attempts:', response.metadata.retryAttempts);

Advanced Gateway Configuration

import { createGatewayClient } from 'ai-cost-tracker';

const gateway = createGatewayClient({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  apiKey: process.env.API_KEY!,
  enableCache: true,
  enableRetries: true,
  // Custom retry configuration
  retryConfig: {
    count: 5,        // Max retry attempts
    factor: 2,       // Exponential backoff factor
    minTimeout: 1000, // Min wait time (ms)
    maxTimeout: 15000 // Max wait time (ms)
  },
  // Custom cache configuration
  cacheConfig: {
    ttl: 3600,       // Cache TTL in seconds
    userScope: 'user_123', // User-scoped caching
    bucketMaxSize: 3 // Multiple responses for variety
  },
  // Default properties for all requests
  defaultProperties: {
    application: 'my-ai-app',
    version: '1.0.0'
  }
});

Multi-Provider Support

// OpenAI
const openaiResponse = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }]
}, {
  targetUrl: 'https://api.openai.com',
  cache: true,
  retry: { count: 3 }
});

// Anthropic
const anthropicResponse = await gateway.anthropic({
  model: 'claude-3-haiku-20240307',
  max_tokens: 100,
  messages: [{ role: 'user', content: 'Hello!' }]
}, {
  targetUrl: 'https://api.anthropic.com'
});

// Google AI
const googleResponse = await gateway.googleAI('gemini-2.5-pro', {
  contents: [{
    parts: [{ text: 'Hello!' }]
  }]
}, {
  targetUrl: 'https://generativelanguage.googleapis.com'
});

// Google AI with different models
const flashResponse = await gateway.googleAI('gemini-2.5-flash', {
  contents: [{
    parts: [{ text: 'Explain quantum computing' }]
  }]
}, {
  targetUrl: 'https://generativelanguage.googleapis.com'
});

const longContextResponse = await gateway.googleAI('gemini-1.5-pro', {
  contents: [{
    parts: [{ text: 'Analyze this long document...' }]
  }]
}, {
  targetUrl: 'https://generativelanguage.googleapis.com'
});

// Mistral AI
const mistralResponse = await gateway.mistralAI('mistral-medium-2508', {
  messages: [{
    role: 'user',
    content: 'Hello!'
  }]
}, {
  targetUrl: 'https://api.mistral.ai'
});

// Mistral AI with different models
const magistralResponse = await gateway.mistralAI('magistral-medium-2507', {
  messages: [{
    role: 'user',
    content: 'Solve this complex reasoning problem step by step...'
  }]
}, {
  targetUrl: 'https://api.mistral.ai'
});

const codestralResponse = await gateway.mistralAI('codestral-2508', {
  messages: [{
    role: 'user',
    content: 'Write a Python function to...'
  }]
}, {
  targetUrl: 'https://api.mistral.ai'
});

// Cohere
const cohereResponse = await gateway.cohere({
  model: 'command',
  prompt: 'Hello!',
  max_tokens: 100
}, {
  targetUrl: 'https://api.cohere.ai'
});

Workflow Tracking

Group related requests to understand end-to-end costs:

const workflowId = `workflow-${Date.now()}`;

// Step 1: Analysis
const analysisResponse = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Analyze this data...' }]
}, {
  workflow: {
    workflowId,
    workflowName: 'DataAnalysis',
    workflowStep: '/analyze'
  },
  properties: {
    step: 'data-analysis',
    priority: 'high'
  }
});

// Step 2: Summary
const summaryResponse = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Summarize the analysis...' }]
}, {
  workflow: {
    workflowId,
    workflowName: 'DataAnalysis',
    workflowStep: '/analyze/summarize'
  }
});

// Get workflow details
const workflowDetails = await gateway.getWorkflowDetails(workflowId);
console.log('Total Workflow Cost:', workflowDetails.totalCost);
console.log('Total Requests:', workflowDetails.requests.length);

Smart Caching

Reduce costs with intelligent caching:

// Basic caching
const response1 = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
}, {
  cache: true // Uses default cache settings
});

// Custom cache settings
const response2 = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Generate a creative story' }]
}, {
  cache: {
    ttl: 7200,        // 2 hours
    userScope: 'user_456', // User-specific cache
    bucketMaxSize: 5  // Store 5 different responses
  }
});

// Cache management
const cacheStats = await gateway.getCacheStats();
console.log('Cache Hit Rate:', cacheStats.singleResponseCache.size);

// Clear cache
await gateway.clearCache({ expired: true }); // Clear expired only
await gateway.clearCache({ userScope: 'user_123' }); // Clear user cache

Smart Retries

Automatic retry with exponential backoff:

const response = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }]
}, {
  retry: {
    count: 5,         // Max 5 retry attempts
    factor: 2.5,      // Aggressive backoff
    minTimeout: 2000, // Start with 2 seconds
    maxTimeout: 30000 // Cap at 30 seconds
  }
});

console.log('Retry Attempts:', response.metadata.retryAttempts);

Integrated Tracker + Gateway

Combine full tracking with gateway functionality:

import AICostTracker, { AIProvider } from 'ai-cost-tracker';

// Create tracker
const tracker = await AICostTracker.create({
  providers: [{ provider: AIProvider.OpenAI }],
  optimization: {
    enablePromptOptimization: true,
    enableModelSuggestions: true,
    enableCachingSuggestions: true,
    thresholds: {
      highCostPerRequest: 0.01,
      highTokenUsage: 1000,
      frequencyThreshold: 10
    }
  },
  tracking: {
    enableAutoTracking: true
  }
});

// Initialize gateway
const gateway = tracker.initializeGateway({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  enableCache: true,
  enableRetries: true
});

// Make requests with automatic usage tracking
const response = await tracker.gatewayOpenAI({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Optimize my AI costs' }]
}, {
  properties: {
    feature: 'cost-optimization',
    user: 'john-doe'
  },
  workflow: {
    workflowId: 'cost-session-123',
    workflowName: 'CostOptimization'
  }
});

// Usage automatically tracked with gateway metadata
const analytics = await tracker.getAnalytics();
console.log('Total Cost:', analytics.totalCost);

Gateway Statistics

Monitor performance and usage:

// Get gateway statistics
const stats = await gateway.getStats();
console.log('Performance Metrics:', {
  totalRequests: stats.totalRequests,
  cacheHitRate: `${stats.cacheHitRate}%`,
  averageResponseTime: `${stats.averageResponseTime}ms`,
  successRate: `${stats.successRate}%`
});

// Provider-specific stats
Object.entries(stats.providerStats).forEach(([provider, stats]) => {
  console.log(`${provider} Stats:`, {
    requests: stats.requests,
    successRate: `${stats.successRate}%`,
    avgResponseTime: `${stats.averageResponseTime}ms`
  });
});

// Health check
const health = await gateway.healthCheck();
console.log('Gateway Status:', health.status);

🔐 Proxy Key Authentication

The CostKATANA Key Vault provides secure, controlled access to your AI provider keys through proxy keys. Instead of sharing your master API keys directly, you can create proxy keys with specific permissions, budgets, and restrictions.

What are Proxy Keys?

Proxy keys (ck-proxy-*) are secure access tokens that:

Resolve to Master Keys: Automatically use your stored provider API keys
Enforce Budgets: Set spending limits (total, daily, monthly)
Control Permissions: Limit access to read, write, or admin operations
Track Usage: Monitor requests and costs per proxy key
Restrict Access: Whitelist IP addresses and domains
Rate Limiting: Control request frequency

Quick Start with Proxy Keys

import { createGatewayClient } from 'ai-cost-tracker';

// Create client with proxy key
const gateway = createGatewayClient({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  apiKey: 'ck-proxy-your-proxy-key-id', // Your proxy key
  enableCache: true,
  enableRetries: true
});

// Check if using proxy key
if (gateway.isUsingProxyKey()) {
  console.log('✅ Using secure proxy key authentication');

  // Get proxy key information
  const info = await gateway.getProxyKeyInfo();
  console.log('Proxy Key:', info?.name);
  console.log('Provider:', info?.provider);
  console.log('Budget Limit:', info?.budgetLimit);
}

// Make requests (proxy key automatically resolves to provider key)
const response = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello from proxy key!' }]
}, {
  targetUrl: 'https://api.openai.com'
});

Budget Monitoring

Monitor spending and enforce limits:

// Check budget status before making requests
const budgetStatus = await gateway.checkProxyKeyBudget();

if (budgetStatus) {
  console.log('Budget Status:', budgetStatus.budgetStatus); // 'good' | 'warning' | 'over'
  console.log('Message:', budgetStatus.message);

  if (!budgetStatus.withinBudget) {
    console.log('⚠️ Proxy key is over budget!');
    return;
  }
}

// Get current usage statistics
const usage = await gateway.getProxyKeyUsage();
if (usage) {
  console.log('Usage Stats:', {
    totalRequests: usage.totalRequests,
    totalCost: usage.totalCost,
    dailyCost: usage.dailyCost,
    monthlyCost: usage.monthlyCost
  });
}

Permission Validation

Validate permissions before operations:

// Check if proxy key has required permissions
const canRead = await gateway.validateProxyKeyPermissions('read');
const canWrite = await gateway.validateProxyKeyPermissions('write');
const canAdmin = await gateway.validateProxyKeyPermissions('admin');

console.log('Permissions:', { read: canRead, write: canWrite, admin: canAdmin });

if (!canWrite) {
  console.log('⚠️ This proxy key has read-only access');
}

Advanced Usage Tracking

Track usage with detailed attribution:

import { ProxyKeyUsageOptions } from 'ai-cost-tracker';

const usageOptions: ProxyKeyUsageOptions = {
  projectId: 'my-project-123',
  properties: {
    environment: 'production',
    feature: 'chat-bot',
    userId: 'user-456'
  },
  modelOverride: 'gpt-4o-mini',
  omitRequest: false,
  omitResponse: false
};

// Make request with detailed tracking
const response = await gateway.makeProxyKeyRequest(
  'https://api.openai.com',
  {
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Tracked request' }],
    max_tokens: 100
  },
  usageOptions
);

Integrated Tracker + Proxy Key

Use proxy keys with the main tracker:

import AICostTracker from 'ai-cost-tracker';

// Initialize tracker
const tracker = new AICostTracker({
  apiKey: 'your-costkatana-api-key',
  tracking: { enableAutoTracking: true }
});

// Initialize gateway with proxy key
const gateway = tracker.initializeGateway({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  apiKey: 'ck-proxy-your-proxy-key-id',
  enableCache: true
});

// Check if using proxy key
if (tracker.isUsingProxyKey()) {
  const proxyInfo = await tracker.getProxyKeyInfo();
  const budgetStatus = await tracker.checkProxyKeyBudget();

  console.log('Proxy Key:', proxyInfo?.name);
  console.log('Budget:', budgetStatus?.message);
}

// Make requests (automatically uses proxy key)
const response = await tracker.gatewayOpenAI({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Integrated request' }]
}, {
  targetUrl: 'https://api.openai.com',
  properties: { source: 'integrated-tracker' }
});

Environment Configuration

Configure proxy keys via environment variables:

# Set environment variables
export COSTKATANA_API_KEY="ck-proxy-your-proxy-key-id"
export COSTKATANA_GATEWAY_URL="https://cost-katana-backend.store/api/gateway"

import { createGatewayClientFromEnv } from 'ai-cost-tracker';

// Create client from environment
const gateway = createGatewayClientFromEnv({
  enableCache: true,
  enableRetries: true,
  keyVault: {
    enabled: true,
    autoDetectProxyKey: true,
    fallbackToEnv: false
  }
});

console.log('Using proxy key:', gateway.isUsingProxyKey());

Security Benefits

Proxy keys provide several security advantages:

Key Isolation: Master API keys never leave the secure vault
Granular Control: Set specific permissions and restrictions
Budget Enforcement: Automatic spending limits prevent overuse
Access Tracking: Monitor who uses what and when
Instant Revocation: Deactivate proxy keys immediately
IP/Domain Restrictions: Limit access to specific locations
Rate Limiting: Prevent abuse and control request frequency

Project Integration

Why Use Projects?

Projects help you:

Organize Usage: Group AI requests by project, team, or client
Track Budgets: Set spending limits and monitor utilization
Generate Reports: Get detailed analytics per project
Control Access: Manage team member permissions
Allocate Costs: Understand spending across different initiatives

Setting Up Projects

1. Basic Project Configuration

import { AICostOptimizer } from 'ai-cost-tracker';

const optimizer = new AICostOptimizer({
  apiKey: 'your-api-key',
  provider: 'openai',
  trackUsage: true,
  // Optional: Set default project for all requests
  defaultProjectId: 'project-123'
});

2. Per-Request Project Assignment

// Associate individual requests with projects
const result = await optimizer.optimize({
  prompt: 'Analyze this customer feedback',
  model: 'gpt-4',
  projectId: 'customer-analysis-2024',
  metadata: {
    department: 'marketing',
    team: 'customer-success',
    client: 'acme-corp'
  }
});

3. Project-Based Usage Tracking

// Track usage with detailed project information
const usageData = await optimizer.trackUsage({
  prompt: 'Generate product description',
  completion: 'AI-generated description...',
  model: 'gpt-3.5-turbo',
  cost: 0.002,
  tokens: 150,
  projectId: 'ecommerce-automation',
  costAllocation: {
    department: 'product',
    team: 'content',
    purpose: 'product-descriptions',
    client: 'internal'
  }
});

Project Configuration Options

Project Metadata Structure

interface ProjectConfig {
  projectId: string;
  name?: string;
  description?: string;
  budget?: {
    amount: number;
    currency: string;
    period: 'monthly' | 'quarterly' | 'yearly';
  };
  costAllocation?: {
    department?: string;
    team?: string;
    purpose?: string;
    client?: string;
    [key: string]: any;
  };
  tags?: string[];
}

Example: E-commerce Project Setup

const ecommerceOptimizer = new AICostOptimizer({
  apiKey: 'your-api-key',
  provider: 'openai',
  trackUsage: true,
  defaultProjectId: 'ecommerce-platform',
  projectConfig: {
    projectId: 'ecommerce-platform',
    name: 'E-commerce AI Features',
    description: 'AI-powered product recommendations and descriptions',
    budget: {
      amount: 1000,
      currency: 'USD',
      period: 'monthly'
    },
    costAllocation: {
      department: 'product',
      team: 'ai-ml',
      purpose: 'customer-experience'
    },
    tags: ['production', 'customer-facing', 'revenue-generating']
  }
});

// Product description generation
const productDesc = await ecommerceOptimizer.optimize({
  prompt: 'Generate SEO-optimized description for wireless headphones',
  model: 'gpt-4',
  costAllocation: {
    purpose: 'product-descriptions',
    client: 'internal'
  }
});

// Customer support automation
const supportResponse = await ecommerceOptimizer.optimize({
  prompt: 'Help customer with return policy question',
  model: 'gpt-3.5-turbo',
  projectId: 'customer-support', // Override default project
  costAllocation: {
    purpose: 'customer-support',
    client: 'end-customer'
  }
});

Advanced Project Features

1. Multi-Project Analytics

// Get analytics for specific project
const projectAnalytics = await optimizer.getProjectAnalytics('project-123', {
  startDate: '2024-01-01',
  endDate: '2024-01-31',
  groupBy: 'day'
});

// Compare multiple projects
const comparison = await optimizer.compareProjects(['project-123', 'project-456', 'project-789'], {
  metric: 'cost',
  period: 'last-30-days'
});

2. Budget Monitoring

// Set up budget alerts
const optimizer = new AICostOptimizer({
  apiKey: 'your-api-key',
  provider: 'openai',
  trackUsage: true,
  budgetAlerts: {
    thresholds: [50, 75, 90], // Alert at 50%, 75%, 90% of budget
    webhookUrl: 'https://your-app.com/budget-alerts',
    email: 'admin@yourcompany.com'
  }
});

// Check budget status
const budgetStatus = await optimizer.getBudgetStatus('project-123');
console.log(`Budget utilization: ${budgetStatus.utilizationPercentage}%`);

3. Team Collaboration

// Set up team-based project access
const optimizer = new AICostOptimizer({
  apiKey: 'your-api-key',
  provider: 'openai',
  trackUsage: true,
  teamConfig: {
    userId: 'user-123',
    teamId: 'ai-team',
    role: 'developer', // developer, admin, viewer
    accessibleProjects: ['project-123', 'project-456']
  }
});

Integration with Cost Katana Dashboard

The npm package automatically syncs with the Cost Katana dashboard at costkatana.com:

// All tracking happens automatically when you use the API
const tracker = await AICostTracker.create({
  providers: [
    {
      provider: AIProvider.OpenAI,
      apiKey: process.env.OPENAI_API_KEY
    }
  ],
  tracking: {
    enableAutoTracking: true // Data syncs automatically to costkatana.com
  }
});

// Every request is automatically tracked and sent to your dashboard
const response = await tracker.makeRequest({
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Hello world' }]
});

Environment Variables

Required environment variables from your Cost Katana dashboard:

# Get these from costkatana.com dashboard settings
API_KEY=dak_your_dashboard_api_key_here
PROJECT_ID=your_project_id_here

# Your AI Provider API Keys
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# ... other provider keys as needed

Best Practices for Project Integration

1. Consistent Project Naming

// Use consistent naming conventions
const PROJECT_IDS = {
  CUSTOMER_SUPPORT: 'customer-support-2024',
  PRODUCT_DESCRIPTIONS: 'product-desc-automation',
  MARKETING_CONTENT: 'marketing-content-gen',
  DATA_ANALYSIS: 'data-analysis-reports'
};

const result = await optimizer.optimize({
  prompt: 'Analyze sales data',
  model: 'gpt-4',
  projectId: PROJECT_IDS.DATA_ANALYSIS
});

2. Hierarchical Project Structure

// Use hierarchical project IDs for better organization
const projectId = `${department}.${team}.${initiative}.${year}`;
// Example: 'marketing.content.blog-automation.2024'

const result = await optimizer.optimize({
  prompt: 'Write blog post about AI trends',
  model: 'gpt-4',
  projectId: 'marketing.content.blog-automation.2024',
  costAllocation: {
    department: 'marketing',
    team: 'content',
    purpose: 'blog-content',
    client: 'internal'
  }
});

3. Environment-Based Configuration

// Different configurations for different environments
const config = {
  development: {
    projectId: 'dev-experiments',
    budget: { amount: 100, currency: 'USD', period: 'monthly' }
  },
  staging: {
    projectId: 'staging-tests',
    budget: { amount: 500, currency: 'USD', period: 'monthly' }
  },
  production: {
    projectId: 'prod-customer-features',
    budget: { amount: 5000, currency: 'USD', period: 'monthly' }
  }
};

const optimizer = new AICostOptimizer({
  apiKey: process.env.OPENAI_API_KEY,
  provider: 'openai',
  trackUsage: true,
  dashboardApiKey: process.env.API_KEY, // Cost Katana Dashboard API Key
  ...config[process.env.NODE_ENV || 'development']
});

API Reference

Core Methods

optimize(options) - Optimize and execute AI requests
trackUsage(data) - Track usage data with project information
getProjectAnalytics(projectId, filters) - Get project-specific analytics
compareProjects(projectIds, options) - Compare multiple projects
getBudgetStatus(projectId) - Check budget utilization

Configuration Options

projectId - Associate requests with specific projects
costAllocation - Detailed cost allocation metadata
budgetAlerts - Set up budget monitoring and alerts
teamConfig - Configure team access and permissions
dashboardConfig - Sync with external dashboard

🛡️ Prompt Firewall & Cost Shield

Protect your AI applications from malicious prompts and save costs with the built-in security firewall.

Quick Start

import { createGatewayClient } from '@ai-cost-optimizer/core';

const gateway = createGatewayClient({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  apiKey: 'your-costkatana-api-key',
  firewall: {
    enabled: true,        // Enable basic firewall (Prompt Guard)
    advanced: true,       // Enable advanced firewall (Llama Guard)
    promptThreshold: 0.5, // Prompt injection threshold (0.0-1.0)
    llamaThreshold: 0.8   // Content safety threshold (0.0-1.0)
  }
});

// Make a request - firewall automatically protects against threats
try {
  const response = await gateway.openai({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Your prompt here' }]
  });
  console.log('✅ Safe request processed:', response.data);
} catch (error) {
  if (error.response?.data?.error?.code === 'PROMPT_BLOCKED_BY_FIREWALL') {
    console.log('🛡️ Malicious prompt blocked - Cost saved!');
    console.log('Threat:', error.response.data.threat);
  }
}

Two-Stage Protection

Stage 1: Prompt Guard (Fast Detection)

Detects prompt injection and jailbreak attempts
Uses Meta's Prompt Guard model via AWS Bedrock
Lightning-fast response for common attack patterns
Configurable confidence thresholds

Stage 2: Llama Guard (Deep Analysis)

Analyzes content against 14 safety categories
Uses Meta's Llama Guard model via AWS Bedrock
Comprehensive threat detection including:
- Violence and Hate
- Sexual Content
- Criminal Planning
- Data Exfiltration
- Phishing Attempts
- And 9 more categories

Per-Request Configuration

// Override firewall settings for specific requests
const response = await gateway.openai({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Sensitive content here' }]
}, {
  firewall: {
    enabled: true,
    advanced: true,
    promptThreshold: 0.3,  // More sensitive
    llamaThreshold: 0.9    // Less sensitive
  }
});

Analytics & Cost Savings

// Get firewall analytics
const analytics = await gateway.getFirewallAnalytics();

console.log('📊 Firewall Performance:');
console.log(`Requests processed: ${analytics.totalRequests}`);
console.log(`Threats blocked: ${analytics.blockedRequests}`);
console.log(`Cost saved: $${analytics.costSaved.toFixed(4)}`);
console.log('Threat categories:', analytics.threatsByCategory);

Integrated Tracker + Firewall

import { AICostTracker } from '@ai-cost-optimizer/core';

const tracker = new AICostTracker({
  apiKey: 'your-api-key',
  projectId: 'secure-project'
});

// Initialize gateway with firewall
tracker.initializeGateway({
  baseUrl: 'https://cost-katana-backend.store/api/gateway',
  apiKey: 'your-costkatana-api-key',
  firewall: { enabled: true, advanced: true }
});

// Use firewall-protected methods with automatic tracking
const response = await tracker.gatewayOpenAIWithFirewall({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello world' }]
}, {
  enabled: true,
  advanced: true
}, {
  properties: { feature: 'chat', user: 'john' }
});

Security Benefits

Cost Protection: Block expensive API calls from malicious prompts
Data Security: Prevent data exfiltration and prompt injection attacks
Compliance: Meet security requirements with automated threat detection
Zero Downtime: Fail-open design ensures service availability
Multi-Provider: Works with OpenAI, Anthropic, Google AI, Cohere, and more

📊 User Feedback & Value Tracking

Move beyond cost tracking to measure Return on AI Spend by connecting every dollar spent to actual user value.

Quick Start

import AICostTracker from 'ai-cost-tracker';

const tracker = new AICostTracker({
  apiKey: 'your-costkatana-api-key'
});

// 1. Make a request and capture the request ID
const requestId = crypto.randomUUID();
const response = await tracker.gatewayOpenAI({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'How do I return an item?' }]
}, { requestId });

// 2. Submit user feedback
await tracker.submitFeedback(requestId, {
  rating: true, // true = positive, false = negative
  comment: 'Very helpful response!',
  implicitSignals: {
    copied: true,
    conversationContinued: false,
    sessionDuration: 45000
  }
});

// 3. Analyze Return on AI Spend
const analytics = await tracker.getEnhancedFeedbackAnalytics();
console.log(`Wasted Spend: ${analytics.insights.wastedSpendPercentage}%`);
console.log(`ROI Score: ${analytics.insights.returnOnAISpend * 100}%`);

Key Features

Explicit + Implicit Feedback

Explicit: User clicks thumbs up/down (rare but accurate)
Implicit: Track copy behavior, session duration, conversation flow (scale)

Cost-Value Correlation

See exactly how much you spend on positive vs negative responses
Calculate cost per positive rating
Identify wasted spending on unhelpful responses

Actionable Insights

const analytics = await tracker.getEnhancedFeedbackAnalytics();

// Get automatic recommendations
analytics.insights.recommendations.forEach(rec => {
  console.log(rec);
  // "You're spending 30% of budget on negatively-rated responses"
  // "Model X has low satisfaction - consider switching"
  // "Feature Y needs optimization - poor user feedback"
});

Business Intelligence

Answer critical questions:

"What percentage of our AI spend is wasted on unhelpful responses?"
"Which features have the highest ROI?"
"Should we switch AI models based on user satisfaction?"
"How much money could we save by improving prompts?"

Advanced Usage

// Track behavioral signals automatically
await tracker.updateImplicitSignals(requestId, {
  copied: true,              // User copied the response
  conversationContinued: false, // No follow-up questions
  immediateRephrase: false,  // Didn't rephrase query
  sessionDuration: 45000,    // 45 seconds engaged
  codeAccepted: true         // Accepted code suggestion
});

// Get feature-specific ROI analysis
const analytics = await tracker.getFeedbackAnalytics();
Object.entries(analytics.ratingsByFeature).forEach(([feature, stats]) => {
  const satisfaction = stats.positive / (stats.positive + stats.negative);
  console.log(`${feature}: ${satisfaction * 100}% satisfaction, $${stats.cost} spent`);
});

Examples

See the examples directory for complete implementation examples:

Support

For questions and support:

License

MIT License - see LICENSE file for details.

Observability & OpenTelemetry

The core library emits OpenTelemetry spans/metrics for AI requests, tools, and HTTP operations. It integrates with the backend’s OTel setup and any OTLP-compatible vendor.

Emitted Telemetry

LLM spans with token counts and per-span cost attribution
HTTP spans for gateway/proxy calls
Custom spans (tools, DB) via helper APIs
Metrics: request rate, error rate, duration percentiles; GenAI cost/tokens

Environment

# OTLP exporters (example)
OTLP_HTTP_TRACES_URL=https://tempo-prod-us-central1.grafana.net/tempo/api/push
OTLP_HTTP_METRICS_URL=https://prometheus-prod-us-central1.grafana.net/api/prom/push
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <YOUR_TOKEN>

# Privacy (optional)
CK_CAPTURE_MODEL_TEXT=false

Frontend Dashboard Coverage

The telemetry dashboard surfaces these from your data:

KPIs (RPM, Error %, Avg & P95 latency)
Cost Analytics by model
Recent Errors and Top Errors
Top Operations
Telemetry Explorer (filters + pagination)
Trace Viewer (hierarchical)
Service Dependency Graph

See backend OBSERVABILITY.md for vendor-specific setup and local collector instructions.

包详细信息

自述文件

Cost Katana

Features

Documentation

Prerequisites

Installation

Environment Setup

Quick Start

🔍 Sessions & Distributed Tracing

Tracing Features

How It Works

Manual Instrumentation

🚀 AI Gateway - Intelligent Proxy

Gateway Features

Quick Gateway Start

Advanced Gateway Configuration

Multi-Provider Support

Workflow Tracking

Smart Caching

Smart Retries

Integrated Tracker + Gateway

Gateway Statistics

🔐 Proxy Key Authentication

What are Proxy Keys?

Quick Start with Proxy Keys

Budget Monitoring

Permission Validation

Advanced Usage Tracking

Integrated Tracker + Proxy Key

Environment Configuration

Security Benefits

Project Integration

Why Use Projects?

Setting Up Projects

1. Basic Project Configuration

2. Per-Request Project Assignment

3. Project-Based Usage Tracking

Project Configuration Options

Project Metadata Structure

Example: E-commerce Project Setup

Advanced Project Features

1. Multi-Project Analytics

2. Budget Monitoring

3. Team Collaboration

Integration with Cost Katana Dashboard

Environment Variables

Best Practices for Project Integration

1. Consistent Project Naming

2. Hierarchical Project Structure

3. Environment-Based Configuration

API Reference

Core Methods

Configuration Options

🛡️ Prompt Firewall & Cost Shield

Quick Start

Two-Stage Protection

Per-Request Configuration

Analytics & Cost Savings

Integrated Tracker + Firewall

Security Benefits

📊 User Feedback & Value Tracking

Quick Start

Key Features

Explicit + Implicit Feedback

Cost-Value Correlation

Actionable Insights

Business Intelligence

Advanced Usage

Examples

Support

License

Observability & OpenTelemetry

Emitted Telemetry

Environment

Frontend Dashboard Coverage

更新日志

Changelog

[1.0.0] - 2024-01-01

Added