Gemini Configuration | WyzQuests Documentation

Feature Owner: scorevi (Sean Patrick Caintic)
Module: System Config
Priority: P0
Sprint #12: Fully Implemented & Deployed
Date: 2026-06-29

EXECUTIVE SUMMARY

What is this feature?
Centralized Gemini AI configuration, API key management, model selection, and multi-tier token usage tracking and enforcement. Admin-only configuration with re-authentication for API key changes. Supports 5 model variants with context chain routing and 30-day SHA-256 response caching.

Why does it matter?
Every AI feature (Idea Validator, Project Wizard, Content Auto-fill, Adventure Builder, Content Safety) depends on a configured Gemini API. Without centralized key management and token enforcement, costs are uncontrollable and model selection is ad-hoc. Token tracking feeds into agency billing.

What's the MVP scope?
Fully deployed. Admin config form (GeminiConfigForm.tsx, 694 lines). API key manager with model discovery. 3-tier token enforcement (sitewide 5M, agency budget, per-member limit). Model context chains (FAST/STRUCTURED/REASONING). Gemini response cache (SHA-256, 30-day TTL). Audit logging. Server action: lib/api/admin/gemini-config.server-action.ts (1049 lines).

1. USER PAIN POINT & SOLUTION

Current State (Without Feature)

AI features are non-functional without API keys. Token usage is untracked — cost overruns are invisible until the billing statement. Agencies can't control their AI spend.

Pain Point

Emotional: "Is the AI even working? How much is this costing us?"
Functional: No API key management. No token budget tracking.
Business Impact: Unpredictable AI costs; no agency billing model for AI usage.

Future State (With Feature)

Admin sets API key once. Platform tracks every token consumed. Agencies get a monthly budget. Per-member limits prevent individual overuse. Caching reduces redundant API calls.

Marketing Hook

"One API key. Full AI power. Every token tracked to the penny."

2. 4D FRAMEWORK MAPPING

Diagnose

Token usage charts and audit logs help diagnose AI cost patterns.

Design

Model context chains (FAST/STRUCTURED/REASONING) select the right model for each AI task.

Develop

AI features use withTokenEnforcement() wrapper for budget compliance.

Deliver

Not directly applicable.

3. USER FLOWS

Entry Point

Admin navigates to /admin/settings → Gemini Configuration section.

Success Criteria

API key configured. Models discovered. Token enforcement active. Key changes require re-auth.

Main Flow (Happy Path)

Admin navigates to /admin/settings → GeminiConfigForm.tsx (694 lines) renders
Enters API key → masked display in UI
Clicks "Test Connection" → testGeminiConnection() in server action discovers available models
Selects active model from dropdown → saves via updateGeminiConfig()
Adjusts temperature slider (0.0 – 2.0, DECIMAL(3,2), DEFAULT 0.7)
Token enforcement gates all AI calls:
- Sitewide: 5M tokens/month
- Agency: monthly_token_budget
- Per-member: monthly_token_limit

Edge Cases

No data: No API key configured → api_key = NULL → AI features return config error.
API error: Gemini returns 429 on free-tier key for pro models → "Billing-enabled API key required" error.
Permission denied: Non-admin accessing config → 403.

Decision Points

IF API key changes → ReAuthModal.tsx → key confirmation → save via server action.
IF token budget exceeded → TOKEN_LIMIT_EXCEEDED → AI features blocked with guidance.
IF circuit breaker trips (3 consecutive failures) → 5-min cooldown → local fallback.

4. INFORMATION ARCHITECTURE

Primary Information (Always visible)

API key status (configured/not configured, masked display).
Active model name and version.
Temperature slider (0.0 – 2.0).

Secondary Information

Test connection status (test_connection_status, last_test_at, last_test_error).
Model pricing from gemini_pricing table.
Token usage summary.

Actions

Primary CTA: "Save Configuration", "Test Connection".
Secondary Actions: "Clear API Key", "Change API Key" (both require re-auth).

5. WIREFRAMES

Excluded — UI is fully developed (GeminiConfigForm.tsx, 694 lines).

6. WIREFLOWS

Excluded.

7. PROTOTYPE

Excluded — production deployed.

8. BACKEND SCHEMA

Database Tables

-- Core configuration
CREATE TABLE gemini_config (
  id UUID PK,
  created_by UUID NOT NULL REFERENCES app_users(id),
  updated_by UUID NOT NULL REFERENCES app_users(id),
  api_key TEXT,                                      -- Stored as TEXT (not hashed at rest), NULL = not configured
  model_name TEXT,
  temperature DECIMAL(3,2) NOT NULL DEFAULT 0.7
    CHECK (temperature >= 0 AND temperature <= 2.0),
  is_active BOOLEAN,
  test_connection_status TEXT,
  last_test_at TIMESTAMPTZ,
  last_test_error TEXT,
  created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ
);
-- NOTE: NO google_cloud_project_id column exists
 
-- AI response storage and caching
CREATE TABLE gemini_responses (
  id UUID PK, user_id TEXT (Clerk), prompt TEXT, response TEXT,
  context TEXT CHECK (context IN ('fast','structured','reasoning')),
  token_metrics JSONB
);
 
CREATE TABLE gemini_cache (
  id UUID PK, hash_key TEXT UNIQUE (SHA-256), response JSONB,
  model_variant TEXT, expires_at TIMESTAMPTZ (30-day TTL)
);
 
-- Pricing and cost tracking
CREATE TABLE gemini_pricing (
  id UUID PK, model_name TEXT, input_price_per_1k DOUBLE PRECISION,
  output_price_per_1k DOUBLE PRECISION, is_active BOOLEAN
);
 
CREATE TABLE gemini_api_logs (
  id UUID PK, intern_owner TEXT, quota_usage tracking
);
 
-- Token usage events (note: column names verified)
CREATE TABLE token_usage_events (
  id UUID PK, user_id TEXT, model TEXT,
  input_tokens INTEGER,                              -- NOT tokens_in
  output_tokens INTEGER,                             -- NOT tokens_out
  total_tokens INTEGER GENERATED ALWAYS AS (input_tokens + output_tokens) STORED,
  cost DOUBLE PRECISION, created_at TIMESTAMPTZ
);

Supported Models (5 variants)

gemini-3.5-flash        (RECOMMENDED — May 2026)
gemini-3.1-pro-preview  (Pro reasoning — billing key required)
gemini-3-flash-preview  (Preview)
gemini-2.5-pro          (Pro thinking — billing key required)
gemini-2.5-flash        (Stable flash)

Model Context Chains

FAST: gemini-3.5-flash → gemini-2.5-flash → gemini-3.1-pro-preview
STRUCTURED: gemini-3.5-flash → gemini-3.1-pro-preview → gemini-2.5-pro
REASONING: gemini-3.1-pro-preview → gemini-2.5-pro → gemini-3.5-flash

9. API ENDPOINTSAdmin Routes

GET    /api/admin/gemini/api-key         (masked key status + available models)
POST   /api/admin/gemini/api-key         (action-based body: reload, clear-cache, validate — NO DELETE handler)

All routes require authenticateRole('ADMIN').

Server Actions

lib/api/admin/gemini-config.server-action.ts (1049 lines):

updateGeminiConfig() — save model, temperature, other settings
testGeminiConnection() — discover available models from Gemini API
getGeminiConfig() — fetch active configuration

NOTABLE: No /api/admin/gemini/test-connection endpoint exists. Connection testing is handled via the server action, not a standalone API route.

10. DATA REQUIREMENTS

Frontend Needs

GeminiConfigForm.tsx (694 lines): React Hook Form + Zod resolver, model dropdown via Shadcn Select, masked key display.
Re-auth via ReAuthModal.tsx for key changes.
PlatformConfigForm.tsx for platform-wide settings.

Caching Strategy

API key manager: 5-minute TTL cache for model discovery results.
Gemini response cache (gemini_cache): SHA-256 hash key, 30-day TTL.
Masked key display in UI (e.g., AIza...****).

11. PERFORMANCE CONSIDERATIONS

Database Optimization

Hash index on gemini_cache.hash_key for fast lookup.
Index on token_usage_events.user_id, created_at for billing queries.

API Response Time

Target: <5 seconds for model discovery (external Gemini API call).
Target: <200ms for config fetch (single row query).

12. SECURITY & AUTHORIZATION

Who can access this feature?

ADMIN only. Key changes require re-authentication via [0.5].

API Key Security

Stored as TEXT in gemini_config table (not hashed at rest — relies on RLS).
RLS: ADMIN only on gemini_config table.
Masked display in UI.
Drag-drop prevention on key input field.
Key clearance/replacement requires re-auth token.

Token Enforcement (3-Tier)

Sitewide: 5M tokens/month (hard limit).
Agency: agencies.monthly_token_budget — per-agency cap.
Per-member: agency_members.monthly_token_limit — individual overuse prevention.

13. ERROR HANDLING

Error	Response
No API key configured	"API key not configured. Contact your administrator."
Invalid API key	"API key validation failed. Please check your key."
Pro model with free key (429)	"This model requires a billing-enabled API key."
Token limit exceeded	"AI usage limit reached. Try again next month or contact your admin."
Circuit breaker active (3 failures)	"AI services temporarily unavailable. Please try again later." (5-min cooldown)

14. TESTING CHECKLIST

Happy Path
□ Set API key → test connection → models discovered
□ Select model → save via updateGeminiConfig() → AI features use selected model
□ Token tracking increments (input_tokens + output_tokens columns)
□ Response cache hit (SHA-256 match) → bypasses Gemini API

Edge Cases
□ Clear API key → AI features show config error
□ Free key + pro model (gemini-2.5-pro) → 429 error
□ Token budget exceeded → TOKEN_LIMIT_EXCEEDED → AI calls blocked
□ Circuit breaker (3 consecutive failures) → 5-min cooldown → local fallback
□ Cache expiry (30 days) → fresh API call on next request

15. OPEN QUESTIONS

Should we implement a scheduled cleanup job for expired gemini_cache entries and old token_usage_events?
Pro models (gemini-3.1-pro-preview, gemini-2.5-pro) require billing-enabled keys — should testGeminiConnection() detect and warn during model discovery?
API key stored as TEXT in DB — should we hash at rest in v1.1?

16. OUT OF SCOPE (v1.1+)

Multi-provider support (OpenAI, Anthropic, etc.).
Model A/B testing.
Per-creator model selection.
Automatic token table cleanup.
API key encryption at rest.

17. SUCCESS METRICS

AI features operational >99.9% uptime.
Token tracking accuracy within 1% of Gemini dashboard.
Agency token budgets prevent cost overruns.
Cache hit rate >30% (reducing redundant API calls).

18. DEPENDENCIES

This feature depends on:

[W1.4] ApiResponse + [0.5] Re-Auth for key changes.
@google/genai SDK (migrated from @google/generative-ai).
gemini_config, gemini_responses, gemini_cache, gemini_pricing, gemini_api_logs, token_usage_events tables.

These features depend on this:

[1.2] AI Idea Validator — uses Gemini for topic validation.
[1.1] Project Creation Wizard — uses Gemini for quest generation.
[5.2] Agency Admin View — token tracking feeds agency billing.
[5.1] Super Admin View — admin configures Gemini in settings.
[5.3] System Health — token usage monitoring.

19. TIMELINE & OWNERSHIP

Sprint #12: Already deployed.
Backend: scorevi
Frontend: scorevi
Estimated Completion: Complete.

Document Version

1.0 - Initial version - 2026-06-29 08:24 UTC

1.1 - Added Document Version section and update author to have full name - 2026-06-29 08:48 UTC