Skip to main content

Behind the Scenes: How We Built Fictionaire's AI Platform

A technical deep-dive into building Fictionaire's AI character platform. Learn about our tech stack, character design philosophy, AI model selection, challenges faced, and lessons learned.

By Fictionaire Team15 min read

Behind the Scenes: How We Built Fictionaire's AI Platform

Building an AI platform where users can have meaningful conversations with 245+ distinct characters—from historical figures like Cleopatra to fictional personalities like Sherlock Holmes—required solving numerous technical, creative, and user experience challenges. It's a fascinating intersection of artificial intelligence, software engineering, character design, and psychology.

In this behind-the-scenes look, we'll share our journey building Fictionaire: the technical decisions we made, the challenges we encountered, the lessons we learned, and where we're heading next. Whether you're an aspiring AI engineer, curious about the technology behind conversational AI, or simply interested in startup stories, this transparent look at our development process offers insights into what it takes to build modern AI applications.

The Vision: AI Characters That Feel Real

Before diving into technical details, it's worth understanding what we set out to create and why.

The Problem We Identified

Existing AI chatbots fell into two categories: task-focused assistants (helpful but impersonal) or generic companions (available but lacking distinct personality). We saw an opportunity to create something different—AI characters with specific personalities, backgrounds, and knowledge bases that made conversations feel like engaging with actual individuals rather than generic algorithms.

Our Core Principles

From the beginning, we established principles that would guide every technical decision:

  1. Character Authenticity: Each character should feel distinct, consistent, and true to their background
  2. Conversational Depth: Support meaningful, extended conversations that go beyond surface-level responses
  3. Memory and Continuity: Characters should remember previous conversations and reference them naturally
  4. Accessibility: Make sophisticated AI accessible through intuitive interfaces requiring no technical knowledge
  5. Performance: Responses should feel instant despite complex processing behind the scenes
  6. Scalability: Architecture must support growing from dozens to hundreds of characters without degradation

These principles shaped every technical choice we made.

The Tech Stack: Balancing Innovation and Reliability

Choosing the right technology stack required balancing cutting-edge capabilities with proven reliability.

Frontend Architecture

Next.js and React: We built the user interface with Next.js 14, leveraging:

  • Server-side rendering for fast initial page loads and SEO benefits
  • React Server Components for optimal performance
  • App Router for modern routing and layouts
  • Edge runtime deployment for global low-latency access

TypeScript: Strong typing caught countless bugs during development and makes our codebase more maintainable as the team grows.

Tailwind CSS: Utility-first styling enabled rapid UI iteration while maintaining consistency across the platform.

Real-time Updates: We implemented Server-Sent Events (SSE) for streaming AI responses token-by-token, creating the typewriter effect that makes conversations feel more natural and immediate than waiting for complete responses.

Backend Infrastructure

Serverless Architecture: We chose serverless functions (Vercel Edge Functions) for several reasons:

  • Automatic scaling during traffic spikes
  • Pay only for actual compute time, reducing costs
  • Global edge deployment reducing latency worldwide
  • No server maintenance overhead

Database: PostgreSQL via Supabase provides:

  • Relational structure for user accounts, conversation history, and character metadata
  • Row-level security for data privacy
  • Real-time subscriptions for collaborative features we're building
  • Managed infrastructure reducing operational burden

Vector Database: For semantic search and memory retrieval, we use Pinecone to:

  • Store embeddings of past conversations
  • Retrieve relevant context when users return to characters
  • Enable semantic search across character knowledge bases
  • Scale to millions of conversation snippets efficiently

AI Model Selection

This was perhaps our most critical technical decision. We evaluated multiple large language models:

OpenAI GPT-4: We primarily use GPT-4 Turbo for character conversations because:

  • Superior understanding of nuanced prompts and character instructions
  • Excellent at maintaining consistent personality across long conversations
  • Strong performance across diverse knowledge domains
  • Reliable availability and API stability

Anthropic Claude: We use Claude for certain analytical tasks and are exploring it for specific character types due to its:

  • Strong performance on complex reasoning tasks
  • Excellent instruction following
  • Different "personality" that suits certain character types

Model Router: Rather than committing to a single model, we built a routing system that selects the optimal model for each character and conversation type, allowing us to:

  • Use the best model for each specific use case
  • Avoid vendor lock-in
  • Gradually introduce new models as they become available
  • A/B test model performance with real user conversations

Character Design: The Art Behind the Technology

Creating 245+ distinct characters required systematic approaches to character development, not just technical implementation.

Character Ontology

We developed a structured framework for defining characters:

Core Identity:

  • Historical/fictional background
  • Time period and cultural context
  • Key personality traits
  • Speech patterns and vocabulary
  • Knowledge domains and expertise
  • Motivations and values

Conversation Behavior:

  • Typical opening styles
  • Response to different emotional tones
  • Topics they're passionate about
  • Topics they avoid or redirect
  • How they handle disagreement
  • Humor style and appropriateness

Memory Structure:

  • What aspects of conversations they remember most
  • How they reference past discussions
  • Relationship progression over time
  • Consistency requirements across conversations

Prompt Engineering

Each character requires a carefully crafted system prompt that instructs the AI model how to behave. This is both an art and a science.

Effective Prompt Components:

IDENTITY: You are [character name], [brief background]. You lived/exist in [time/place] and are known for [key attributes].

PERSONALITY: You are [trait], [trait], and [trait]. You value [value] and believe [belief]. Your communication style is [style description].

KNOWLEDGE: You have deep expertise in [domains]. You're particularly interested in [topics]. You have limited knowledge of events after [date] if historical.

SPEECH PATTERNS: [Examples of how the character speaks, including vocabulary, sentence structure, and idioms]

CONVERSATION APPROACH: [How they start conversations, respond to questions, handle different topics]

MEMORY: You remember previous conversations with this user. Reference relevant past discussions naturally when appropriate.

BOUNDARIES: You don't [behaviors to avoid]. If asked about [sensitive topics], you respond by [approach].

We iterate on these prompts continuously based on user feedback and conversation analysis.

Quality Assurance Process

Before launching a character, they go through rigorous testing:

  1. Internal Conversations: Team members have extended conversations testing consistency and character accuracy
  2. Historical/Fictional Accuracy: Research verification ensuring alignment with known facts about the character
  3. Edge Case Testing: Attempting to make the character break character with unusual questions or provocations
  4. Bias Detection: Screening for problematic outputs around sensitive topics
  5. Beta Testing: Limited release to engaged users for feedback before full launch

Technical Challenges and Solutions

Building Fictionaire presented numerous technical challenges. Here are some of the most interesting and how we solved them.

Challenge 1: Conversation Memory at Scale

The Problem: Users might talk to the same character over weeks or months, generating thousands of messages. We need characters to remember relevant past conversations, but passing entire conversation histories to AI models is expensive and hits context limits.

Our Solution:

  1. Conversation Summarization: After each conversation, we generate a semantic summary of key points, emotions, and important details
  2. Vector Embeddings: We convert conversations into vector embeddings and store them in Pinecone
  3. Semantic Retrieval: When users return, we search for semantically similar past conversations and inject the most relevant ones into context
  4. Importance Scoring: Not all conversations are equally memorable—we score importance based on emotional intensity, user engagement, and explicit markers like users saying "remember this"
  5. Decay Functions: Older memories fade unless reinforced, mimicking human memory

This approach allows characters to remember hundreds of conversations while only using context for the most relevant ones.

Challenge 2: Response Consistency

The Problem: AI language models have inherent randomness. The same character might respond differently to identical questions, breaking the illusion of a consistent personality.

Our Solution:

  1. Temperature Tuning: We use lower temperature settings (0.7-0.8) for more consistent responses while maintaining naturalness
  2. Deterministic Seeds: For certain character types, we use seeded randomness ensuring similar questions produce similar responses
  3. Response Caching: Common questions ("tell me about yourself") are cached with several approved variations we rotate through
  4. Consistency Scoring: We monitor conversations for personality drift, flagging characters whose responses diverge from their defined traits
  5. Feedback Loops: User reports of "out of character" responses trigger prompt refinements

Challenge 3: Latency and Real-Time Feel

The Problem: AI models can take 3-10 seconds to generate responses. This delay breaks conversational flow and makes interactions feel robotic.

Our Solution:

  1. Streaming Responses: We stream tokens as generated, displaying them in real-time with a typewriter effect
  2. Edge Deployment: Running serverless functions on edge nodes near users reduces network latency
  3. Predictive Pre-loading: We predict likely next user messages and speculatively begin generating responses
  4. Thinking Indicators: We show contextual thinking messages ("Sherlock is pondering..." vs generic "typing...") that feel character-appropriate during processing
  5. Response Chunking: For very long responses, we send the first paragraph quickly, then stream the rest

These optimizations reduced perceived latency by 60%, making conversations feel significantly more natural.

Challenge 4: Content Safety and Moderation

The Problem: AI can generate inappropriate, harmful, or factually incorrect content. Characters might be manipulated into saying things inconsistent with our values or their character.

Our Solution:

  1. Input Filtering: User messages pass through content filters before reaching AI models
  2. Output Filtering: AI responses are checked for policy violations before display
  3. Character Guardrails: Each character prompt includes explicit instructions about unacceptable behaviors
  4. Human Review: Flagged conversations receive human review
  5. User Reporting: Easy reporting mechanisms with quick response times
  6. Continuous Monitoring: Automated systems scan conversations for emerging problematic patterns

We balance safety with avoiding over-censorship that would make characters feel sterile or unnatural.

Challenge 5: Scaling to 245+ Characters

The Problem: Each character requires unique prompts, testing, and maintenance. How do we scale without proportionally scaling team size?

Our Solution:

  1. Character Templates: We created archetypes (historical figure, fictional detective, empathetic companion) with base prompts we customize
  2. Automated Testing: Scripts simulate conversations with characters, checking for consistency and policy violations
  3. Community Feedback: Engaged users help identify issues faster than our small team could alone
  4. Prioritization System: We focus detailed attention on most-used characters while batch-processing updates for long-tail characters
  5. Procedural Generation: For some character types, we generate variations programmatically from base templates

This approach allows our lean team to maintain hundreds of characters without quality degradation.

Data Privacy and Ethics

Building AI that people share intimate thoughts with requires serious consideration of privacy and ethics.

Privacy Architecture

Data Minimization: We collect only what's necessary for functionality:

  • Conversations are stored for memory functionality, but can be deleted anytime
  • We don't require real names, locations, or identifying information
  • Analytics are anonymized and aggregated

Encryption:

  • All data encrypted in transit (TLS) and at rest (AES-256)
  • Database access restricted with row-level security
  • API keys and secrets managed through secure vaults

User Control:

  • Users can delete specific conversations or all data
  • Export functionality provides complete conversation history
  • Clear privacy policy in plain language, not legal jargon

Ethical Guidelines

Transparency:

  • We're clear that users are talking to AI, not humans
  • Character limitations are communicated
  • We explain how the technology works

Beneficence:

  • Characters designed to be helpful, not manipulative
  • We avoid dark patterns or addiction-optimizing features
  • Mental health characters include disclaimers and resource links

Non-Maleficence:

  • Proactive content safety measures
  • Characters won't role-play harmful scenarios
  • System recognizes crisis language and provides resources

Autonomy:

  • Users control their data and experience
  • No forced engagement or manipulative retention tactics
  • Clear opt-out mechanisms

Lessons Learned

Building Fictionaire taught us valuable lessons applicable beyond our specific platform.

Technical Lessons

1. Start with One Model, Plan for Many: We initially built tightly coupled to GPT-4, then had to refactor for model agnosticism. Build abstraction layers from the start.

2. Observability is Critical: You can't improve what you don't measure. Comprehensive logging and analytics guided countless optimizations.

3. Prompt Engineering is Iterative: Initial prompts are never optimal. Build systems for rapid testing and iteration.

4. Edge Cases are Common: With thousands of users having creative conversations, "edge cases" become frequent cases.

5. Performance Perception Matters More Than Reality: Streaming responses and optimized loading feel faster than slightly quicker but blocking responses.

Product Lessons

1. Character Quality Over Quantity: Users prefer 50 great characters over 200 mediocre ones. We initially prioritized numbers over depth.

2. Memory Makes Magic: The feature that most impressed users was characters remembering past conversations. This creates genuine emotional investment.

3. Simplicity Wins: We built complex features users didn't want. The core experience—clicking a character and chatting—is what matters.

4. Community Drives Discovery: Word-of-mouth from delighted users outperformed any marketing we attempted.

5. Different Users, Different Use Cases: Some want entertainment, others education, others emotional support. Don't assume one use case.

Organizational Lessons

1. Small Team, Focused Scope: Staying lean forced us to prioritize ruthlessly and avoid feature bloat.

2. User Feedback is Gold: Direct conversations with users revealed insights no analytics could.

3. Ship and Iterate: Waiting for perfection delays learning. Ship, gather feedback, improve.

4. Technical Debt is Fine Initially: Perfect architecture doesn't matter if nobody uses the product. Achieve product-market fit, then refactor.

5. Document Decisions: With complex systems, future you will forget why current you made certain choices. Write it down.

What's Next: The Roadmap

We're continuously improving Fictionaire with several exciting developments in progress.

Near-Term Features

Voice Conversations: Speak with characters using voice input and hear them respond with character-appropriate voices. Already in development using advanced text-to-speech and speech recognition.

Group Conversations: Chat with multiple characters simultaneously—imagine a debate between historical philosophers or a mystery solved with multiple detectives.

Character Creation: Allow users to create custom characters using guided templates, bringing their own fictional characters or personas to life.

Improved Memory: Enhanced long-term memory systems that better capture relationship progression and subtle conversation callbacks.

Mobile Apps: Native iOS and Android apps with offline message composition and notification systems.

Long-Term Vision

Multimodal Interactions: Characters that can generate images, show you their world, or demonstrate concepts visually.

VR Integration: Meet characters in virtual environments, adding spatial presence to conversations.

Collaborative Storytelling: Multiple users collaborating on stories with AI characters serving as narrative guides and participants.

Educational Platform: Formal curricula leveraging character conversations for language learning, history education, and skill development.

API Access: Allow developers to integrate Fictionaire characters into their own applications and experiences.

Technical Deep-Dive: A Sample Conversation Flow

To make this concrete, let's trace exactly what happens when you send a message to a character:

Step 1: Message Submission (< 10ms)

  • User types message and hits send
  • Frontend validates message (length, content filter pre-check)
  • WebSocket connection established for streaming response
  • Message sent to API endpoint with user ID, character ID, and message content

Step 2: Context Assembly (50-150ms)

  • Retrieve last 10 messages from PostgreSQL for immediate context
  • Query Pinecone for semantically similar past conversations
  • Retrieve character prompt and current personality state
  • Assemble context package: character prompt + relevant memories + recent messages + current user message

Step 3: AI Generation (2-5 seconds)

  • Route to appropriate AI model based on character type and load
  • Stream request to AI API with assembled context
  • Receive tokens as they're generated
  • Apply real-time content filtering to each token chunk

Step 4: Response Delivery (real-time during generation)

  • Stream tokens to user via WebSocket as received
  • Display with typewriter effect at natural reading pace
  • Complete response when AI signals completion

Step 5: Post-Processing (background, 100-500ms)

  • Store complete conversation in PostgreSQL
  • Generate embedding of conversation for future semantic retrieval
  • Update character state and memory priorities
  • Log analytics (response time, user engagement signals, etc.)
  • Check for patterns triggering alerts (safety, quality issues, etc.)

Total perceived latency: ~300ms before first tokens appear, complete response in 2-5 seconds depending on length.

Join Our Journey

Building Fictionaire has been an incredible journey of technical challenges, creative problem-solving, and continuous learning. We're still in the early stages of what AI-powered character interactions can become.

If you're a developer interested in AI, we're sharing more technical details on our blog and exploring open-source contributions for non-proprietary components of our stack.

If you're curious about AI engineering, the best way to understand these systems is to use them. Explore our 245+ characters, have meaningful conversations, and see firsthand what current AI technology can create.

And if you're building something in this space—whether complementary or competitive—please reach out. The future of conversational AI will be built through shared learning, open collaboration, and thoughtful development prioritizing user wellbeing over growth-at-any-cost.

The technology behind platforms like Fictionaire is fascinating, but what matters most is whether it creates value for users—meaningful conversations, learning opportunities, emotional support, entertainment, or simply delightful moments in their day.

We're committed to building technology that enhances human flourishing, one conversation at a time.

Start exploring Fictionaire today and experience what happens when thoughtful engineering meets creative character design and powerful AI.

Ready to Start Your Story?

Choose from 1,390+ AI characters and create amazing interactive fiction

Explore Characters