Most AI applications today rely on vector databases for context - storing conversations and retrieving them through similarity search.
Like a digital filing cabinet, they find relevant past conversations by matching patterns in text.
While this works for basic context retrieval, as users interact more with these apps, fundamental limitations emerge:
Growing conversation histories become expensive to process
Surface-level matching misses deeper patterns
Each interaction requires rediscovering user patterns
Vector search accuracy degrades with scale
Context becomes inconsistent across responses
We need a system that can build reliable, structured understanding of users while preserving the richness and fluidity of natural conversations.
Persona combines graph databases with vector search to create a hybrid system that captures both structured understanding and semantic relationships. Instead of just indexing conversations, it builds an evolving model of each user through their interactions.Traditional vector-only systems are limited to finding similar past conversations - like having a perfect memory of what someone said, but no understanding of who they are. Persona’s hybrid approach builds genuine understanding by:
Maintaining verified patterns in a graph structure
Discovering relationships between different aspects of user behavior
Evolving its understanding over time
Enabling precise, programmatic access to user insights
Supporting complex queries about user psychology
Let’s see how this transforms a simple conversational AI coach and task manager into an intelligent companion that truly understands its users.
Copy
# A user seeking help with an overwhelming taskuser_message = "I need to finish this quarterly report by tomorrow but feeling overwhelmed. Any suggestions?"# ===============================# Traditional RAG Approach# ===============================def get_traditional_response(user_message): # Simple vector similarity search similar_messages = vector_db.similarity_search( text=user_message, limit=3 ) # Build context from similar past conversations context = '\n'.join([ "Previous relevant interactions:", *[msg.text for msg in similar_messages], "", "Current user: " + user_message ]) response = llm.generate(context) return response# Returns generic advice based on similar-sounding situations:# # "Breaking tasks into smaller pieces can help manage overwhelm. # Let's start with creating an outline for your report..."# ===============================# Persona Approach# ===============================def get_persona_response(user_message): # Persona handles the complexity of finding relevant patterns context = requests.post( f"{PERSONA_API_BASE}/api/v1/rag/{user_id}/query", json={"query": user_message} ) # Under the hood, Persona: # 1. Uses vector search to find similar nodes # 2. Retrieves relevant attribute nodes (work_sessions, energy_cycles) # 3. Retrieves structured patterns from graph from crawl and multi-hop connections # Returns rich context: # { # "work_sessions": { # "effective_patterns": { # "duration": "two hour blocks", # "evidence": "completed presentation without distractions when using morning blocks", # "success_rate": "80% task completion in morning focus time" # } # }, # "energy_cycles": { # "peak_period": "before lunch", # "optimal_work_window": "morning", # "evidence": "consistently higher output in 8am-11am slot" # } # } response = llm.generate(context) return response# Returns personalized advice based on proven patterns:# # "I know morning focus blocks work well for you - you've had 80% task completion # rate during 8am-11am. Since it's 3pm now, let's organize the report tonight # so you can tackle it first thing tomorrow in your optimal window. # Remember to turn off Slack notifications like you did for the last presentation?"
When a user interacts with the AI coach, Persona builds queryable understanding of latent cognitive patterns, and stores those as facts about the users.
For our example, we’ll use a simple schema to learn about the user’s productivity patterns.
Copy
POST /learn{ "description": "Learn user's productivity patterns for adaptive coaching", "schema": { "attributes": [ "work_sessions", // Track when/how user works effectively "energy_cycles", // Daily/weekly energy patterns "focus_blocks", // Deep work capabilities "effective_patterns", // What works for them "recovery_needs", // How they recharge "disruption_factors" // What breaks their flow ], "relationships": [ "improves", "disrupts", "depends_on", "leads_to" ] }}
Persona will make new nodes and relationships based on the schema. Since this is generative graph construction, it will learn new patterns and preferences over time.
Copy
// Sample user interactions showing pattern evolution{ "user_id": "john", "interactions": [ // Week 1 - Baseline { "user": "Having trouble focusing today. Can't stick to one task for more than 15 minutes.", "ai_coach": "I hear you. When do you usually feel most focused?" }, { "user": "Mornings are usually better, but lately I've been checking emails and Slack constantly.", "ai_coach": "What if we tried dedicated focus blocks in the morning?" }, // Week 2 - Implementation { "user": "The two-hour morning block worked great! Got through my whole presentation without distractions.", "ai_coach": "That's excellent! What made it particularly effective?" }, { "user": "No notifications, and my energy is better before lunch. After lunch I get sluggish.", "ai_coach": "Good insight about your energy patterns." }, // Week 3 - Challenge { "user": "Morning meetings all week. Can't do my focus blocks and feeling scattered again.", "ai_coach": "I see. This shows how important those morning blocks have become." } ]}// What Persona stores in the graph - directly tied to user's messages{ "work_sessions": { "effective_patterns": { "duration": "two hour blocks", "evidence": "completed presentation without distractions", "barriers": "morning meetings" }, "disruption_factors": { "primary": ["email_checking", "slack_notifications"], "impact": "reduces focus to 15min intervals" } }, "energy_cycles": { "peak_period": "before lunch", "low_period": "post lunch", "state": "sluggish", "optimal_work_window": "morning" }, "focus_blocks": { "preferred_duration": "two hour blocks", "success_conditions": ["no notifications"], "impact_when_missed": "feeling scattered" }}
The key difference here is that Persona maintains a structured understanding of each requested attribute while allowing for the discovery of new connections and patterns.The user is understood as a language based entity that evolves over time. That’s where the generative LLMs combined with graph structure shows its power.
This structured approach enables reliable querying of specific patterns.Say our app has a non conversation AI agent that plans a user’s tasks using energy_patterns and motivation_triggers as inputs.
Copy
// Get focus block requirements for schedulingMATCH (u:User {id: 'john'})-[r]->(n)WHERE n.name = 'focus_blocks'RETURN n;// Check energy patterns for optimal schedulingMATCH (u:User {id: 'john'})-[r]->(n)WHERE n.name = 'energy_cycles'RETURN n;
For example, we can pass this language data to an LLM based scheduler.
Copy
def create_focus_schedule(user_id, tasks): # Get patterns from graph focus_data = graph.query('\n'.join([ "MATCH (u:User {id: $user_id})-[r]->(n)", "WHERE n.name = 'focus_blocks'", "RETURN n" ])) energy_data = graph.query('\n'.join([ "MATCH (u:User {id: $user_id})-[r]->(n)", "WHERE n.name = 'energy_cycles'", "RETURN n" ])) # Convert graph data to strings for LLM context focus_context = json.dumps(focus_data) energy_context = json.dumps(energy_data) # Use LLM to create schedule based on learned patterns prompt = '\n'.join([ "Based on the user's patterns:", "", f"Focus patterns: {focus_context}", f"Energy patterns: {energy_context}", "", f"Create a schedule for these tasks: {tasks}", "", "Order the tasks optimally considering the user's focus blocks and energy cycles.", "Return a JSON schedule with time slots and rationale." ]) schedule = llm.generate(prompt) # Returns: # { # "schedule": [ # { # "task": "Quarterly presentation", # "time": "9:00 AM", # "duration": "two hours", # "rationale": "Scheduled during peak energy before lunch, with no notifications" # }, # { # "task": "Team check-in", # "time": "2:00 PM", # "rationale": "Short meeting during lower energy period" # } # ] # } return schedule
This shows how Persona’s graph structure allows us to:
Reliably retrieve learned patterns
Use natural language as building blocks
Take advantage of any additional patterns the system has discovered
Let LLMs handle the complex reasoning while working with verified user data
Persona allows you to ask questions about your users and get structured insights based on their learned patterns.
The system combines hybrid vector search and graph traversal with LLMs to discover relevant connections and respond in the given structure.For example, if we want to understand how our user responds to deadlines with reasons:
Copy
POST /ask{ "user_id": "john", "question": "How does this user handle high-pressure deadlines?", "context": { "response_style": {}, "support_needs": {}, "evidence": [] }}// Response drawing from user's patterns{ "response_style": { "initial": "tends to procrastinate until pressure builds", "effective": "performs well with structured breakdown approach", "time preference": "most successful with morning execution" }, "support_needs": { "planning": "needs external help breaking down large tasks", "environment": "requires notification-free blocks", "motivation": "responds well to deadline-based milestones" }, "evidence": [ "Completed quarterly presentation after breaking it into morning focus blocks", "Shows consistent pattern of task avoidance until structured intervention", "Successfully manages personal deadlines when given clear morning schedules" ]}
The Ask API combines the reliability of graph structure with the flexibility of language models and vectorDB.
You can explore new aspects of user behavior while getting consistent, evidence-based responses.
1. Deterministic Retrieval vs Probabilistic Generation
With conversation history, each time we ask about energy patterns, the LLM needs to re-analyze and regenerate insights
This can lead to inconsistent interpretations over time
Graph storage ensures we get exactly the same canonical facts each time we query
Critical for building reliable systems and maintaining consistent user experience
2. Computational Efficiency & Cost
Processing entire conversation histories through an LLM for each insight is expensive
Graph queries are fast and cheap
We can use LLMs just for the initial pattern discovery, then rely on efficient graph operations
Especially important when building features that need frequent access to user attributes
3. Complex Pattern Recognition Over Time
Graphs can track how attributes evolve and influence each other
We can see how changes in energy patterns affect motivation over months
This kind of temporal pattern analysis is much harder with raw conversation history
Enables features like “How has John’s morning productivity changed since starting exercise?”
4. Programmatic Integration
Other systems and agents can easily consume structured graph data
No need for natural language parsing or context interpretation
Enables reliable automation and workflows
Critical for features like automated scheduling or task prioritization