Skip to content

Redis to Typesense Migration Report

Executive Summary

This report analyzes the current Redis implementation in the AgentPortal BFF application and evaluates migrating to Typesense. With projected data growth to 20-30GB and 150,000-200,000 documents requiring full-text search, pagination, and keyword counting capabilities, the current Redis setup shows significant strain. This analysis provides a comprehensive migration strategy and compares Typesense with Elasticsearch to ensure the best technology choice.

Current Redis Usage Analysis

Data Volume and Structure

The application currently stores multiple entity types in Redis using RedisJSON:

Entity Type Key Pattern Typical Document Size TTL
Consultant Profiles BFF:consultant:{entityId} 5-10 KB None
Sales Requests BFF:request:{matchId} 3-5 KB 5 minutes
Match Results BFF:r2cresult:{matchId}:{iterationId} 10-15 KB 48 hours
Declarations of Interest BFF:dois:{entityId} 1-2 KB None
Map Locations BFF:mapLocation:{locationId} 500 bytes None

Redis Operations and Patterns

1. Basic Key Operations

// Current implementation in RedisAdapter.cs
KeyExists(key)
KeyDelete(keys)
KeyExpire(key, expiry)
KeyDeleteAsync(keys)

2. JSON Document Operations

// Complex object storage
JSON.SET(key, "$", document)
JSON.GET(key, paths)
JSON.ARRAPPEND(key, path, values)
JSON.TYPE(key, path)

3. Search Operations (RedisSearch)

// Index creation
FT.CREATE idx-bff-consultant ON JSON PREFIX 1 BFF:consultant: SCHEMA
  $.Fullname AS Fullname TEXT
  $.Territory AS Territory TAG
  $.OfficeLocation AS OfficeLocation TAG
  $.BusinessUnit AS BusinessUnit TAG
  $.AvailableFromDate AS AvailableFromDate NUMERIC
  $.MapGeoLocation AS MapGeoLocation GEO

// Search queries
FT.SEARCH idx-bff-consultant "@MatchId:{123} @Status:{Lead | 7N Consultant}"

Current Pain Points

  1. Complex Query Building: Extensive string manipulation required for search queries
  2. Manual Pagination: Fetching all results (500+ items) then paginating in-memory
  3. No Native Full-Text Search: Using workarounds with flattened text fields
  4. Performance Issues: Large result sets causing memory pressure
  5. Limited Analytics: No native support for counting keyword occurrences
  6. Error-Prone Operations: Frequent try-catch blocks for Redis operations

Drop-in Replacement Strategy

Phase 1: Parallel Implementation (Weeks 1-2)

  1. Install Typesense alongside Redis

    // New ITypesenseAdapter interface matching IRedisAdapter
    public interface ITypesenseAdapter
    {
        Task<T> GetDocumentAsync<T>(string collection, string id);
        Task UpsertDocumentAsync<T>(string collection, string id, T document);
        Task<SearchResult<T>> SearchAsync<T>(string collection, SearchParameters parameters);
    }
    

  2. Create Adapter Layer

    // Wrapper to maintain existing interfaces
    public class TypesenseRedisAdapter : IRedisAdapter
    {
        private readonly ITypesenseAdapter _typesense;
        private readonly IRedisAdapter _redis;
        private readonly bool _useTypesenseForSearch;
    
        // Gradually migrate operations
    }
    

Phase 2: Data Migration (Week 3)

  1. Dual Write Strategy
  2. Write to both Redis and Typesense
  3. Read from Redis (primary) with Typesense validation
  4. Monitor data consistency

  5. Collection Schema Definition

    {
      "name": "consultants",
      "fields": [
        {"name": "id", "type": "string"},
        {"name": "fullname", "type": "string", "facet": false},
        {"name": "territory", "type": "string[]", "facet": true},
        {"name": "office_location", "type": "string", "facet": true},
        {"name": "business_unit", "type": "string", "facet": true},
        {"name": "available_from", "type": "int64"},
        {"name": "location", "type": "geopoint"},
        {"name": "profile_text", "type": "string", "facet": false}
      ],
      "default_sorting_field": "available_from"
    }
    

Phase 3: Cutover (Week 4)

  1. Switch read operations to Typesense
  2. Remove Redis write operations
  3. Decommission Redis indexes

Current Redis Implementation Analysis

Redis Index Structure and Profile Combination Logic

The current implementation uses a sophisticated Redis index idx-bff-consultant that combines consultant profiles with match iterations:

Index Schema

FT.CREATE idx-bff-consultant ON JSON 
PREFIX 1 BFF:consultant: 
SCHEMA 
    $.ContactDto.Fullname AS Fullname TEXT NOSTEM 
    $.ProfileLevel2.MatchingType AS ProfileMatchingType NUMERIC 
    $.EntityId AS EntityId TAG 
    $.Filters.Territory AS Territory TAG 
    $.Filters.OfficeLocation AS OfficeLocation TAG 
    $.Filters.BusinessUnit AS BusinessUnit TAG 
    $.Filters.WorkedWith7N AS WorkedWith7N TAG 
    $.Filters.RelationStatus AS RelationStatus TAG 
    $.Filters.Status as Status TAG 
    $.Filters.CalculatedAvailabilityDate as CalculatedAvailabilityDate NUMERIC 
    $.Filters.MapLocationId AS MapLocationId TAG 
    $.Filters.MapGeoLocation AS MapGeoLocation GEO 
    $.Matches[*].MatchId AS MatchId TAG 
    $.Matches[*].IterationId AS IterationId NUMERIC 
    $.Matches[*].AverageSimilarity AS AverageSimilarity NUMERIC SORTABLE 
    $.Matches[*].MatchingType AS MatchingType NUMERIC 
    $.Flattened.AggregateRoot AS FlattenedAggregateRoot TEXT

Data Structure Analysis

ConsultantWrapper (Redis Document)

public class ConsultantWrapper
{
    public Guid EntityId { get; set; }
    public FreelancerAggregateRoot FreelancerAggregateRoot { get; set; }
    public Flattened Flattened { get; set; } = new Flattened();
    public ContactDto ContactDto { get; set; }
    public Filters Filters { get; set; } = new();
    public ProfileLevel2 ProfileLevel2 { get; set; }
    public List<Match> Matches = []; // Key: Multiple matches per consultant
}

Match Structure

public class Match
{
    public uint IterationId { get; set; }
    public Guid MatchId { get; set; }
    public double AverageSimilarity { get; set; }
    public long CreationTime { get; set; }
    public int TTL { get; set; }
    public MatchingType MatchingType { get; set; }
    public Dictionary<string, int>? SkillsOverlap { get; set; }
}

GetProfiles Method Implementation

The GetProfiles(matchId, iterationId, matchingType, options) method implements complex profile combination logic:

Step 1: Query Construction

private Query BuildQuery(RedisFilteringOptions options, Guid matchId, uint iterationId, MatchingType matchingType)
{
    var query = "@MatchId:{$mId} @IterationId:[$itId $itId] @MatchingType:[$mType $mType]";

    // Add keyword search
    if (options.Keywords.Any())
    {
        foreach (var keyword in options.Keywords)
        {
            query += $"@FlattenedAggregateRoot:\"{EscapeRedisText(keyword)}\" ";
        }
    }

    // Add additional filters (status, territory, availability, etc.)
    // ... filtering logic

    return new Query(query)
        .AddParam("mId", matchId.ToString())
        .AddParam("itId", iterationId.ToString())
        .AddParam("mType", (int)matchingType);
}

Step 2: Paginated Retrieval with Deduplication

private async Task<List<MatchedCandidate>> GetProfiles(
    Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
{
    List<ProfileAndMatches> all = [];
    var offset = 0;
    var pagesize = 500; // Fixed batch size
    var hashes = new HashSet<Guid>(); // Deduplication

    // Initial query
    var q = BuildQuery(options, matchId, iterationId, matchingType);
    var (profiles, totalResults) = await SearchQueryAsync(q, matchId, iterationId, offset, pagesize);

    // Add to deduplication set
    profiles.ForEach(p => {
        hashes.Add(p.Profile!.Id);
        all.Add(p);
    });

    // Paginate through all results
    while (totalResults > (offset + pagesize))
    {
        offset += pagesize;
        (profiles, totalResults) = await SearchQueryAsync(q, matchId, iterationId, offset, pagesize);

        // Skip duplicates
        profiles.ForEach(p => {
            if (hashes.Contains(p.Profile!.Id)) return;
            hashes.Add(p.Profile!.Id);
            all.Add(p);
        });
    }

    // Final processing and sorting
    return all
        .Where(m => m.Match != null && m.Profile != null)
        .Where(m => m.Match!.MatchingType == matchingType)
        .OrderByDescending(m => m.Match!.AverageSimilarity)
        .Select(m => new MatchedCandidate(m.Profile!, m.Match!.SkillsOverlap))
        .ToList();
}

Step 3: Profile-Match Combination

private async Task<(List<ProfileAndMatches>, long)> SearchQueryAsync(
    Query q, Guid matchId, uint iterationId, int offset, int limit)
{
    var d = await ft.SearchAsync(IndexConsultantMatchAndIterationId, q.Limit(offset, limit));

    var profiles = d.Documents.Select(d =>
        new ProfileAndMatches()
        {
            Profile = JsonSerializer.Deserialize<List<ProfileLevel2>>(d["$.ProfileLevel2"])
                .FirstOrDefault(),
            Match = JsonSerializer.Deserialize<List<List<Match>>>(d["$.Matches"])
                .SelectMany(l => l)
                .FirstOrDefault(m => m.MatchId == matchId && m.IterationId == iterationId)
        }).Where(p => p.Profile != null).DistinctBy(p => p.Profile?.Id).ToList();

    return (profiles, d.TotalResults);
}

Keyword Search Implementation

Flattened Aggregate Root Construction

The FlattenedAggregateRoot field contains searchable text extracted from: - CV Title and Name - Profile Summary - All position titles, customers, descriptions - All skills (category names and skill names) - All project titles, roles, descriptions - Project account names and project names

Keyword Query Building

private string BuildKeywordsQuery(RedisFilteringOptions options, string query)
{
    if (options.Keywords.Any())
    {
        foreach (var keyword in options.Keywords)
        {
            query += $"@FlattenedAggregateRoot:\"{EscapeRedisText(keyword)}\" ";
        }
        // Generates: @FlattenedAggregateRoot:"Azure" @FlattenedAggregateRoot:"AWS"
    }
    return query;
}

Current Pain Points in Profile Combination

  1. In-Memory Deduplication: HashSet used to prevent duplicate profiles across pages
  2. Full Result Set Retrieval: Must fetch all matches to apply proper sorting
  3. Complex Array Filtering: Searching within $.Matches[*] arrays with multiple conditions
  4. Manual Pagination: Client-side pagination after fetching all results
  5. Performance Degradation: 500+ item batches cause memory pressure

Redis to Typesense Operation Mapping

Key-Value Operations

Redis Operation Typesense Equivalent Notes
JSON.SET key $ doc documents.upsert() Automatic indexing
JSON.GET key $ documents.retrieve() Direct document fetch
KeyExists documents.retrieve() with error handling Check 404 response
KeyDelete documents.delete() Immediate removal
KeyExpire Not needed Use document timestamp field

Search Operations

Redis Pattern Typesense Implementation
@Fullname:John* q=John&query_by=fullname&prefix=true
@Territory:{DK\|PH} filter_by=territory:=[DK,PH]
@AvailableFromDate:[0 $date] filter_by=available_from:<=$date
@MapGeoLocation:[$lng $lat $dist km] filter_by=location:($lat,$lng,$dist km)
LIMIT $offset $limit page=$page&per_page=$limit

Complex Query Example

Redis Query:

FT.SEARCH idx-bff-consultant 
  "@MatchId:{123} @IterationId:[456 456] @Status:{Lead | 7N Consultant}" 
  LIMIT 0 100

Typesense Query:

{
  "searches": [{
    "collection": "consultants",
    "q": "*",
    "filter_by": "match_id:=123 && iteration_id:=456 && status:=[Lead,7N Consultant]",
    "per_page": 100,
    "page": 1
  }]
}

Typesense Query Implementation Guide

Documentation References

  1. Search API Reference: https://typesense.org/docs/0.25.0/api/search.html
  2. Filtering Documentation: https://typesense.org/docs/0.25.0/api/search.html#filter-parameters
  3. Geo Search Guide: https://typesense.org/docs/0.25.0/api/geosearch.html
  4. Faceted Search: https://typesense.org/docs/0.25.0/api/search.html#facet-results
  5. Pagination Guide: https://typesense.org/docs/0.25.0/api/search.html#pagination
  6. Multi-Search: https://typesense.org/docs/0.25.0/api/federated-multi-search.html

Query Implementation Examples

1. Full-Text Search with Typo Tolerance

// Reference: https://typesense.org/docs/0.25.0/api/search.html#typo-tolerance
var searchParameters = new SearchParameters
{
    Q = "John Doe",           // Search query
    QueryBy = "fullname,profile_text",  // Fields to search
    TypoTokensThreshold = 2,  // Minimum word length for typo tolerance
    NumTypos = 2,            // Number of typos to tolerate
    Prefix = true            // Enable prefix search
};

2. Filtering and Faceting

// Reference: https://typesense.org/docs/0.25.0/api/search.html#filter-parameters
var searchParameters = new SearchParameters
{
    Q = "*",  // Wildcard to return all documents
    FilterBy = "territory:=[DK,SE,NO] && business_unit:=Technology && available_from:<=1704067200",
    FacetBy = "territory,office_location,business_unit,status",
    MaxFacetValues = 100
};
// Reference: https://typesense.org/docs/0.25.0/api/geosearch.html
var searchParameters = new SearchParameters
{
    Q = "*",
    FilterBy = "location:(55.6761, 12.5683, 50 km)",  // Copenhagen center, 50km radius
    SortBy = "_geo_distance(location: 55.6761, 12.5683):asc"
};

4. Pagination with Sorting

// Reference: https://typesense.org/docs/0.25.0/api/search.html#ranking-and-sorting
var searchParameters = new SearchParameters
{
    Q = "*",
    Page = 2,
    PerPage = 50,
    SortBy = "available_from:asc,_text_match:desc"  // Sort by date, then relevance
};

5. Keyword Counting with Group By

// Reference: https://typesense.org/docs/0.25.0/api/search.html#grouping
var searchParameters = new SearchParameters
{
    Q = "cloud computing",
    QueryBy = "profile_text",
    GroupBy = "business_unit",
    GroupLimit = 3,  // Top 3 results per group
    FacetBy = "skills",  // Count occurrences of skills
    MaxFacetValues = 50
};

C# Client Implementation

// Using official Typesense C# client
// Reference: https://github.com/typesense/typesense-dotnet
using Typesense;

public class TypesenseSearchService
{
    private readonly ITypesenseClient _client;

    public TypesenseSearchService(string apiKey, string[] nodes)
    {
        _client = new TypesenseClient(new Config
        {
            ApiKey = apiKey,
            Nodes = nodes.Select(n => new Node(n, "443", "https")).ToList()
        });
    }

    public async Task<SearchResult<T>> SearchAsync<T>(
        string collection, 
        SearchParameters parameters)
    {
        return await _client
            .Collections[collection]
            .Documents
            .Search<T>(parameters);
    }
}

## Typesense Data Structure Redesign

### Problem with Current Redis Approach

The current Redis implementation stores multiple matches per consultant in `$.Matches[]` arrays, leading to:
1. **Complex array queries** requiring `$.Matches[*].MatchId` filtering
2. **Deduplication overhead** when consultants match multiple iterations
3. **In-memory sorting** of large result sets
4. **Inefficient pagination** requiring full result retrieval

### Recommended Typesense Approach: Flattened Match Documents

Instead of storing matches as nested arrays, create individual documents for each consultant-match combination:

#### New Data Structure

**Collection: `consultant_matches`**
```javascript
{
  "name": "consultant_matches",
  "fields": [
    // Match-specific fields (primary keys)
    {"name": "id", "type": "string"}, // "{consultantId}_{matchId}_{iterationId}"
    {"name": "match_id", "type": "string", "index": true},
    {"name": "iteration_id", "type": "int32", "index": true},
    {"name": "matching_type", "type": "int32", "facet": true},
    {"name": "average_similarity", "type": "float", "sort": true},
    {"name": "creation_time", "type": "int64"},
    {"name": "skills_overlap", "type": "object", "index": false},

    // Consultant profile fields (denormalized)
    {"name": "consultant_id", "type": "string", "index": true},
    {"name": "entity_id", "type": "string", "index": true},
    {"name": "fullname", "type": "string", "facet": false},
    {"name": "territory", "type": "string[]", "facet": true},
    {"name": "office_location", "type": "string", "facet": true},
    {"name": "business_unit", "type": "string", "facet": true},
    {"name": "worked_with_7n", "type": "bool", "facet": true},
    {"name": "relation_status", "type": "string", "facet": true},
    {"name": "status", "type": "string", "facet": true},
    {"name": "availability_date", "type": "int64", "sort": true},
    {"name": "location", "type": "geopoint", "index": true},
    {"name": "address_country_id", "type": "string", "facet": true},

    // Full-text search field
    {"name": "profile_text", "type": "string", "facet": false}
  ],
  "default_sorting_field": "average_similarity"
}

Example Document

{
  "id": "12345_67890_101",
  "match_id": "67890",
  "iteration_id": 101,
  "matching_type": 0,
  "average_similarity": 0.85,
  "creation_time": 1704067200,
  "skills_overlap": {"Azure": 5, "C#": 8, "React": 3},

  "consultant_id": "12345",
  "entity_id": "abcd-1234-efgh-5678",
  "fullname": "John Doe",
  "territory": ["DK", "SE"],
  "office_location": "Copenhagen",
  "business_unit": "Technology",
  "worked_with_7n": true,
  "relation_status": "Active",
  "status": "Available",
  "availability_date": 1704067200,
  "location": [55.6761, 12.5683],
  "address_country_id": "DK",

  "profile_text": "Senior Full Stack Developer with expertise in Azure, C#, React..."
}

Migration Strategy

Phase 1: Create Typesense Collection and Populate Data

public class TypesenseMigrationService
{
    public async Task MigrateConsultantMatches()
    {
        // 1. Create collection schema
        await CreateConsultantMatchesCollection();

        // 2. Read all Redis consultant documents
        var consultants = await GetAllRedisConsultants();

        // 3. Flatten and insert into Typesense
        foreach (var consultant in consultants)
        {
            var documents = FlattenConsultantMatches(consultant);
            await _typesenseClient.Collections["consultant_matches"]
                .Documents.Import(documents, ImportType.Upsert);
        }
    }

    private List<ConsultantMatchDocument> FlattenConsultantMatches(ConsultantWrapper consultant)
    {
        var documents = new List<ConsultantMatchDocument>();

        foreach (var match in consultant.Matches)
        {
            documents.Add(new ConsultantMatchDocument
            {
                Id = $"{consultant.EntityId}_{match.MatchId}_{match.IterationId}",
                MatchId = match.MatchId.ToString(),
                IterationId = (int)match.IterationId,
                MatchingType = (int)match.MatchingType,
                AverageSimilarity = (float)match.AverageSimilarity,
                CreationTime = match.CreationTime,
                SkillsOverlap = match.SkillsOverlap,

                // Denormalized consultant fields
                ConsultantId = consultant.EntityId.ToString(),
                EntityId = consultant.EntityId.ToString(),
                Fullname = consultant.ContactDto.Fullname,
                Territory = consultant.Filters.Territory,
                OfficeLocation = consultant.Filters.OfficeLocation,
                BusinessUnit = consultant.Filters.BusinessUnit,
                WorkedWith7N = consultant.Filters.WorkedWith7N,
                RelationStatus = consultant.Filters.RelationStatus,
                Status = consultant.Filters.Status,
                AvailabilityDate = consultant.Filters.CalculatedAvailabilityDate,
                Location = consultant.Filters.MapGeoLocation,
                AddressCountryId = consultant.Filters.AddressCountryId,
                ProfileText = consultant.Flattened.AggregateRoot
            });
        }

        return documents;
    }
}

Phase 2: Replace GetProfiles Implementation

public class TypesenseAdapter : ITypesenseAdapter
{
    public async Task<List<MatchedCandidate>> GetProfiles(
        Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
    {
        var searchParameters = new SearchParameters
        {
            Q = string.IsNullOrEmpty(options.SearchString) ? "*" : options.SearchString,
            QueryBy = "profile_text,fullname",
            FilterBy = BuildTypesenseFilter(matchId, iterationId, matchingType, options),
            SortBy = "average_similarity:desc",
            Page = (options.Offset / options.Limit) + 1,
            PerPage = options.Limit,
            MaxFacetValues = 1000
        };

        // Add keyword search if specified
        if (options.Keywords.Any())
        {
            searchParameters.Q = string.Join(" ", options.Keywords);
            searchParameters.QueryBy = "profile_text";
        }

        var result = await _typesenseClient
            .Collections["consultant_matches"]
            .Documents
            .Search<ConsultantMatchDocument>(searchParameters);

        return result.Hits.Select(hit => new MatchedCandidate(
            MapToProfileLevel2(hit.Document),
            hit.Document.SkillsOverlap
        )).ToList();
    }

    private string BuildTypesenseFilter(
        Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
    {
        var filters = new List<string>
        {
            $"match_id:={matchId}",
            $"iteration_id:={iterationId}",
            $"matching_type:={(int)matchingType}"
        };

        if (options.WorkedWith7N.HasValue)
            filters.Add($"worked_with_7n:={options.WorkedWith7N.Value}");

        if (options.Statuses.Any())
            filters.Add($"status:=[{string.Join(",", options.Statuses.Select(s => $"\"{s}\""))}]");

        if (options.RelationStatuses.Any())
            filters.Add($"relation_status:=[{string.Join(",", options.RelationStatuses.Select(s => $"\"{s}\""))}]");

        if (options.AvailabilityRange != null)
            filters.Add($"availability_date:>={options.AvailabilityRange.Start.Value} && availability_date:<={options.AvailabilityRange.End.Value}");

        if (options.GeoLocations.Any())
        {
            var geoFilters = options.GeoLocations.Select(geo => 
                $"location:({geo.Latitude}, {geo.Longitude}, {geo.RadiusKm} km)");
            filters.Add($"({string.Join(" || ", geoFilters)})");
        }

        return string.Join(" && ", filters);
    }
}

Benefits of Flattened Approach

Performance Improvements

  1. Native Pagination: Typesense handles pagination server-side
  2. Efficient Sorting: Built-in sorting by average_similarity
  3. No Deduplication: Each document represents a unique consultant-match pair
  4. Faster Filtering: Direct field filtering instead of array queries

Query Simplification

// Old Redis Query
"@MatchId:{67890} @IterationId:[101 101] @MatchingType:[0 0] @FlattenedAggregateRoot:\"Azure\""

// New Typesense Query
{
  "q": "Azure",
  "query_by": "profile_text",
  "filter_by": "match_id:=67890 && iteration_id:=101 && matching_type:=0",
  "sort_by": "average_similarity:desc",
  "per_page": 50,
  "page": 1
}

Scalability Benefits

  1. Horizontal Scaling: Typesense can shard across multiple nodes
  2. Memory Efficiency: Not all data needs to be in memory
  3. Index Optimization: Purpose-built for search operations
  4. Concurrent Queries: Better handling of multiple simultaneous searches

Data Consistency Strategy

Dual-Write Pattern During Migration

public async Task UpdateConsultantMatch(ConsultantWrapper consultant, Match match)
{
    // Update Redis (existing)
    await _redisAdapter.UpdateConsultantMatch(consultant, match);

    // Update Typesense (new)
    var document = new ConsultantMatchDocument
    {
        Id = $"{consultant.EntityId}_{match.MatchId}_{match.IterationId}",
        // ... populate fields
    };

    await _typesenseClient
        .Collections["consultant_matches"]
        .Documents
        .Upsert(document.Id, document);
}

Alternative Collection Structure (Optional)

For even better query performance, consider separate collections:

consultant_profiles - Static profile data match_results - Dynamic match results with consultant references

This approach reduces data duplication but requires join operations.

Typesense vs Elasticsearch Comparison

Feature Comparison for Current Use Case

Feature Redis (Current) Typesense Elasticsearch Winner for Use Case
Full-Text Search Workarounds with flattened fields Native with typo tolerance Advanced with analyzers Typesense (simpler, sufficient)
Faceted Search Manual aggregation Built-in with counts Powerful aggregations Typesense (easier setup)
Geo Search Basic with RedisSearch Native geo filtering Advanced geo queries Tie (both sufficient)
Pagination Client-side (inefficient) Server-side, efficient Deep pagination challenges Typesense (better for UI)
Document Size Limit 512MB (Redis limit) 4MB No hard limit All sufficient
Memory Usage 100% in-memory Memory-mapped files JVM heap + filesystem cache Typesense (more efficient)
Query Language String-based, complex Simple JSON Complex JSON/DSL Typesense (developer friendly)
Setup Complexity Medium Low High Typesense
Operational Cost High (all RAM) Low Medium-High Typesense

Specific Use Case Analysis

1. Keyword Counting Within Documents

  • Typesense: Use faceted search with group_by for counting
  • Elasticsearch: Aggregations with terms buckets
  • Winner: Typesense (simpler for this use case)

2. 150k-200k Documents Scale

  • Typesense: Designed for this scale, single-node sufficient
  • Elasticsearch: Would work but overkill for this size
  • Winner: Typesense (right-sized solution)

3. 20-30GB Data Size

  • Typesense: Efficient storage with compression
  • Elasticsearch: Would require 2-3x RAM for performance
  • Winner: Typesense (lower resource requirements)

Why Typesense Over Elasticsearch

  1. Simplicity: Typesense requires minimal configuration vs Elasticsearch's complex cluster setup
  2. Cost: Single-node Typesense can handle your scale; Elasticsearch would need a cluster
  3. Developer Experience: Simple API, no query DSL learning curve
  4. Performance: Built-in caching and optimization for your document size
  5. Maintenance: No JVM tuning, shard management, or cluster state issues

When to Choose Elasticsearch Instead

You should consider Elasticsearch only if you need: - Complex aggregations beyond faceting - Custom analyzers for multiple languages - Machine learning features - Log analytics capabilities - Multi-tenant isolation

None of these appear necessary for the current use case.

Implementation Recommendations

1. Quick Wins

  • Start with search operations (highest pain point)
  • Implement faceted search for filtering
  • Enable typo tolerance for better UX

2. Architecture Guidelines

  • Keep Redis for real-time caching (short TTL items)
  • Use Typesense for all searchable data
  • Implement circuit breakers for failover

3. Performance Optimization

  • Use Typesense's built-in caching
  • Implement search-as-you-type for instant results
  • Leverage facet counts for UI filters

4. Migration Checklist

  • [ ] Set up Typesense development instance
  • [ ] Create collection schemas
  • [ ] Implement adapter interface
  • [ ] Add monitoring and metrics
  • [ ] Create data migration scripts
  • [ ] Test search relevance
  • [ ] Implement fallback mechanism
  • [ ] Plan rollback strategy

Azure Cost Estimation

Typesense Deployment Requirements

For 20-30GB data with 150,000-200,000 documents:

  • VM Type: Standard D4s v5 (4 vCPUs, 16 GB RAM)
  • Storage: 128 GB Premium SSD (P10)
  • Backup: Azure Backup for VMs
  • Monitoring: Azure Monitor

Monthly Cost Breakdown

Component Specification Monthly Cost (USD)
Compute D4s v5 (4 vCPUs, 16GB RAM) $140
Storage 128 GB Premium SSD $20
Backup Daily snapshots, 30-day retention $25
Bandwidth 100 GB outbound $9
Azure Monitor Basic metrics + logs $15
Total $209/month

Alternative Configurations

High Performance Option (D8s v5) - 8 vCPUs, 32 GB RAM - Better for complex queries - Total: ~$350/month

Budget Option (B4ms) - 4 vCPUs, 16 GB RAM (burstable) - Good for development/staging - Total: ~$120/month

Cost Comparison with Current Redis

Current Redis Setup (Estimated)

  • Azure Cache for Redis Premium P3: 26 GB RAM
  • Monthly cost: ~$1,100/month
  • No persistence, pure in-memory

Typesense Advantages

  • 83% cost reduction ($209 vs $1,100)
  • Persistent storage included
  • Better scalability with disk-based indices
  • Lower operational overhead

Additional Considerations

  1. High Availability Setup (Optional)
  2. Add secondary node: +$209/month
  3. Azure Load Balancer: +$25/month
  4. Total HA setup: ~$443/month

  5. Managed Container Option

  6. Azure Container Instances: ~$180/month
  7. Simpler deployment, auto-scaling

  8. Development Environment

  9. Use B2ms (2 vCPUs, 8 GB): $60/month
  10. Sufficient for testing with subset of data

ROI Calculation

  • Annual Savings: ($1,100 - $209) × 12 = $10,692/year
  • Migration Cost: ~160 hours × $150/hour = $24,000
  • Payback Period: 2.2 years
  • 5-Year TCO Savings: $29,460

Secure Typesense Access Patterns

Current Security Architecture Analysis

The AgentPortal BFF currently implements a comprehensive security model:

  • Azure AD B2C Authentication: OAuth 2.0 with scope-based validation (App.Access, Service)
  • Role-Based Authorization: Three-tier access (Reader, Contributor, Admin)
  • BFF Shielding Pattern: All external services accessed through BFF layer with no direct client exposure
  • Service-to-Service Authentication: Client credentials flow with token caching
  • User Context Propagation: On-behalf-of headers maintain user identity across service calls

Architecture

Client -> Azure AD B2C -> AgentPortal BFF -> Typesense
                                        -> Other APIs (CRM, Profile, etc.)

Implementation

public class TypesenseAdapter : ITypesenseAdapter
{
    private readonly ITypesenseClient _client;
    private readonly IUserPrincipalContextAccessor _userContext;

    public async Task<SearchResult<T>> SearchAsync<T>(
        string collection, 
        SearchParameters parameters)
    {
        // Apply user-context based filtering
        var userContext = _userContext.GetUserPrincipal();
        parameters.FilterBy = ApplyUserBasedFilters(parameters.FilterBy, userContext);

        // Execute search with proper error handling
        return await _client
            .Collections[collection]
            .Documents
            .Search<T>(parameters);
    }

    private string ApplyUserBasedFilters(string existingFilter, UserPrincipal user)
    {
        // Implement row-level security based on user role/territory
        var userFilters = new List<string>();

        if (user.Role == AppRoles.Reader)
        {
            userFilters.Add($"territory:=[{string.Join(",", user.AllowedTerritories)}]");
        }

        if (user.BusinessUnit != "All")
        {
            userFilters.Add($"business_unit:={user.BusinessUnit}");
        }

        return string.IsNullOrEmpty(existingFilter) 
            ? string.Join(" && ", userFilters)
            : $"({existingFilter}) && {string.Join(" && ", userFilters)}";
    }
}

Security Benefits

  • Consistent Authentication: Leverages existing Azure AD B2C integration
  • Row-Level Security: BFF applies user-context based filters before queries
  • API Gateway Pattern: Single point for security enforcement and monitoring
  • Audit Trail: All searches logged with user context
  • Rate Limiting: BFF can implement per-user rate limiting
  • Data Transformation: Sensitive fields filtered based on user role

Configuration Security

// In Program.cs - Typesense client configuration
services.AddSingleton<ITypesenseClient>(provider =>
{
    var configuration = provider.GetService<IConfiguration>();
    return new TypesenseClient(new Config
    {
        ApiKey = configuration["Typesense:AdminApiKey"], // Admin key for BFF
        Nodes = new List<Node>
        {
            new(configuration["Typesense:Host"], "443", "https")
        },
        ConnectionTimeoutSeconds = 10,
        HealthcheckIntervalSeconds = 60,
        NumRetries = 3
    });
});

Option 2: Direct Access with Authentication Layer

Architecture

Client -> Azure AD B2C -> Typesense Proxy/Gateway -> Typesense
                      -> AgentPortal BFF (for other operations)

Implementation Considerations

Custom Typesense Proxy Service

[ApiController]
[Route("api/search")]
public class TypesenseProxyController : ControllerBase
{
    private readonly ITypesenseClient _client;

    [HttpPost("consultants")]
    [Authorize(Scopes = "App.Access")]
    public async Task<IActionResult> SearchConsultants([FromBody] SearchRequest request)
    {
        // Validate user permissions
        var userContext = HttpContext.GetUserPrincipal();
        if (!IsAuthorizedForSearch(userContext, request))
        {
            return Forbid();
        }

        // Apply security filters
        request.Parameters.FilterBy = ApplySecurityFilters(
            request.Parameters.FilterBy, 
            userContext);

        // Execute search
        var result = await _client
            .Collections["consultants"]
            .Documents
            .Search<ConsultantProfile>(request.Parameters);

        return Ok(result);
    }
}

Typesense API Key Strategy - Search-Only Keys: Create scoped API keys per user/role - Collection-Level Permissions: Restrict access to specific collections - Filter Presets: Pre-configure filters for user roles

// Typesense scoped key creation
const searchOnlyKey = await client.keys().create({
  "description": "Search-only key for Reader role",
  "actions": ["documents:search"],
  "collections": ["consultants", "sales_requests"],
  "filter_by": "territory:=[DK,SE,NO]", // Pre-applied filter
  "expires_at": Math.floor(Date.now() / 1000) + 86400 // 24 hours
});

Security Challenges with Direct Access

  1. API Key Management: Complex key rotation and distribution
  2. Filter Bypass: Risk of client-side filter manipulation
  3. Rate Limiting: Harder to implement per-user limits
  4. Audit Complexity: Distributed logging across services
  5. Data Leakage: Potential exposure of restricted fields

Security Comparison Matrix

Aspect BFF Shielding Direct Access Recommendation
Authentication Complexity Simple (existing) Complex (new proxy) BFF Shielding
Row-Level Security Server-enforced Client-dependent BFF Shielding
API Key Security Single admin key Multiple scoped keys BFF Shielding
Performance Additional hop Direct connection Direct Access
Caching Strategy BFF-level caching Client-side only BFF Shielding
Monitoring Centralized Distributed BFF Shielding
Development Complexity Low (existing pattern) High (new service) BFF Shielding

1. Maintain BFF Pattern with Enhanced Security

public class SecureTypesenseService
{
    public async Task<SearchResult<T>> SecureSearchAsync<T>(
        string collection,
        SearchParameters parameters,
        UserPrincipal user)
    {
        // 1. Validate user permissions for collection
        if (!IsAuthorizedForCollection(user, collection))
            throw new UnauthorizedAccessException();

        // 2. Apply row-level security filters
        parameters.FilterBy = ApplyRowLevelSecurity(parameters.FilterBy, user);

        // 3. Execute search with audit logging
        var result = await _typesenseClient.SearchAsync<T>(collection, parameters);

        // 4. Filter sensitive fields based on user role
        return FilterSensitiveData(result, user.Role);

        // 5. Log search for audit trail
        await _auditLogger.LogSearchAsync(user.Id, collection, parameters);
    }
}

2. Network Security

  • Private Networking: Deploy Typesense in Azure VNet with private endpoints
  • Firewall Rules: Restrict access to BFF subnet only
  • TLS Encryption: Force HTTPS for all communications

3. Data Security

  • Field-Level Encryption: Encrypt sensitive profile fields at rest
  • Backup Encryption: Azure Backup with customer-managed keys
  • Key Rotation: Automated API key rotation every 90 days

4. Monitoring and Compliance

  • Search Audit Logs: All searches logged with user context
  • Performance Monitoring: Query performance and rate limiting
  • GDPR Compliance: Data retention and deletion policies

Migration Security Checklist

  • [ ] Configure Typesense with admin API key in Azure Key Vault
  • [ ] Implement row-level security filters in BFF layer
  • [ ] Set up private networking and firewall rules
  • [ ] Create audit logging for all search operations
  • [ ] Implement field-level access control
  • [ ] Configure automated backup with encryption
  • [ ] Set up monitoring and alerting
  • [ ] Document security architecture changes
  • [ ] Conduct security review with stakeholders
  • [ ] Plan security testing and penetration testing

Conclusion

Typesense is the recommended solution for replacing Redis in this use case. It provides:

  1. Native full-text search eliminating current workarounds
  2. Efficient pagination removing client-side processing
  3. Built-in faceting for keyword counting and filtering
  4. 83% lower operational costs on Azure infrastructure
  5. Simpler implementation compared to Elasticsearch
  6. Secure BFF pattern maintaining existing authentication architecture

The migration can be completed in 4 weeks with minimal risk using the parallel implementation strategy outlined above, with significant long-term cost savings while maintaining enterprise-grade security.