Redis to Typesense Migration Report¶
Executive Summary¶
This report analyzes the current Redis implementation in the AgentPortal BFF application and evaluates migrating to Typesense. With projected data growth to 20-30GB and 150,000-200,000 documents requiring full-text search, pagination, and keyword counting capabilities, the current Redis setup shows significant strain. This analysis provides a comprehensive migration strategy and compares Typesense with Elasticsearch to ensure the best technology choice.
Current Redis Usage Analysis¶
Data Volume and Structure¶
The application currently stores multiple entity types in Redis using RedisJSON:
| Entity Type | Key Pattern | Typical Document Size | TTL |
|---|---|---|---|
| Consultant Profiles | BFF:consultant:{entityId} |
5-10 KB | None |
| Sales Requests | BFF:request:{matchId} |
3-5 KB | 5 minutes |
| Match Results | BFF:r2cresult:{matchId}:{iterationId} |
10-15 KB | 48 hours |
| Declarations of Interest | BFF:dois:{entityId} |
1-2 KB | None |
| Map Locations | BFF:mapLocation:{locationId} |
500 bytes | None |
Redis Operations and Patterns¶
1. Basic Key Operations¶
// Current implementation in RedisAdapter.cs
KeyExists(key)
KeyDelete(keys)
KeyExpire(key, expiry)
KeyDeleteAsync(keys)
2. JSON Document Operations¶
// Complex object storage
JSON.SET(key, "$", document)
JSON.GET(key, paths)
JSON.ARRAPPEND(key, path, values)
JSON.TYPE(key, path)
3. Search Operations (RedisSearch)¶
// Index creation
FT.CREATE idx-bff-consultant ON JSON PREFIX 1 BFF:consultant: SCHEMA
$.Fullname AS Fullname TEXT
$.Territory AS Territory TAG
$.OfficeLocation AS OfficeLocation TAG
$.BusinessUnit AS BusinessUnit TAG
$.AvailableFromDate AS AvailableFromDate NUMERIC
$.MapGeoLocation AS MapGeoLocation GEO
// Search queries
FT.SEARCH idx-bff-consultant "@MatchId:{123} @Status:{Lead | 7N Consultant}"
Current Pain Points¶
- Complex Query Building: Extensive string manipulation required for search queries
- Manual Pagination: Fetching all results (500+ items) then paginating in-memory
- No Native Full-Text Search: Using workarounds with flattened text fields
- Performance Issues: Large result sets causing memory pressure
- Limited Analytics: No native support for counting keyword occurrences
- Error-Prone Operations: Frequent try-catch blocks for Redis operations
Drop-in Replacement Strategy¶
Phase 1: Parallel Implementation (Weeks 1-2)¶
-
Install Typesense alongside Redis
// New ITypesenseAdapter interface matching IRedisAdapter public interface ITypesenseAdapter { Task<T> GetDocumentAsync<T>(string collection, string id); Task UpsertDocumentAsync<T>(string collection, string id, T document); Task<SearchResult<T>> SearchAsync<T>(string collection, SearchParameters parameters); } -
Create Adapter Layer
// Wrapper to maintain existing interfaces public class TypesenseRedisAdapter : IRedisAdapter { private readonly ITypesenseAdapter _typesense; private readonly IRedisAdapter _redis; private readonly bool _useTypesenseForSearch; // Gradually migrate operations }
Phase 2: Data Migration (Week 3)¶
- Dual Write Strategy
- Write to both Redis and Typesense
- Read from Redis (primary) with Typesense validation
-
Monitor data consistency
-
Collection Schema Definition
{ "name": "consultants", "fields": [ {"name": "id", "type": "string"}, {"name": "fullname", "type": "string", "facet": false}, {"name": "territory", "type": "string[]", "facet": true}, {"name": "office_location", "type": "string", "facet": true}, {"name": "business_unit", "type": "string", "facet": true}, {"name": "available_from", "type": "int64"}, {"name": "location", "type": "geopoint"}, {"name": "profile_text", "type": "string", "facet": false} ], "default_sorting_field": "available_from" }
Phase 3: Cutover (Week 4)¶
- Switch read operations to Typesense
- Remove Redis write operations
- Decommission Redis indexes
Current Redis Implementation Analysis¶
Redis Index Structure and Profile Combination Logic¶
The current implementation uses a sophisticated Redis index idx-bff-consultant that combines consultant profiles with match iterations:
Index Schema¶
FT.CREATE idx-bff-consultant ON JSON
PREFIX 1 BFF:consultant:
SCHEMA
$.ContactDto.Fullname AS Fullname TEXT NOSTEM
$.ProfileLevel2.MatchingType AS ProfileMatchingType NUMERIC
$.EntityId AS EntityId TAG
$.Filters.Territory AS Territory TAG
$.Filters.OfficeLocation AS OfficeLocation TAG
$.Filters.BusinessUnit AS BusinessUnit TAG
$.Filters.WorkedWith7N AS WorkedWith7N TAG
$.Filters.RelationStatus AS RelationStatus TAG
$.Filters.Status as Status TAG
$.Filters.CalculatedAvailabilityDate as CalculatedAvailabilityDate NUMERIC
$.Filters.MapLocationId AS MapLocationId TAG
$.Filters.MapGeoLocation AS MapGeoLocation GEO
$.Matches[*].MatchId AS MatchId TAG
$.Matches[*].IterationId AS IterationId NUMERIC
$.Matches[*].AverageSimilarity AS AverageSimilarity NUMERIC SORTABLE
$.Matches[*].MatchingType AS MatchingType NUMERIC
$.Flattened.AggregateRoot AS FlattenedAggregateRoot TEXT
Data Structure Analysis¶
ConsultantWrapper (Redis Document)
public class ConsultantWrapper
{
public Guid EntityId { get; set; }
public FreelancerAggregateRoot FreelancerAggregateRoot { get; set; }
public Flattened Flattened { get; set; } = new Flattened();
public ContactDto ContactDto { get; set; }
public Filters Filters { get; set; } = new();
public ProfileLevel2 ProfileLevel2 { get; set; }
public List<Match> Matches = []; // Key: Multiple matches per consultant
}
Match Structure
public class Match
{
public uint IterationId { get; set; }
public Guid MatchId { get; set; }
public double AverageSimilarity { get; set; }
public long CreationTime { get; set; }
public int TTL { get; set; }
public MatchingType MatchingType { get; set; }
public Dictionary<string, int>? SkillsOverlap { get; set; }
}
GetProfiles Method Implementation¶
The GetProfiles(matchId, iterationId, matchingType, options) method implements complex profile combination logic:
Step 1: Query Construction¶
private Query BuildQuery(RedisFilteringOptions options, Guid matchId, uint iterationId, MatchingType matchingType)
{
var query = "@MatchId:{$mId} @IterationId:[$itId $itId] @MatchingType:[$mType $mType]";
// Add keyword search
if (options.Keywords.Any())
{
foreach (var keyword in options.Keywords)
{
query += $"@FlattenedAggregateRoot:\"{EscapeRedisText(keyword)}\" ";
}
}
// Add additional filters (status, territory, availability, etc.)
// ... filtering logic
return new Query(query)
.AddParam("mId", matchId.ToString())
.AddParam("itId", iterationId.ToString())
.AddParam("mType", (int)matchingType);
}
Step 2: Paginated Retrieval with Deduplication¶
private async Task<List<MatchedCandidate>> GetProfiles(
Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
{
List<ProfileAndMatches> all = [];
var offset = 0;
var pagesize = 500; // Fixed batch size
var hashes = new HashSet<Guid>(); // Deduplication
// Initial query
var q = BuildQuery(options, matchId, iterationId, matchingType);
var (profiles, totalResults) = await SearchQueryAsync(q, matchId, iterationId, offset, pagesize);
// Add to deduplication set
profiles.ForEach(p => {
hashes.Add(p.Profile!.Id);
all.Add(p);
});
// Paginate through all results
while (totalResults > (offset + pagesize))
{
offset += pagesize;
(profiles, totalResults) = await SearchQueryAsync(q, matchId, iterationId, offset, pagesize);
// Skip duplicates
profiles.ForEach(p => {
if (hashes.Contains(p.Profile!.Id)) return;
hashes.Add(p.Profile!.Id);
all.Add(p);
});
}
// Final processing and sorting
return all
.Where(m => m.Match != null && m.Profile != null)
.Where(m => m.Match!.MatchingType == matchingType)
.OrderByDescending(m => m.Match!.AverageSimilarity)
.Select(m => new MatchedCandidate(m.Profile!, m.Match!.SkillsOverlap))
.ToList();
}
Step 3: Profile-Match Combination¶
private async Task<(List<ProfileAndMatches>, long)> SearchQueryAsync(
Query q, Guid matchId, uint iterationId, int offset, int limit)
{
var d = await ft.SearchAsync(IndexConsultantMatchAndIterationId, q.Limit(offset, limit));
var profiles = d.Documents.Select(d =>
new ProfileAndMatches()
{
Profile = JsonSerializer.Deserialize<List<ProfileLevel2>>(d["$.ProfileLevel2"])
.FirstOrDefault(),
Match = JsonSerializer.Deserialize<List<List<Match>>>(d["$.Matches"])
.SelectMany(l => l)
.FirstOrDefault(m => m.MatchId == matchId && m.IterationId == iterationId)
}).Where(p => p.Profile != null).DistinctBy(p => p.Profile?.Id).ToList();
return (profiles, d.TotalResults);
}
Keyword Search Implementation¶
Flattened Aggregate Root Construction¶
The FlattenedAggregateRoot field contains searchable text extracted from:
- CV Title and Name
- Profile Summary
- All position titles, customers, descriptions
- All skills (category names and skill names)
- All project titles, roles, descriptions
- Project account names and project names
Keyword Query Building¶
private string BuildKeywordsQuery(RedisFilteringOptions options, string query)
{
if (options.Keywords.Any())
{
foreach (var keyword in options.Keywords)
{
query += $"@FlattenedAggregateRoot:\"{EscapeRedisText(keyword)}\" ";
}
// Generates: @FlattenedAggregateRoot:"Azure" @FlattenedAggregateRoot:"AWS"
}
return query;
}
Current Pain Points in Profile Combination¶
- In-Memory Deduplication: HashSet used to prevent duplicate profiles across pages
- Full Result Set Retrieval: Must fetch all matches to apply proper sorting
- Complex Array Filtering: Searching within
$.Matches[*]arrays with multiple conditions - Manual Pagination: Client-side pagination after fetching all results
- Performance Degradation: 500+ item batches cause memory pressure
Redis to Typesense Operation Mapping¶
Key-Value Operations¶
| Redis Operation | Typesense Equivalent | Notes |
|---|---|---|
JSON.SET key $ doc |
documents.upsert() |
Automatic indexing |
JSON.GET key $ |
documents.retrieve() |
Direct document fetch |
KeyExists |
documents.retrieve() with error handling |
Check 404 response |
KeyDelete |
documents.delete() |
Immediate removal |
KeyExpire |
Not needed | Use document timestamp field |
Search Operations¶
| Redis Pattern | Typesense Implementation |
|---|---|
@Fullname:John* |
q=John&query_by=fullname&prefix=true |
@Territory:{DK\|PH} |
filter_by=territory:=[DK,PH] |
@AvailableFromDate:[0 $date] |
filter_by=available_from:<=$date |
@MapGeoLocation:[$lng $lat $dist km] |
filter_by=location:($lat,$lng,$dist km) |
LIMIT $offset $limit |
page=$page&per_page=$limit |
Complex Query Example¶
Redis Query:
FT.SEARCH idx-bff-consultant
"@MatchId:{123} @IterationId:[456 456] @Status:{Lead | 7N Consultant}"
LIMIT 0 100
Typesense Query:
{
"searches": [{
"collection": "consultants",
"q": "*",
"filter_by": "match_id:=123 && iteration_id:=456 && status:=[Lead,7N Consultant]",
"per_page": 100,
"page": 1
}]
}
Typesense Query Implementation Guide¶
Documentation References¶
- Search API Reference: https://typesense.org/docs/0.25.0/api/search.html
- Filtering Documentation: https://typesense.org/docs/0.25.0/api/search.html#filter-parameters
- Geo Search Guide: https://typesense.org/docs/0.25.0/api/geosearch.html
- Faceted Search: https://typesense.org/docs/0.25.0/api/search.html#facet-results
- Pagination Guide: https://typesense.org/docs/0.25.0/api/search.html#pagination
- Multi-Search: https://typesense.org/docs/0.25.0/api/federated-multi-search.html
Query Implementation Examples¶
1. Full-Text Search with Typo Tolerance¶
// Reference: https://typesense.org/docs/0.25.0/api/search.html#typo-tolerance
var searchParameters = new SearchParameters
{
Q = "John Doe", // Search query
QueryBy = "fullname,profile_text", // Fields to search
TypoTokensThreshold = 2, // Minimum word length for typo tolerance
NumTypos = 2, // Number of typos to tolerate
Prefix = true // Enable prefix search
};
2. Filtering and Faceting¶
// Reference: https://typesense.org/docs/0.25.0/api/search.html#filter-parameters
var searchParameters = new SearchParameters
{
Q = "*", // Wildcard to return all documents
FilterBy = "territory:=[DK,SE,NO] && business_unit:=Technology && available_from:<=1704067200",
FacetBy = "territory,office_location,business_unit,status",
MaxFacetValues = 100
};
3. Geo-Location Search¶
// Reference: https://typesense.org/docs/0.25.0/api/geosearch.html
var searchParameters = new SearchParameters
{
Q = "*",
FilterBy = "location:(55.6761, 12.5683, 50 km)", // Copenhagen center, 50km radius
SortBy = "_geo_distance(location: 55.6761, 12.5683):asc"
};
4. Pagination with Sorting¶
// Reference: https://typesense.org/docs/0.25.0/api/search.html#ranking-and-sorting
var searchParameters = new SearchParameters
{
Q = "*",
Page = 2,
PerPage = 50,
SortBy = "available_from:asc,_text_match:desc" // Sort by date, then relevance
};
5. Keyword Counting with Group By¶
// Reference: https://typesense.org/docs/0.25.0/api/search.html#grouping
var searchParameters = new SearchParameters
{
Q = "cloud computing",
QueryBy = "profile_text",
GroupBy = "business_unit",
GroupLimit = 3, // Top 3 results per group
FacetBy = "skills", // Count occurrences of skills
MaxFacetValues = 50
};
C# Client Implementation¶
// Using official Typesense C# client
// Reference: https://github.com/typesense/typesense-dotnet
using Typesense;
public class TypesenseSearchService
{
private readonly ITypesenseClient _client;
public TypesenseSearchService(string apiKey, string[] nodes)
{
_client = new TypesenseClient(new Config
{
ApiKey = apiKey,
Nodes = nodes.Select(n => new Node(n, "443", "https")).ToList()
});
}
public async Task<SearchResult<T>> SearchAsync<T>(
string collection,
SearchParameters parameters)
{
return await _client
.Collections[collection]
.Documents
.Search<T>(parameters);
}
}
## Typesense Data Structure Redesign
### Problem with Current Redis Approach
The current Redis implementation stores multiple matches per consultant in `$.Matches[]` arrays, leading to:
1. **Complex array queries** requiring `$.Matches[*].MatchId` filtering
2. **Deduplication overhead** when consultants match multiple iterations
3. **In-memory sorting** of large result sets
4. **Inefficient pagination** requiring full result retrieval
### Recommended Typesense Approach: Flattened Match Documents
Instead of storing matches as nested arrays, create individual documents for each consultant-match combination:
#### New Data Structure
**Collection: `consultant_matches`**
```javascript
{
"name": "consultant_matches",
"fields": [
// Match-specific fields (primary keys)
{"name": "id", "type": "string"}, // "{consultantId}_{matchId}_{iterationId}"
{"name": "match_id", "type": "string", "index": true},
{"name": "iteration_id", "type": "int32", "index": true},
{"name": "matching_type", "type": "int32", "facet": true},
{"name": "average_similarity", "type": "float", "sort": true},
{"name": "creation_time", "type": "int64"},
{"name": "skills_overlap", "type": "object", "index": false},
// Consultant profile fields (denormalized)
{"name": "consultant_id", "type": "string", "index": true},
{"name": "entity_id", "type": "string", "index": true},
{"name": "fullname", "type": "string", "facet": false},
{"name": "territory", "type": "string[]", "facet": true},
{"name": "office_location", "type": "string", "facet": true},
{"name": "business_unit", "type": "string", "facet": true},
{"name": "worked_with_7n", "type": "bool", "facet": true},
{"name": "relation_status", "type": "string", "facet": true},
{"name": "status", "type": "string", "facet": true},
{"name": "availability_date", "type": "int64", "sort": true},
{"name": "location", "type": "geopoint", "index": true},
{"name": "address_country_id", "type": "string", "facet": true},
// Full-text search field
{"name": "profile_text", "type": "string", "facet": false}
],
"default_sorting_field": "average_similarity"
}
Example Document¶
{
"id": "12345_67890_101",
"match_id": "67890",
"iteration_id": 101,
"matching_type": 0,
"average_similarity": 0.85,
"creation_time": 1704067200,
"skills_overlap": {"Azure": 5, "C#": 8, "React": 3},
"consultant_id": "12345",
"entity_id": "abcd-1234-efgh-5678",
"fullname": "John Doe",
"territory": ["DK", "SE"],
"office_location": "Copenhagen",
"business_unit": "Technology",
"worked_with_7n": true,
"relation_status": "Active",
"status": "Available",
"availability_date": 1704067200,
"location": [55.6761, 12.5683],
"address_country_id": "DK",
"profile_text": "Senior Full Stack Developer with expertise in Azure, C#, React..."
}
Migration Strategy¶
Phase 1: Create Typesense Collection and Populate Data¶
public class TypesenseMigrationService
{
public async Task MigrateConsultantMatches()
{
// 1. Create collection schema
await CreateConsultantMatchesCollection();
// 2. Read all Redis consultant documents
var consultants = await GetAllRedisConsultants();
// 3. Flatten and insert into Typesense
foreach (var consultant in consultants)
{
var documents = FlattenConsultantMatches(consultant);
await _typesenseClient.Collections["consultant_matches"]
.Documents.Import(documents, ImportType.Upsert);
}
}
private List<ConsultantMatchDocument> FlattenConsultantMatches(ConsultantWrapper consultant)
{
var documents = new List<ConsultantMatchDocument>();
foreach (var match in consultant.Matches)
{
documents.Add(new ConsultantMatchDocument
{
Id = $"{consultant.EntityId}_{match.MatchId}_{match.IterationId}",
MatchId = match.MatchId.ToString(),
IterationId = (int)match.IterationId,
MatchingType = (int)match.MatchingType,
AverageSimilarity = (float)match.AverageSimilarity,
CreationTime = match.CreationTime,
SkillsOverlap = match.SkillsOverlap,
// Denormalized consultant fields
ConsultantId = consultant.EntityId.ToString(),
EntityId = consultant.EntityId.ToString(),
Fullname = consultant.ContactDto.Fullname,
Territory = consultant.Filters.Territory,
OfficeLocation = consultant.Filters.OfficeLocation,
BusinessUnit = consultant.Filters.BusinessUnit,
WorkedWith7N = consultant.Filters.WorkedWith7N,
RelationStatus = consultant.Filters.RelationStatus,
Status = consultant.Filters.Status,
AvailabilityDate = consultant.Filters.CalculatedAvailabilityDate,
Location = consultant.Filters.MapGeoLocation,
AddressCountryId = consultant.Filters.AddressCountryId,
ProfileText = consultant.Flattened.AggregateRoot
});
}
return documents;
}
}
Phase 2: Replace GetProfiles Implementation¶
public class TypesenseAdapter : ITypesenseAdapter
{
public async Task<List<MatchedCandidate>> GetProfiles(
Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
{
var searchParameters = new SearchParameters
{
Q = string.IsNullOrEmpty(options.SearchString) ? "*" : options.SearchString,
QueryBy = "profile_text,fullname",
FilterBy = BuildTypesenseFilter(matchId, iterationId, matchingType, options),
SortBy = "average_similarity:desc",
Page = (options.Offset / options.Limit) + 1,
PerPage = options.Limit,
MaxFacetValues = 1000
};
// Add keyword search if specified
if (options.Keywords.Any())
{
searchParameters.Q = string.Join(" ", options.Keywords);
searchParameters.QueryBy = "profile_text";
}
var result = await _typesenseClient
.Collections["consultant_matches"]
.Documents
.Search<ConsultantMatchDocument>(searchParameters);
return result.Hits.Select(hit => new MatchedCandidate(
MapToProfileLevel2(hit.Document),
hit.Document.SkillsOverlap
)).ToList();
}
private string BuildTypesenseFilter(
Guid matchId, uint iterationId, MatchingType matchingType, RedisFilteringOptions options)
{
var filters = new List<string>
{
$"match_id:={matchId}",
$"iteration_id:={iterationId}",
$"matching_type:={(int)matchingType}"
};
if (options.WorkedWith7N.HasValue)
filters.Add($"worked_with_7n:={options.WorkedWith7N.Value}");
if (options.Statuses.Any())
filters.Add($"status:=[{string.Join(",", options.Statuses.Select(s => $"\"{s}\""))}]");
if (options.RelationStatuses.Any())
filters.Add($"relation_status:=[{string.Join(",", options.RelationStatuses.Select(s => $"\"{s}\""))}]");
if (options.AvailabilityRange != null)
filters.Add($"availability_date:>={options.AvailabilityRange.Start.Value} && availability_date:<={options.AvailabilityRange.End.Value}");
if (options.GeoLocations.Any())
{
var geoFilters = options.GeoLocations.Select(geo =>
$"location:({geo.Latitude}, {geo.Longitude}, {geo.RadiusKm} km)");
filters.Add($"({string.Join(" || ", geoFilters)})");
}
return string.Join(" && ", filters);
}
}
Benefits of Flattened Approach¶
Performance Improvements¶
- Native Pagination: Typesense handles pagination server-side
- Efficient Sorting: Built-in sorting by
average_similarity - No Deduplication: Each document represents a unique consultant-match pair
- Faster Filtering: Direct field filtering instead of array queries
Query Simplification¶
// Old Redis Query
"@MatchId:{67890} @IterationId:[101 101] @MatchingType:[0 0] @FlattenedAggregateRoot:\"Azure\""
// New Typesense Query
{
"q": "Azure",
"query_by": "profile_text",
"filter_by": "match_id:=67890 && iteration_id:=101 && matching_type:=0",
"sort_by": "average_similarity:desc",
"per_page": 50,
"page": 1
}
Scalability Benefits¶
- Horizontal Scaling: Typesense can shard across multiple nodes
- Memory Efficiency: Not all data needs to be in memory
- Index Optimization: Purpose-built for search operations
- Concurrent Queries: Better handling of multiple simultaneous searches
Data Consistency Strategy¶
Dual-Write Pattern During Migration¶
public async Task UpdateConsultantMatch(ConsultantWrapper consultant, Match match)
{
// Update Redis (existing)
await _redisAdapter.UpdateConsultantMatch(consultant, match);
// Update Typesense (new)
var document = new ConsultantMatchDocument
{
Id = $"{consultant.EntityId}_{match.MatchId}_{match.IterationId}",
// ... populate fields
};
await _typesenseClient
.Collections["consultant_matches"]
.Documents
.Upsert(document.Id, document);
}
Alternative Collection Structure (Optional)¶
For even better query performance, consider separate collections:
consultant_profiles - Static profile data match_results - Dynamic match results with consultant references
This approach reduces data duplication but requires join operations.
Typesense vs Elasticsearch Comparison¶
Feature Comparison for Current Use Case¶
| Feature | Redis (Current) | Typesense | Elasticsearch | Winner for Use Case |
|---|---|---|---|---|
| Full-Text Search | Workarounds with flattened fields | Native with typo tolerance | Advanced with analyzers | Typesense (simpler, sufficient) |
| Faceted Search | Manual aggregation | Built-in with counts | Powerful aggregations | Typesense (easier setup) |
| Geo Search | Basic with RedisSearch | Native geo filtering | Advanced geo queries | Tie (both sufficient) |
| Pagination | Client-side (inefficient) | Server-side, efficient | Deep pagination challenges | Typesense (better for UI) |
| Document Size Limit | 512MB (Redis limit) | 4MB | No hard limit | All sufficient |
| Memory Usage | 100% in-memory | Memory-mapped files | JVM heap + filesystem cache | Typesense (more efficient) |
| Query Language | String-based, complex | Simple JSON | Complex JSON/DSL | Typesense (developer friendly) |
| Setup Complexity | Medium | Low | High | Typesense |
| Operational Cost | High (all RAM) | Low | Medium-High | Typesense |
Specific Use Case Analysis¶
1. Keyword Counting Within Documents¶
- Typesense: Use faceted search with
group_byfor counting - Elasticsearch: Aggregations with terms buckets
- Winner: Typesense (simpler for this use case)
2. 150k-200k Documents Scale¶
- Typesense: Designed for this scale, single-node sufficient
- Elasticsearch: Would work but overkill for this size
- Winner: Typesense (right-sized solution)
3. 20-30GB Data Size¶
- Typesense: Efficient storage with compression
- Elasticsearch: Would require 2-3x RAM for performance
- Winner: Typesense (lower resource requirements)
Why Typesense Over Elasticsearch¶
- Simplicity: Typesense requires minimal configuration vs Elasticsearch's complex cluster setup
- Cost: Single-node Typesense can handle your scale; Elasticsearch would need a cluster
- Developer Experience: Simple API, no query DSL learning curve
- Performance: Built-in caching and optimization for your document size
- Maintenance: No JVM tuning, shard management, or cluster state issues
When to Choose Elasticsearch Instead¶
You should consider Elasticsearch only if you need: - Complex aggregations beyond faceting - Custom analyzers for multiple languages - Machine learning features - Log analytics capabilities - Multi-tenant isolation
None of these appear necessary for the current use case.
Implementation Recommendations¶
1. Quick Wins¶
- Start with search operations (highest pain point)
- Implement faceted search for filtering
- Enable typo tolerance for better UX
2. Architecture Guidelines¶
- Keep Redis for real-time caching (short TTL items)
- Use Typesense for all searchable data
- Implement circuit breakers for failover
3. Performance Optimization¶
- Use Typesense's built-in caching
- Implement search-as-you-type for instant results
- Leverage facet counts for UI filters
4. Migration Checklist¶
- [ ] Set up Typesense development instance
- [ ] Create collection schemas
- [ ] Implement adapter interface
- [ ] Add monitoring and metrics
- [ ] Create data migration scripts
- [ ] Test search relevance
- [ ] Implement fallback mechanism
- [ ] Plan rollback strategy
Azure Cost Estimation¶
Typesense Deployment Requirements¶
For 20-30GB data with 150,000-200,000 documents:
Recommended Azure VM Configuration¶
- VM Type: Standard D4s v5 (4 vCPUs, 16 GB RAM)
- Storage: 128 GB Premium SSD (P10)
- Backup: Azure Backup for VMs
- Monitoring: Azure Monitor
Monthly Cost Breakdown¶
| Component | Specification | Monthly Cost (USD) |
|---|---|---|
| Compute | D4s v5 (4 vCPUs, 16GB RAM) | $140 |
| Storage | 128 GB Premium SSD | $20 |
| Backup | Daily snapshots, 30-day retention | $25 |
| Bandwidth | 100 GB outbound | $9 |
| Azure Monitor | Basic metrics + logs | $15 |
| Total | $209/month |
Alternative Configurations¶
High Performance Option (D8s v5) - 8 vCPUs, 32 GB RAM - Better for complex queries - Total: ~$350/month
Budget Option (B4ms) - 4 vCPUs, 16 GB RAM (burstable) - Good for development/staging - Total: ~$120/month
Cost Comparison with Current Redis¶
Current Redis Setup (Estimated)¶
- Azure Cache for Redis Premium P3: 26 GB RAM
- Monthly cost: ~$1,100/month
- No persistence, pure in-memory
Typesense Advantages¶
- 83% cost reduction ($209 vs $1,100)
- Persistent storage included
- Better scalability with disk-based indices
- Lower operational overhead
Additional Considerations¶
- High Availability Setup (Optional)
- Add secondary node: +$209/month
- Azure Load Balancer: +$25/month
-
Total HA setup: ~$443/month
-
Managed Container Option
- Azure Container Instances: ~$180/month
-
Simpler deployment, auto-scaling
-
Development Environment
- Use B2ms (2 vCPUs, 8 GB): $60/month
- Sufficient for testing with subset of data
ROI Calculation¶
- Annual Savings: ($1,100 - $209) × 12 = $10,692/year
- Migration Cost: ~160 hours × $150/hour = $24,000
- Payback Period: 2.2 years
- 5-Year TCO Savings: $29,460
Secure Typesense Access Patterns¶
Current Security Architecture Analysis¶
The AgentPortal BFF currently implements a comprehensive security model:
- Azure AD B2C Authentication: OAuth 2.0 with scope-based validation (
App.Access,Service) - Role-Based Authorization: Three-tier access (Reader, Contributor, Admin)
- BFF Shielding Pattern: All external services accessed through BFF layer with no direct client exposure
- Service-to-Service Authentication: Client credentials flow with token caching
- User Context Propagation: On-behalf-of headers maintain user identity across service calls
Option 1: Maintain BFF Shielding Pattern (Recommended)¶
Architecture¶
Client -> Azure AD B2C -> AgentPortal BFF -> Typesense
-> Other APIs (CRM, Profile, etc.)
Implementation¶
public class TypesenseAdapter : ITypesenseAdapter
{
private readonly ITypesenseClient _client;
private readonly IUserPrincipalContextAccessor _userContext;
public async Task<SearchResult<T>> SearchAsync<T>(
string collection,
SearchParameters parameters)
{
// Apply user-context based filtering
var userContext = _userContext.GetUserPrincipal();
parameters.FilterBy = ApplyUserBasedFilters(parameters.FilterBy, userContext);
// Execute search with proper error handling
return await _client
.Collections[collection]
.Documents
.Search<T>(parameters);
}
private string ApplyUserBasedFilters(string existingFilter, UserPrincipal user)
{
// Implement row-level security based on user role/territory
var userFilters = new List<string>();
if (user.Role == AppRoles.Reader)
{
userFilters.Add($"territory:=[{string.Join(",", user.AllowedTerritories)}]");
}
if (user.BusinessUnit != "All")
{
userFilters.Add($"business_unit:={user.BusinessUnit}");
}
return string.IsNullOrEmpty(existingFilter)
? string.Join(" && ", userFilters)
: $"({existingFilter}) && {string.Join(" && ", userFilters)}";
}
}
Security Benefits¶
- Consistent Authentication: Leverages existing Azure AD B2C integration
- Row-Level Security: BFF applies user-context based filters before queries
- API Gateway Pattern: Single point for security enforcement and monitoring
- Audit Trail: All searches logged with user context
- Rate Limiting: BFF can implement per-user rate limiting
- Data Transformation: Sensitive fields filtered based on user role
Configuration Security¶
// In Program.cs - Typesense client configuration
services.AddSingleton<ITypesenseClient>(provider =>
{
var configuration = provider.GetService<IConfiguration>();
return new TypesenseClient(new Config
{
ApiKey = configuration["Typesense:AdminApiKey"], // Admin key for BFF
Nodes = new List<Node>
{
new(configuration["Typesense:Host"], "443", "https")
},
ConnectionTimeoutSeconds = 10,
HealthcheckIntervalSeconds = 60,
NumRetries = 3
});
});
Option 2: Direct Access with Authentication Layer¶
Architecture¶
Client -> Azure AD B2C -> Typesense Proxy/Gateway -> Typesense
-> AgentPortal BFF (for other operations)
Implementation Considerations¶
Custom Typesense Proxy Service
[ApiController]
[Route("api/search")]
public class TypesenseProxyController : ControllerBase
{
private readonly ITypesenseClient _client;
[HttpPost("consultants")]
[Authorize(Scopes = "App.Access")]
public async Task<IActionResult> SearchConsultants([FromBody] SearchRequest request)
{
// Validate user permissions
var userContext = HttpContext.GetUserPrincipal();
if (!IsAuthorizedForSearch(userContext, request))
{
return Forbid();
}
// Apply security filters
request.Parameters.FilterBy = ApplySecurityFilters(
request.Parameters.FilterBy,
userContext);
// Execute search
var result = await _client
.Collections["consultants"]
.Documents
.Search<ConsultantProfile>(request.Parameters);
return Ok(result);
}
}
Typesense API Key Strategy - Search-Only Keys: Create scoped API keys per user/role - Collection-Level Permissions: Restrict access to specific collections - Filter Presets: Pre-configure filters for user roles
// Typesense scoped key creation
const searchOnlyKey = await client.keys().create({
"description": "Search-only key for Reader role",
"actions": ["documents:search"],
"collections": ["consultants", "sales_requests"],
"filter_by": "territory:=[DK,SE,NO]", // Pre-applied filter
"expires_at": Math.floor(Date.now() / 1000) + 86400 // 24 hours
});
Security Challenges with Direct Access¶
- API Key Management: Complex key rotation and distribution
- Filter Bypass: Risk of client-side filter manipulation
- Rate Limiting: Harder to implement per-user limits
- Audit Complexity: Distributed logging across services
- Data Leakage: Potential exposure of restricted fields
Security Comparison Matrix¶
| Aspect | BFF Shielding | Direct Access | Recommendation |
|---|---|---|---|
| Authentication Complexity | Simple (existing) | Complex (new proxy) | BFF Shielding |
| Row-Level Security | Server-enforced | Client-dependent | BFF Shielding |
| API Key Security | Single admin key | Multiple scoped keys | BFF Shielding |
| Performance | Additional hop | Direct connection | Direct Access |
| Caching Strategy | BFF-level caching | Client-side only | BFF Shielding |
| Monitoring | Centralized | Distributed | BFF Shielding |
| Development Complexity | Low (existing pattern) | High (new service) | BFF Shielding |
Recommended Security Implementation¶
1. Maintain BFF Pattern with Enhanced Security¶
public class SecureTypesenseService
{
public async Task<SearchResult<T>> SecureSearchAsync<T>(
string collection,
SearchParameters parameters,
UserPrincipal user)
{
// 1. Validate user permissions for collection
if (!IsAuthorizedForCollection(user, collection))
throw new UnauthorizedAccessException();
// 2. Apply row-level security filters
parameters.FilterBy = ApplyRowLevelSecurity(parameters.FilterBy, user);
// 3. Execute search with audit logging
var result = await _typesenseClient.SearchAsync<T>(collection, parameters);
// 4. Filter sensitive fields based on user role
return FilterSensitiveData(result, user.Role);
// 5. Log search for audit trail
await _auditLogger.LogSearchAsync(user.Id, collection, parameters);
}
}
2. Network Security¶
- Private Networking: Deploy Typesense in Azure VNet with private endpoints
- Firewall Rules: Restrict access to BFF subnet only
- TLS Encryption: Force HTTPS for all communications
3. Data Security¶
- Field-Level Encryption: Encrypt sensitive profile fields at rest
- Backup Encryption: Azure Backup with customer-managed keys
- Key Rotation: Automated API key rotation every 90 days
4. Monitoring and Compliance¶
- Search Audit Logs: All searches logged with user context
- Performance Monitoring: Query performance and rate limiting
- GDPR Compliance: Data retention and deletion policies
Migration Security Checklist¶
- [ ] Configure Typesense with admin API key in Azure Key Vault
- [ ] Implement row-level security filters in BFF layer
- [ ] Set up private networking and firewall rules
- [ ] Create audit logging for all search operations
- [ ] Implement field-level access control
- [ ] Configure automated backup with encryption
- [ ] Set up monitoring and alerting
- [ ] Document security architecture changes
- [ ] Conduct security review with stakeholders
- [ ] Plan security testing and penetration testing
Conclusion¶
Typesense is the recommended solution for replacing Redis in this use case. It provides:
- Native full-text search eliminating current workarounds
- Efficient pagination removing client-side processing
- Built-in faceting for keyword counting and filtering
- 83% lower operational costs on Azure infrastructure
- Simpler implementation compared to Elasticsearch
- Secure BFF pattern maintaining existing authentication architecture
The migration can be completed in 4 weeks with minimal risk using the parallel implementation strategy outlined above, with significant long-term cost savings while maintaining enterprise-grade security.