Skip to content

Cost Optimization Strategy

Overview

This document provides comprehensive cost optimization strategies for the metrics platform, targeting a monthly budget of <$400 while maintaining scalability and performance. The focus is on leveraging Azure's consumption-based pricing and implementing intelligent resource management.

Cost Targets

  • Total Monthly Cost: <$400 (current scale: 1K events/day)
  • Scalability Cost: <$1200 at 100K events/day
  • Cost per Event: <$0.0004 at scale
  • ROI Target: 10x value from analytics insights

Current Cost Breakdown

Detailed Monthly Costs (1K events/day)

pie title Monthly Cost Distribution ($375 total)
    "Event Hubs" : 180
    "Data Factory" : 150
    "Storage (ADLS)" : 15
    "Synapse Serverless" : 10
    "Functions" : 5
    "Monitoring" : 10
    "Network" : 5
Component Configuration Monthly Cost Cost Driver Optimization Potential
Event Hubs 7 namespaces, Standard tier $180 Fixed throughput units High - Auto-scale, Basic tier
Data Factory 3 pipelines, hourly execution $150 Pipeline orchestration Medium - Reduce frequency
ADLS Gen2 2GB hot, 10GB cool, 100GB archive $15 Storage and transactions Low - Already optimized
Synapse Serverless 50GB scanned/month $10 Query volume Medium - Query optimization
Azure Functions 10K executions/month $5 Consumption plan Low - Already optimized
Monitoring Log Analytics, alerts $10 Log retention Low - Retention tuning
Network Data egress, VNet $5 Data transfer Low - Regional placement

Cost Scaling Projections

graph LR
    subgraph "Cost Scaling (Monthly)"
        C1K[1K events/day<br/>$375/month]
        C10K[10K events/day<br/>$450/month]
        C50K[50K events/day<br/>$750/month]
        C100K[100K events/day<br/>$1,200/month]
    end

    subgraph "Per-Event Cost"
        E1K[$0.012/event]
        E10K[$0.0015/event]
        E50K[$0.0005/event]
        E100K[$0.0004/event]
    end

    C1K --> C10K
    C10K --> C50K
    C50K --> C100K

    C1K --> E1K
    C10K --> E10K
    C50K --> E50K
    C100K --> E100K

Event Hubs Optimization

Auto-Scaling Configuration

Current state uses fixed throughput units (TU). Implement auto-scaling to reduce costs during low-traffic periods.

resource "azurerm_eventhub_namespace" "optimized" {
  name                = "crm-events-namespace"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  sku                 = "Standard"
  capacity            = 1  # Start with 1 TU

  # Enable auto-inflate to handle traffic spikes
  auto_inflate_enabled     = true
  maximum_throughput_units = 5  # Scale up to 5 TU when needed

  tags = {
    Environment = "Production"
    CostCenter  = "Analytics"
  }
}

Potential Savings: $50-80/month (30-45% reduction)

Traffic Pattern Analysis

// Analyze Event Hub usage patterns
EventHubMetrics
| where TimeGenerated >= ago(30d)
| where MetricName == "IncomingMessages"
| summarize 
    HourlyMessages = avg(Total),
    PeakMessages = max(Total),
    MinMessages = min(Total)
    by bin(TimeGenerated, 1h)
| extend Hour = hourofday(TimeGenerated)
| summarize 
    AvgHourly = avg(HourlyMessages),
    PeakHourly = max(HourlyMessages)
    by Hour
| order by Hour

Basic Tier Evaluation

For non-critical event hubs, consider Basic tier:

Feature Basic Standard Savings
Throughput Units 1 TU 1-20 TU -
Message Retention 1 day 7 days -
Consumer Groups 1 20 -
Partitions 2 32 -
Cost per TU $11.16 $25.95 57% less
# Cost comparison for 7 namespaces
# Standard: 7 × $25.95 = $181.65/month
# Basic: 7 × $11.16 = $78.12/month
# Savings: $103.53/month (57%)

Data Factory Optimization

Execution Frequency Optimization

Current hourly execution may be excessive for business needs. Analyze data freshness requirements:

graph TD
    subgraph "Current Schedule"
        H1[00:00 - Process]
        H2[01:00 - Process]
        H3[02:00 - Process]
        H24[23:00 - Process]
    end

    subgraph "Optimized Schedule"
        B1[06:00 - Business Hours Start]
        B2[12:00 - Midday Update]
        B3[18:00 - End of Day]
        B4[23:00 - Daily Rollup]
    end

    subgraph "Savings"
        S1[Executions: 24 → 4]
        S2[Cost Reduction: 83%]
        S3[Monthly Savings: ~$125]
    end

Pipeline Consolidation

{
  "name": "ConsolidatedProcessing",
  "properties": {
    "activities": [
      {
        "name": "ProcessAllDomains",
        "type": "ForEach",
        "typeProperties": {
          "items": {
            "value": ["CRM", "Matching", "Profile", "Assessment"],
            "type": "Expression"
          },
          "batchCount": 4,
          "activities": [
            {
              "name": "ProcessDomain",
              "type": "SqlServerStoredProcedure",
              "typeProperties": {
                "storedProcedureName": "ProcessDomainEvents",
                "storedProcedureParameters": {
                  "DomainName": "@item()"
                }
              }
            }
          ]
        }
      }
    ]
  }
}

Potential Savings: $125/month (83% reduction in executions)

Storage Optimization

Lifecycle Policy Tuning

Fine-tune storage lifecycle policies for optimal cost/performance:

{
  "rules": [
    {
      "name": "aggressive-archival",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["raw-events/"]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": {
              "daysAfterModificationGreaterThan": 7
            },
            "tierToArchive": {
              "daysAfterModificationGreaterThan": 30
            },
            "delete": {
              "daysAfterModificationGreaterThan": 2555
            }
          }
        }
      }
    },
    {
      "name": "bronze-optimization",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "prefixMatch": ["bronze/"]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": {
              "daysAfterModificationGreaterThan": 14
            },
            "tierToArchive": {
              "daysAfterModificationGreaterThan": 60
            }
          }
        }
      }
    }
  ]
}

Storage Analytics

// Analyze storage cost distribution
StorageAccount
| where TimeGenerated >= ago(30d)
| where MetricName == "UsedCapacity"
| extend StorageTier = case(
    BlobType == "BlockBlob" and AccessTier == "Hot", "Hot",
    BlobType == "BlockBlob" and AccessTier == "Cool", "Cool",
    BlobType == "BlockBlob" and AccessTier == "Archive", "Archive",
    "Unknown"
)
| summarize 
    AvgSize_GB = avg(Average) / 1024 / 1024 / 1024,
    MaxSize_GB = max(Average) / 1024 / 1024 / 1024
    by StorageTier, bin(TimeGenerated, 1d)
| extend EstimatedCost = case(
    StorageTier == "Hot", AvgSize_GB * 0.0208,
    StorageTier == "Cool", AvgSize_GB * 0.01,
    StorageTier == "Archive", AvgSize_GB * 0.00099,
    0
)

Synapse Serverless Optimization

Query Optimization Strategies

1. Partition Pruning

-- Optimized query with partition elimination
SELECT 
    aggregate_type,
    COUNT(*) as event_count,
    SUM(CAST(JSON_VALUE(payload, '$.value') AS FLOAT)) as total_value
FROM silver.business_events
WHERE year = 2024
  AND month = 1  -- Partition pruning reduces scan by 92%
  AND day >= 15
  AND aggregate_type IN ('Contact', 'Match')  -- Predicate pushdown
GROUP BY aggregate_type;

-- Before optimization: Scans 12 months = 100GB
-- After optimization: Scans 16 days = 5GB
-- Cost savings: 95% reduction in scan costs

2. Column Pruning

-- Only select needed columns to reduce I/O
SELECT 
    event_id,
    timestamp,
    aggregate_type,
    JSON_VALUE(payload, '$.status') as status  -- Extract only needed JSON fields
FROM silver.business_events
WHERE year = 2024 AND month = 1
-- Don't select entire payload column (saves 60% scan volume)

3. Query Result Caching

Enable result caching for frequently executed queries:

-- Enable query result cache (automatic in Synapse)
-- Cached results valid for 24 hours
SELECT 
    aggregate_type,
    COUNT(*) as daily_count
FROM silver.business_events
WHERE CAST(timestamp AS DATE) = CAST(GETDATE() AS DATE)
GROUP BY aggregate_type;

Query Cost Analysis:

graph LR
    subgraph "Query Optimization Impact"
        BEFORE[Before: $50/month<br/>500GB scanned]
        AFTER[After: $10/month<br/>100GB scanned]
        SAVINGS[Savings: $40/month<br/>80% reduction]
    end

    BEFORE --> AFTER
    AFTER --> SAVINGS

Materialized Views for Common Queries

-- Create materialized view for daily aggregations
CREATE MATERIALIZED VIEW gold.daily_summary_mv
AS
SELECT 
    CAST(timestamp AS DATE) as date,
    aggregate_type,
    COUNT(*) as event_count,
    COUNT(DISTINCT aggregate_id) as unique_entities,
    COUNT(DISTINCT user_id) as unique_users
FROM silver.business_events
GROUP BY CAST(timestamp AS DATE), aggregate_type;

-- Auto-refresh materialized view
ALTER MATERIALIZED VIEW gold.daily_summary_mv SET (AUTO_REFRESH = ON);

Resource Scheduling

Off-Peak Processing

Schedule compute-intensive tasks during off-peak hours for discounted rates:

gantt
    title Optimized Processing Schedule
    dateFormat HH:mm
    axisFormat %H:%M

    section Business Hours (Peak Rate)
    Light Monitoring  :active, monitor, 08:00, 18:00

    section Off-Peak Hours (Discounted)
    Heavy Processing  :process, 01:00, 04:00
    Data Archival     :archive, 04:00, 05:00
    Maintenance       :maint, 05:00, 06:00

    section Weekend (Lowest Rate)
    Full Reprocessing :weekend, 02:00, 08:00

Azure Functions Optimization

// Optimize function memory allocation
[FunctionName("ProcessEvents")]
[MemorySize(512)]  // Reduced from default 1024MB
[Timeout("00:05:00")]  // 5-minute timeout
public static async Task<IActionResult> ProcessEvents(
    [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req,
    ILogger log)
{
    // Use connection pooling
    using var connection = connectionPool.Get();

    // Batch processing for efficiency
    var events = await ReadEventsBatch(req, batchSize: 100);

    await ProcessEventsBatch(events);

    return new OkObjectResult($"Processed {events.Count} events");
}

Monitoring and Alerting Optimization

Cost-Aware Alerting

// Alert on unexpected cost increases
AzureBilling
| where TimeGenerated >= ago(7d)
| where ServiceName in ("Event Hubs", "Data Factory", "Storage", "Synapse")
| summarize DailyCost = sum(Cost) by bin(TimeGenerated, 1d), ServiceName
| extend CostChange = DailyCost - prev(DailyCost, 1)
| where CostChange > 10  // Alert if daily cost increases by $10
| project TimeGenerated, ServiceName, DailyCost, CostChange

Optimized Log Retention

{
  "workspaceCapping": {
    "dailyQuotaGb": 1,
    "quotaNextResetTime": "2024-01-01T00:00:00Z"
  },
  "retentionInDays": 30,
  "sku": {
    "name": "PerGB2018"
  }
}

Log Analytics Optimization: - Reduce retention from 90 to 30 days: Save $15/month - Set daily quota to 1GB: Prevent overage charges - Use sampling for non-critical logs: Save $10/month

Development and Testing Cost Management

Separate Environments

graph TD
    subgraph "Production (Optimized)"
        PROD_EH[Event Hubs: Standard]
        PROD_ADF[Data Factory: Full schedule]
        PROD_SYNAPSE[Synapse: Optimized queries]
        PROD_COST[$375/month]
    end

    subgraph "Development (Minimal)"
        DEV_EH[Event Hubs: Basic]
        DEV_ADF[Data Factory: Manual triggers]
        DEV_SYNAPSE[Synapse: Sample data]
        DEV_COST[$50/month]
    end

    subgraph "Testing (Auto-shutdown)"
        TEST_EH[Event Hubs: Basic]
        TEST_ADF[Data Factory: On-demand]
        TEST_SCHEDULE[Auto-shutdown: 18:00-08:00]
        TEST_COST[$25/month]
    end

Auto-Shutdown for Dev/Test

#!/bin/bash
# Auto-shutdown script for development environments

# Stop Event Hub (scale to 0 TU during off-hours)
az eventhubs namespace update \
    --name dev-events-namespace \
    --resource-group rg-dev \
    --capacity 0

# Pause Data Factory triggers
az datafactory trigger stop \
    --factory-name dev-data-factory \
    --resource-group rg-dev \
    --name dev-processing-trigger

# Schedule restart at 8 AM
at 08:00 <<EOF
az eventhubs namespace update --name dev-events-namespace --resource-group rg-dev --capacity 1
az datafactory trigger start --factory-name dev-data-factory --resource-group rg-dev --name dev-processing-trigger
EOF

Reserved Capacity and Savings Plans

Azure Reserved Instances

For predictable workloads, consider 1-year reserved capacity:

Service On-Demand 1-Year Reserved Savings
Event Hubs $25.95/TU/month $18.17/TU/month 30%
Storage $0.0208/GB $0.0166/GB 20%
Synapse $5/TB scanned $4/TB scanned 20%

Annual Commitment Analysis:

# Current annual cost: $375 × 12 = $4,500
# With reserved instances: $375 × 0.75 × 12 = $3,375
# Annual savings: $1,125 (25% reduction)

Azure Hybrid Benefit

If you have existing SQL Server licenses, apply Azure Hybrid Benefit to Synapse: - Savings: Up to 40% on compute costs - Applicable to: Synapse dedicated SQL pools (if used in future)

Cost Monitoring Dashboard

Real-Time Cost Tracking

graph TD
    subgraph "Cost Monitoring Dashboard"
        subgraph "Daily Metrics"
            DAILY_COST[Daily Cost<br/>Target: <$12.50]
            DAILY_TREND[Cost Trend<br/>7-day average]
            BUDGET_ALERT[Budget Alert<br/>80% threshold]
        end

        subgraph "Service Breakdown"
            EH_COST[Event Hubs<br/>$6/day]
            ADF_COST[Data Factory<br/>$5/day]
            STORAGE_COST[Storage<br/>$0.50/day]
            SYNAPSE_COST[Synapse<br/>$0.33/day]
        end

        subgraph "Optimization Alerts"
            UNUSED_RESOURCES[Unused Resources]
            SCALE_DOWN[Scale Down Opportunities]
            ARCHIVE_READY[Archive Ready Data]
        end
    end

Power BI Cost Dashboard

// Cost tracking measures
DailyCost = 
CALCULATE(
    SUM(CostTable[Cost]),
    CostTable[Date] = TODAY()
)

MonthlyBudgetUtilization = 
VAR MonthlyBudget = 400
VAR MonthToDateCost = 
    CALCULATE(
        SUM(CostTable[Cost]),
        DATESINPERIOD(CostTable[Date], TODAY(), -1, MONTH)
    )
RETURN
    DIVIDE(MonthToDateCost, MonthlyBudget, 0)

CostEfficiencyScore = 
VAR EventsProcessed = SUM(MetricsTable[EventCount])
VAR TotalCost = [MonthToDateCost]
RETURN
    DIVIDE(EventsProcessed, TotalCost, 0)  // Events per dollar

// Alert if utilization > 80%
BudgetAlert = 
IF(
    [MonthlyBudgetUtilization] > 0.8,
    "⚠️ Budget Alert: " & FORMAT([MonthlyBudgetUtilization], "0%") & " used",
    "✅ Within Budget: " & FORMAT([MonthlyBudgetUtilization], "0%") & " used"
)

Cost Optimization Roadmap

Phase 1: Quick Wins (Week 1-2)

  • [ ] Implement Event Hub auto-scaling
  • [ ] Reduce Data Factory execution frequency
  • [ ] Optimize storage lifecycle policies
  • [ ] Set up cost monitoring alerts

Expected Savings: $100-150/month

Phase 2: Query Optimization (Week 3-4)

  • [ ] Optimize Synapse queries for partition pruning
  • [ ] Implement materialized views
  • [ ] Enable query result caching
  • [ ] Create cost monitoring dashboard

Expected Savings: $40-60/month

Phase 3: Architecture Refinement (Month 2)

  • [ ] Evaluate Basic tier Event Hubs
  • [ ] Implement reserved capacity
  • [ ] Optimize development environments
  • [ ] Review and consolidate resources

Expected Savings: $80-120/month

Phase 4: Advanced Optimization (Month 3)

  • [ ] Implement auto-pause for dev/test
  • [ ] Advanced query optimization
  • [ ] Review data retention policies
  • [ ] Capacity planning for growth

Expected Savings: $50-75/month

Success Metrics

Cost KPIs

Metric Current Target Tracking
Monthly Cost $375 <$250 Daily monitoring
Cost per Event $0.012 <$0.008 Weekly calculation
Budget Utilization 94% <80% Real-time dashboard
Waste Ratio 15% <10% Monthly audit

ROI Measurement

graph LR
    subgraph "Cost Investment"
        INFRA[Infrastructure: $250/month]
        OPS[Operations: $50/month]
        DEV[Development: $100/month]
        TOTAL[Total: $400/month]
    end

    subgraph "Business Value"
        INSIGHTS[Data Insights: $2000/month]
        EFFICIENCY[Operational Efficiency: $1500/month]
        DECISIONS[Better Decisions: $1000/month]
        VALUE[Total Value: $4500/month]
    end

    subgraph "ROI"
        RATIO[ROI Ratio: 11.25x]
        PAYBACK[Payback Period: 26 days]
    end

    TOTAL --> RATIO
    VALUE --> RATIO
    RATIO --> PAYBACK

Best Practices Summary

  1. Continuous Monitoring
  2. Set up real-time cost alerts
  3. Review costs weekly
  4. Track cost per business metric
  5. Monitor unused resources

  6. Right-Sizing Resources

  7. Use auto-scaling where available
  8. Match resource size to workload
  9. Regular capacity reviews
  10. Eliminate over-provisioning

  11. Intelligent Scheduling

  12. Off-peak processing for batch jobs
  13. Auto-shutdown for dev/test
  14. Optimize refresh frequencies
  15. Load balance across time zones

  16. Data Lifecycle Management

  17. Aggressive archival policies
  18. Regular cleanup procedures
  19. Retention policy optimization
  20. Delete unnecessary data

With these optimizations, the target monthly cost of <$250 is achievable while maintaining system performance and preparing for future scale.