Xortrix AI

Case Study — B2B SaaS / Analytics

Multi-Tenant Analytics Dashboard for Indian SMBs

How Xortrix AI designed and built a production-grade, multi-tenant SaaS analytics platform — complete with real-time data ingestion, row-level tenant isolation, white-label support, and usage-based billing — for a price point accessible to small businesses across India.

AWS Aurora Serverless v2Amazon KinesisMulti-TenancyTally ERP IntegrationStripe BillingAmazon Athena

01 — Client Overview

The Client

The client is a Pune-based B2B SaaS startup founded in 2023, built around a single insight: most Indian small and medium businesses run their operations across disconnected tools — Tally for accounting, Zoho for CRM, Shopify or WooCommerce for e-commerce — yet have no unified view of how the business is actually performing.

The platform aggregates data from these sources into a single, opinionated dashboard that surfaces the KPIs that matter most for Indian SMBs: GST-inclusive revenue, collection efficiency, inventory turnover, customer acquisition cost, and repeat purchase rate. Unlike Western analytics tools priced in dollars and built for scale-up SaaS companies, the platform targets the vast middle market of Indian businesses — a kirana chain, a regional distributor, a D2C brand with 5 SKUs — that needs insight without complexity.

They came to Xortrix AI with a validated idea, a seed round, and a design prototype — but no backend, no data infrastructure, and no clear answer to the hardest question in B2B SaaS: how do you serve hundreds of business tenants from a single codebase without their data leaking into each other, while keeping the infrastructure cost low enough that a ₹999/month price point is actually sustainable?

Client Snapshot

Company
Confidential
Founded
2023, Pune, Maharashtra
Segment
B2B SaaS — SMB Analytics
Target market
Indian SMBs: retail, D2C, distribution
Integrations
Tally ERP, Zoho CRM, Shopify, WooCommerce
Price point
₹999 – ₹4,999 / month per business
Reseller channel
CA firms and ERP consultants

Xortrix AI Engagement Scope

System architecture design
Full-stack product engineering
AWS infrastructure provisioning (IaC with CDK)
Data pipeline design and implementation
ERP connector development (Tally, Zoho)
White-label theming system
Usage-based billing integration (Stripe)

02 — The Challenge

Four Problems That Had to Be Solved Simultaneously

The brief looked simple on the surface — build a dashboard — but the combination of constraints created a genuinely difficult engineering problem. Four challenges were non-negotiable from day one.

01

Strict Tenant Data Isolation

The platform's customers are businesses sharing a common infrastructure. A data leak — even a partial one — would be catastrophic for trust and potentially a regulatory issue under India's DPDP Act. The isolation model had to be correct by design, not by convention. Row-level security enforced at the database layer, not the application layer, was the minimum acceptable standard.

02

Real-Time Data Aggregation Across Sources

Business owners don't want yesterday's numbers. When a sale is recorded in Tally or an order comes in on Shopify, the dashboard metric should update within seconds. This required a streaming ingestion pipeline capable of handling bursty, unpredictable write loads from hundreds of tenants — without individual tenants starving each other's data processing.

03

Sustainable Unit Economics at Indian SMB Price Points

A ₹999/month plan generates roughly $12 USD in MRR per tenant. After Stripe fees, GST, and support costs, the infrastructure budget per tenant is extremely thin. Any architecture that cost more than ₹3 per tenant per day was economically unviable at projected scale. This ruled out dedicated RDS instances, per-tenant compute clusters, and any solution that didn't scale to near-zero during inactive hours.

04

White-Label Support for Reseller Channel

The client's go-to-market included CA firms and ERP consultants reselling the platform under their own brand. The system needed to support custom domains, custom color schemes, custom logos, and custom email templates — all configurable per reseller without any redeployment. A reseller's client should see their CA's brand, not the platform's.

03 — Solution Architecture

A Serverless-First, Event-Driven Platform on AWS

We designed the platform around a layered AWS architecture that separates the hot path (real-time ingestion and live dashboard queries) from the cold path (historical analytics and batch aggregations), allowing each layer to scale and cost-optimise independently.

API Layer: API Gateway + Lambda

All client traffic enters through an Amazon API Gateway HTTP API. We chose HTTP API over REST API for its lower latency and significantly cheaper per-request pricing — important at the volumes the client expected. Each API route is backed by a dedicated Lambda function, keeping cold-start surfaces small and enabling per-route concurrency limits.

Critically, we configured API Gateway Usage Plans with per-tenant API keys. Each tenant's API key is associated with a usage plan that enforces a request rate limit (requests/second) and a monthly quota. This means a noisy tenant cannot starve other tenants of compute capacity, and plan-level throttling is enforced before any Lambda invocation — saving cost and protecting the platform from abuse. The Starter plan is capped at 10 RPS; the Business plan allows 50 RPS; Enterprise tenants get a negotiated limit.

AWS WAF sits in front of API Gateway with managed rule groups for SQL injection, XSS, and known bad IPs. A custom rule rate-limits unauthenticated requests to 100/5-minute window per IP to prevent credential-stuffing attacks on the login endpoint.

Tenant Configuration: DynamoDB

Tenant metadata — subscription plan, enabled integrations, white-label configuration, feature flags, API key mappings, and reseller association — lives in Amazon DynamoDB. We use a single-table design with a composite primary key of PK: TENANT#<tenantId> and entity-specific sort keys.

DynamoDB was the right choice here because tenant config reads happen on every authenticated API request (to resolve plan limits, feature flags, and white-label settings). The access pattern is always a point lookup by tenantId — no joins, no range scans — making DynamoDB's single-digit millisecond read latency ideal. We use DynamoDB DAX (caching layer) for hot-read tenants, bringing config resolution latency below 1ms on cache hits. TTL on DAX items is set to 60 seconds to ensure plan changes propagate quickly.

Analytical Queries: Aurora Serverless v2 (PostgreSQL)

The dashboard query layer — the system that powers live charts and KPI summaries — runs on Amazon Aurora Serverless v2 in PostgreSQL-compatible mode. We chose Aurora Serverless v2 over provisioned RDS for two reasons: it scales ACUs (Aurora Capacity Units) up and down in fine-grained increments within seconds, and it can scale to 0 ACUs after a configurable idle period — critical for overnight cost efficiency when Indian SMB users are offline.

The minimum ACU is set to 0.5 (Aurora's minimum for Serverless v2), and the maximum is capped at 32 ACUs per cluster. In practice, the cluster stays between 0.5 and 4 ACUs for 95% of operating hours, scaling to higher capacity only during the 9am–11am and 4pm–7pm IST windows when SMB owners check their dashboards.

The cluster runs in a private VPC subnet. Lambda functions access it via RDS Proxy, which pools and reuses connections — essential because Lambda's ephemeral nature would otherwise exhaust PostgreSQL's connection limit during traffic spikes. RDS Proxy supports IAM authentication, meaning no database passwords are stored in Lambda environment variables.

Real-Time Ingestion: Amazon Kinesis Data Streams

All incoming business events — a new sale recorded in Tally, an order placed on Shopify, a lead updated in Zoho — are written to a Kinesis Data Stream. We provision the stream with 4 shards, giving 4 MB/s ingest capacity and 4,000 records/second, which comfortably covers the projected peak load of ~340 active tenants each with bursty write patterns.

Shard management is a key operational concern we addressed upfront. We partition records using the tenantId as the partition key, which guarantees that all events from a single tenant land on the same shard — preserving per-tenant ordering. The risk with this approach is hot shards if one tenant generates disproportionate traffic. We mitigate this by appending a suffix to the partition key for tenants on the Enterprise plan (which allows higher event rates), effectively spreading their records across multiple shards while maintaining logical ordering via sequence numbers.

Kinesis Data Firehose is attached to the stream and delivers a continuous, compressed (GZIP) archive of all raw events to S3 in Parquet format, partitioned by year/month/day/tenantId. This serves as the source of truth for historical analytics and as a recovery mechanism if the Aurora write path ever falls behind.

Historical Analytics: S3 + Athena + Glue

Dashboard features that display historical trends — month-over-month revenue comparison, quarterly GST-inclusive revenue summaries, 12-month inventory turnover — run against Amazon Athena rather than Aurora. This is a deliberate separation: Aurora handles low-latency queries against recent, hot data (last 90 days), while Athena handles range queries over the full historical archive in S3.

Athena pricing is per-terabyte scanned, so partitioning strategy is critical to cost control. The S3 data lake uses Hive-style partitions on year=YYYY/month=MM/day=DD/tenant=<tenantId>. Every Athena query issued by the dashboard includes a mandatory partition filter on tenantId and a date range, which Athena uses to prune partitions before scanning. In testing, this reduced per-query data scanned from ~200 GB to under 50 MB for typical dashboard range queries — a 4,000x reduction in scanned data and cost.

AWS Glue maintains the schema catalog for the S3 data lake. A scheduled Glue Crawler runs nightly to detect new partitions and update the Glue Data Catalog, which Athena uses for partition pruning. We also run a weekly Glue ETL job that compacts small Parquet files (Firehose creates many small files) into larger ones, improving Athena query performance and reducing scan overhead.

Auth: Cognito User Pools + Identity Pools

Authentication follows a two-layer Cognito model. Each reseller gets a dedicated Cognito User Pool — this gives resellers the ability to configure their own MFA settings, password policies, and email domain for verification emails, all within their branded experience. Tenants onboarded directly by the client use the platform's own User Pool.

Cognito Identity Pools sit above the User Pools and are responsible for vending temporary AWS credentials to authenticated frontend clients. We use attribute-based access control (ABAC) with custom Cognito attributes — specifically custom:tenantId and custom:resellerId — which flow into IAM session tags when credentials are assumed. IAM policies on the Identity Pool role use aws:RequestedRegion and aws:PrincipalTag/tenantId conditions, so even if a user obtained valid credentials, they cannot access any S3 path or DynamoDB item outside their own tenantId namespace.

API Gateway uses a Cognito Authorizer that validates the JWT issued by the User Pool on every request. The tenantId claim is extracted from the JWT and injected into the Lambda event context, making it the authoritative source of tenant identity throughout the request lifecycle — never derived from user-supplied input.

04 — Architecture Deep Dive

Tenant Isolation: The Pool Model with Row-Level Security

Multi-tenancy in SaaS databases typically follows one of three patterns: silo (dedicated database per tenant), bridge (shared database, dedicated schema), or pool (shared database, shared schema with a tenant discriminator column). We chose the pool model for the platform, with PostgreSQL row-level security (RLS) as the enforcement mechanism.

Why Pool Model?

The silo model — a dedicated Aurora cluster per tenant — would provide the strongest isolation but is economically impossible at the client's price point. Even Aurora Serverless v2 at minimum capacity costs approximately $43/month per cluster (0.5 ACU minimum × 730 hours). With 340 tenants, that's $14,620/month on database clusters alone, before any compute or storage costs. The unit economics do not work.

The bridge model (one schema per tenant) offers better isolation than pool but introduces schema management complexity — migrations require running DDL across potentially hundreds of schemas, and connection routing becomes schema-aware. It also doesn't scale well in PostgreSQL beyond ~100 schemas per database due to catalog bloat. We rejected it for both cost and operational reasons.

Row-Level Security Implementation

Every analytical table in Aurora carries a tenant_id UUID NOT NULL column. PostgreSQL RLS policies are defined on each table:

-- Enable RLS on events table

ALTER TABLE business_events ENABLE ROW LEVEL SECURITY;


-- Policy: tenants see only their rows

CREATE POLICY tenant_isolation ON business_events

  USING (tenant_id = current_setting(

    'app.current_tenant')::uuid);


-- Lambda sets this before any query

SET app.current_tenant = '<tenantId-from-JWT>';

The tenant context variable is set at the database session level by the Lambda handler immediately after acquiring a connection from RDS Proxy, before any SQL is executed. Even if an application bug omits a WHERE clause, PostgreSQL silently filters rows to only those belonging to the authenticated tenant. The database enforces the boundary, not the application.

Superuser Bypass and Internal Tooling

The client's internal support team occasionally needs cross-tenant access for debugging. We handle this with a separate database role — platform_support — that is explicitly granted BYPASSRLS privilege. This role is used only by the internal admin console, which is a separate Next.js application protected by Cognito with an internal-only user pool, accessible only from a specific CIDR range (the Xortrix AI office IPs and a VPN endpoint).

All cross-tenant queries issued through the admin console are logged to a tamper-evident audit trail in a separate CloudWatch Log Group with a resource policy that prevents deletion. This was a deliberate design decision to satisfy the client's prospective enterprise customers who ask about data access auditing during procurement conversations.

Pre-Computed Aggregations for Dashboard Performance

Running analytical SQL against raw event data on every dashboard load would make Aurora the bottleneck and drive up ACU consumption. Instead, we use a materialized aggregation pattern: a Lambda function scheduled every 5 minutes computes common aggregations (daily revenue, weekly active customers, 30-day rolling average order value) and writes them to a metric_snapshots table per tenant. Dashboard queries read from this pre-computed table rather than scanning raw events — reducing query execution time from 200–800ms to under 20ms for the most common dashboard views.

For real-time accuracy, the dashboard UI uses a two-tier fetch strategy: it first renders the last pre-computed snapshot (available immediately), then triggers a differential query that counts only events that arrived since the last snapshot timestamp. The differential result is merged client-side using TanStack Query's select transform, giving users numbers that are accurate to within 5 minutes without hitting Aurora with expensive full scans.

05 — Implementation Details

Key Implementation Decisions

Tally ERP Integration: The Hard Part

Tally ERP 9 and TallyPrime are the de facto accounting standard for Indian SMBs, but Tally's integration story is notoriously difficult. The only programmatic interface is a local HTTP server (ODBC/XML mode) that Tally exposes on port 9000 when running on the user's Windows machine. There is no cloud API, no webhook support, and no real-time push capability.

We built a lightweight Windows desktop connector — a small Electron application that the client's customers install alongside Tally. The connector does the following:

Polls Tally's local XML API on a configurable interval (default: 2 minutes) for new vouchers, ledger entries, and stock movements
Normalises the XML response into a canonical JSON event format shared across all platform integrations
Authenticates with the platform's API Gateway using a machine-level API key provisioned at connector setup time
Pushes normalised events to the ingestion endpoint, which writes to Kinesis
Maintains a local SQLite checkpoint of the last-synced voucher sequence number to avoid duplicate ingestion after restarts

This approach means the platform's real-time ingestion from Tally is technically "near real-time" (2-minute latency) rather than true real-time. We were explicit with the client's product team about this constraint — Tally's architecture simply does not support push-based notifications. The product communicates this as "syncs every 2 minutes" in the UI, which has been well-received by users accustomed to Tally's manual, end-of-day reporting paradigm.

White-Label Theming System

The white-label system allows resellers (CA firms, ERP consultants) to present the platform's dashboard under their own brand. The requirements were: custom domain, custom logo, custom primary colour, custom email sender name, and a custom support email — all configurable without redeployment.

The implementation uses a domain-resolution middleware in the Next.js application. When a request arrives at the dashboard, the middleware reads the Host header and looks up the reseller configuration from DynamoDB (cached in Vercel's edge cache with a 60-second TTL). The reseller config object contains:

{

  "resellerId": "ca-firm-sharma-associates",

  "displayName": "Sharma & Associates Analytics",

  "logoUrl": "https://cdn.platform.in/logos/sharma.svg",

  "primaryColor": "#1a56db",

  "domain": "analytics.sharmaassociates.in",

  "cognitoUserPoolId": "ap-south-1_XXXXXXX",

  "supportEmail": "support@sharmaassociates.in"

}

The frontend injects the primaryColor as a CSS custom property on the root element, which cascades through the design system. Logos are served from a CloudFront distribution backed by S3. Custom domains are configured as CNAME records pointing to the platform's Vercel deployment, with Vercel's domain verification handled programmatically via the Vercel API when a reseller is onboarded.

Usage-Based Billing with Stripe

The client's pricing combines a flat monthly subscription with a usage component — specifically, the number of "data sources" (integrations) active on the account and the number of seats (dashboard users). We implemented this using Stripe Billing with metered subscriptions.

Each tenant has a Stripe Customer and a Stripe Subscription with two line items: a flat-rate price (the base plan) and a metered price (per additional data source beyond the plan's included count). At the end of each billing period, a Lambda function triggered by EventBridge Scheduler queries the platform database for each tenant's active integration count and reports the usage to Stripe via the subscription_items.usage_records API. Stripe generates the invoice automatically.

Stripe webhooks handle subscription lifecycle events: when a subscription is cancelled, a Lambda function sets the tenant's status to suspended in DynamoDB, which causes the API Gateway Usage Plan to reject subsequent requests with a 429. When payment resumes, the status flips back to active within seconds of the Stripe webhook delivery. This entire flow is stateless and requires no manual intervention from the client team — the system self-manages subscription state.

06 — Tech Stack

Technologies Used

API Layer

  • AWS API Gateway (HTTP API)
  • AWS Lambda (Node.js 20.x)
  • AWS WAF for DDoS protection

Data Storage

  • Amazon DynamoDB (tenant config + metadata)
  • Amazon Aurora Serverless v2 (PostgreSQL-compatible)
  • Amazon S3 (raw event archive + Parquet exports)

Real-Time Pipeline

  • Amazon Kinesis Data Streams
  • Kinesis Data Firehose (S3 delivery)
  • AWS Lambda (stream processors)

Historical Analytics

  • Amazon Athena (ad-hoc query engine)
  • AWS Glue (schema catalog + ETL jobs)
  • S3 Intelligent-Tiering (cost-optimized storage)

Auth & Identity

  • Amazon Cognito User Pools (per-tenant auth)
  • Amazon Cognito Identity Pools (federated access)
  • AWS IAM roles with attribute-based access control

Billing & Integrations

  • Stripe Billing (usage-based metering)
  • Stripe Webhooks + Lambda for event processing
  • Custom connectors: Tally ERP, Zoho CRM, Shopify

Observability

  • Amazon CloudWatch (metrics + alarms)
  • AWS X-Ray (distributed tracing)
  • CloudWatch Embedded Metric Format (structured logs)

Frontend

  • Next.js 14 (App Router)
  • Recharts + custom SVG components
  • TanStack Query for server-state management

07 — Results

Production Outcomes, 6 Months Post-Launch

The platform launched to paying customers in February 2024. These are measured outcomes from the first six months of production operation, not projections.

340+

Active tenants onboarded

in first 4 months post-launch

1.2s

Average dashboard load time

p95 across all tenant sizes

$0.03

Infrastructure cost per tenant/day

at median usage tier

99.95%

API availability

measured over 6-month window

< 800ms

Kinesis ingestion latency

event-to-dashboard p99

4

ERP / commerce integrations

Tally, Zoho, Shopify, WooCommerce

Infrastructure Cost Breakdown

At 340 active tenants, the platform's total AWS spend averages approximately $3,500/month. Broken down per tenant, that is roughly $10.30/month — or $0.34/day — against a base plan price of ₹999/month ($12 USD). The infrastructure margin on the base plan is approximately 85%, which comfortably supports the cost of support, product development, and Xortrix AI's ongoing retainer.

The largest line item is Aurora Serverless v2 at ~$1,100/month, followed by Lambda and API Gateway at ~$600/month, and S3 + Athena at ~$400/month. Kinesis is surprisingly cheap at this scale — under $120/month for 4 shards with Firehose delivery — making it an excellent choice for event streaming at Indian SMB volumes.

What the Numbers Mean

The 1.2-second average dashboard load time includes network latency from tier-2 Indian cities on 4G connections — a realistic benchmark for the platform's actual user base, not a measurement from a Mumbai data centre to a Mumbai browser. The p95 load time is 2.1 seconds, still well within acceptable thresholds for a dashboard product used on a daily planning cadence rather than in real-time operations.

The 99.95% API availability figure corresponds to approximately 2.2 hours of downtime over six months. Of this, approximately 1.5 hours was attributable to two scheduled Aurora maintenance windows (during which RDS Proxy buffered connections and queued requests), and the remaining 45 minutes to two Lambda cold-start cascades during traffic spikes that briefly exceeded the concurrency limit. Both issues have since been addressed through provisioned concurrency on critical Lambda functions.

08 — Lessons Learned

What We Would Do Differently

01

Start with Kinesis Enhanced Fan-Out

We launched with standard Kinesis consumers, which poll shards and share the 2 MB/s read throughput per shard across all consumers. As we added more Lambda consumers (one for Aurora writes, one for Firehose, one for real-time metric computation), we hit shard read throughput limits during traffic peaks. Upgrading to Enhanced Fan-Out — which gives each registered consumer its own 2 MB/s read throughput per shard via a push model — resolved this immediately. We should have designed for Enhanced Fan-Out from the start.

02

Invest in Tenant Onboarding Automation Earlier

The first 50 tenants were onboarded with a partially manual process — a support team member ran a script to provision the Cognito user, the DynamoDB tenant record, the Stripe Customer, and the API Gateway usage plan key. This took 15–20 minutes per tenant and did not scale. We built a fully automated onboarding Step Functions workflow in month two, reducing onboarding to under 90 seconds end-to-end. In retrospect, this should have been in scope for the initial build.

03

Athena Is Not Always the Right Tool for Tenant-Scoped Queries

We use Athena for historical analytics and it works well when the query touches a large, partition-pruned dataset. However, for small tenants (fewer than 1,000 events/month), Athena's minimum billing unit (10 MB per query) means the cost per query is disproportionately high relative to the data actually scanned. For small tenants, loading the full history into Aurora and querying it directly would be cheaper. A tiered approach — Aurora for small tenants, Athena for large ones — is something we are actively implementing for the platform's next architecture iteration.

Related Case Studies

Work With Us

Building a multi-tenant platform, analytics product, or B2B SaaS?

The architecture decisions that separate good SaaS from fragile SaaS — tenant isolation, cost-effective data pipelines, usage-based billing — are hard to get right the first time. We've built these systems in production and we know where the traps are.