Integration Guide
If your microservice produces data that needs to flow into the analytics universe, follow these steps.
-
Identify events worth persisting
Section titled “Identify events worth persisting”Not every database write needs to go to the data lake. Focus on:
- Entities that feed compliance reports or dashboards
- High-level business events (worker assigned, test completed, license issued)
- Audit-relevant changes
-
Write to ADLS (non-blocking)
Section titled “Write to ADLS (non-blocking)”Add a background ADLS write after the transactional operation completes. The service must not wait for this write to respond to the caller.
# Pseudocode — after DB commitnow = utcnow()background_tasks.add_task(adls_client.ingest,container="angelis-datalake",path=(f"{SERVICE_NAMESPACE}/"f"{table_name}/"f"tenant={company_id}/"f"year={now.year}/month={now.month:02d}/day={now.day:02d}/"f"{uuid4()}.parquet"),payload={**entity_data, "ingested_at": now, "source_service": SERVICE_NAME},)See Architecture — Layer 3 for the mandatory path pattern and file format rules.
-
Define a model in Dremio
Section titled “Define a model in Dremio”Create a model that maps your ADLS data to a clean SQL table. Apply only structural transformations (casting, null handling, renaming) — no business logic, no joins.
SELECTid,company_id,your_field,CAST(created_at AS TIMESTAMP) AS created_at,ingested_atFROM datalake.your_namespace.your_tableWHERE deleted_at IS NULLCoordinate with the data platform team to register it under the correct schema namespace. See Architecture — Models for the full rules.
-
Create or extend an explore
Section titled “Create or extend an explore”If your entity enriches an existing business view, add it as a dimension to the relevant explore.
If it represents a new domain, create a new explore with your entity as the central fact:
explore_your_domain├── fact: your_central_model├── dim: wfm_profiles (if worker data is relevant)└── dim: wfm_companies (always include for tenant filtering)Always include
company_id— Superset’s RLS depends on it. See Architecture — Explores for more. -
Add to Superset
Section titled “Add to Superset”Register the explore as a Superset dataset. Apply the
company_idRLS rule:clause: company_id = '{{ current_user_attribute("company_id") }}'Build charts and optionally embed the dashboard into the relevant frontend application.
Constraints and Decisions
Section titled “Constraints and Decisions”| Decision | Rationale |
|---|---|
| Analytics universe is read-only | Prevents analytics workloads from affecting transactional performance or integrity |
| ADLS as the single landing zone | Decouples services from the consumption tool; swap Dremio or Superset without re-ingesting |
| Services write to ADLS directly (current) | Simpler to implement; acceptable at current scale. Does not require additional infrastructure |
| Non-blocking ingestion | Customer-facing latency must not be affected by data lake write performance or failures |
| Explores as the Superset boundary | Keeps transformation logic in one place; visualization layer stays configuration-only |
| Tenant isolation at Superset RLS | Enforces multi-tenant security without embedding it into each dashboard |