Documentation Index
Fetch the complete documentation index at: https://unko.design/llms.txt
Use this file to discover all available pages before exploring further.
Data Strand – The Operating System of Your Company
Your product doesn’t run on features — it runs on data. The Data Strand defines how your company:- structures information,
- moves it across systems,
- secures and governs it,
- and turns it into insights, AI, and automation.
🧪 Workshop Meta – How to Design the Data Strand
Framework version:data-strand-v1.0
Use this strand to map:
- Data Purpose
- Data Domains & Entities
- Pipelines & Flows
- Storage & Architecture
- Access & Permissions
- Governance & Compliance
- Analytics & Insights
- AI & Automation
- Quality & Reliability
- Lifecycle & Retention
- Risks & Guardrails
- Data engineering
- Backend / platform engineering
- Product & UX
- Marketing / growth
- AI / ML
- Start by mapping real events, logs, objects, and usage telemetry, not abstractions.
- Treat this as the Data OS — the backbone that every system and team relies on.
🎯 Purpose & Role – Why This Company Collects Data
Guiding questionWhy does this company collect and use data?Core answer Data ensures the product stays reliable, personalized, and secure, enabling:
- fast search,
- AI-powered assistance,
- performance optimization,
- customer insight,
- and compliance.
- Product – feature usage, adoption, outcomes
- UX – flows, drop-offs, friction events
- UI – interaction events, clickstreams
- Marketing – attribution, cohorts, campaigns
- AI – summarization, retrieval, recommendations
- Power real-time collaboration, search, and AI summarization.
- Maintain workspace integrity, access control, and security.
- Support product-led growth, customer insights, and adoption metrics.
- Fuel automation through telemetry and workflow triggers.
🗺 Data Domains – The Map of What Exists
Guiding questionWhat are the core domains of data in the system?
1. Users & Identities
- Entities
- User profiles
- Credentials & auth tokens
- Permissions & roles
- Preferences & notification settings
- Notes
- Tightly connected with authentication, SSO, org admin, and compliance controls.
2. Workspaces / Organizations
- Entities
- Workspace metadata
- Billing & plan
- Workspace settings
- Security & compliance policies
- Notes
- Drives governance, access, and cross-org collaboration.
3. Channels & Conversations
- Entities
- Channel metadata
- Membership lists
- Messages
- Threads
- Reactions (emoji data events)
- Pinned items
- Notes
- Primary collaboration dataset powering:
- search,
- grooming & curation,
- AI summarization,
- compliance exports.
- Primary collaboration dataset powering:
4. Artifacts
- Entities
- Files
- Canvases
- Lists
- Task items
- Attached metadata (permissions, versions, references)
- Notes
- Interlinked with messages; stored in object storage and indexed for search.
5. Activity & Telemetry
- Entities
- UI interaction events
- UX flow events
- Feature adoption events
- Performance logs
- Search queries
- Notes
- Feeds product analytics, PLG motions, UX quality metrics, and AI ranking.
6. External Integrations
- Entities
- App tokens
- API calls
- Workflow steps
- External channel partners
- Integration logs
- Notes
- Supports platform health, audit logs, and the extensibility ecosystem.
🔄 Data Flows & Pipelines – How Data Moves
Guiding questionHow does data move from creation to consumption?
Pipeline 1 – Real-time Event Pipeline
Stages- Client events generated (UI)
- Ingestion gateway
- Streaming queue (Kafka / PubSub)
- Event processors
- Storage in time-series DB or warehouse
- Live updates
- Presence indicators
- Message posting & thread updates
- Alerting & notifications
- Analytics & dashboards
Pipeline 2 – Search Indexing Pipeline
Stages- Message stored
- Tokenization & normalization
- Embedding generation (for AI search)
- Indexing in search clusters
- Refresh & ranking adjustments
- Full-text search
- Semantic search
- AI conversation summaries
- Knowledge retrieval
Pipeline 3 – AI Summarization Pipeline
Stages- Conversation or artifact retrieved
- Preprocessing & cleaning
- LLM summary generation
- Metadata tagging
- Caching & revalidation
- Channel summaries
- Thread catch-up
- Daily digests
- Decision extraction
Pipeline 4 – ETL / Warehouse Sync
Stages- Batch or micro-batch extract
- Transform into analytics schemas
- Load into warehouse
- Expose through BI tools
- Retention analysis
- Funnel metrics
- Enterprise reporting
- Billing & usage scoring
🧱 Storage & Architecture – Where Data Lives
Datastores and their jobs- Relational DB
- Use: Users, orgs, channels, permissions, metadata
- Notes: Strong consistency required for identity and access.
- Object Storage
- Use: Files, media, canvas versions
- Notes: Versioning, scanning, encryption at rest.
- Search Clusters
- Use: Messages, threads, artifacts
- Notes: Combines keyword indexing and vector embeddings.
- Time-series DB
- Use: Metrics, telemetry, performance logs
- Notes: Used by SRE, reliability, and product analytics.
- Data Warehouse
- Use: Analytics, BI, dashboards, segmentation
- Notes: Source of truth for user and workspace metrics.
- Cache / KV Store
- Use: Presence, recent items, hot keys, ephemeral data
- Notes: Supports real-time responsiveness.
🔐 Access & Permissions – Who Sees What
Guiding questionWho has access to what data, and how is it enforced?Principles
- Least privilege by default.
- Role-based permissions for org admins, owners, and users.
- Clear separation between internal staff, customers, and external partners.
- All access points audited.
- Workspace-level permissions
- Channel membership
- Thread visibility
- Artifact-level permissions
- Admin override rules with audit documentation
🛡 Governance & Compliance – How Data Stays Legit
Guiding questionHow do we ensure data is secure, compliant, and high-integrity?Policies
- Encryption in transit and at rest.
- Data residency options for enterprise customers.
- Retention settings configurable per workspace.
- Export tools for compliance and eDiscovery.
- Audit logs for all critical actions.
- SOC 2
- ISO 27001
- GDPR
- HIPAA (if applicable)
- FedRAMP / GovCloud (for government workspaces)
📊 Analytics & Insights – What You Learn from Data
Guiding questionWhat metrics and insights are generated from data?
Product Metrics
- Daily Active Users
- Weekly Active Channels
- Messages sent per user
- Search usage
- Workflow Builder usage
- AI summary usage
Experience Metrics
- Task completion time
- Flow drop-off
- Latency and error rates
- UX friction points from telemetry
Business Metrics
- Retention and expansion
- Activation milestones
- Seat growth
- External collaboration adoption
Marketing Metrics
- Attribution data
- Lifecycle segmentation
- Campaign performance
- Lead → conversion pipeline
🤖 AI & Automation – Turning Data into Leverage
Guiding questionHow does data feed AI and automation systems?
AI Uses
- Summaries of channels, threads, and canvases
- Semantic search embeddings
- Decision extraction
- User preference prediction
- Workflow suggestions
Automation Uses
- Triggers based on message patterns
- Workflow Builder events
- Bot interactions
- Cross-platform signals
Responsible AI Policies
- AI never accesses content the user can’t already access.
- Summaries are cached and revalidated to avoid overprocessing.
- Models tested for hallucination reduction.
- Users get consent and visibility into AI operations.
📈 Quality & Reliability – Keeping Trust in the Data
Quality dimensions- Latency (message post, render, search)
- Event delivery reliability
- Data correctness
- Search accuracy
- AI summary precision
- Zero data loss under scale
- Real-time dashboards for ingestion and pipeline health
- Anomaly detection on message volume
- Alerting rules for indexing delays
🕰 Lifecycle & Retention – How Data Ages
Phases- Creation
- Messages, events, files, artifacts, telemetry
- Active use
- Displayed in UI, threads, search, canvases
- Archival
- Older content in cheaper storage tiers
- Deletion
- Retention-based or admin-initiated removals
- Users and admins control visibility and retention.
- Search respects retention windows.
- Deletion propagates to all indexes and caches.
🚧 Risks & Guardrails – How It Fails, How You Prevent It
Risks- Data overload causing slow search and degraded performance.
- Inaccurate or outdated search indexes creating trust issues.
- AI summarizing sensitive content incorrectly.
- Broken workflows due to missing telemetry.
- Strict pipeline ownership per data domain.
- Automated reindexing for stale content.
- AI summaries labeled and easily toggled off.
- Rate limiting on ingestion systems under overload.
🧙♂️ Data Archetype – Who the System “Is”
Guiding questionIf the data system were a role in the organization, who would it be?
- Primary archetype: Archivist
- Secondary archetype: Strategist
The data system remembers everything, organizes it intelligently,
and provides the insight and foresight needed to make strategic decisions at scale.
📌 How to Use This Data Strand in Practice
- Run a cross-functional workshop
- Use this page as the agenda.
- Fill in your company’s answers under each section.
- Map your real events and entities
- Start from what actually exists: logs, messages, files, telemetry.
- Place everything into domains and pipelines.
- Decide storage and access patterns
- For each domain, decide:
- where it lives (DB / warehouse / object store),
- who can see it,
- how long it lives.
- For each domain, decide:
- Wire analytics, AI, and automation explicitly
- For each metric or AI use case, map:
- source data → pipeline → model → UI surface.
- For each metric or AI use case, map:
- Define risks & guardrails up front
- Decide how you detect failures,
- and what should gracefully degrade when they happen.
Screenshotable line:
“Your Data Strand isn’t a dashboard — it’s the operating system that decides what your company can know, automate, and safely promise.”

