get_cluster
Description: Retrieve analytics for a semantic cluster including top predictors, discriminators, and exemplars
Tool Identifier: get_cluster
Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| cluster_id | string | Yes | - | - | Unique identifier of the cluster to retrieve (also accepts cluster_hash) |
| cluster_name | enum | Yes | - | Must be valid cluster name | Name of the semantic cluster (download via list_clusters) |
| domain | enum | Yes | - | Must be valid domain (see below) | Domain category of the cluster |
| analytics_depth | number | No | 10 | Min: 5, Max: 50 | Number of top items to return for each analytics category |
| workflow_id | string | No | - | Valid UUID | Workflow session identifier for correlation |
Parameter Details:
cluster_id:
- Unique string identifier for the cluster
- Also accepts cluster_hash values (32-character hexadecimal stable identifiers that persist across rebuilds)
- Used as a fallback identifier in the query (OR condition with cluster_name)
- Examples:
"cluster_gender_male","a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6"
cluster_name:
- Must be one of 346 valid cluster names
- Use
list_clusterstool to discover available cluster names - Examples:
"gender","education","marital_status","household_income_range","golf_affinity","is_dog_owner" - Case-sensitive exact match required
domain:
- Must be one of the following valid domains:
"purchase"- Purchase behavior and transaction data"demographic"- Age, gender, education, ethnicity, marital status"intent"- Consumer intent signals and purchase indicators"interest"- Hobbies, activities, and interests"financial"- Financial status, credit, investments"firmographic"- Business/company attributes"affinity"- Brand and category affinities"content"- Content consumption patterns"employment"- Job and career information"household"- Household composition and attributes"lifestyle"- Lifestyle choices and behaviors"political"- Political affiliations and contributions
analytics_depth:
- Controls how many top items are returned for each category
- Default: 10
- Range: 5 to 50
- Applied to: predictors, discriminators, cooccurring, segments, and exemplars
workflow_id:
- Optional UUID for tracking related tool calls in a session
- If not provided, a new workflow_id is generated
Request Schema (Zod):
{
cluster_id: z.string(),
cluster_name: z.enum(clusterNames), // 346 valid values
domain: z.enum(clusterDomains), // 12 valid values
analytics_depth: z.number().min(5).max(50).default(10),
workflow_id?: string
}Output Format
Success Response:
{
cluster: {
cluster_id: string,
cluster_hash: string,
name: string,
value: string,
domain: string,
size: number,
prevalence: number,
description?: string,
member_count?: number,
top_predictors: Array<{
cluster_id: string,
lift: number,
rank: number
}>,
top_discriminators: Array<{
cluster_id: string,
cohens_d: number,
rank: number
}>,
top_cooccurring: Array<{
cluster_id: string,
prevalence: number,
rank: number
}>,
top_segments: Array<{
segment_id: string,
cohens_d: number,
rank: number
}>,
top_exemplars: Array<{
person_id: string,
distance: number,
rank: number
}>
},
tool_trace_id: string,
workflow_id: string
}Response Fields:
| Field | Type | Description |
|---|---|---|
| cluster.cluster_id | string | Unique identifier for the cluster |
| cluster.cluster_hash | string | Stable identifier that persists across cluster rebuilds |
| cluster.name | string | Human-readable name of the cluster |
| cluster.value | string | The specific value or category within the cluster |
| cluster.domain | string | Domain category (purchase, demographic, etc.) |
| cluster.size | number | Number of individuals in this cluster |
| cluster.prevalence | number | Proportion of total population in this cluster (0-1, where 1.0 = 100%) |
| cluster.description | string | Optional description of the cluster |
| cluster.member_count | number | Optional count of members in this cluster |
| cluster.top_predictors | array | Clusters that predict membership in this cluster (sorted by lift) |
| cluster.top_predictors[].cluster_id | string | ID of the predictor cluster |
| cluster.top_predictors[].lift | number | Lift score indicating prediction strength |
| cluster.top_predictors[].rank | number | Rank order of this predictor |
| cluster.top_discriminators | array | Clusters that distinguish this cluster from others (sorted by Cohen's d) |
| cluster.top_discriminators[].cluster_id | string | ID of the discriminating cluster |
| cluster.top_discriminators[].cohens_d | number | Cohen's d effect size |
| cluster.top_discriminators[].rank | number | Rank order of this discriminator |
| cluster.top_cooccurring | array | Clusters that commonly appear with this cluster (sorted by prevalence) |
| cluster.top_cooccurring[].cluster_id | string | ID of the co-occurring cluster |
| cluster.top_cooccurring[].prevalence | number | Prevalence score |
| cluster.top_cooccurring[].rank | number | Rank order of this co-occurrence |
| cluster.top_segments | array | Segments strongly associated with this cluster |
| cluster.top_segments[].segment_id | string | ID of the segment |
| cluster.top_segments[].cohens_d | number | Cohen's d effect size |
| cluster.top_segments[].rank | number | Rank order of this segment |
| cluster.top_exemplars | array | Representative persons (exemplars) for this cluster |
| cluster.top_exemplars[].person_id | string | ID of the exemplar person |
| cluster.top_exemplars[].distance | number | Distance metric from cluster centroid |
| cluster.top_exemplars[].rank | number | Rank order of this exemplar |
| tool_trace_id | string | OpenTelemetry trace ID for this tool execution |
| workflow_id | string | Workflow session identifier |
Example Response:
{
"cluster": {
"cluster_id": "cluster_gender_male",
"cluster_hash": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"name": "gender",
"value": "Male",
"domain": "demographic",
"size": 125000000,
"prevalence": 0.489,
"top_predictors": [
{
"cluster_id": "cluster_sports_affinity_high",
"lift": 2.3,
"rank": 1
},
{
"cluster_id": "cluster_auto_affinity_high",
"lift": 1.8,
"rank": 2
}
],
"top_discriminators": [
{
"cluster_id": "cluster_fashion_interest_low",
"cohens_d": 0.82,
"rank": 1
}
],
"top_cooccurring": [
{
"cluster_id": "cluster_technology_affinity",
"prevalence": 0.67,
"rank": 1
}
],
"top_segments": [
{
"segment_id": "segment_tech_enthusiasts",
"cohens_d": 0.75,
"rank": 1
}
],
"top_exemplars": [
{
"person_id": "person_xyz123",
"distance": 0.12,
"rank": 1
}
]
},
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Error Handling
Error Response Format:
{
"content": [
{
"type": "text",
"text": "Cluster analytics failed: <error message>"
}
],
"isError": true
}Common Errors:
- Invalid
cluster_name: Must be one of the 346 valid cluster names - Invalid
domain: Must be one of the 14 valid domains - Analytics depth out of range: "Analytics depth must be between 5 and 50"
- Cluster not found: Verify the cluster name and domain combination are valid
- Service temporarily unavailable: "Request failed - please try again"
Analytics Metrics Explained
- Predictors (lift): Clusters that strongly predict membership in this cluster. Higher lift = stronger prediction.
- Discriminators (Cohen's d): Clusters that distinguish members from non-members. Higher Cohen's d = stronger differentiation.
- Co-occurring (prevalence): Clusters that frequently appear together with this cluster. Higher prevalence = more common co-occurrence.
- Segments (Cohen's d): Pre-defined segments strongly associated with this cluster.
- Exemplars (distance): Representative individuals who typify this cluster. Lower distance = more representative.
Usage Examples
Example 1: Get basic demographics cluster
{
"cluster_id": "cluster_gender",
"cluster_name": "gender",
"domain": "demographic"
}Example 2: Deep dive into golf affinity
{
"cluster_id": "cluster_golf_affinity",
"cluster_name": "golf_affinity",
"domain": "affinity",
"analytics_depth": 50
}Example 3: Household income analysis
{
"cluster_id": "cluster_income",
"cluster_name": "household_income_range",
"domain": "financial",
"analytics_depth": 25
}Example 4: Pet ownership cluster
{
"cluster_id": "cluster_dog_owner",
"cluster_name": "is_dog_owner",
"domain": "lifestyle",
"analytics_depth": 15
}find_persons
Find persons using cluster criteria and/or location filters. Supports boolean expressions (AND/OR/NOT), geospatial radius search, optional identifier enrichment, and export.
submit_feedback
Submit structured feedback on data quality, tool behavior, or results. Feedback is associated with a workflow session and optionally linked to specific tool executions.