search_clusters
Description: Semantic search for clusters using natural language descriptions. Converts your query into a vector embedding and finds clusters with similar semantic meaning using cosine similarity. Use this when you know what you're looking for conceptually but don't know the exact cluster names.
Tool Identifier: search_clusters
Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| query | string | Yes | - | Min length: 1 | Natural language description of desired clusters |
| limit | number | No | 10 | Min: 1, Max: 100 | Maximum results to return |
| domains | array | No | - | Valid cluster domains | Filter results to specific domains (see domain list) |
| workflow_id | string | No | - | Valid UUID | Workflow session identifier for correlation |
Parameter Details
query:
- Natural language description of the clusters you're looking for
- The query is converted to a vector embedding and compared against cluster embeddings
- More descriptive queries generally produce better results
- Examples:
"people interested in outdoor activities and camping""high income households with investments""sports enthusiasts who play golf"
domains:
- Optional filter to limit results to specific cluster domains
- Valid values:
purchase,demographic,intent,interest,financial,firmographic,affinity,content,employment,household,lifestyle,political - Example:
["affinity", "interest"]
Request Schema:
interface SearchClustersParams {
query: string;
limit?: number;
domains?: string[];
workflow_id?: string;
}Output Format
Success Response:
{
results: Array<{
cluster_id: number;
cluster_hash: string;
domain: string;
name: string;
value: string;
similarity_score: number;
size: number;
prevalence: number;
}>;
tool_trace_id: string;
workflow_id: string;
}Response Fields:
| Field | Type | Description |
|---|---|---|
| results | array | Array of matching clusters sorted by similarity |
| results[].cluster_id | number | Unique cluster identifier (use in find_persons expressions) |
| results[].cluster_hash | string | Stable cluster identifier (persists across rebuilds) |
| results[].domain | string | Domain category of the cluster |
| results[].name | string | Human-readable cluster name |
| results[].value | string | Specific value within the cluster |
| results[].similarity_score | number | Cosine similarity score (0-1, higher = more similar) |
| results[].size | number | Number of persons in this cluster |
| results[].prevalence | number | Proportion of total population in this cluster (0-1) |
| tool_trace_id | string | OpenTelemetry trace ID for this tool execution |
| workflow_id | string | Workflow session identifier |
Example Response:
{
"results": [
{
"cluster_id": 1000000045,
"cluster_hash": "def456a1b2c3d4e5f6a7b8c9d0e1f2a3",
"domain": "affinity",
"name": "camping_affinity",
"value": "high",
"similarity_score": 0.892,
"size": 3456789,
"prevalence": 0.0134
},
{
"cluster_id": 1000000067,
"cluster_hash": "abc789b2c3d4e5f6a7b8c9d0e1f2a3b4",
"domain": "interest",
"name": "interested_hiking",
"value": "true",
"similarity_score": 0.845,
"size": 5678901,
"prevalence": 0.0221
}
],
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Error Handling
Error Response Format:
{
"content": [
{
"type": "text",
"text": "Cluster search failed: <error message>"
}
],
"isError": true
}Common Errors:
- Empty or whitespace query: "Query cannot be empty"
- Limit out of range: "Limit must be between 1 and 100, received {value}"
- Embedding service failure: "Failed to search clusters: {error details}"
Usage Examples
Example 1: Semantic search with domain filter
{
"query": "people interested in outdoor activities and camping",
"domains": ["affinity", "interest"],
"limit": 20
}Example 2: Finding high-income segments
{
"query": "high income households with investments",
"domains": ["financial"],
"limit": 10
}Example 3: Sports and recreation clusters
{
"query": "sports enthusiasts who play golf",
"limit": 15
}list_clusters
Discover available clusters by browsing domains. Use this tool to identify cluster IDs before calling find_persons.
resolve_identities
Resolve person identities by matching emails, phones, addresses, or mobile advertising IDs (MAIDs). Supports querying multiple identifier types in a single call with Noisy-OR quality score aggregation.