find_persons
Description: Find persons using cluster criteria and/or location filters. Supports boolean expressions (AND/OR/NOT), geospatial radius search, optional identifier enrichment, and export. Returns total count and sample records.
Tool Identifier: find_persons
Recommended Workflow:
- Call
list_clusterswith desired domains to browse available clusters - Review results to find relevant clusters
- Extract
cluster_idvalues from results - Use those IDs to build your boolean expression for
find_persons
Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| expression | string | No* | - | Boolean expression | Cluster boolean expression (AND/OR/NOT operators) |
| location | object | No* | - | Lat/lng/radius/unit | Geospatial filter |
| identifier_type | string | No | - | "phone", "email", or "address" | Include contact info in results |
| audience_limit | number | No | 1000000 | Min: 1, Max: 15000000 | Maximum persons to return (default 1M, max 15M) |
| offset | number | No | 0 | Min: 0 | Number of persons to skip for pagination. Requires workflow_id when > 0 |
| format | string | No | "none" | "none", "csv", "json", "jsonl" | Export format - generates presigned S3 URL valid for 1 hour |
| workflow_id | string | No | - | Valid UUID | Workflow session identifier for deterministic sampling |
*Required: At least one of expression or location must be provided
Parameter Details:
expression:
- Boolean expression combining cluster IDs or cluster hashes with AND, OR, NOT operators
- Supports both numeric cluster IDs (e.g.,
"1000000001") and 32-character hexadecimal cluster hashes (e.g.,"a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6") - Use parentheses for grouping
- NOT must be part of AND expression (not standalone)
- Examples:
"1000000001","1000000001 AND 1000000002","(1000000001 OR 1000000002) AND NOT 1000000003","a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 AND b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7"
location:
- Geospatial radius filter with dynamic H3 resolution selection
- Automatically selects optimal H3 resolution (3, 5, 7, or 9) based on radius
- Example:
{ "latitude": 37.7749, "longitude": -122.4194, "radius": 5, "unit": "km" }
identifier_type:
- When specified, enriches results with contact information
- Returns up to 3 values per person for the requested type
- Example:
"email"returnsemail1,email2,email3in nested structure
audience_limit:
- Controls maximum number of persons returned
- Default: 1,000,000 (1M)
- Maximum: 15,000,000 (15M)
- Minimum: 1
- Deterministic sampling based on workflow_id for consistent results across calls
format:
- When set to
csv,json, orjsonl, generates S3 presigned download URL - URL expires in 1 hour
- Returns export metadata in response
- Recommended for audiences >100k
offset:
- Number of persons to skip before returning results
- Enables pagination through large audiences
- Important:
workflow_idis required whenoffset > 0to ensure deterministic ordering - Example: With
audience_limit=500000andoffset=500000, returns persons 500,001 through 1,000,000
workflow_id:
- Optional UUID for deterministic sampling and session tracking
- Same workflow_id with same query parameters returns identical sample
- If not provided, a new workflow_id is generated
Request Schema:
{
expression?: string;
location?: {
latitude: number; // -90 to 90
longitude: number; // -180 to 180
radius: number; // positive
unit: "km" | "miles";
};
identifier_type?: "phone" | "email" | "address";
audience_limit?: number; // 1 to 15M, default 1M
offset?: number; // 0 to total, default 0
format?: "none" | "csv" | "json" | "jsonl";
workflow_id?: string;
}Output Format
Success Response:
{
total: number,
returned_count: number,
sample: Array<{
person_id: number;
name?: string;
identifiers?: {
email?: { email1?: string; email2?: string; email3?: string; };
phone?: { phone1?: string; phone2?: string; phone3?: string; };
address?: { address1?: string; address2?: string; address3?: string; };
};
}>,
export?: {
url: string;
format: "csv" | "json" | "jsonl";
rows: number;
size_bytes: number;
expires_at: string;
},
has_more: boolean,
next_offset?: number,
tool_trace_id: string,
workflow_id: string
}Response Fields:
| Field | Type | Description |
|---|---|---|
| total | number | Total matching persons (full cardinality of query) |
| returned_count | number | Actual count returned (min of total and audience_limit) |
| sample | array | Up to 10 sample records for preview |
| sample[].person_id | number | Person ID |
| sample[].name | string | Person name (if available) |
| sample[].identifiers | object | Contact info (only if identifier_type specified) |
| export | object | Export metadata (only when format is csv/json/jsonl) |
| export.url | string | Presigned S3 download URL (expires in 1 hour) |
| export.format | string | Export format |
| export.rows | number | Number of rows in export |
| export.size_bytes | number | File size in bytes |
| export.expires_at | string | ISO 8601 expiration timestamp |
| has_more | boolean | Whether more results exist beyond current page |
| next_offset | number | Offset value for next page (only present when has_more=true) |
| tool_trace_id | string | OpenTelemetry trace ID for this tool execution |
| workflow_id | string | Workflow session identifier |
Example Response (Basic):
{
"total": 45678,
"returned_count": 45678,
"sample": [
{ "person_id": 123456 },
{ "person_id": 234567 },
{ "person_id": 345678 }
],
"has_more": false,
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example Response (With Identifiers):
{
"total": 1500,
"returned_count": 1500,
"sample": [
{
"person_id": 123456,
"identifiers": {
"email": {
"email1": "john@example.com",
"email2": "john.doe@work.com"
}
}
}
],
"has_more": false,
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Error Handling
Error Response Format:
{
"content": [
{
"type": "text",
"text": "Person search failed: <error message>"
}
],
"isError": true
}Common Errors:
- Invalid cluster ID provided in expression
- Standalone NOT expression: "Standalone NOT expressions are not supported - use "X AND NOT Y" instead"
- Pure NOT expression: "Pure NOT expressions are not supported - must have at least one positive criterion"
- Unexpected token in boolean expression
- Missing closing parenthesis in expression
- Service temporarily unavailable: "Request failed - please try again"
- Request timeout: "Request took too long - try simplifying your expression"
Performance Notes
- Queries execute in milliseconds for most cluster combinations
- Supports complex boolean expressions with nested parentheses
- Efficient handling of NOT operations
- Returns total audience size plus sample of 10 person IDs
Usage Examples
Example 1: Single cluster
{
"expression": "1000000001"
}Example 2: Boolean expression
{
"expression": "(1000000001 OR 1000000002) AND NOT 1000000003"
}Example 3: Boolean expression with cluster hashes
{
"expression": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 AND b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7"
}Example 4: Location-only search
{
"location": {
"latitude": 37.7749,
"longitude": -122.4194,
"radius": 5,
"unit": "km"
}
}Example 5: Combined cluster + location
{
"expression": "1000000001 AND 1000000002",
"location": {
"latitude": 40.7128,
"longitude": -74.0060,
"radius": 10,
"unit": "miles"
}
}Example 6: With identifier enrichment
{
"expression": "1000000001",
"identifier_type": "email"
}Example 7: Large audience with limit (total exceeds limit)
{
"expression": "1000000001 OR 1000000002",
"audience_limit": 500000
}Response shows total (full cardinality) and returned_count (limited):
{
"total": 2500000,
"returned_count": 500000,
"sample": [...],
"tool_trace_id": "...",
"workflow_id": "..."
}Example 8: Paginate through large audience
{
"expression": "1000000001 OR 1000000002",
"audience_limit": 500000,
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Response for first page:
{
"total": 2500000,
"returned_count": 500000,
"sample": [
{ "person_id": 123456 },
{ "person_id": 234567 }
],
"has_more": true,
"next_offset": 500000,
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Request for second page:
{
"expression": "1000000001 OR 1000000002",
"audience_limit": 500000,
"offset": 500000,
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example 9: Export large audience to CSV
{
"expression": "1000000001 OR 1000000002",
"identifier_type": "phone",
"audience_limit": 10000000,
"format": "csv"
}