You’re viewing the V1 docs. V2 is now recommended — read the V2 docs.
Watt Data

find_persons

Description: Find persons using cluster criteria and/or location filters. Supports boolean expressions (AND/OR/NOT), geospatial radius search, optional identifier enrichment, and export. Returns total count and sample records.

Tool Identifier: find_persons

Recommended Workflow:

  1. Call list_clusters with desired domains to browse available clusters
  2. Review results to find relevant clusters
  3. Extract cluster_id values from results
  4. Use those IDs to build your boolean expression for find_persons

Input Parameters

ParameterTypeRequiredDefaultConstraintsDescription
expressionstringNo*-Boolean expressionCluster boolean expression (AND/OR/NOT operators)
locationobjectNo*-Lat/lng/radius/unitGeospatial filter
identifier_typestringNo-"phone", "email", or "address"Include contact info in results
audience_limitnumberNo1000000Min: 1, Max: 15000000Maximum persons to return (default 1M, max 15M)
offsetnumberNo0Min: 0Number of persons to skip for pagination. Requires workflow_id when > 0
formatstringNo"none""none", "csv", "json", "jsonl"Export format - generates presigned S3 URL valid for 1 hour
workflow_idstringNo-Valid UUIDWorkflow session identifier for deterministic sampling

*Required: At least one of expression or location must be provided

Parameter Details:

expression:

  • Boolean expression combining cluster IDs or cluster hashes with AND, OR, NOT operators
  • Supports both numeric cluster IDs (e.g., "1000000001") and 32-character hexadecimal cluster hashes (e.g., "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6")
  • Use parentheses for grouping
  • NOT must be part of AND expression (not standalone)
  • Examples: "1000000001", "1000000001 AND 1000000002", "(1000000001 OR 1000000002) AND NOT 1000000003", "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 AND b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7"

location:

  • Geospatial radius filter with dynamic H3 resolution selection
  • Automatically selects optimal H3 resolution (3, 5, 7, or 9) based on radius
  • Example:
    {
      "latitude": 37.7749,
      "longitude": -122.4194,
      "radius": 5,
      "unit": "km"
    }

identifier_type:

  • When specified, enriches results with contact information
  • Returns up to 3 values per person for the requested type
  • Example: "email" returns email1, email2, email3 in nested structure

audience_limit:

  • Controls maximum number of persons returned
  • Default: 1,000,000 (1M)
  • Maximum: 15,000,000 (15M)
  • Minimum: 1
  • Deterministic sampling based on workflow_id for consistent results across calls

format:

  • When set to csv, json, or jsonl, generates S3 presigned download URL
  • URL expires in 1 hour
  • Returns export metadata in response
  • Recommended for audiences >100k

offset:

  • Number of persons to skip before returning results
  • Enables pagination through large audiences
  • Important: workflow_id is required when offset > 0 to ensure deterministic ordering
  • Example: With audience_limit=500000 and offset=500000, returns persons 500,001 through 1,000,000

workflow_id:

  • Optional UUID for deterministic sampling and session tracking
  • Same workflow_id with same query parameters returns identical sample
  • If not provided, a new workflow_id is generated

Request Schema:

{
  expression?: string;
  location?: {
    latitude: number;  // -90 to 90
    longitude: number; // -180 to 180
    radius: number;    // positive
    unit: "km" | "miles";
  };
  identifier_type?: "phone" | "email" | "address";
  audience_limit?: number;  // 1 to 15M, default 1M
  offset?: number;          // 0 to total, default 0
  format?: "none" | "csv" | "json" | "jsonl";
  workflow_id?: string;
}

Output Format

Success Response:

{
  total: number,
  returned_count: number,
  sample: Array<{
    person_id: number;
    name?: string;
    identifiers?: {
      email?: { email1?: string; email2?: string; email3?: string; };
      phone?: { phone1?: string; phone2?: string; phone3?: string; };
      address?: { address1?: string; address2?: string; address3?: string; };
    };
  }>,
  export?: {
    url: string;
    format: "csv" | "json" | "jsonl";
    rows: number;
    size_bytes: number;
    expires_at: string;
  },
  has_more: boolean,
  next_offset?: number,
  tool_trace_id: string,
  workflow_id: string
}

Response Fields:

FieldTypeDescription
totalnumberTotal matching persons (full cardinality of query)
returned_countnumberActual count returned (min of total and audience_limit)
samplearrayUp to 10 sample records for preview
sample[].person_idnumberPerson ID
sample[].namestringPerson name (if available)
sample[].identifiersobjectContact info (only if identifier_type specified)
exportobjectExport metadata (only when format is csv/json/jsonl)
export.urlstringPresigned S3 download URL (expires in 1 hour)
export.formatstringExport format
export.rowsnumberNumber of rows in export
export.size_bytesnumberFile size in bytes
export.expires_atstringISO 8601 expiration timestamp
has_morebooleanWhether more results exist beyond current page
next_offsetnumberOffset value for next page (only present when has_more=true)
tool_trace_idstringOpenTelemetry trace ID for this tool execution
workflow_idstringWorkflow session identifier

Example Response (Basic):

{
  "total": 45678,
  "returned_count": 45678,
  "sample": [
    { "person_id": 123456 },
    { "person_id": 234567 },
    { "person_id": 345678 }
  ],
  "has_more": false,
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Example Response (With Identifiers):

{
  "total": 1500,
  "returned_count": 1500,
  "sample": [
    {
      "person_id": 123456,
      "identifiers": {
        "email": {
          "email1": "john@example.com",
          "email2": "john.doe@work.com"
        }
      }
    }
  ],
  "has_more": false,
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Error Handling

Error Response Format:

{
  "content": [
    {
      "type": "text",
      "text": "Person search failed: <error message>"
    }
  ],
  "isError": true
}

Common Errors:

  • Invalid cluster ID provided in expression
  • Standalone NOT expression: "Standalone NOT expressions are not supported - use "X AND NOT Y" instead"
  • Pure NOT expression: "Pure NOT expressions are not supported - must have at least one positive criterion"
  • Unexpected token in boolean expression
  • Missing closing parenthesis in expression
  • Service temporarily unavailable: "Request failed - please try again"
  • Request timeout: "Request took too long - try simplifying your expression"

Performance Notes

  • Queries execute in milliseconds for most cluster combinations
  • Supports complex boolean expressions with nested parentheses
  • Efficient handling of NOT operations
  • Returns total audience size plus sample of 10 person IDs

Usage Examples

Example 1: Single cluster

{
  "expression": "1000000001"
}

Example 2: Boolean expression

{
  "expression": "(1000000001 OR 1000000002) AND NOT 1000000003"
}

Example 3: Boolean expression with cluster hashes

{
  "expression": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 AND b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7"
}

Example 4: Location-only search

{
  "location": {
    "latitude": 37.7749,
    "longitude": -122.4194,
    "radius": 5,
    "unit": "km"
  }
}

Example 5: Combined cluster + location

{
  "expression": "1000000001 AND 1000000002",
  "location": {
    "latitude": 40.7128,
    "longitude": -74.0060,
    "radius": 10,
    "unit": "miles"
  }
}

Example 6: With identifier enrichment

{
  "expression": "1000000001",
  "identifier_type": "email"
}

Example 7: Large audience with limit (total exceeds limit)

{
  "expression": "1000000001 OR 1000000002",
  "audience_limit": 500000
}

Response shows total (full cardinality) and returned_count (limited):

{
  "total": 2500000,
  "returned_count": 500000,
  "sample": [...],
  "tool_trace_id": "...",
  "workflow_id": "..."
}

Example 8: Paginate through large audience

{
  "expression": "1000000001 OR 1000000002",
  "audience_limit": 500000,
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Response for first page:

{
  "total": 2500000,
  "returned_count": 500000,
  "sample": [
    { "person_id": 123456 },
    { "person_id": 234567 }
  ],
  "has_more": true,
  "next_offset": 500000,
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Request for second page:

{
  "expression": "1000000001 OR 1000000002",
  "audience_limit": 500000,
  "offset": 500000,
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Example 9: Export large audience to CSV

{
  "expression": "1000000001 OR 1000000002",
  "identifier_type": "phone",
  "audience_limit": 10000000,
  "format": "csv"
}

On this page