You’re viewing the V1 docs. V2 is now recommended — read the V2 docs.
Watt Data

resolve_identities

Description: Resolve person identities by matching emails, phones, addresses, or mobile advertising IDs (MAIDs). Supports querying multiple identifier types in a single call with Noisy-OR quality score aggregation. Returns person IDs grouped by individual with quality scores. Email addresses are automatically normalized (Gmail dots and plus-addressing removed). Phone numbers normalized to E.164 format.

Tool Identifier: resolve_identities

Input Parameters

ParameterTypeRequiredDefaultConstraintsDescription
multi_identifiersarrayYes-Max 50 groups per request; each group's values array is capped at 3,000 entries — use csv_resource_uri for larger inputsQuery multiple identifier types simultaneously
formatstringNo"none""none", "csv", "json", "jsonl"Export format - generates presigned S3 URL valid for 1 hour
identifier_typesarrayNo["email"]"name", "email", "phone", "address", "maid"Contact types to return in the identifiers field from resolved person profiles
workflow_idstringNo-Valid UUIDWorkflow session identifier for correlation

Parameter Details:

multi_identifiers:

  • Array of objects, each specifying id_type, hash_type, and values[]
  • Allows querying across different identifier types in one call
  • Email/phone/maid can be mixed in a single call
  • Address must be in a separate call (uses geospatial H3 matching)
  • Returns Noisy-OR aggregated overall_quality_score per person
  • Capped at 50 identifier groups per request — split larger inputs into multiple calls
  • Each identifier group's values array is capped at 3,000 entries — for larger inputs use csv_resource_uri (governed by a separate 200,000-row cap)

Supported id_types:

  • "email" - Email addresses with automatic normalization
  • "phone" - Phone numbers (E.164 format recommended)
  • "address" - Physical addresses (geospatial matching via H3 resolution 11, ~28m precision)
  • "maid" - Mobile advertising IDs (IDFA for iOS, GAID for Android)

Supported hash_types:

  • "plaintext" - Unhashed values
  • "md5" - MD5 hash
  • "sha1" - SHA-1 hash
  • "sha256" - SHA-256 hash

Example multi_identifiers:

{
  "multi_identifiers": [
    {
      "id_type": "email",
      "hash_type": "plaintext",
      "values": ["alice@example.com", "bob@example.com"]
    },
    {
      "id_type": "phone",
      "hash_type": "plaintext",
      "values": ["+15551234567"]
    },
    {
      "id_type": "maid",
      "hash_type": "sha256",
      "values": ["abc123..."]
    }
  ]
}

format:

  • When set to csv, json, or jsonl, generates S3 presigned download URL
  • URL expires in 1 hour
  • Returns export metadata in response

identifier_types:

  • Array of contact types to return in the identifiers field
  • Valid types: "name", "email", "phone", "address", "maid"
  • Default: ["email"]
  • Returns actual stored contact data from the resolved person profiles
  • Eliminates need for follow-up get_person call to retrieve contact info
  • The identifiers field may be an empty object {} if the person has no stored contacts matching the requested types

workflow_id:

  • Optional UUID for tracking related tool calls in a session
  • If not provided, a new workflow_id is generated
  • Used for deterministic sampling and feedback correlation

Request Schema:

interface ResolveIdentitiesParams {
  multi_identifiers: Array<{
    id_type: "email" | "phone" | "address" | "maid";
    hash_type: "plaintext" | "md5" | "sha1" | "sha256";
    values: string[];
  }>;
  format?: "none" | "csv" | "json" | "jsonl";
  identifier_types?: Array<"name" | "email" | "phone" | "address" | "maid">;
  workflow_id?: string;
}

Output Format

Success Response:

{
  identities: Array<{
    person_id: number;
    overall_quality_score: number;
    matches: Array<{
      criterion_type: string;
      criterion_value: string;
      quality_score: number;
    }>;
    identifiers: {
      [type: string]: string[];  // e.g., { email: ["a@example.com"], phone: ["+15551234567"] }
    };
    address?: {
      normalized_address: string;
      latitude: number;
      longitude: number;
      distance_meters: number;
    };
  }>,
  stats: {
    requested: number,
    resolved: number,
    rate: number
  },
  export?: {
    url: string;
    format: "csv" | "json" | "jsonl";
    rows: number;
    size_bytes: number;
    expires_at: string;
  },
  tool_trace_id: string,
  workflow_id: string
}

Response Fields:

FieldTypeDescription
identitiesarrayArray of resolved identities grouped by person_id
identities[].person_idnumberPerson ID
identities[].overall_quality_scorenumberNoisy-OR aggregated confidence (0-1) across all matches
identities[].matchesarrayIndividual criterion matches with per-criterion scores
identities[].matches[].criterion_typestringType (e.g., "email_plaintext", "phone_md5", "maid_sha256")
identities[].matches[].criterion_valuestringThe matched value
identities[].matches[].quality_scorenumberQuality score for this specific match (0-1)
identities[].identifiersobjectStored contact data from person profile, keyed by type (e.g., email, phone). Returns types specified in identifier_types parameter.
identities[].addressobjectGeocoding data with distance (only for address queries)
identities[].address.normalized_addressstringNormalized address string
identities[].address.latitudenumberLatitude coordinate
identities[].address.longitudenumberLongitude coordinate
identities[].address.distance_metersnumberDistance from query address in meters
stats.requestednumberTotal identifier values provided across all groups
stats.resolvednumberDistinct identities matched. rate = resolved / requested is bounded to [0, 1].
stats.ratenumberDistinct identities resolved per identifier requested
stats.resolved_by_typeobjectDistinct identities matched per identifier type (e.g. {"email": 171, "address": 226}). Each identity contributes at most 1 per type bucket regardless of how many criteria of that type matched it.
exportobjectExport metadata (only when format is csv/json/jsonl)
export.urlstringPresigned S3 download URL (expires in 1 hour)
export.formatstringExport format
export.rowsnumberNumber of rows in export
export.size_bytesnumberFile size in bytes
export.expires_atstringISO 8601 expiration timestamp
tool_trace_idstringOpenTelemetry trace ID for this tool execution
workflow_idstringWorkflow session identifier

Example Response (Email Resolution):

{
  "identities": [
    {
      "person_id": 123456,
      "overall_quality_score": 0.95,
      "matches": [
        {
          "criterion_type": "email_plaintext",
          "criterion_value": "john.doe@example.com",
          "quality_score": 0.95
        }
      ],
      "identifiers": {
        "email": ["john.doe@example.com", "jdoe@work.com"]
      }
    }
  ],
  "stats": {
    "requested": 2,
    "resolved": 1,
    "rate": 0.5
  },
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Example Response (Multi-Criterion with Export):

{
  "identities": [],
  "stats": {
    "requested": 3,
    "resolved": 2,
    "rate": 0.67
  },
  "export": {
    "url": "https://s3.amazonaws.com/bucket/file.csv?...",
    "format": "csv",
    "rows": 2,
    "size_bytes": 1024,
    "expires_at": "2025-01-16T12:00:00Z"
  },
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Example Response (Address Resolution with Distance):

{
  "identities": [
    {
      "person_id": 789012,
      "overall_quality_score": 0.88,
      "matches": [
        {
          "criterion_type": "address_h3_11",
          "criterion_value": "123 Main St, San Francisco, CA 94105",
          "quality_score": 0.88
        }
      ],
      "identifiers": {
        "email": ["resident@example.com"]
      },
      "address": {
        "normalized_address": "123 Main St, San Francisco, CA 94105, USA",
        "latitude": 37.7749,
        "longitude": -122.4194,
        "distance_meters": 15.3
      }
    }
  ],
  "stats": {
    "requested": 1,
    "resolved": 1,
    "rate": 1.0
  },
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Error Handling

Error Response Format:

{
  "content": [
    {
      "type": "text",
      "text": "Identity resolution failed: <error message>"
    }
  ],
  "isError": true
}

Common Errors:

  • Empty identifiers array: "At least one identifier required"
  • Invalid identifier format: "Invalid email/phone/address format"
  • More than 3,000 identifiers in a single request: "Maximum 3000 identifiers allowed"
  • More than 50 identifier groups in multi_identifiers: "Maximum 50 identifier groups allowed"
  • More than 3,000 values in any identifier group: "Maximum 3000 values per identifier group"
  • Service temporarily unavailable: "Request failed - please try again"
  • Request timeout: "Request took too long - try reducing batch size"

When csv_resource_uri points to a file with more than 200,000 rows, paginate using the next_offset cursor returned in the response: pass it back as offset on the next call until the field is omitted (last page). offset, limit, and next_offset only apply on the CSV path; they are ignored when inline identifiers / multi_identifiers is used.

Address Matching Behavior

  • Returns only the best-scoring person(s) per input address (not all people in the H3 cell)
  • Minimum quality threshold of 0.75 (~31m distance) excludes weak matches
  • Household members tied at max score are all returned
  • Maximum batch size: 1,000 addresses per request
  • Quality scores use distance-based decay: full score within 5m, linear decay to 10% at 100m, floor at 10% beyond

Performance Notes

  • Supports batch processing of multiple identifiers in a single request
  • Returns quality scores for each resolved identity
  • Results ordered by quality score (highest quality first)
  • No hard limit on number of identifiers, but very large batches may timeout

Usage Examples

Example 1: Simple email resolution

{
  "multi_identifiers": [
    {
      "id_type": "email",
      "hash_type": "plaintext",
      "values": ["alice@example.com", "bob@example.com"]
    }
  ]
}

Example 2: Phone number resolution

{
  "multi_identifiers": [
    {
      "id_type": "phone",
      "hash_type": "plaintext",
      "values": ["+15551234567", "+442071234567"]
    }
  ]
}

Example 3: MAID resolution with hashing

{
  "multi_identifiers": [
    {
      "id_type": "maid",
      "hash_type": "sha256",
      "values": [
        "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
      ]
    }
  ]
}

Example 4: Multi-criterion (email + phone + maid)

{
  "multi_identifiers": [
    {
      "id_type": "email",
      "hash_type": "plaintext",
      "values": ["alice@example.com"]
    },
    {
      "id_type": "phone",
      "hash_type": "plaintext",
      "values": ["+15551234567"]
    },
    {
      "id_type": "maid",
      "hash_type": "md5",
      "values": ["098f6bcd4621d373cade4e832627b4f6"]
    }
  ]
}

Example 5: Address resolution with distance

{
  "multi_identifiers": [
    {
      "id_type": "address",
      "hash_type": "plaintext",
      "values": ["123 Main St, San Francisco, CA 94105"]
    }
  ]
}

Example 6: Request specific identifier types

{
  "multi_identifiers": [
    {
      "id_type": "email",
      "hash_type": "plaintext",
      "values": ["alice@example.com"]
    }
  ],
  "identifier_types": ["email", "phone"]
}

Example 7: Export to CSV

{
  "multi_identifiers": [
    {
      "id_type": "email",
      "hash_type": "md5",
      "values": [
        "5d41402abc4b2a76b9719d911017c592",
        "098f6bcd4621d373cade4e832627b4f6"
      ]
    }
  ],
  "format": "csv"
}

On this page