resolve_identities
Description: Resolve person identities by matching emails, phones, addresses, or mobile advertising IDs (MAIDs). Supports querying multiple identifier types in a single call with Noisy-OR quality score aggregation. Returns person IDs grouped by individual with quality scores. Email addresses are automatically normalized (Gmail dots and plus-addressing removed). Phone numbers normalized to E.164 format.
Tool Identifier: resolve_identities
Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| multi_identifiers | array | Yes | - | Max 50 groups per request; each group's values array is capped at 3,000 entries — use csv_resource_uri for larger inputs | Query multiple identifier types simultaneously |
| format | string | No | "none" | "none", "csv", "json", "jsonl" | Export format - generates presigned S3 URL valid for 1 hour |
| identifier_types | array | No | ["email"] | "name", "email", "phone", "address", "maid" | Contact types to return in the identifiers field from resolved person profiles |
| workflow_id | string | No | - | Valid UUID | Workflow session identifier for correlation |
Parameter Details:
multi_identifiers:
- Array of objects, each specifying
id_type,hash_type, andvalues[] - Allows querying across different identifier types in one call
- Email/phone/maid can be mixed in a single call
- Address must be in a separate call (uses geospatial H3 matching)
- Returns Noisy-OR aggregated
overall_quality_scoreper person - Capped at 50 identifier groups per request — split larger inputs into multiple calls
- Each identifier group's
valuesarray is capped at 3,000 entries — for larger inputs usecsv_resource_uri(governed by a separate 200,000-row cap)
Supported id_types:
"email"- Email addresses with automatic normalization"phone"- Phone numbers (E.164 format recommended)"address"- Physical addresses (geospatial matching via H3 resolution 11, ~28m precision)"maid"- Mobile advertising IDs (IDFA for iOS, GAID for Android)
Supported hash_types:
"plaintext"- Unhashed values"md5"- MD5 hash"sha1"- SHA-1 hash"sha256"- SHA-256 hash
Example multi_identifiers:
{
"multi_identifiers": [
{
"id_type": "email",
"hash_type": "plaintext",
"values": ["alice@example.com", "bob@example.com"]
},
{
"id_type": "phone",
"hash_type": "plaintext",
"values": ["+15551234567"]
},
{
"id_type": "maid",
"hash_type": "sha256",
"values": ["abc123..."]
}
]
}format:
- When set to
csv,json, orjsonl, generates S3 presigned download URL - URL expires in 1 hour
- Returns export metadata in response
identifier_types:
- Array of contact types to return in the
identifiersfield - Valid types:
"name","email","phone","address","maid" - Default:
["email"] - Returns actual stored contact data from the resolved person profiles
- Eliminates need for follow-up
get_personcall to retrieve contact info - The
identifiersfield may be an empty object{}if the person has no stored contacts matching the requested types
workflow_id:
- Optional UUID for tracking related tool calls in a session
- If not provided, a new workflow_id is generated
- Used for deterministic sampling and feedback correlation
Request Schema:
interface ResolveIdentitiesParams {
multi_identifiers: Array<{
id_type: "email" | "phone" | "address" | "maid";
hash_type: "plaintext" | "md5" | "sha1" | "sha256";
values: string[];
}>;
format?: "none" | "csv" | "json" | "jsonl";
identifier_types?: Array<"name" | "email" | "phone" | "address" | "maid">;
workflow_id?: string;
}Output Format
Success Response:
{
identities: Array<{
person_id: number;
overall_quality_score: number;
matches: Array<{
criterion_type: string;
criterion_value: string;
quality_score: number;
}>;
identifiers: {
[type: string]: string[]; // e.g., { email: ["a@example.com"], phone: ["+15551234567"] }
};
address?: {
normalized_address: string;
latitude: number;
longitude: number;
distance_meters: number;
};
}>,
stats: {
requested: number,
resolved: number,
rate: number
},
export?: {
url: string;
format: "csv" | "json" | "jsonl";
rows: number;
size_bytes: number;
expires_at: string;
},
tool_trace_id: string,
workflow_id: string
}Response Fields:
| Field | Type | Description |
|---|---|---|
| identities | array | Array of resolved identities grouped by person_id |
| identities[].person_id | number | Person ID |
| identities[].overall_quality_score | number | Noisy-OR aggregated confidence (0-1) across all matches |
| identities[].matches | array | Individual criterion matches with per-criterion scores |
| identities[].matches[].criterion_type | string | Type (e.g., "email_plaintext", "phone_md5", "maid_sha256") |
| identities[].matches[].criterion_value | string | The matched value |
| identities[].matches[].quality_score | number | Quality score for this specific match (0-1) |
| identities[].identifiers | object | Stored contact data from person profile, keyed by type (e.g., email, phone). Returns types specified in identifier_types parameter. |
| identities[].address | object | Geocoding data with distance (only for address queries) |
| identities[].address.normalized_address | string | Normalized address string |
| identities[].address.latitude | number | Latitude coordinate |
| identities[].address.longitude | number | Longitude coordinate |
| identities[].address.distance_meters | number | Distance from query address in meters |
| stats.requested | number | Total identifier values provided across all groups |
| stats.resolved | number | Distinct identities matched. rate = resolved / requested is bounded to [0, 1]. |
| stats.rate | number | Distinct identities resolved per identifier requested |
| stats.resolved_by_type | object | Distinct identities matched per identifier type (e.g. {"email": 171, "address": 226}). Each identity contributes at most 1 per type bucket regardless of how many criteria of that type matched it. |
| export | object | Export metadata (only when format is csv/json/jsonl) |
| export.url | string | Presigned S3 download URL (expires in 1 hour) |
| export.format | string | Export format |
| export.rows | number | Number of rows in export |
| export.size_bytes | number | File size in bytes |
| export.expires_at | string | ISO 8601 expiration timestamp |
| tool_trace_id | string | OpenTelemetry trace ID for this tool execution |
| workflow_id | string | Workflow session identifier |
Example Response (Email Resolution):
{
"identities": [
{
"person_id": 123456,
"overall_quality_score": 0.95,
"matches": [
{
"criterion_type": "email_plaintext",
"criterion_value": "john.doe@example.com",
"quality_score": 0.95
}
],
"identifiers": {
"email": ["john.doe@example.com", "jdoe@work.com"]
}
}
],
"stats": {
"requested": 2,
"resolved": 1,
"rate": 0.5
},
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example Response (Multi-Criterion with Export):
{
"identities": [],
"stats": {
"requested": 3,
"resolved": 2,
"rate": 0.67
},
"export": {
"url": "https://s3.amazonaws.com/bucket/file.csv?...",
"format": "csv",
"rows": 2,
"size_bytes": 1024,
"expires_at": "2025-01-16T12:00:00Z"
},
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example Response (Address Resolution with Distance):
{
"identities": [
{
"person_id": 789012,
"overall_quality_score": 0.88,
"matches": [
{
"criterion_type": "address_h3_11",
"criterion_value": "123 Main St, San Francisco, CA 94105",
"quality_score": 0.88
}
],
"identifiers": {
"email": ["resident@example.com"]
},
"address": {
"normalized_address": "123 Main St, San Francisco, CA 94105, USA",
"latitude": 37.7749,
"longitude": -122.4194,
"distance_meters": 15.3
}
}
],
"stats": {
"requested": 1,
"resolved": 1,
"rate": 1.0
},
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Error Handling
Error Response Format:
{
"content": [
{
"type": "text",
"text": "Identity resolution failed: <error message>"
}
],
"isError": true
}Common Errors:
- Empty identifiers array: "At least one identifier required"
- Invalid identifier format: "Invalid email/phone/address format"
- More than 3,000 identifiers in a single request: "Maximum 3000 identifiers allowed"
- More than 50 identifier groups in
multi_identifiers: "Maximum 50 identifier groups allowed" - More than 3,000 values in any identifier group: "Maximum 3000 values per identifier group"
- Service temporarily unavailable: "Request failed - please try again"
- Request timeout: "Request took too long - try reducing batch size"
When csv_resource_uri points to a file with more than 200,000 rows, paginate using the next_offset cursor returned in the response: pass it back as offset on the next call until the field is omitted (last page). offset, limit, and next_offset only apply on the CSV path; they are ignored when inline identifiers / multi_identifiers is used.
Address Matching Behavior
- Returns only the best-scoring person(s) per input address (not all people in the H3 cell)
- Minimum quality threshold of 0.75 (~31m distance) excludes weak matches
- Household members tied at max score are all returned
- Maximum batch size: 1,000 addresses per request
- Quality scores use distance-based decay: full score within 5m, linear decay to 10% at 100m, floor at 10% beyond
Performance Notes
- Supports batch processing of multiple identifiers in a single request
- Returns quality scores for each resolved identity
- Results ordered by quality score (highest quality first)
- No hard limit on number of identifiers, but very large batches may timeout
Usage Examples
Example 1: Simple email resolution
{
"multi_identifiers": [
{
"id_type": "email",
"hash_type": "plaintext",
"values": ["alice@example.com", "bob@example.com"]
}
]
}Example 2: Phone number resolution
{
"multi_identifiers": [
{
"id_type": "phone",
"hash_type": "plaintext",
"values": ["+15551234567", "+442071234567"]
}
]
}Example 3: MAID resolution with hashing
{
"multi_identifiers": [
{
"id_type": "maid",
"hash_type": "sha256",
"values": [
"a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
]
}
]
}Example 4: Multi-criterion (email + phone + maid)
{
"multi_identifiers": [
{
"id_type": "email",
"hash_type": "plaintext",
"values": ["alice@example.com"]
},
{
"id_type": "phone",
"hash_type": "plaintext",
"values": ["+15551234567"]
},
{
"id_type": "maid",
"hash_type": "md5",
"values": ["098f6bcd4621d373cade4e832627b4f6"]
}
]
}Example 5: Address resolution with distance
{
"multi_identifiers": [
{
"id_type": "address",
"hash_type": "plaintext",
"values": ["123 Main St, San Francisco, CA 94105"]
}
]
}Example 6: Request specific identifier types
{
"multi_identifiers": [
{
"id_type": "email",
"hash_type": "plaintext",
"values": ["alice@example.com"]
}
],
"identifier_types": ["email", "phone"]
}Example 7: Export to CSV
{
"multi_identifiers": [
{
"id_type": "email",
"hash_type": "md5",
"values": [
"5d41402abc4b2a76b9719d911017c592",
"098f6bcd4621d373cade4e832627b4f6"
]
}
],
"format": "csv"
}