Resolve each row of a CSV file to a platform entity ID and optionally append enrichment data, preserving the original row structure.
Quick Example
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/uploads/customers.csv",
"email_columns": { "names": ["email"] },
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| entity_type | string | Yes | - | "person" or "business" | Type of entity to resolve |
| csv_key | string | Conditional | - | Filename | CSV filename from generate_upload_url. Mutually exclusive with csv_resource_uri |
| csv_resource_uri | string | Conditional | - | workflow:// URI | CSV resource URI. Mutually exclusive with csv_key |
| email_columns | object | No | - | { names: string[], hash_type?: string } | Columns containing email addresses |
| phone_columns | object | No | - | { names: string[], hash_type?: string } | Columns containing phone numbers |
| address_columns | object | No | - | { names: string[], hash_type?: string } | Columns containing physical addresses |
| name_columns | object | No | - | { names: string[], hash_type?: string } | Columns containing entity names |
| tiebreaker_hierarchy | array | No | ["email", "phone", "address", "name"] | Ordered array | Priority for divergent identifier resolution |
| min_score_threshold | number | No | 0.0 | 0.0-1.0 | Minimum match score threshold |
| domains | array | No | - | Enrichment domains | Optional enrichment domains to include |
| contact_types | array | No | - | "name", "email", "phone", "address" | Contact types from resolved profiles |
| include_unmatched | boolean | No | true | - | Include rows with no matches in output |
| include_score_breakdown | boolean | No | false | - | Include detailed score breakdown per identifier |
| workflow_id | string | Yes | - | Valid UUID | Workflow session ID from generate_upload_url |
Parameter Details:
csv_key vs csv_resource_uri:
- Provide exactly one. They are mutually exclusive.
- At least one identifier column must be specified (email_columns, phone_columns, address_columns, or name_columns)
email_columns / phone_columns / address_columns / name_columns:
names- Array of CSV column names containing the identifier valueshash_type- Hash format of values in those columns:"plaintext"(default),"md5","sha1", or"sha256"- Use
hash_typewhen your CSV contains pre-hashed identifiers
tiebreaker_hierarchy:
- When multiple identifiers resolve to different entities, the hierarchy determines which entity wins
- Default:
["email", "phone", "address", "name"](email has highest priority)
domains (enrichment):
- Available:
demographic,affinity,content,employment,financial,household,id,intent,interest,lifestyle,political,purchase - When specified, adds enrichment columns (prefixed with
_) to the output CSV
Output columns added to CSV:
_entity_id- Resolved entity ID_match_score- Overall match confidence (0-1)_match_method- How the match was made (composite, single, tiebreaker)_matched_identifiers- Which identifiers matched_tiebreaker_winner- Which identifier type won the tiebreak_enriched_{contact_type}- Enriched contact data (when contact_types specified)_{domain}- Domain enrichment data (when domains specified)
Request Schema:
interface ResolveAndEnrichRowsParams {
entity_type: "person" | "business";
csv_key?: string;
csv_resource_uri?: string;
email_columns?: { names: string[]; hash_type?: "plaintext" | "md5" | "sha1" | "sha256" };
phone_columns?: { names: string[]; hash_type?: "plaintext" | "md5" | "sha1" | "sha256" };
address_columns?: { names: string[]; hash_type?: "plaintext" | "md5" | "sha1" | "sha256" };
name_columns?: { names: string[]; hash_type?: "plaintext" | "md5" | "sha1" | "sha256" };
tiebreaker_hierarchy?: Array<"email" | "phone" | "address" | "name">;
min_score_threshold?: number;
domains?: Array<"demographic" | "affinity" | "content" | "employment" | "financial" | "household" | "id" | "intent" | "interest" | "lifestyle" | "political" | "purchase">;
contact_types?: Array<"name" | "email" | "phone" | "address">;
include_unmatched?: boolean;
include_score_breakdown?: boolean;
workflow_id: string;
}Output Format
{
export: {
download_url: string;
format: "csv";
expires_at: string;
},
stats: {
total_rows: number;
matched_rows: number;
unmatched_rows: number;
match_rate: number;
by_method: {
composite: number;
single: number;
tiebreaker: number;
};
avg_score: number;
},
tool_trace_id: string,
workflow_id: string
}Response Fields:
| Field | Type | Description |
|---|---|---|
| export.download_url | string | Presigned URL for enriched CSV |
| export.format | string | Always "csv" |
| export.expires_at | string | ISO 8601 expiration timestamp |
| stats.total_rows | number | Total input rows processed |
| stats.matched_rows | number | Rows with successful matches |
| stats.unmatched_rows | number | Rows with no matches |
| stats.match_rate | number | Match rate (0-1) |
| stats.by_method.composite | number | Rows matched via multiple identifiers |
| stats.by_method.single | number | Rows matched via single identifier |
| stats.by_method.tiebreaker | number | Rows resolved via tiebreaker |
| stats.avg_score | number | Average match score |
Example Response:
{
"export": {
"download_url": "https://s3.amazonaws.com/bucket/artifacts/550e.../resolved_rows_1705320000.csv?...",
"format": "csv",
"expires_at": "2025-01-16T13:00:00Z"
},
"stats": {
"total_rows": 5000,
"matched_rows": 4250,
"unmatched_rows": 750,
"match_rate": 0.85,
"by_method": {
"composite": 2100,
"single": 1800,
"tiebreaker": 350
},
"avg_score": 0.78
},
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Performance Notes
- Processes in streaming batches of 10,000 rows for memory efficiency
- Suitable for files with millions of rows
- Output maintains exact 1:1 row correspondence with input
Usage Examples
Example 1: Basic email resolution
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/uploads/customers.csv",
"email_columns": { "names": ["email"] },
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example 2: Multi-identifier with enrichment
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/uploads/leads.csv",
"email_columns": { "names": ["work_email", "personal_email"] },
"phone_columns": { "names": ["phone"] },
"address_columns": { "names": ["address"] },
"domains": ["demographic", "employment", "financial"],
"contact_types": ["phone", "email"],
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example 3: High-quality matches only
{
"entity_type": "person",
"csv_key": "prospects.csv",
"email_columns": { "names": ["email"] },
"min_score_threshold": 0.8,
"include_unmatched": false,
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example 4: Pre-hashed identifiers
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/uploads/hashed_list.csv",
"email_columns": { "names": ["email_md5"], "hash_type": "md5" },
"phone_columns": { "names": ["phone_sha256"], "hash_type": "sha256" },
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}