Check trait membership for a set of entities against one or more labeled trait expressions. Returns a per-entity boolean matrix where 1 means the entity matches the expression and 0 means it does not. Uses the same boolean expression syntax as entity_find.
Quick Example
{
"entity_type": "person",
"entity_ids": ["123", "456", "789"],
"expressions": [
{ "label": "golf_fans", "expression": "6003037" },
{ "label": "high_income_golf_fans", "expression": "6003037 AND 2000007" }
]
}Input Parameters
| Parameter | Type | Required | Default | Constraints | Description |
|---|---|---|---|---|---|
| entity_type | string | Yes | - | "person" or "business" | Type of entity being checked |
| entity_ids | array | Conditional | - | Array of strings or integers, max 1000 | Entity IDs (inline mode). Mutually exclusive with csv_resource_uri |
| csv_resource_uri | string | Conditional | - | workflow:// URI | CSV or Parquet file containing entity IDs. Mutually exclusive with entity_ids |
| entity_id_column | string | No | "entity_id" | Column name | Column containing entity IDs (only with csv_resource_uri) |
| expressions | array | Yes | - | 1-100 entries of { label, expression } | Labeled boolean trait expressions to evaluate per entity |
| include_unmatched | boolean | No | true | - | If false, drop entities that match no expression |
| include_identifiers | array | No | - | Subset of name, email, phone, address, maid | Identifier columns to include alongside entity_id |
| format | string | No | "none" | "none", "csv", "json", "jsonl" | Export format. Non-none values produce a presigned download URL valid for 1 hour |
| workflow_id | string | Conditional | - | Valid UUID | Required when format is not "none" |
Parameter Details:
entity_ids vs csv_resource_uri:
- Provide exactly one. They are mutually exclusive.
entity_idsfor small inline batches (capped at 1000 IDs)csv_resource_urifor chaining fromentity_resolveorentity_findoutput (recommended for larger sets); bounded at 200,000 entity IDscsv_resource_urisupports both.csvand.parquetfiles
expressions:
- Each entry is
{ "label": "<column-name>", "expression": "<boolean-expression>" } labelbecomes a column name in the output matrix; labels must be uniqueexpressionuses the same boolean syntax asentity_find- 1-100 expressions per call
Boolean expression syntax (same as entity_find):
- Trait IDs (numeric) or trait hashes (alphanumeric)
- Supports
AND,OR,NOT, parentheses for grouping - Mixing trait IDs and trait hashes is allowed
- Discover valid trait IDs via
trait_searchor browsetrait://resources
"6003037" // Single trait
"6003037 AND 2000007" // Both traits
"(6003034 OR 6003037) AND NOT 2000012" // Grouped boolean
"abc123 AND def456" // Trait hashesinclude_identifiers:
- Adds the entity's primary identifier of each requested type to the output row (e.g.,
email1,phone1) - Useful for re-joining the matrix with the original input file
format and workflow_id:
format: "none"(default) returns an inline summary plus a 10-row sampleformat: "csv" | "json" | "jsonl"streams the full matrix to S3 and returns a presignedexport.urlvalid for 1 hour, plus aworkflow://resource URI for downstream tools- A
workflow_idis required wheneverformatis not"none"
Request Schema:
interface EntityTraitsParams {
entity_type: "person" | "business";
entity_ids?: Array<string | number>;
csv_resource_uri?: string;
entity_id_column?: string;
expressions: Array<{
label: string;
expression: string;
}>;
include_unmatched?: boolean;
include_identifiers?: Array<"name" | "email" | "phone" | "address" | "maid">;
format?: "none" | "csv" | "json" | "jsonl";
workflow_id?: string; // Valid UUID; required when format is not "none"
}Output Format
{
total_entities: number;
matched_entities: number;
expression_counts: Record<string, number>;
sample: Array<{
entity_id: string;
[identifier: string]: string | number; // e.g., email1, phone1 when include_identifiers is set
[label: string]: 0 | 1; // one column per expression label
}>;
export?: {
url: string;
format: "csv" | "json" | "jsonl";
rows: number;
expires_at: string;
resource_uri: string;
};
tool_trace_id: string;
workflow_id: string;
}Response Fields:
| Field | Type | Description |
|---|---|---|
| total_entities | number | Number of input entities scored (after include_unmatched filtering) |
| matched_entities | number | Entities matching at least one expression |
| expression_counts | object | Per-label match count, keyed by expression label |
| sample | array | Up to 10 matrix rows: entity_id, optional identifier columns, one 0/1 column per expression label |
| export | object | Present only when format is csv, json, or jsonl |
| export.url | string | Presigned download URL, valid for 1 hour |
| export.resource_uri | string | workflow:// URI for chaining into downstream tools |
| export.rows | number | Total rows written to the export |
| export.expires_at | string | ISO-8601 expiry of the presigned URL |
| tool_trace_id | string | OpenTelemetry trace ID |
| workflow_id | string | Workflow session identifier |
Example Response:
{
"total_entities": 500,
"matched_entities": 312,
"expression_counts": {
"golf_fans": 245,
"high_income_golf_fans": 132
},
"sample": [
{
"entity_id": "123456",
"email1": "alice@example.com",
"golf_fans": 1,
"high_income_golf_fans": 1
},
{
"entity_id": "789012",
"email1": "bob@example.com",
"golf_fans": 1,
"high_income_golf_fans": 0
}
],
"tool_trace_id": "a1b2c3d4e5f6",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Error Handling
Common Errors:
- Both
entity_idsandcsv_resource_uriprovided: "entity_ids and csv_resource_uri are mutually exclusive" - Neither provided: "Either entity_ids or csv_resource_uri must be provided"
csv_resource_uridoes not end in.csvor.parquet:"csv_resource_uri must point to a .csv or .parquet file, got: <uri>"- Duplicate label in
expressions:"Duplicate expression label: \"<label>\"" formatset withoutworkflow_id:"workflow_id is required when format is not 'none'. Provide a workflow_id to enable export."- Unknown cluster hash(es) in an expression:
"Unknown cluster hash(es): <hash-list>. Use trait_search to discover valid trait hashes before building expressions." - Invalid cluster identifier token (not a numeric ID or 32-character hex hash):
"Invalid cluster identifier \"<token>\". Must be a numeric cluster ID or a 32-character hex cluster hash"
Usage Examples
Example 1: Inline entity IDs with two expressions
{
"entity_type": "person",
"entity_ids": ["123", "456", "789"],
"expressions": [
{ "label": "golf_fans", "expression": "6003037" },
{ "label": "high_income_golf_fans", "expression": "6003037 AND 2000007" }
]
}Example 2: Chained from entity_find via csv_resource_uri (CSV export)
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/artifacts/audience.csv",
"entity_id_column": "entity_id",
"expressions": [
{ "label": "golf_fans", "expression": "6003037" },
{ "label": "luxury_travel", "expression": "(6003034 OR 6003037) AND NOT 2000012" }
],
"include_identifiers": ["email", "phone"],
"format": "csv",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}Example 3: Chained from entity_resolve, only matched entities
{
"entity_type": "person",
"csv_resource_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/artifacts/resolved_identities.parquet",
"entity_id_column": "entity_id",
"expressions": [
{ "label": "high_value", "expression": "6003037 AND 2000007" }
],
"include_unmatched": false,
"format": "jsonl",
"workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}