Watt Data Logo

Enrich a set of entity profiles and compute trait frequency distributions across the audience for ICP analysis.

Quick Example

{
  "entity_type": "person",
  "entity_ids": ["123", "456", "789"],
  "domains": ["demographic", "affinity"]
}

Input Parameters

ParameterTypeRequiredDefaultConstraintsDescription
entity_typestringYes-"person" or "business"Type of entity to enrich
entity_idsarrayConditional-String arrayEntity IDs (inline mode). Mutually exclusive with entity_ids_uri
entity_ids_uristringConditional-workflow:// URICSV or Parquet with entity IDs. Mutually exclusive with entity_ids
entity_id_columnstringNo"entity_id"Column nameColumn containing entity IDs (only with entity_ids_uri)
domainsarrayYes-Min 1 enrichment domainEnrichment domains to include
trait_limitnumberNo-Positive integerMaximum traits to return in trait_frequencies
workflow_idstringNo-Valid UUIDWorkflow ID for tracking and persistence

Parameter Details:

entity_ids vs entity_ids_uri:

  • Provide exactly one. They are mutually exclusive.
  • entity_ids for small datasets (inline array)
  • entity_ids_uri for chaining from entity_resolve or entity_find output (recommended)
  • entity_ids_uri supports both .csv and .parquet files

domains:

  • Enrichment domains: demographic, affinity, content, employment, financial, household, id, intent, interest, lifestyle, political, purchase
  • At least one domain is required

Request Schema:

interface GroupEntitiesByTraitParams {
  entity_type: "person" | "business";
  entity_ids?: string[];
  entity_ids_uri?: string;
  entity_id_column?: string;
  domains: Array<"demographic" | "affinity" | "content" | "employment" | "financial" | "household" | "id" | "intent" | "interest" | "lifestyle" | "political" | "purchase">;
  trait_limit?: number;
  workflow_id?: string;
}

Output Format

{
  enrichment: {
    total_entities: number;
    enriched_entities: number;
    enrichment_rate: number;
    by_domain: Record<string, number>;
  },
  trait_frequencies: Array<{
    trait_hash: string;
    trait_name: string;
    trait_value: string;
    domain: string;
    audience_count: number;
    audience_prevalence: number;
  }>,
  tool_trace_id: string,
  workflow_id: string
}

Response Fields:

FieldTypeDescription
enrichment.total_entitiesnumberTotal input entities
enrichment.enriched_entitiesnumberEntities successfully enriched
enrichment.enrichment_ratenumberEnrichment success rate (0-1)
enrichment.by_domainobjectEnriched count per domain
trait_frequenciesarrayTrait frequency distribution for the audience
trait_frequencies[].trait_hashstringStable trait hash
trait_frequencies[].trait_namestringTrait name
trait_frequencies[].trait_valuestringTrait value
trait_frequencies[].domainstringDomain category
trait_frequencies[].audience_countnumberEntities with this trait
trait_frequencies[].audience_prevalencenumberAudience proportion (0-1)

Example Response:

{
  "enrichment": {
    "total_entities": 500,
    "enriched_entities": 425,
    "enrichment_rate": 0.85,
    "by_domain": {
      "demographic": 400,
      "affinity": 380,
      "intent": 350
    }
  },
  "trait_frequencies": [
    {
      "trait_hash": "a1b2c3d4e5f67890",
      "trait_name": "tech_affinity",
      "trait_value": "high",
      "domain": "affinity",
      "audience_count": 225,
      "audience_prevalence": 0.45
    },
    {
      "trait_hash": "b2c3d4e5f6789012",
      "trait_name": "income_level",
      "trait_value": "high",
      "domain": "demographic",
      "audience_count": 190,
      "audience_prevalence": 0.38
    }
  ],
  "tool_trace_id": "a1b2c3d4e5f6",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Chaining to calculate_trait_lift

When a workflow_id is provided, group_entities_by_trait persists a trait_frequencies.parquet artifact. Pass the resource URI to calculate_trait_lift:

{
  "entity_type": "person",
  "trait_frequencies_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/artifacts/trait_frequencies.parquet"
}

Usage Examples

Example 1: Inline entity IDs

{
  "entity_type": "person",
  "entity_ids": ["123", "456", "789"],
  "domains": ["demographic", "affinity", "intent"]
}

Example 2: From entity_resolve output

{
  "entity_type": "person",
  "entity_ids_uri": "workflow://550e8400-e29b-41d4-a716-446655440000/artifacts/resolved_identities.parquet",
  "entity_id_column": "entity_id",
  "domains": ["demographic", "affinity", "interest"],
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
}

Example 3: Limited trait output

{
  "entity_type": "person",
  "entity_ids": ["123", "456"],
  "domains": ["demographic"],
  "trait_limit": 20
}

On this page