The Ideal Customer Profile (ICP) Analysis workflow demonstrates how to analyze your existing customer base to identify defining characteristics, then use those insights to find lookalike audiences.

Recommended Approach: Use the analyze_customers and generate_audience workflow tools for automated ICP analysis with AI-driven insights. These tools combine identity resolution, profile enrichment, cluster analysis, and audience discovery into streamlined operations.

Manual Approach: This guide documents the manual multi-step process for integrators who need fine-grained control over each step.

Use Case

Goal: Given a list of customer identifiers (emails, phone numbers, or addresses), identify the common characteristics of your best customers and find similar people in the broader population.

Business Value:

Understand what makes your customers similar
Identify market segments and personas
Build lookalike audiences for marketing campaigns
Optimize customer acquisition targeting

Workflow Overview

Workflow Steps

Step 1: Resolve Identifiers to Person IDs

Convert your customer identifiers (emails, phones, addresses) into standardized person IDs that can be used across the platform.

Tool: resolve_identities

Request:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "resolve_identities",
    "arguments": {
      "multi_identifiers": [
        {
          "id_type": "email",
          "hash_type": "plaintext",
          "values": [
            "customer1@example.com",
            "customer2@example.com",
            "customer3@example.com"
          ]
        },
        {
          "id_type": "phone",
          "hash_type": "plaintext",
          "values": [
            "5551234567",
            "5559876543"
          ]
        }
      ],
      "format": "json"
    }
  },
  "id": 1
}

Response:

{
  "jsonrpc": "2.0",
  "result": {
    "identities": [],
    "stats": {
      "requested": 5,
      "resolved": 4,
      "rate": 0.8
    },
    "export": {
      "url": "https://s3.amazonaws.com/presigned-url...",
      "format": "json",
      "rows": 4,
      "size_bytes": 2048,
      "expires_at": "2025-11-20T13:00:00.000Z"
    },
    "tool_trace_id": "abc123...",
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "id": 1
}

Download the JSON export from export.url to get the full identity resolution data:

[
  {
    "person_id": 12345,
    "overall_quality_score": 0.95,
    "matches": [
      {
        "criterion_type": "email_plaintext",
        "criterion_value": "customer1@example.com",
        "quality_score": 0.95
      }
    ],
    "identifiers": {
      "email": ["customer1@example.com", "alt1@example.com"],
      "phone": ["5551234567"]
    }
  },
  {
    "person_id": 67890,
    "overall_quality_score": 0.88,
    "matches": [
      {
        "criterion_type": "email_plaintext",
        "criterion_value": "customer2@example.com",
        "quality_score": 0.88
      }
    ],
    "identifiers": {
      "email": ["customer2@example.com"]
    }
  }
]

Key Points:

Use multi_identifiers to search across multiple identifier types simultaneously
format: "json" returns a download URL for the full dataset (use format: "none" for inline results with small datasets)
Filter results by overall_quality_score (e.g., >= 0.5) to ensure match quality
The resolved person_id values are used in subsequent enrichment steps

Step 2: Enrich Person Profiles

Retrieve detailed demographic, behavioral, and interest data for the resolved person IDs.

Tool: get_person

Request:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "get_person",
    "arguments": {
      "person_ids": [12345, 67890, 11111, 22222],
      "domains": [
        "address", "affinity", "content", "demographic", "email",
        "employment", "financial", "household", "id", "intent_category",
        "intent_topic", "interest", "lifestyle", "maid", "name",
        "phone", "political", "purchase"
      ],
      "format": "none"
    }
  },
  "id": 2
}

Response:

{
  "jsonrpc": "2.0",
  "result": {
    "profiles": [
      {
        "person_id": "12345",
        "metadata": {
          "quality_score": 0.92,
          "last_modified": "2025-01-15T10:30:00Z"
        },
        "domains": {
          "gender": "Female",
          "generation": "Millennial",
          "interested_fitness": "Yes",
          "interested_healthy_living": "Yes",
          "fitness_affinity": "High",
          "household_income_range": "$75K-$100K",
          "email1": "customer1@example.com",
          "email2": "alt1@example.com",
          "phone1": "5551234567",
          "first_name": "Jane",
          "last_name": "Smith"
        }
      },
      {
        "person_id": "67890",
        "metadata": {
          "quality_score": 0.88
        },
        "domains": {
          "gender": "Female",
          "generation": "Gen X",
          "interested_healthy_living": "Yes",
          "interested_outdoors": "Yes",
          "email1": "customer2@example.com",
          "first_name": "Sarah"
        }
      }
    ],
    "tool_trace_id": "def456...",
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "id": 2
}

Key Points:

Request all available domains to get a comprehensive view
The domains field contains attribute key-value pairs (e.g., "gender": "Female")
Attributes are returned as flat key-value pairs, not nested by domain category
Multi-value fields like email and phone are numbered (email1, email2, email3, etc.)
To use attributes in find_persons, you need to map them to cluster IDs using list_clusters
The profiles array is always populated, even when using export format
Batch requests support up to 1000 person IDs per call

Step 3: Analyze Attribute Patterns and Intersections

Aggregate the enriched profiles to identify common characteristics and combinations of attributes that frequently co-occur. This reveals the true ICP by finding the intersection of traits.

Analysis Approach:

Extract attribute sets - Pull out relevant attributes from each profile, excluding contact info
Count frequencies - Tally how often each attribute appears across all profiles
Find top attributes - Identify the most common individual characteristics
Discover intersections - Find pairs of attributes that co-occur frequently
Build multi-attribute clusters - Identify 3+ attributes that define cohesive segments
Calculate lift scores - Measure how much more likely attributes co-occur than by chance

Key Metrics:

Frequency: Percentage of customers with each attribute
Lift Score: Ratio of actual co-occurrence to expected random co-occurrence (>1.0 indicates positive correlation)
Intersection Rate: Percentage of customers with multiple attributes

Analysis Output Structure:

Top Attributes: Individual characteristics sorted by frequency (baseline view)
Top Intersections: Pairs of attributes with high co-occurrence and lift scores
Multi-Attribute Clusters: Groups of 3+ attributes representing cohesive segments (true ICP)

Key Insights:

Top Attributes: Individual characteristics sorted by frequency
Top Intersections: Pairs of attributes that co-occur frequently
- lift > 1.0 means they appear together more than random chance
- Higher lift = stronger association between attributes
Multi-Attribute Clusters: 3+ attributes that define cohesive segments
- These represent your true ICP - the combination of traits that define your best customers
- Use these for high-precision targeting (AND logic)
Strategy:
- Use multi-attribute clusters with AND for precision (smaller, high-quality audience)
- Use top individual attributes with OR for reach (larger, broader audience)

Step 4: Map Attributes to Cluster IDs

To use attributes in find_persons, you need to look up their cluster IDs using list_clusters.

Tool: list_clusters

Request:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "list_clusters",
    "arguments": {
      "cluster_names": [
        "gender",
        "interested_healthy_living",
        "fitness_affinity",
        "household_income_range",
        "interested_fitness"
      ]
    }
  },
  "id": 4
}

Response:

{
  "jsonrpc": "2.0",
  "result": {
    "clusters": [
      {
        "cluster_id": "1000000145",
        "cluster_hash": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
        "name": "gender",
        "value": "Female",
        "size": 85000000,
        "domain": "demographic"
      },
      {
        "cluster_id": "1000001556",
        "cluster_hash": "b2c3d4e5f67890a1b2c3d4e5f67890a1",
        "name": "interested_healthy_living",
        "value": "Yes",
        "size": 45000000,
        "domain": "interest"
      },
      {
        "cluster_id": "1000000045",
        "cluster_hash": "c3d4e5f67890a1b2c3d4e5f67890a1b2",
        "name": "fitness_affinity",
        "value": "High",
        "size": 12000000,
        "domain": "affinity"
      },
      {
        "cluster_id": "1000001234",
        "cluster_hash": "d4e5f67890a1b2c3d4e5f67890a1b2c3",
        "name": "household_income_range",
        "value": "$75K-$100K",
        "size": 28000000,
        "domain": "household"
      },
      {
        "cluster_id": "1000001523",
        "cluster_hash": "e5f67890a1b2c3d4e5f67890a1b2c3d4",
        "name": "interested_fitness",
        "value": "Yes",
        "size": 35000000,
        "domain": "interest"
      }
    ],
    "total": 5,
    "returned": 5,
    "tool_trace_id": "xyz789...",
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "id": 4
}

Key Points:

Each cluster represents a specific attribute value (e.g., gender=Female)
cluster_hash is a 32-character hex string - use this in boolean expressions for find_persons
cluster_hash is stable across data rebuilds (recommended over cluster_id)
size shows the population size for each cluster (useful for estimating audience size)
You must match both name AND value to your attributes
Use cluster hashes in the next step to build your lookalike audience

Step 5: Find Lookalike Audience

Use the cluster hashes from Step 4 to find similar people. You can use two strategies:

Precision Targeting (AND): Find people matching all key attributes
Reach Targeting (OR): Find people matching any key attribute

Tool: find_persons

Precision Approach (Recommended): Use the multi-attribute cluster to find high-quality matches. This targets the core ICP.

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1 AND c3d4e5f67890a1b2c3d4e5f67890a1b2",
      "identifier_type": "email"
    }
  },
  "id": 5
}

Reach Approach: Use individual top clusters to cast a wider net for discovery.

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 OR b2c3d4e5f67890a1b2c3d4e5f67890a1 OR c3d4e5f67890a1b2c3d4e5f67890a1b2",
      "identifier_type": "email"
    }
  },
  "id": 5
}

Hybrid Approach: Combine required attributes (AND) with optional attributes (OR) for balanced targeting.

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "expression": "(a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1) OR (a1b2c3d4e5f67890a1b2c3d4e5f67890 AND c3d4e5f67890a1b2c3d4e5f67890a1b2)",
      "identifier_type": "email"
    }
  },
  "id": 5
}

Response (Precision Approach):

{
  "jsonrpc": "2.0",
  "result": {
    "total": 450000,
    "sample": [
      {
        "person_id": 99999,
        "identifiers": {
          "email": {
            "email1": "prospect1@example.com",
            "email2": "prospect1-alt@example.com"
          }
        }
      },
      {
        "person_id": 88888,
        "identifiers": {
          "email": {
            "email1": "prospect2@example.com"
          }
        }
      }
    ],
    "tool_trace_id": "ghi789...",
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  },
  "id": 5
}

Response (Reach Approach):

{
  "jsonrpc": "2.0",
  "result": {
    "total": 8500000,
    "sample": [...]
  }
}

Key Points:

expression uses boolean logic with cluster IDs (AND, OR, NOT operators)
Precision (AND): Smaller audience, higher match quality
- Example: 450K people matching ALL 3 attributes
- Best for: High-value campaigns, limited budgets, quality over quantity
Reach (OR): Larger audience, broader targeting
- Example: 8.5M people matching ANY of 3 attributes
- Best for: Brand awareness, discovery, testing new segments
Hybrid: Balanced approach using combinations
- Example: Multiple AND groups connected by OR
- Best for: Testing multiple ICP variants simultaneously
total shows the full audience size (use this to estimate campaign reach)
sample provides up to 10 preview records
Use identifier_type to specify which contact info to return (email, phone, or address)
For full export, add "format": "csv" or "format": "json"

Strategy Recommendations:

Approach	Audience Size	Match Quality	Use Case
AND (3+ clusters)	100K - 1M	Very High	Core ICP, high-value offers
AND (2 clusters)	500K - 5M	High	Standard campaigns
Hybrid	1M - 10M	Medium-High	A/B testing segments
OR	5M+	Variable	Discovery, cold outreach

Expression Examples:

// Precision: Core ICP with all defining traits
"1000000145 AND 1000001556 AND 1000000045"  // 450K people

// With exclusions: Remove known non-converters
"(1000000145 AND 1000001556) AND NOT 1000002000"  // 380K people

// Multiple precise segments: Test different ICP hypotheses
"(1000000145 AND 1000001556 AND 1000000045) OR (1000000145 AND 1000001234 AND 1000001523)"  // 720K people

// Reach: Any matching characteristic
"1000000145 OR 1000001556 OR 1000000045"  // 8.5M people

Step 6: Enrich Lookalike Profiles (Optional)

Retrieve full profiles for the lookalike audience to validate the match quality or for further analysis.

Tool: get_person

Request:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "get_person",
    "arguments": {
      "person_ids": [99999, 88888, 77777],
      "domains": [
        "demographic", "interest", "affinity", "household"
      ],
      "format": "none"
    }
  },
  "id": 6
}

Response: (Same structure as Step 2)

Key Points:

You can request only specific domains instead of all domains
Compare lookalike profiles to original customer profiles to validate similarity
Use enriched data for personalized outreach campaigns

Performance Considerations

Batching

resolve_identities: No per-request limit, but use format: "json" for large datasets
get_person: Max 1000 person_ids per request; split larger datasets into chunks of 1000
find_persons: Returns up to 10 sample records inline; use export format for full results

Export Formats

"format": "none" - Inline response (best for < 100 records)
"format": "json" - S3 export as JSON (best for analysis)
"format": "csv" - S3 export as CSV (best for imports to other tools)
"format": "jsonl" - S3 export as JSON Lines (best for streaming/large datasets)

Workflow ID Tracking

Pass the workflow_id from the first response through all subsequent requests to track the entire workflow:

{
  "name": "get_person",
  "arguments": {
    "person_ids": [...],
    "domains": [...],
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

This enables:

End-to-end request tracing
Performance analysis across tools
Feedback submission for data quality issues

Run ICP analysis for each segment (e.g., high-value customers, frequent purchasers)
Identify top clusters for each segment
Build separate expressions for each segment
Combine with OR logic: (segment1_clusters) OR (segment2_clusters)

Error Handling

Identity Resolution Failures

If stats.rate < 0.5, consider:

Identifier quality (are emails valid?)
Hash type mismatch (plaintext vs. hashed)
Formatting (phones should be digits only, no country code)

Empty Lookalike Audience

If total: 0, consider:

Expression too restrictive (try OR instead of AND)
Rare cluster combinations
Use get_cluster tool to understand cluster size before building expressions

Export URL Expiration

Export URLs expire after 1 hour. If expired:

Re-run the original request
Download and cache results immediately
Use workflow_id to track retries

Next Steps

After building your lookalike audience:

Activate in marketing platforms - Export as CSV and upload to ad platforms
Validate campaign performance - Track conversion rates and refine clusters
Iterate on cluster selection - Test different cluster combinations
Feedback loop - Use submit_feedback tool to report data quality issues

Related Workflows:

Identity Enrichment - Enrich customer identifiers with demographic and behavioral data
Criteria-Based Audiences - Build targeted audiences using cluster criteria and location filters

ICP Analysis Workflow

Overview

Use Case

Workflow Overview

Workflow Steps

Step 1: Resolve Identifiers to Person IDs

Step 2: Enrich Person Profiles

Step 3: Analyze Attribute Patterns and Intersections

Step 4: Map Attributes to Cluster IDs

Step 5: Find Lookalike Audience

Step 6: Enrich Lookalike Profiles (Optional)

Performance Considerations

Batching

Export Formats

Workflow ID Tracking

Common Variations

Geographic Targeting

Exclude Existing Customers

Multi-Segment ICP

Error Handling

Identity Resolution Failures

Empty Lookalike Audience

Export URL Expiration

Next Steps

On this page