You’re viewing the V1 docs. V2 is now recommended — read the V2 docs.
Watt Data

Criteria-Based Audience Building

Overview

The Criteria-Based Audience Building workflow enables you to construct highly targeted audiences by combining demographic, behavioral, and geographic filters. This approach is ideal for marketers who want to reach specific customer segments based on known characteristics.

Use Case

Goal: Build a targeted audience by selecting specific demographic, interest, and behavioral characteristics, optionally filtered by geographic location.

Business Value:

  • Precision targeting for marketing campaigns
  • Geographic market penetration
  • Segment-specific product launches
  • Efficient ad spend through refined targeting
  • A/B testing different audience segments

Automated Approach

The build_cluster_expression tool automates the entire criteria-based audience workflow in a single call. It parses a natural language audience description, searches for matching clusters, builds a boolean expression, and geocodes any location references.

Use this when you want to go from a description like "Tech executives in the Bay Area making over $200K who golf" directly to a targeting expression without manually searching and selecting clusters. The output includes a criteria_mapping that shows which clusters were selected, so you can review and refine before passing the expression to find_persons.

See the build_cluster_expression reference for full parameter and output details.

Manual Approach

For more control over cluster selection and expression construction, follow the step-by-step workflow below.

Manual Workflow Steps

Step 1: Discover Relevant Segments

Use search_clusters to find audience segments using natural language descriptions. This semantic search approach is more intuitive than browsing all clusters.

Tool: search_clusters

Request:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "search_clusters",
    "arguments": {
      "query": "high income professionals interested in golf and fitness",
      "limit": 10
    }
  },
  "id": 1
}

Response:

{
  "jsonrpc": "2.0",
  "result": {
    "clusters": [
      {
        "cluster_hash": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
        "domain": "demographic",
        "name": "household_income_range",
        "value": "$150K+",
        "size": 2500000,
        "similarity_score": 0.92
      },
      {
        "cluster_hash": "b2c3d4e5f67890a1b2c3d4e5f67890a1",
        "domain": "interest",
        "name": "interested_golf",
        "value": "Yes",
        "size": 1800000,
        "similarity_score": 0.89
      },
      {
        "cluster_hash": "c3d4e5f67890a1b2c3d4e5f67890a1b2",
        "domain": "affinity",
        "name": "health_affinity",
        "value": "High",
        "size": 3200000,
        "similarity_score": 0.85
      },
      {
        "cluster_hash": "d4e5f67890a1b2c3d4e5f67890a1b2c3",
        "domain": "employment",
        "name": "occupation_group",
        "value": "Professional",
        "size": 15000000,
        "similarity_score": 0.82
      }
    ],
    "workflow_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "tool_trace_id": "trace_abc123"
  },
  "id": 1
}

Key Points:

  • Use natural language to describe your target audience
  • Results are ranked by similarity_score (0-1, higher is better match)
  • cluster_hash is the stable identifier to use in expressions (persists across data rebuilds)
  • size shows population size for each cluster
  • Filter by domains to narrow results to specific categories

Example Queries:

QueryFinds
"outdoor enthusiasts who like camping"Interest clusters for outdoor activities
"young parents with children"Household composition and age clusters
"luxury car buyers"Purchase behavior and affinity clusters
"health-conscious vegetarians"Lifestyle and dietary preference clusters
"small business owners"Employment and firmographic clusters

Alternative: Browse by Domain

If you prefer to browse available segments, use list_clusters:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "list_clusters",
    "arguments": {
      "domains": ["demographic", "interest", "affinity"],
      "limit": 100
    }
  },
  "id": 1
}

Available Domains:

  • demographic - Age, gender, education, income, marital status, ethnicity
  • interest - Hobbies, activities (golf, fitness, cooking, travel, etc.)
  • affinity - Brand/category affinities (auto, fashion, pets, sports, health, etc.)
  • purchase - Purchase behavior, spending patterns, transaction history
  • financial - Credit, investments, net worth, financial products
  • lifestyle - Health-conscious, pet owner, homeowner, etc.
  • household - Children, household size, generations, composition
  • political - Political affiliations, contributions, party affiliation
  • content - Content consumption patterns, reading habits
  • employment - Occupation, industry, employment status

Step 2: Build Your Audience

Combine cluster criteria using boolean expressions with cluster_hash values, with optional geographic filtering.

Tool: find_persons

Request (Cluster Criteria Only):

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND (b2c3d4e5f67890a1b2c3d4e5f67890a1 OR c3d4e5f67890a1b2c3d4e5f67890a1b2)",
      "identifier_type": "email",
      "format": "csv"
    }
  },
  "id": 2
}

In this example, the cluster hashes from the search_clusters response are used directly in the boolean expression.

Request (With Geographic Filter):

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1",
      "location": {
        "latitude": 37.7749,
        "longitude": -122.4194,
        "radius": 25,
        "unit": "miles"
      },
      "identifier_type": "email",
      "format": "csv"
    }
  },
  "id": 2
}

Request (Location Only, No Cluster Criteria):

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "find_persons",
    "arguments": {
      "location": {
        "latitude": 40.7128,
        "longitude": -74.0060,
        "radius": 50,
        "unit": "miles"
      },
      "identifier_type": "email",
      "format": "none"
    }
  },
  "id": 2
}

Response:

{
  "jsonrpc": "2.0",
  "result": {
    "total": 125000,
    "sample": [
      {
        "person_id": "12345",
        "identifiers": {
          "email": {
            "email1": "alice@example.com",
            "email2": "alice.smith@work.com"
          }
        }
      },
      {
        "person_id": "12346",
        "identifiers": {
          "email": {
            "email1": "bob@example.com"
          }
        }
      }
    ],
    "export": {
      "url": "https://watt-mcp-exports.s3.amazonaws.com/exports/audience_20250118_120000.csv?X-Amz-Signature=...",
      "format": "csv",
      "rows": 125000,
      "size_bytes": 15728640,
      "expires_at": "2025-01-18T13:00:00Z"
    },
    "workflow_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "tool_trace_id": "trace_def456"
  },
  "id": 2
}

Boolean Expression Syntax:

OperatorDescriptionExampleResult
ANDIntersection - people matching ALL criteria"hash1 AND hash2"High income AND interested in golf
ORUnion - people matching ANY criteria"hash1 OR hash2"Interested in golf OR health affinity
NOTExclusion - must be part of AND"hash1 AND NOT hash2"High income but NOT interested in golf
()Grouping - control precedence"(hash1 OR hash2) AND hash3"(Income $150K+ OR $100K-$150K) AND golf interest

Expression Examples:

Using cluster hashes from search_clusters results:

// Simple AND: Narrow targeting
"a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1"
// → High income AND interested in golf

// Simple OR: Broad targeting
"b2c3d4e5f67890a1b2c3d4e5f67890a1 OR c3d4e5f67890a1b2c3d4e5f67890a1b2"
// → Interested in golf OR health affinity

// Complex with grouping: Multiple income levels, single interest
"(a1b2c3d4e5f67890a1b2c3d4e5f67890 OR d4e5f67890a1b2c3d4e5f67890a1b2c3) AND b2c3d4e5f67890a1b2c3d4e5f67890a1"
// → (Income $150K+ OR $100K-$150K) AND golf interest

// With exclusions: Remove specific segments
"a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1 AND NOT e5f67890a1b2c3d4e5f67890a1b2c3d4"
// → High income AND golf interest AND exclude specific segment

Location Filtering:

  • Uses H3 hexagonal grid (resolution 9, ~0.65km edge length)
  • Provide latitude, longitude, radius, and unit (km or miles)
  • Can be combined with cluster expressions or used alone
  • Optional - omit for nationwide audiences

Identifier Types:

  • email - Returns up to 3 email addresses per person
  • phone - Returns up to 3 phone numbers per person
  • address - Returns up to 3 physical addresses per person
  • If omitted, email is returned by default

Export Formats:

  • "none" - Returns 10 sample records inline (useful for previews and testing)
  • "csv" - Full audience export as CSV file (best for ad platform uploads)
  • "json" - Full audience export as JSON array (best for programmatic processing)
  • "jsonl" - Full audience export as JSON Lines (best for streaming/large datasets)

Key Points:

  • total shows full audience size (use for campaign reach estimation)
  • sample provides up to 10 preview records regardless of format setting
  • Export URL expires after 1 hour - download immediately
  • Both expression and location are optional, but at least one must be provided
  • Use cluster_hash values in expressions (stable across data rebuilds)
  • NOT cannot be standalone - must be part of an AND expression

Performance Considerations

Audience Size Estimation

Use size from search results to estimate audience size before running find_persons:

Expression TypeEstimated SizeUse Case
Single clustermember_countBroad targeting, brand awareness
AND (2 clusters)~10-30% of smallest clusterStandard targeting
AND (3+ clusters)~5-15% of smallest clusterPrecision targeting
OR (multiple clusters)Sum of member_counts (with overlap)Reach campaigns

Example Calculation:

  • Cluster A: 2.5M people
  • Cluster B: 1.8M people
  • Expression: "A AND B" → Estimated 180K-540K people (10-30% of 1.8M)

Export Strategy

  • Small audiences (less than 10K): Use format: "none" for inline results
  • Medium audiences (10K-100K): Use format: "csv" or format: "json"
  • Large audiences (greater than 100K): Always use export format, download immediately

Geographic Filtering Performance

  • H3 resolution 9 provides ~0.65km precision
  • Larger radius = more hexagons to search = slightly slower queries
  • Typical performance:
    • 5km radius: less than 2 seconds
    • 25km radius: less than 3 seconds
    • 100km radius: less than 5 seconds

Workflow ID Tracking

Pass the same workflow_id through related requests for end-to-end tracing:

{
  "name": "search_clusters",
  "arguments": {
    "query": "high income golf enthusiasts",
    "workflow_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Then use the same workflow_id in subsequent find_persons calls.


Common Variations

Multi-Region Campaigns

Build separate audiences for different geographic markets using the same cluster expression:

// West Coast - use cluster hashes from search_clusters
const westCoast = await findPersons({
  expression: "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1",
  location: { latitude: 37.7749, longitude: -122.4194, radius: 50, unit: "miles" }
});

// East Coast - same expression, different location
const eastCoast = await findPersons({
  expression: "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1",
  location: { latitude: 40.7128, longitude: -74.0060, radius: 50, unit: "miles" }
});

A/B Testing Segments

Create multiple audience variants for testing different cluster combinations:

// Variant A: High income + golf interest
{
  "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1",
  "identifier_type": "email",
  "format": "csv"
}

// Variant B: High income + health affinity
{
  "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND c3d4e5f67890a1b2c3d4e5f67890a1b2",
  "identifier_type": "email",
  "format": "csv"
}

Exclusion Lists

Remove specific segments (e.g., existing customers, competitors):

{
  "expression": "(a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1) AND NOT f67890a1b2c3d4e5f67890a1b2c3d4e5"
}

Precision vs. Reach Strategies

Precision Strategy (Smaller, High-Quality Audience):

{
  "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1 AND 0a1b2c3d4e5f67890a1b2c3d4e5f6789 AND c3d4e5f67890a1b2c3d4e5f67890a1b2"
}

Use: High-value products, limited budgets, quality over quantity

Reach Strategy (Larger, Broader Audience):

{
  "expression": "a1b2c3d4e5f67890a1b2c3d4e5f67890 OR b2c3d4e5f67890a1b2c3d4e5f67890a1 OR c3d4e5f67890a1b2c3d4e5f67890a1b2"
}

Use: Brand awareness, new product discovery, testing new markets

Hybrid Strategy (Balanced):

{
  "expression": "(a1b2c3d4e5f67890a1b2c3d4e5f67890 AND b2c3d4e5f67890a1b2c3d4e5f67890a1) OR (a1b2c3d4e5f67890a1b2c3d4e5f67890 AND c3d4e5f67890a1b2c3d4e5f67890a1b2)"
}

Use: Multiple segment hypotheses, A/B testing, diversified targeting


Error Handling

Empty Results (total: 0)

Causes:

  • Expression too restrictive (too many AND conditions)
  • Rare cluster combination
  • Location filter too narrow
  • Invalid cluster hashes

Solutions:

  • Relax criteria (replace some AND with OR)
  • Check size in search results before building expression
  • Expand geographic radius
  • Verify cluster hashes are correct

Large Export Failures

Causes:

  • Network timeout during download
  • Export URL expired (more than 1 hour old)

Solutions:

  • Re-run the request to generate new export
  • Download immediately after receiving URL
  • Use format: "jsonl" for streaming large files
  • Implement retry logic with exponential backoff

Invalid Boolean Expression

Causes:

  • Syntax errors (missing parentheses, invalid operators)
  • Standalone NOT expression
  • Mismatched parentheses

Solutions:

  • Validate expression syntax before submission
  • Ensure NOT is always part of an AND expression
  • Use proper grouping with balanced ()
  • Test with simple expressions first

Valid Expressions:

"hash1 AND NOT hash2"  // ✓ NOT as part of AND
"(hash1 OR hash2) AND hash3"  // ✓ Proper grouping

Invalid Expressions:

"NOT hash1"  // ✗ Standalone NOT
"hash1 OR NOT hash2"  // ✗ NOT with OR
"(hash1 AND hash2"  // ✗ Missing closing parenthesis

Next Steps

After building your audience:

  1. Download export - Save CSV/JSON from the presigned S3 URL
  2. Upload to ad platforms - Import to Facebook Ads, Google Ads, etc.
  3. Track performance - Monitor conversion rates, ROAS, engagement
  4. Iterate - Refine cluster selection based on campaign results
  5. Provide feedback - Use submit_feedback tool to report data quality issues

Related Workflows:

Campaign Optimization Tips:

  • Start with broader audiences (OR logic) for discovery
  • Narrow to precision audiences (AND logic) after identifying high-performing segments
  • Use geographic filters for local businesses or regional campaigns
  • Monitor size changes over time to track segment growth
  • Test multiple audience variants simultaneously for faster optimization

On this page