After where does your data come from, this is the question we get most.
A signal is a single fact about a person or company at a specific moment: something they did, something they own, something that happened to them, or something they were exposed to. A few examples from the Signal Graph:
- Searched buying a Peloton bike in the last 90 days
- Renewed a commercial driver's license in 2025
- Household with at least one child under the age of 5
- Lives within 25 miles of a Tesla service center
- Company hired its first VP of Sales in the last 60 days
- Recently relocated from a single-family home to a multi-unit
- Researched bariatric surgery in the last 30 days
- Read The Information consistently in the last 30 days
Each of those is a signal. None of them is a profile, a segment, a score, or an audience. Those are downstream artifacts built out of signals. A signal is the atomic unit, the smallest piece of truth about a person or company that's useful on its own and composable with others.
That's the definition. The rest of this post is about why that definition matters, what makes a signal good, and why one signal is almost never the answer.
What a signal isn't
The legacy data world conflates four things: signals, fields, traits, and segments. They look similar from the outside. They're structurally different, and conflating them is why most data infrastructure breaks the moment you put an agent in front of it.
A field is what a database calls a column. Company size. Industry. Job title. Fields are containers, decided in advance by a vendor with a fixed taxonomy. A field has the values the vendor decided to expose, in the shape the vendor decided to expose them. If your buyer's actual behavior doesn't map to a value the vendor pre-chose, you can't query for it.
A trait/tag is a field value. Engaged user. High-LTV cohort. At-risk customer. Every tool has its own word for it. A CDP calls it a computed trait, an analytics tool calls it a tag, a data vendor calls it a field. Either way, it's a decision someone already made about which behaviors counted as which category. Useful for the dashboard the author was building. Useless to an agent composing across signals that author never saw coming.
A segment is a finished audience. In-market for Q4 holiday gifting. Lapsed subscribers. Decision-makers at mid-market SaaS companies. Segments are compositions someone else already built. You can buy them. You can target against them. You can't pull them apart and reuse the underlying signals. They're sold to you compressed.
A signal is none of those things. A signal is the raw observation underneath. It's what's left when you stop letting someone else decide what mattered.
The difference is structural, not semantic. Fields, traits, and segments are pre-aggregated. They're the output of a one-way compression. Somebody decided which behaviors counted, which signals collapsed into which categories, which dimensions made it into the schema and which got thrown away. Whatever didn't fit the chosen shape is gone. The downstream consumer never sees it.
A signal is raw. It hasn't been compressed. It hasn't been pre-interpreted into someone else's schema. It's the thing the schema would have been built from if anyone had known in advance which questions you'd ask.
What makes a signal useful
Four properties:
Granularity. How specific is the signal? Owns a car is a coarse-grain signal: billions of people qualify. Owns a 2019 Brinkley Model G is a fine-grain signal: a few thousand people qualify, and the audience is half-defined the moment the signal fires. Fine-grain signals do more work per signal. The Signal Graph holds 160,000+ signals on US adults specifically because granularity matters more than broad membership. Covering the full range of what people actually do takes a lot of them.
Freshness. When did the signal fire, and how recently was it refreshed? Renewed a commercial driver's license sometime in the last five years is a different signal than renewed in the last 30 days. Behavioral data goes stale fast: life events, purchase patterns, and intent signals all decay. A signal you can't refresh is a signal that gets less true every day.
Lineage. Where did this signal come from? Who observed it? What was the chain of custody between the original event and the version in your graph? Most data vendors hide this. We don't. Every signal in our graph has a lineage we can name, and most have passed through fewer hands than the equivalent signals from legacy aggregators. Fewer hands means less compression, less staleness, fewer derivative claims layered on top of the original observation.
Composability. Can this signal be combined with other signals at reasoning time, without being pre-joined into a fixed schema first? This is the substrate question, and most data vendors fail it. A signal that lives in its own silo, accessible only through that vendor's API, in their pre-chosen schema, isn't composable. A signal that lives in the same graph as 199,999 others, in a structure built for an agent to traverse, is.
Why one signal is never enough
One signal at 30% accuracy is garbage.
Five signals at 30% accuracy each, stacked with the right boolean composition, is a precise audience. The failure modes compress as you compose. The signals you stack against each other narrow the cone of possible interpretations until what comes out the other side is actually predictive of the thing you care about.
This is why one question separates Signal Engineers from Plumbers: what can I find? versus what data can I get? That distinction matters more than it looks. A Plumber stares at one signal's coverage and complains it's incomplete. A Signal Engineer stacks that signal with seven others and finds the audience in the intersection.
The artifact that comes out is a composition: a boolean expression of signals, often with weights and exclusions, that represents the actual audience or scoring rubric you're trying to build. The Signal Graph holds the raw signals. The Signal Engineer composes them. The output is something no individual signal could have produced and no vendor's segment catalog would have sold.
What this looks like at Watt
The Signal Graph today holds 165,000+ behavioral signals on 250M+ US adults and 55,000+ signals on 60M+ US companies. The categories span purchases, life events, demographic and household composition, geographic patterns, intent and in-market behaviors, ownership and asset history, professional and firmographic signals, hiring and growth signals, technographics, and a long tail of harder-to-classify behavioral observations.
Every one of those signals is sourced from the open market. We've written separately about where our data comes from. The thing we built that nobody else has is structural: a substrate that holds raw signals at the resolution they were observed, refreshes them on their own cadence, and exposes them in a shape an agent can compose across in real time without being handed a schema in advance.
A Signal Library is coming. Every signal will have its own page: what it is, where it came from, how often it refreshes, how it relates to other signals, and how to compose with it. Until that ships, the easiest way to see what's actually in the graph is to ask one of our customers what they composed for their last campaign.
The takeaway
A signal is a single fact about a person or company at a specific moment, raw, uncompressed, and composable with others.
Everything else in the data industry (fields, traits, segments, scores, dashboards) is downstream of signals. Built from them. Compressed out of them. Useful for the questions the builder anticipated, useless for the questions an agent will actually ask.
If you've spent the last decade thinking in fields, the cognitive flip is to start thinking in signals. The artifact you ship gets bigger. The questions you can answer get more interesting. The audience your work produces stops looking like everyone else's, because nobody else is composing from the same raw material you are.
That's what we built the Signal Graph for. That's what Signal Engineering is.