Blog

Probabilistic Attribution: When It Still Wins

Lakshith Dinesh

Lakshith Dinesh

Head of Growth, Linkrunner

Probabilistic Attribution: When It Still Wins

Probabilistic attribution does not have a clean reputation. Apple's App Tracking Transparency framework limits it on iOS. Deterministic identifiers feel more accurate. Most teams have been told to lean on SKAN instead. None of that means probabilistic attribution is dead.

In practice, the model still has channels and cohorts where it is the only one delivering a number a marketer can act on. Android remains advertising-ID-accessible by default. iOS has large non-ATT cohorts where deterministic identifiers are unavailable but a probabilistic match is still defensible. Offline, OOH, podcast, and programmatic open exchange traffic was never deterministic in the first place.

The mistake teams make is not using probabilistic attribution. It is treating a probabilistic match and a deterministic match as the same thing in reporting. Once you flag confidence tiers and report them honestly, probabilistic earns its place back in the stack.

This post walks through where probabilistic attribution still works in 2026, what accuracy to expect, and how to set up the layer correctly without misleading your finance team.

What Is Probabilistic Attribution?

Probabilistic attribution is a method that matches a click to an install using statistical signals such as device characteristics, IP ranges, and timing, returning a confidence score rather than a guaranteed link.

How it works in practice:

  • Click logs capture device profile signals at the time of the click: device model, OS version, screen size, language, region, carrier, IP range, click timestamp.

  • Install events capture the same signal set on first launch.

  • A matching algorithm calculates the probability that a given install came from a given recent click.

  • The output is a confidence score, not a binary credit.

The contrast with deterministic attribution is clean. Deterministic uses a unique identifier (GAID, IDFA, the Google Play install referrer, a signed SAN postback). The match is one-to-one and verifiable. No probability involved.

The right mental model: deterministic is "we know who clicked"; probabilistic is "the timing, device, and IP all line up, so this install is likely from this click." Both models sit alongside each other in any modern attribution stack, as the post on attribution in the current privacy landscape lays out.

Where Probabilistic Attribution Still Works in 2026

  • Android, where advertising IDs remain accessible by default and the install referrer is deterministic. Probabilistic mostly fills gaps for delayed referrer reads, sideloads, and OEM Play Store forks.

  • iOS non-ATT cohorts: users who declined the prompt, never saw it, or have Limit Ad Tracking enabled. This is still a meaningful share of iOS users.

  • Server-side matching when device identifiers are intentionally not collected (privacy-first apps, hashed-email flows).

  • Channel categories outside the major self-attributing networks: offline, OOH, podcast, programmatic open exchange, influencer, QR. These have always relied on probabilistic stitching because there is no deterministic identifier in the path.

The post-IDFA tracking playbook leans on this combination: deterministic where available, probabilistic where it is not.

Where Probabilistic Attribution Does Not Work

  • Channels where Apple or Meta SAN policy forbids fingerprinting on iOS. Apple Search Ads, Meta SAN postbacks, and the Google SAN flow on iOS rely on the platform's own attribution, not third-party probabilistic matching. SKAN handles iOS measurement for these channels, and the SKAN 4.0 framework is the right read for setup detail.

  • Cohorts where the user-level signal cannot be defensibly aggregated. Tiny app categories, niche device populations, and exotic OS forks all break probabilistic matching because the fingerprint stops being unique enough.

  • Privacy-regulated regions with strict consent regimes. GDPR-strict and DPDPA-aware setups should require explicit consent before probabilistic matching is enabled.

If your channel is in this list, do not retrofit a probabilistic layer. Treat SKAN, server-side signals, or aggregated lift testing as the right primary measurement.

Probabilistic Accuracy: What to Actually Expect

The honest answer is that accuracy depends on the cohort. Rough working bands, rounded:

  • Android, signal-rich device populations (recent Pixel, Samsung flagship, common OEMs in India): around 80 to 90 per cent match accuracy.

  • iOS non-ATT cohort on current iOS versions: around 70 to 85 per cent match accuracy.

  • Niche device populations or sparse traffic: drops fast, sometimes below 50 per cent.

  • Cross-day matches: accuracy decays as the lookback widens. A match made on day 7 of a 30-day window is not the same quality as one made within the first hour.

A few aggregated observations from running these matches across audits:

  • Always report a confidence score, not a binary credit. A 91 per cent match and a 54 per cent match should not appear in the same column.

  • Calibrate against a deterministic cohort on the same channel where possible. Run them side by side and look at the gap. If the probabilistic layer is consistently 20 per cent over deterministic, you have a calibration problem.

  • Time decay matters more than people realise. The P90 click-to-install lag on iOS sits in the days, not the seconds, so the model's window choices have direct downstream effects on quality.

Probabilistic vs Deterministic: A Decision Framework

Use deterministic by default. It is verifiable, defensible, and easier to explain to finance.

Layer probabilistic on top only when:

  1. The channel has no deterministic option (offline, OOH, podcast, influencer, programmatic open exchange).

  2. The cohort has no deterministic identifier (iOS non-ATT, hashed-email server-side flows, sideloads).

  3. The deterministic signal arrived but is incomplete (install referrer missing, postback delayed beyond the window).

Three things to do every time probabilistic is in the mix:

  • Tag the match type on every row of raw data. Reporting can roll it up; you cannot reverse-engineer it later.

  • Show deterministic and probabilistic credit in separate columns when stakeholders need precision. Roll them up only for top-line views.

  • Run a quarterly calibration against deterministic cohorts and adjust the model's weights if the gap grows.

The contrarian case for last-click attribution makes the point that simpler models often beat complex ones for growing apps. The same logic applies here. A transparent two-layer model (deterministic plus a clearly-labelled probabilistic fill) beats a complex multi-touch black box that no one can debug.

How to Set Up a Probabilistic Layer Correctly

Signals to include:

  • Click timestamp and source IP range

  • Device model, OS version, screen size, language, region

  • Carrier and connection type where available

  • Install timestamp and the same device profile

Signals to exclude post-ATT on iOS:

  • Anything Apple's policy forbids on a non-consented device (IDFA, persistent identifiers)

  • User-level data that crosses the on-device boundary without consent

Window and decay logic:

  • Match within a configurable lookback window (typical defaults: 7 days on iOS, 24 to 72 hours on Android)

  • Weight matches inside the first hour higher than matches at day 6

  • Set a confidence floor and do not publish matches below it (a 30 per cent match is noise, not signal)

Different MMPs sit at different points on the probabilistic spectrum. Linkrunner has chosen the deterministic-first path: no third-party device fingerprinting on iOS, SKAN as the canonical iOS measurement for SAN channels, deterministic identifiers and install referrer reads for everything else. The Linkrunner docs overview covers the architecture at a high level. If you want a probabilistic fill on top of that for offline or open-exchange channels, you build it on the export data or use a parallel measurement layer for those channels only.

A separate point worth flagging: probabilistic attribution is not the same as predictive attribution. The terms get confused all the time. Probabilistic answers "did this click cause this install"; predictive answers "is this install going to convert to revenue." Different models, different inputs, different outputs.

FAQ

Q: Is probabilistic attribution still allowed on iOS in 2026?

Yes, with limits. Apple's ATT framework restricts fingerprinting on consented devices for paid channels covered by SKAN. Probabilistic matching is still defensible for non-ATT cohorts and channels outside the SAN ecosystem, provided you respect the consent state.

Q: How accurate is probabilistic attribution compared to deterministic?

Match accuracy typically sits in the 70 to 90 per cent range for signal-rich cohorts, and drops fast for niche device populations or long lookback windows. Deterministic matches are verifiable by construction, so the right pattern is to use probabilistic alongside deterministic, not instead of it.

Q: What signals does probabilistic attribution use?

Device model, OS version, screen size, language, region, carrier, connection type, IP range, and click and install timing. Identifiers protected by consent regimes are excluded.

Q: When should I use probabilistic attribution instead of relying on SKAN?

For non-SKAN channels (offline, OOH, podcast, programmatic open exchange, influencer) and for non-ATT iOS cohorts where SKAN does not deliver per-campaign granularity. SKAN handles iOS SAN channels; probabilistic handles the rest.

Q: Can I trust probabilistic attribution for revenue reporting?

Yes, if you report match type alongside the revenue figure. Mixing a high-confidence deterministic match and a 55 per cent probabilistic match in the same revenue column hides quality differences finance needs to see. Report them separately, then roll up.

Closing

Probabilistic attribution is not dead. The model has a defensible place for channels with no deterministic identifier, for iOS non-ATT cohorts where SKAN does not deliver per-campaign granularity, and as a fill layer for delayed or missing deterministic signals. The mistake is rarely the model. It is reporting probabilistic and deterministic credit as if they were the same quality of signal.

If you want to see how a deterministic-first MMP handles iOS measurement without third-party fingerprinting, request a demo from Linkrunner. Or pull your last 30 days of attributed installs from your current setup and check whether the match type is even exposed at the row level. If it is not, that is the first thing to fix.

Start measuring the installs your team cares about

Bring attribution, deep links, SKAN, cohorts, and campaign intelligence into one workflow your growth team can trust.