The dataset, the engine-testing protocol, and the limits of what's been measured.
The empirical claims in the essay draw from a single internal dataset: roughly 200 personal injury law firm diagnostics run by LawShift during calendar year 2025 and the first half of 2026. Each diagnostic queries four AI engines against a curated prompt set parameterized by firm name and metro, then scores the firm's appearance in each engine's response. Results are stored, dated, and tagged by metro, practice area, and firm size band.
The dataset is not a random sample of US personal injury firms; it is a convenience sample of firms that ran the diagnostic, plus the firms LawShift benchmarked against during client engagements. It overrepresents mid-sized firms in top-50 US metros and underrepresents micro-firms in tertiary markets. Where the essay generalizes, it does so cautiously and with that bias in mind.
Raw firm-level data is confidential. Aggregate metrics — the correlation figures, the percentage of firms appearing in AI answers, the median time-to-first-citation — are derivable from the dataset and are what is cited in the essay.
Four engines, in priority order:
Other engines (Anthropic Claude, xAI Grok, You.com, Brave) are monitored internally for category trends but are not part of the published scoring methodology. They will be added as their consumer market share for legal queries grows.
Each engine is queried with a parameterized prompt set covering the high-intent consumer queries a personal injury client would actually use. The set is curated, not exhaustive. Representative prompts:
The full audit (run by LawShift directly with client firms) expands this to 40+ prompts across practice areas, demographic segments, and longer-tail intent variations. The published essay summarizes patterns observed across the larger prompt set.
Each engine returns one of three states:
Aggregating across four engines yields a 0–16 visibility score. Scores under 4 are classified as "AI invisible." Scores 4–8 are "AI thin." Scores 9–12 are "AI present." Scores 13–16 are "AI visible." The "less than 5% of firms appear" figure refers to firms scoring 9 or higher on this scale.
Several claims in the essay rely on published external research:
The diagnostic is a snapshot. AI engines update retrieval indexes on different cycles, and individual queries are non-deterministic — re-running the same prompt twice can return different results. The diagnostic captures the prevailing pattern across a prompt set, not a definitive judgment about any single query.
The dataset also cannot answer counterfactual questions. We do not know what would have happened if a specific firm had done specific work; we know what happened in the firms that did do specific work. The case study referenced in the essay (zero-to-43% AI visibility in six weeks) is a real engagement, but it is one engagement, and outcomes vary.
The dataset is re-run on a quarterly cadence. The figures cited in this essay will be revisited in August 2026 and again in November 2026. If anything material changes — which it will — the essay will be updated and the original archived. Update notes will be added to the bottom of the home page when revisions occur.
If you want to discuss specific figures, request anonymized data on a particular subset, or push back on something the essay claims, email nick@lawshift.ai or press@lawshift.ai. Substantive corrections will be published.