How We Advise an In-House Team That Executes Its Own AI Visibility Work
A step-by-step walkthrough of a consulting engagement for a B2B SaaS marketing team with hands but no AI visibility strategy: how we baseline their current representation, decide what to fix first, hand over a playbook their team runs, and teach them to measure movement themselves.
Methodology walkthrough
This page shows, step by step, how we run this type of engagement. Where figures appear, they illustrate the mechanics - client results are published only with written permission and supporting data.
The Typical Challenge
A B2B SaaS company has a capable marketing team - writers, a lifecycle marketer, some developer time - but nobody who has run AI visibility work before. The team reads about AEO and GEO, sees competitors named in ChatGPT and Perplexity answers while their own brand rarely appears, and cannot tell whether to start with technical fixes, content restructuring, or community presence. Effort is not the constraint; direction is. Without an outside diagnosis, they risk spending months of in-house capacity on the wrong layer.
Our Approach
1. Baseline Review
Before recommending anything, we establish where the brand actually stands - in AI answers and in the technical signals behind them.
- Sample a fixed panel of 20-30 category prompts across ChatGPT, Perplexity, and Google AI Overviews, logging which brands get named and which sources get cited
- Review the technical posture: site architecture, service pages, metadata, structured data, robots, sitemap, and how clearly the site explains what the company does and who it is for
- Check entity consistency: whether the name, descriptions, and category framing match across the site, directories, and the third-party sources AI systems reconcile
- Map competitor presence in the sampled answers and in the sources those answers retrieve from
The deliverable is a written baseline: what assistants currently say about the brand, where they source it from, and which gaps are technical versus reputational. Everything after this is prioritized against it.
2. Strategy and Prioritization Workshop
We work through the baseline with the team in a live session and decide, together, which layers to fix first.
- Weigh the four layers against the findings: SEO foundation (crawlability, site structure, page quality), AEO structure (answer-shaped pages, FAQs, schema), GEO signals (entity consistency, third-party corroboration), and community presence (Reddit and category forums)
- Sequence by dependency: structural and technical fixes usually come before promotion, because promoting a site assistants cannot parse wastes the effort
- Mark every recommendation as urgent, next, or later - matched to the hours and skills the team actually has, not an idealized headcount
- Flag risks in plain language: where claims, automation, or channel choices could create problems later
The output is a prioritized roadmap the team agreed to in the room - not a generic audit deck they file away.
3. Execution Playbook
Because the team implements everything themselves, the handover has to be runnable, not theoretical.
- A brief per workstream: what to change, why it matters, what done looks like, and an example of good execution
- An owner and effort estimate for each item, so the roadmap maps onto the sprint planning the team already uses
- Templates where they help: page structure for answer-shaped content, an entity consistency checklist, and disclosure guidelines for community participation
- An explicit do-not list - tactics that look tempting but create trust or policy risk, so nobody improvises into trouble
4. Review Cadence
Advisory only works if someone checks whether the plan survives contact with reality, so we stay on a recurring schedule.
- Recurring sessions reviewing the sampled prompt panel: what moved, what did not, and whether variance or a real trend explains it
- Unblocking decisions the team cannot settle alone: trade-offs between workstreams, whether to drop an item that is not paying off, when to escalate a technical fix
- Re-prioritizing the roadmap as findings come in, so the plan reflects what the panel and analytics actually show
5. Skills Transfer
The end state is a team that measures for itself instead of renting the capability indefinitely.
- Teach prompt panel design: fixed wording, a fixed set, sampled repeatedly, logged with dates and model versions
- Teach honest reading: single runs prove nothing - repeated sampling with variance notes is the minimum bar for calling something a trend
- Hand over the measurement log and sampling checklist so the cadence continues without us in the loop
- Help set expectations internally: leadership reporting framed as directional movement, never as promised placements
What We Work Toward
Consulting does not promise outcomes; it delivers clarity and a plan the team can execute. AI answers are non-deterministic, so on an engagement like this we manage toward directional signals rather than promised placements:
Independent operation
A team that can run the full loop - implement, sample the panel, read the results, re-prioritize - without needing us in the room for every decision.
A roadmap in motion
A prioritized plan that is actually being executed - urgent fixes shipped before nice-to-haves get discussed, with owners and sequencing the team committed to.
Prompt panel movement
Directional movement on the tracked category prompts - brand mentions and citations sampled repeatedly over time, reported with variance notes instead of cherry-picked screenshots.
Honest reporting in-house
Internal reporting practices that separate real trends from single-run noise, so leadership hears what is actually happening rather than what a screenshot suggests.
Key Principles
- Diagnosis before execution - the baseline decides the roadmap; recommendations without a baseline are guesses dressed up as strategy.
- Sequence by dependency - structural and technical fixes come before promotion, because promoting content assistants cannot parse wastes the team's hours.
- Playbooks match capacity - recommendations are shaped to the hours and skills the team actually has; a plan the team cannot staff is a plan that will not happen.
- Measurement is taught, not outsourced - the team leaves able to sample, log, and read directional movement themselves, with no vendor dependency for the truth.
How We Measure
- Data sources: Google Search Console, Google Analytics 4, Repeated ChatGPT / Perplexity / Claude prompt sampling
- Timeframe: Panel sampled between review sessions; first checkpoint at 4-8 weeks
- Metric definition: Prompt panel movement = brand mentions and citations in assistant responses to a fixed panel of category prompts, observed through repeated sampling run by the client team. Branded queries = impressions for queries containing the brand name, observed in Google Search Console. See our methodology page for detailed definitions.
Validation & Evidence Standards
How Results Get Validated on a Real Engagement
On a live engagement, every reported metric is cross-checked across multiple data sources. We combine platform analytics, third-party tools, and observational methods to confirm directional trends.
Validation tools we use:
- Google Search Console (branded query volume tracking)
- Google Analytics 4 (traffic attribution analysis)
- Repeated prompt sampling across ChatGPT, Perplexity, and Claude - run by the client team using the panel designed together
- A shared measurement log recording dates, model versions, and variance notes
Cross-validation methods:
- Prompt panel readings confirmed through repeated sampling, never single runs
- Branded query data cross-referenced between GSC and GA4
- Movement compared against seasonality, product releases, and paid-spend changes to rule out confounds
About This Walkthrough
This walkthrough shows exactly how we run this type of engagement. Where figures appear, they illustrate the mechanics. We publish client numbers only with written permission and supporting data exports - transparency about method over dressed-up numbers.
Measurement Limitations
AI outputs are non-deterministic and vary by prompt wording, model version, and time. Our measurements are proxy-based and observational, not precise counts. Results should be interpreted as directional indicators rather than absolute guarantees. See our methodology page for detailed measurement definitions.
Replication Prompts
These are the kinds of prompts we track on an engagement like this. Try them (or your own category prompts) yourself - AI responses vary by model, wording, and time, so treat any single run as directional:
What are the best B2B SaaS tools for customer onboarding?Recommend workflow automation software for mid-market teamsWhich vendors should a mid-market team shortlist for onboarding software?
Have the Hands but Need the Direction?
Start with consulting to get a senior baseline, a prioritized roadmap, and a playbook your team can run in-house.