The Technical SEO & Site Health Checklist: How to Brief an Expert Agency and Audit Your Own Foundation

The Technical SEO & Site Health Checklist: How to Brief an Expert Agency and Audit Your Own Foundation

A common misconception persists among marketing teams: that technical SEO is a one-time fix, a set of configurations you apply and forget. In reality, technical SEO is an ongoing operational discipline—a continuous cycle of crawling, indexing, rendering, and performance optimization. If your site’s technical foundation is brittle, no amount of content or backlinks will produce sustainable organic growth. This checklist is designed for two audiences: the in-house marketer who needs to brief an SEO agency with precision, and the practitioner who wants to run their own baseline audit. We will walk through the critical areas—crawl budget, indexation signals, Core Web Vitals, and link profile hygiene—and flag the risks that can undermine even a well-funded campaign.

1. The Crawl Budget Audit: Prioritizing What Googlebot Sees

Before Google can rank your pages, it must find and crawl them. Crawl budget is the finite number of URLs Googlebot will request from your server within a given timeframe. For small sites (under a few thousand pages), crawl budget is rarely a bottleneck. For enterprise sites, e-commerce catalogs, or news publishers, wasted crawl on thin pages, infinite filter combinations, or broken redirect chains can delay the discovery of new, important content.

Checklist Step 1: Identify Crawl Waste

  • Log into Google Search Console (GSC) and navigate to the Crawl Stats report. Review the breakdown of “By purpose” (discovery vs. refresh) and “By response” (200, 404, 301, etc.).
  • Use a log file analyzer (e.g., Screaming Frog Log File Analyzer, or a custom script parsing your server access logs) to see which URLs Googlebot actually requests. Compare this to your XML sitemap—are there URLs being crawled that are not in your sitemap? Are sitemap URLs being ignored?
  • Look for patterns: are you serving 200 status codes to parameterized URLs that should be blocked? Are redirect chains (301 → 302 → 200) consuming multiple requests per destination?
Risk Callout: The Infinite Crawl Trap Poorly configured faceted navigation (e.g., color=red&size=large&material=cotton) can generate thousands of unique URLs that all point to the same or near-identical content. Without proper parameter handling in GSC or a `robots.txt` disallow, Googlebot can waste significant budget. This does not directly cause a penalty, but it delays the indexing of your high-value product pages.

Action for the Agency Brief: When briefing an agency, ask: “What is our current crawl efficiency ratio (useful pages crawled vs. total pages crawled)? Which URL patterns are consuming the most budget, and what is your plan to consolidate or block them?”

2. Indexation Signals: Sitemaps, Robots.txt, and Canonicalization

Having your pages crawled is meaningless if they are not indexed—or worse, if the wrong version is indexed. Three core files control this flow: the XML sitemap, the `robots.txt`, and the canonical tag. Misconfigurations here are among the most common causes of “lost” SEO value.

Checklist Step 2: Validate Your Indexation Controls

  • XML Sitemap: Ensure your sitemap includes only canonical, indexable URLs (no paginated pages, no session IDs, no 3xx or 4xx responses). Submit the sitemap in GSC and check for “Couldn’t fetch” or “URL not followed” errors. For sites with 10,000+ URLs, split sitemaps by content type (e.g., products, categories, articles).
  • Robots.txt: Verify that you are not accidentally blocking critical resources. The most common mistake: `Disallow: /wp-admin/` is fine, but `Disallow: /css/` or `Disallow: /js/` can prevent Google from rendering JavaScript-rendered content. Use the Robots Testing Tool in GSC.
  • Canonical Tags: Run a site-wide crawl (Screaming Frog, Sitebulb) and flag any page where the `rel="canonical"` points to a different domain, a 4xx URL, or a URL that itself has a different canonical. This creates a “canonical chain” that Google may ignore.
Table: Common Indexation Mistakes and Their Impact

MisconfigurationSymptom in GSCConsequence
Blocked CSS/JS in robots.txt“Page is indexed without resources” warningPoor rendering, potential ranking loss for JS-dependent content
Self-referencing canonical on paginated pagesPaginated URLs indexed instead of “view-all” or main categoryDiluted authority, duplicate content issues
Sitemap includes non-canonical URLs“Submitted URL marked ‘noindex’”Wasted crawl budget, no indexation of the intended page
No sitemap submittedSlow discovery of new contentWeeks-long delay in indexing for large sites

Action for the Agency Brief: Request a full indexation audit report that includes: (1) the number of submitted vs. indexed URLs, (2) a list of pages blocked by robots.txt that should be accessible, and (3) any canonical tag conflicts found.

3. Core Web Vitals: Beyond the Lab Data

Core Web Vitals (LCP, FID/INP, CLS) are considered a ranking signal, but more importantly, they are a direct measure of user experience. Many agencies treat CWV as a “check the box” metric—run a Lighthouse test, get a green score, move on. This is insufficient. Field data (from the Chrome User Experience Report, or CrUX) often tells a different story than lab data.

Checklist Step 3: Measure Field Performance, Not Just Lab Performance

  • Open GSC → Core Web Vitals report. This shows real-user data segmented by URL group. Note the “Poor” URLs—these are the ones actually harming user experience.
  • For LCP: identify the largest element (usually an image or hero block). Is it lazy-loaded? Is it served via a slow CDN? Is it a next-gen format (WebP, AVIF)?
  • For CLS: check for layout shifts caused by late-loading ads, dynamically injected content, or web fonts with no `font-display: swap`.
  • For INP (Interaction to Next Paint): Google has announced that INP will replace FID as a Core Web Vital. Test with a real-user monitoring (RUM) tool or use the Web Vitals JavaScript library.
Risk Callout: The “False Green” from Lab Tests A Lighthouse score of 90+ on a local MacBook on a fast Wi-Fi network does not reflect the experience of a user on a mid-range Android phone on 3G. Always cross-reference lab data with CrUX field data. If your agency only shows you Lighthouse screenshots, ask for the CrUX data.

Action for the Agency Brief: Provide your agency with a list of the top 20 URLs by organic traffic. Ask them to produce a field-data report for those URLs, identifying specific remediation steps (e.g., “compress hero image to under 100KB,” “preload LCP image,” “set explicit width/height on banner”).

4. On-Page Optimization and Intent Mapping: The Content-Audit Connection

Technical SEO does not stop at crawl and indexation. On-page optimization—the structure of headings, the use of schema, the placement of target keywords—must align with search intent. A technically perfect page that targets the wrong intent will not rank.

Checklist Step 4: Align Technical Signals with Intent

  • For each high-priority page, map the primary keyword to one of four intent types: informational, navigational, commercial, transactional. The page’s content format (blog post, product page, category page) must match this intent.
  • Verify that the H1 contains the primary keyword and matches the page title. Check for multiple H1s (a common CMS error).
  • Implement structured data (JSON-LD) appropriate to the content type: Product schema for product pages, Article schema for blog posts, FAQ schema for Q&A content. Test with Google’s Rich Results Test.
  • Ensure internal links use descriptive anchor text. Avoid generic “click here” or “read more” links.
Table: Intent vs. Technical SEO Signals

Intent TypeTypical Page FormatKey Technical Signals
InformationalBlog post, guide, videoArticle schema, clear H2/H3 hierarchy, FAQ schema
CommercialComparison page, reviewProduct schema, review schema, table of contents
TransactionalProduct page, checkoutProduct schema, breadcrumb, clean URL structure
NavigationalBrand page, categoryOrganization schema, breadcrumb, fast load time

Action for the Agency Brief: Request a content audit that includes intent mapping for your top 50 landing pages. The output should show: (1) the current intent vs. the target intent, (2) a list of technical changes needed (e.g., schema type, heading restructure), and (3) a content update plan.

5. Link Building and Backlink Profile Hygiene: The Risk-Aware Approach

Link building remains a significant ranking factor, but the tactics used to acquire links have evolved dramatically. Black-hat methods—private blog networks (PBNs), automated link exchanges, paid links on low-quality directories—still exist, but they carry substantial risk. Algorithmic actions or manual penalties can severely impact organic progress.

Checklist Step 5: Audit Your Backlink Profile Before Building

  • Use a backlink analysis tool (Ahrefs, Majestic, Moz) to review your current link profile. Look for:
  • Spam score: high percentage of links from irrelevant, low-authority domains.
  • Anchor text distribution: an unnatural over-optimization (e.g., a high percentage of exact-match commercial anchors).
  • Link velocity: a sudden spike in links from new domains (a classic PBN pattern).
  • Disavow only the most egregious links—those from detected PBNs or sites that violate Google’s guidelines. Do not disavow links from legitimate but low-quality sites; Google largely ignores them.
  • For new link building, focus on:
  • Digital PR: creating newsworthy data or stories that earn links from reputable publications.
  • Guest content on authoritative, relevant sites (with a clear editorial value, not just a backlink).
  • Broken link building: find broken resources on high-authority sites and offer your content as a replacement.
Risk Callout: The “Safe” Black-Hat Myth No link building method that violates Google’s guidelines is known to be safe. Google’s spam detection systems are increasingly adept at identifying unnatural patterns. If an agency promises “guaranteed first page ranking” or “instant SEO results” through link building, that is a red flag. Sustainable link building is slow, requires relationship building, and produces results over months, not days.

Action for the Agency Brief: Ask your agency for a link building strategy document that includes: (1) a current profile audit, (2) target domains by relevance and authority, (3) the outreach methodology (e.g., digital PR, guest posting with editorial standards), and (4) a risk assessment of any aggressive tactics.

6. Measuring Progress: KPIs That Matter (and Those That Don’t)

Finally, an agency’s value is measured by outcomes, not activity. Track these metrics to evaluate the effectiveness of your technical SEO program.

Checklist Step 6: Define and Track Meaningful KPIs

  • Indexation rate: (indexed URLs / submitted URLs) × 100. A healthy rate is 80%+ for most sites.
  • Crawl efficiency: (useful pages crawled / total pages crawled) × 100. Target 70%+.
  • Core Web Vitals pass rate: percentage of URLs in the “Good” category in CrUX. Aim for 80%+.
  • Organic traffic to high-value pages: not just total organic traffic, but traffic to pages that drive conversions.
  • Keyword ranking stability: a sudden drop in rankings often signals a technical issue (e.g., a site migration error, a server outage, a penalty).
Table: KPI Targets for Technical SEO

MetricBaseline (Poor)GoodExcellent
Indexation rate<60%80%95%+
Crawl efficiency<50%70%90%+
CWV pass rate (field)<50%80%95%+
Organic traffic growth (YoY)Negative or flat10-20%30%+

Action for the Agency Brief: Include a reporting schedule in your contract. Monthly reports should show these KPIs with trend lines, not just raw numbers. Quarterly reports should include a full technical audit re-run.

Summary: From Checklist to Continuous Improvement

Technical SEO is not a one-time project; it is a continuous cycle of audit, fix, monitor, and iterate. The checklist above provides a structured approach for both briefing an agency and conducting your own internal reviews. Remember: the goal is not to achieve a perfect score on every metric, but to systematically remove friction from the user and search engine experience. A site that loads fast, is easy to crawl, and aligns content with intent can outperform a technically flawed competitor—even if that competitor has more content or more backlinks.

For further reading, explore our guides on technical SEO audits and Core Web Vitals optimization.

Russell Le

Russell Le

Senior SEO Analyst

Marcus specializes in data-driven SEO strategy and competitive analysis. He helps businesses align search performance with business goals.

Reader Comments (0)

Leave a comment