The Technical SEO & Site Health Checklist: How to Brief an Agency and Audit Your Own Performance
Search engines reward sites that load fast, render without layout shifts, and expose a clear content hierarchy. Yet many organizations treat technical SEO as a one-time fix—a crawl error cleanup, a sitemap submission—and then move on. In reality, technical SEO is a continuous process of monitoring, diagnosing, and optimizing how search bots interact with your infrastructure. This checklist is designed for two audiences: marketing leads who need to brief an SEO agency with precision, and in-house practitioners who want to run their own site health audit without relying on third-party promises. We will walk through the essential checks, from crawl budget management to Core Web Vitals, and highlight what can go wrong when shortcuts are taken.
1. Crawl Budget & Robots.txt: Controlling the Bot’s Path
Every site has a finite crawl budget—the number of URLs a search engine will crawl within a given time window. For large sites (10,000+ pages), inefficient crawling wastes resources on low-value pages (tag archives, session IDs, paginated filters) while leaving important content unindexed. The first step in any technical audit is understanding how Googlebot allocates its time across your domain.
What to check:
- robots.txt file: Ensure it does not block critical resources (CSS, JavaScript, images) that Google needs to render the page. A common mistake is `Disallow: /wp-admin/` (fine) but accidentally blocking `/wp-content/themes/` (problematic). Use the robots.txt tester in Google Search Console.
- Crawl rate settings: In Search Console, verify that the crawl rate is not artificially capped unless you have server bandwidth constraints. A cap set too low delays indexation of new content.
- Log file analysis: If you have access to server logs, check which URLs Googlebot actually requests. If 60% of hits land on parameter-based filter pages, you have a crawl waste problem. An agency should provide a crawl budget report as part of the initial technical audit.
2. XML Sitemaps & Index Coverage: The Blueprint for Discovery
An XML sitemap is not a magic wand for ranking, but it is the most reliable mechanism for telling search engines which URLs you consider canonical and important. A well-structured sitemap should contain only indexable URLs (200 HTTP status, no `noindex`, no canonical pointing elsewhere) and be updated whenever new content is published.
Sitemap checklist:
| Check | Action | Tool/Method |
|---|---|---|
| Valid format | XML with UTF-8 encoding, no more than 50,000 URLs | Online validator or browser parse |
| Lastmod accuracy | `lastmod` should reflect actual content changes, not CMS timestamps | Compare with page publish dates |
| No broken URLs | Every listed URL returns 200 status | Screaming Frog or custom script |
| Priority & changefreq | These are hints, not commands; use sparingly | Remove if uncertain |
| Sitemap index | If you have multiple sitemaps, submit a parent sitemap index | Google Search Console |
Common mistake: Including paginated pages (e.g., `/category/page/2/`) in the sitemap. These should be `noindex` or excluded, as they dilute the crawl budget and can confuse canonical signals. A competent agency will flag this during the audit.
3. Canonical Tags & Duplicate Content: Signal the Preferred Version
Duplicate content is not a penalty—it is a confusion signal. When Google encounters identical or near-identical content on multiple URLs, it must choose which to show in search results. If it picks the wrong one, you lose traffic. Canonical tags (`rel="canonical"`) are your primary tool for telling the search engine which URL is the authoritative version.

How to audit canonical implementation:
- Self-referencing canonicals: Every page should have a canonical tag pointing to itself unless it is a syndicated copy or a printer-friendly version. Missing self-referencing canonicals is a common oversight on e-commerce product pages.
- Cross-domain canonicals: If you republish content on Medium or LinkedIn, use a canonical tag pointing back to your original. This passes link equity to your domain rather than splitting it.
- Parameter handling: For URLs with tracking parameters (`?utm_source=...`), ensure the canonical tag strips them. Use Google Search Console’s URL Parameters tool to tell Google which parameters to ignore.
4. Core Web Vitals & Site Performance: The User Experience Gate
Core Web Vitals (LCP, FID/INP, CLS) became ranking factors in 2021, but their real value is in user retention. A site that loads slowly or shifts layout while the user tries to click a button will see higher bounce rates regardless of rankings. Technical SEO now includes performance optimization as a core discipline.
Key metrics to monitor:
| Metric | Target | What It Measures |
|---|---|---|
| Largest Contentful Paint (LCP) | ≤ 2.5 seconds | Loading speed of the main content element |
| Interaction to Next Paint (INP) | ≤ 200 ms | Responsiveness to user clicks/taps |
| Cumulative Layout Shift (CLS) | ≤ 0.1 | Visual stability during load |
Practical steps for improvement:
- Image optimization: Serve next-gen formats (WebP, AVIF), lazy-load below-the-fold images, and set explicit width/height attributes to prevent CLS.
- Server response time: Use a CDN, enable caching, and keep Time to First Byte (TTFB) under 800 ms. For dynamic sites, consider server-side rendering or static generation.
- JavaScript minification: Remove unused code, defer non-critical scripts, and avoid long tasks that block the main thread.
5. On-Page Optimization & Intent Mapping: Beyond Meta Tags
On-page optimization has evolved from stuffing keywords into title tags to aligning content with search intent. A page targeting "best running shoes for flat feet" must answer the user’s need to compare products, not just list features. The technical layer supports this by ensuring the page structure is crawlable and semantically clear.
On-page audit checklist:
- Title tag & meta description: Unique for each page, within character limits, and including the primary keyword naturally. Avoid duplicate titles across product variants.
- Heading hierarchy: One `H1` per page, followed by `H2` and `H3` for sub-sections. The `H1` should match the page’s primary topic and ideally include the target keyword.
- Internal linking: Link to relevant pillar pages and related articles. Use descriptive anchor text (not "click here") and avoid linking to the same target multiple times on one page.
- Schema markup: Implement structured data appropriate to the content type (Article, Product, FAQ, BreadcrumbList). Test with Google’s Rich Results Test.

6. Link Building & Backlink Profile: Quality Over Quantity
Link building remains a significant ranking factor, but the landscape has shifted toward editorial relevance and away from mass directory submissions. A healthy backlink profile shows a natural distribution of link types (editorial, guest posts, resource pages, mentions) and a low ratio of toxic links.
How to brief a link building campaign:
- Define your target audience: Ask the agency to map the types of sites your ideal customers read. For a B2B SaaS product, that might be industry blogs, comparison sites, and trade publications.
- Set quality thresholds: Prioritize relevance over arbitrary metrics. A link from a high-authority site that is unrelated to your niche may be less valuable than a link from a smaller, relevant site. Avoid links from sites with spammy characteristics.
- Avoid PBNs and paid links: Private Blog Networks (PBNs) are sites created solely to pass link equity. Google’s algorithms detect patterns like identical IP ranges, similar themes, and unnatural anchor text distribution. If an agency promises an unusually high volume of links in a short time, they are likely using PBNs or link farms—both can lead to a manual penalty.
- Toxic link detection: Use tools like Ahrefs or Majestic to check Trust Flow (TF) vs. Citation Flow (CF). A large discrepancy (e.g., CF 50, TF 5) indicates spammy links.
- Disavow file: Only submit a disavow file if you have confirmed spam links via a manual action notice in Search Console. Proactive disavow is rarely necessary and can harm good links if done incorrectly.
7. Content Strategy & Duplicate Content Prevention
Content strategy sits at the intersection of keyword research, intent mapping, and technical execution. Without a plan, you risk producing pages that compete against each other (keyword cannibalization) or that add no unique value.
Content audit checklist:
- Identify duplicate content: Use a tool like Screaming Frog to find pages with identical or near-identical meta descriptions, titles, or body text. For thin content (fewer than 300 words), consider merging into a longer, more authoritative page.
- Consolidate weak pages: If you have 10 blog posts on "SEO tips for beginners," combine them into one comprehensive guide and 301-redirect the old URLs. This consolidates link equity and improves user experience.
- Keyword cannibalization: Check if multiple pages target the same primary keyword. Use a spreadsheet to map each page to its primary and secondary keywords. If two pages target the same term, decide which is the canonical version and add a `noindex` or redirect to the other.
8. Risk Awareness: What Can Go Wrong
Technical SEO is not without risks. Poorly executed changes can cause traffic drops, deindexation, or manual penalties. Here are the most common pitfalls and how to avoid them.
| Risk | Cause | Mitigation |
|---|---|---|
| Traffic loss after site migration | Missing 301 redirects, changed URL structure | Create a redirect map before launch, test in staging |
| Manual penalty for unnatural links | Paid links, PBNs, over-optimized anchor text | Regular backlink audits, avoid link schemes |
| Deindexation after canonical error | Canonical pointing to non-existent or wrong URL | Test all canonicals with URL Inspection tool |
| Slow Core Web Vitals after redesign | Heavy scripts, uncompressed images, no lazy loading | Perform performance budget check before go-live |
| Duplicate content from pagination | No `rel="next"/"prev"` or missing canonical on page 1 | Use `view-all` pages or implement `noindex` on paginated pages |
Final word of caution: No agency can guarantee first-page rankings. Any agency that does is either lying or using black-hat techniques that will eventually catch up with you. The role of technical SEO is to remove barriers between your content and the search engine, not to manipulate the algorithm. Focus on site health, user experience, and quality content—the rankings will follow.

Reader Comments (0)