The Technical SEO & Site Health Checklist: How to Audit, Diagnose, and Fix for Superior Search Performance
You are not in the content business; you are in the indexation business. Every page you publish, every link you build, and every keyword you target is worthless if Google’s crawlers cannot reach, parse, and understand your site. Technical SEO is the infrastructure layer of organic search—the part that determines whether your content has a chance to rank. This guide walks you through a systematic, risk-aware technical audit process, from crawl budget to Core Web Vitals, and explains how to brief an agency without falling for black-hat promises.
1. Crawl Budget & Indexation: The First Gate
Before any ranking signal matters, Google must decide how many pages to crawl and which to index. Crawl budget is the allocation of Googlebot’s time and resources to your site. For small sites (under a few thousand pages), crawl budget is rarely a constraint. For large e-commerce or publishing sites, it is the critical bottleneck.
What you need to check:
- Crawl rate in Google Search Console: Look at the “Crawl stats” report. If Google is crawling hundreds of error pages or thin content, it is wasting budget.
- Log file analysis: The most precise method. Use tools like Screaming Frog Log File Analyzer or custom ELK stacks to see which URLs Googlebot actually requests. Compare that to your sitemap.
- Blocked resources: Ensure that CSS, JavaScript, and images are not blocked by `robots.txt`. A blocked stylesheet can cause Google to render pages incorrectly, leading to poor Core Web Vitals measurement.
2. XML Sitemaps & robots.txt: Your Crawl Instructions
Your sitemap.xml is a suggestion, not a command. Your robots.txt is a directive, not a firewall. Together, they form the instruction set for Googlebot.
Sitemap checklist:
- Include only canonical URLs (no pagination parameters, no session IDs).
- Keep under 50,000 URLs or 50 MB uncompressed per sitemap file.
- Use `<lastmod>` accurately—Google uses it to prioritize re-crawls, but incorrect timestamps erode trust.
- Submit via Google Search Console and Bing Webmaster Tools.
- Do not block CSS or JS files unless you have a specific performance reason.
- Use `Disallow: /` only on staging environments.
- Test your robots.txt with the testing tool in Search Console.
3. Core Web Vitals: The Performance Tax
Google has confirmed Core Web Vitals as a ranking signal. The thresholds are:
| Metric | Good | Needs Improvement | Poor |
|---|---|---|---|
| LCP (Largest Contentful Paint) | ≤2.5s | 2.5s–4.0s | >4.0s |
| FID/INP (First Input Delay / Interaction to Next Paint) | ≤100ms | 100ms–300ms | >300ms |
| CLS (Cumulative Layout Shift) | ≤0.1 | 0.1–0.25 | >0.25 |
How to improve each:
- LCP: Optimize the largest image or text block. Use next-gen formats (WebP, AVIF), lazy-load below-the-fold images, and preload hero images.
- INP: Reduce JavaScript execution time. Break up long tasks, defer non-critical scripts, and use `requestAnimationFrame` for animations.
- CLS: Set explicit width/height attributes on images and embeds. Avoid injecting content above existing content (e.g., dynamically loaded ads).
4. Duplicate Content & Canonicalization
Duplicate content is not a penalty—it is a dilution problem. When Google finds identical or very similar content on multiple URLs, it must choose one to rank. If it picks the wrong one, your traffic drops.
Sources of duplicate content:
- WWW vs. non-WWW
- HTTP vs. HTTPS
- Trailing slash vs. non-trailing slash
- URL parameters (sort, filter, tracking)
- Printer-friendly versions
- Paginated pages with thin content
Risk warning: Never use `noindex` on a page that has backlinks. The link equity disappears. Instead, redirect or canonicalize.

5. On-Page Optimization: Beyond Keywords
On-page optimization has moved beyond stuffing keywords into H1 tags. Modern on-page SEO is about semantic relevance, entity recognition, and user intent.
Checklist for each page:
- Title tag: 50–60 characters, includes primary keyword, unique per page.
- Meta description: 150–160 characters, includes primary keyword and a call-to-action.
- H1: One per page, matches the page’s core topic.
- H2–H4: Support the H1 with subtopics. Include secondary keywords naturally.
- Image alt text: Descriptive, includes keyword where relevant, but not stuffed.
- Internal links: Link to at least 2–3 relevant pages within the site. Use descriptive anchor text.
- Schema markup: Add appropriate structured data (Article, Product, FAQ, BreadcrumbList).
6. Link Building: Quality Over Quantity
Link building remains a strong ranking signal, but the rules have changed. High-quality, relevant links are generally more valuable than many low-quality directory links.
What to avoid:
- Black-hat links: Paid links, link farms, private blog networks (PBNs), automated comments. Google’s Penguin algorithm is designed to detect these patterns. Recovering from a manual penalty can take significant time.
- Exact-match anchor text: Using the same anchor text for every link looks unnatural. Vary it with branded, generic, and partial-match anchors.
- Low-authority directories: Many “SEO directories” are unlikely to provide significant ranking benefit. Only use niche-specific, vetted directories.
- Content-based outreach: Create linkable assets (original research, infographics, comprehensive guides) and pitch them to relevant sites.
- Broken link building: Find broken pages in your niche, create a replacement, and contact the linking site.
- Digital PR: News stories, expert quotes, and data-driven press releases can earn natural links from high-authority news sites.
7. Technical SEO Audit: The Step-by-Step Workflow
Run a full technical audit at least quarterly, or after any major site change (redesign, migration, new CMS).
Step 1: Crawl the site
Use a tool like Screaming Frog, Sitebulb, or DeepCrawl. Crawl the entire site, including XML sitemaps and internal links.
Step 2: Check indexation
In Google Search Console, run the “Pages” report. Look for:
- Excluded by ‘noindex’ tag – verify intentional.
- Excluded by ‘robots.txt’ – verify no important pages are blocked.
- Crawled but not indexed – could be thin content, duplicate, or low-quality.
Identify chains (three or more redirects) and loops. Fix them to a single 301 redirect.

Step 4: Review Core Web Vitals
Run the “Core Web Vitals” report in Search Console. For failing URLs, use Lighthouse or PageSpeed Insights to identify specific issues.
Step 5: Check mobile usability
Use the “Mobile Usability” report in Search Console. Common issues: text too small, touch elements too close, viewport not set.
Step 6: Audit internal linking
Use a tool to visualize your internal link graph. Ensure every important page has at least one internal link from another page.
Step 7: Review structured data
Use the Rich Results Test tool to validate all schema markup. Fix any errors or warnings.
8. Summary: The Checklist
| Area | Action | Frequency |
|---|---|---|
| Crawl budget | Review crawl stats in Search Console | Monthly |
| Sitemap | Submit updated sitemap | After content changes |
| robots.txt | Test for blocked resources | After site changes |
| Core Web Vitals | Monitor LCP, INP, CLS | Weekly |
| Duplicate content | Check canonical tags | Quarterly |
| On-page | Audit title tags, meta descriptions | Per page |
| Link profile | Review backlinks for toxicity | Monthly |
| Technical audit | Full crawl | Quarterly |
Final note: Technical SEO is not a one-time fix. It is a continuous process of monitoring, testing, and adjusting. The sites that win are the ones that treat their technical foundation as seriously as their content and links. Start with this checklist, run your first audit, and build from there.

Reader Comments (0)