How to Find and Fix Broken Links on Any Website
Broken links are digital dead ends. They frustrate users, waste crawl budget, and dilute link equity. Here’s a no-nonsense guide to find and fix them.
In this article
- Why You Should Even Bother Finding Broken Links
- The Best Tools to Find Broken Links (Yes, We're Biased)
- A Step-by-Step Guide to Find Broken Links with a Crawler
- How to Prioritize Your Broken Link Fixes (Because Not All 404s Are Equal)
- Fixing Broken Links: Your Options (and Our Opinions)
- Automating Broken Link Detection (Because Manual Checks are for Masochists)
Why You Should Even Bother Finding Broken Links
Let’s be direct. Broken links are a sign of neglect. They tell users and search engines that you’re not paying attention to the details, and in the world of technical SEO, details are everything.
A broken link (typically a 404 Not Found error) creates a jarring user experience. A user clicks a link expecting information and hits a wall. Most will simply leave, increasing your bounce rate and signaling low quality to search engines.
For search engine crawlers, it’s a similar story. Every 404 is a wasted resource. Googlebot has a finite amount of time to spend on your site—your crawl budget. Sending it down dead-end corridors is inefficient and can lead to important pages being crawled less frequently.
Finally, there’s link equity. Internal and external links pass authority (PageRank) between pages. A link pointing to a 404 page is an open faucet, pouring that valuable equity down the drain. Finding broken links isn’t just housekeeping; it’s about preserving your site’s authority and crawl efficiency.
The Best Tools to Find Broken Links (Yes, We’re Biased)
You have options when you need to find broken links, but they are not created equal. Let’s break down the hierarchy of effectiveness.
Google Search Console: GSC will report some ‘Not Found (404)’ errors under the ‘Pages’ report. It’s free, which is nice. However, it’s often slow to update, provides incomplete data, and gives you little control over the discovery process. It’s a lagging indicator, not a proactive tool.
Browser Extensions: There are dozens of Chrome extensions that check for broken links. These are fine for spot-checking a single page. If you want to audit an entire 50,000-page e-commerce site, good luck. Your browser will crash long before you get any useful data.
Desktop SEO Crawlers: This is the only serious option for a comprehensive audit. A dedicated crawler like ScreamingCAT (or others, if you must) will systematically request every URL on your site, following links just like Googlebot. It gives you complete, real-time data on every internal and external link, their status codes, and where they’re located.
Because ScreamingCAT is built in Rust, it’s ridiculously fast and memory-efficient. You can crawl millions of pages without your machine breaking a sweat. For a task as fundamental as finding broken links, you need a tool built for the job.
A Step-by-Step Guide to Find Broken Links with a Crawler
Alright, enough theory. Let’s get practical. Here is the exact process to find every broken link on a website using a crawler. The interface is simple, so this won’t take long.
First, you need to configure the crawl. By default, most crawlers focus on internal links. To find broken external links (links from your site to other sites), you need to enable that setting. In ScreamingCAT, this is under `Configuration > Spider > Crawl External Links`.
Once configured, enter your root domain in the crawl bar and hit ‘Start’. The crawler will begin fetching URLs and following links. You can watch the data populate in real-time, but for a large site, this is a good time to grab a coffee.
When the crawl is complete, the real work begins. Navigate to the ‘Response Codes’ tab. This is your ground zero. Here, you’ll find a summary of every status code encountered during the crawl. We’re interested in the ‘Client Error (4xx)’ bucket. Click on it.
This view shows you every URL that returned a 4xx error. Now, select one of these broken URLs. To find out where this broken link exists on your site, click the ‘Inlinks’ tab at the bottom of the window. This panel shows you every single page (`From`) that links to the broken URL (`To`), including the anchor text used. This is the critical information you need for the fix.
Finally, export the data. You don’t want to live in the UI. Export the ‘Client Error (4xx)’ report, making sure to include the ‘Inlinks’ data. Most crawlers let you do this in a few clicks.
- Step 1: Configure Crawl: Ensure ‘Crawl External Links’ is enabled.
- Step 2: Run Crawl: Enter your domain and start the crawl.
- Step 3: Filter Results: Navigate to the ‘Response Codes’ tab and select ‘Client Error (4xx)’.
- Step 4: Identify Source: Select a broken URL and view the ‘Inlinks’ tab to find all source pages.
- Step 5: Export: Export the list of 4xx URLs with their corresponding inlink sources for remediation.
How to Prioritize Your Broken Link Fixes (Because Not All 404s Are Equal)
Your export might contain hundreds or even thousands of broken links. Don’t panic, and don’t start fixing them from top to bottom. You need to prioritize.
Internal vs. External: Broken internal links should almost always be your first priority. These are links entirely within your control and have a more direct impact on user navigation and internal link equity distribution. A broken external link is a bad user experience, but a broken internal link is a structural flaw.
Inlink Count: This is your most powerful prioritization metric. A broken URL with 1,000 internal links pointing to it (e.g., a broken link in your main navigation) is a critical issue. A broken link on a single, low-traffic blog post from 2012 is a low-priority task. Your crawler export should show the inlink count for each broken URL; sort by this column in descending order.
Page Importance: Consider the importance of the page containing the broken link. A broken link on your homepage or a primary service page is more damaging than one on an obscure privacy policy page. Cross-reference your broken link report with analytics data to see which source pages get the most traffic.
Good to know
Pro Tip: Combine metrics for ruthless prioritization. Create a score based on Inlink Count x Source Page Traffic. This will immediately surface the most impactful fixes.
Fixing Broken Links: Your Options (and Our Opinions)
Once you’ve found and prioritized your list, it’s time to fix them. You have three primary weapons in your arsenal. Choose wisely.
1. Update the Link (The Obvious Fix): This is the best-case scenario, usually for typos in internal links. If a link points to `/servceis/` instead of `/services/`, the fix is simply to correct the typo in the source HTML. This is clean, efficient, and requires no server-side changes.
2. Implement a 301 Redirect (The Workhorse): If the content the link pointed to still exists but at a new URL, a permanent (301) redirect is the correct solution. This tells both users and search engines that the page has moved permanently, and it passes the vast majority of link equity to the new destination. This is a core part of any sound internal linking strategy.
Avoid using 302 (temporary) redirects unless the move is genuinely temporary. Using 302s for permanent moves can confuse search engines and delay the transfer of authority.
3. Remove the Link (The Last Resort): Sometimes, the resource a link pointed to is gone for good, and there’s no relevant replacement. In this case, the best course of action is to simply remove the `` tag from the source page. It’s better to have no link than a broken one. This is a common task during a comprehensive technical SEO audit.
What you should not do is redirect all your 404s to the homepage. This is a lazy, unhelpful practice that solves nothing and creates a soft 404 problem, masking the underlying issue from both users and yourself.
Automating Broken Link Detection (Because Manual Checks are for Masochists)
Finding broken links once is good. Finding them systematically before they become a problem is better. Manually running a crawl every month is a waste of your time. This is a perfect task for automation.
Most professional SEO crawlers, including ScreamingCAT, have a command-line interface (CLI). This allows you to run fully configured crawls from your server’s terminal, without ever opening the graphical user interface. You can script these crawls to run on a schedule.
For example, you can set up a weekly cron job on a Linux server to crawl your production site and automatically export a CSV of any new 4xx errors it finds. That report can then be emailed directly to your team.
Here’s what a basic command might look like for ScreamingCAT:
screamingcat --crawl https://www.example.com --config /path/to/config.toml --headless --output-folder /path/to/reports --export-csv "response_codes_client_error_4xx"
--crawlspecifies the starting URL.--configpoints to a saved configuration file (so you don’t have to specify settings like ‘crawl external links’ every time).--headlessruns the crawl without the GUI.--output-folderdefines where to save the reports.--export-csvtells the crawler which specific report to generate.
The goal of a good technical SEO is to automate themselves out of a job, so they can focus on more strategic work. Broken link checking is low-hanging fruit for automation.
Every Smart SEO Ever
Key Takeaways
- Broken links harm user experience, waste crawl budget, and dilute link equity. Fixing them is non-negotiable.
- Desktop SEO crawlers are the most effective tools to find broken links comprehensively across an entire site.
- Prioritize fixes based on a combination of internal vs. external links, inlink count, and the importance of the source page.
- Fix broken links by updating the source, implementing a 301 redirect, or removing the link entirely. Never mass-redirect 404s to the homepage.
- Automate broken link detection using a crawler’s command-line interface and scheduled tasks (cron jobs) to proactively manage site health.
Ready to audit your site?
Download ScreamingCAT for free. No limits, no registration, no cloud dependency.