Learn
Crawl Budget for E-commerce: Stop Wasting Googlebot on Low-Value URLs
E-commerce sites can create thousands of low-value URLs. This guide shows how to protect crawl budget and keep important pages discoverable.
Run a fresh DomainLens audit and use the report as your priority list.
Why e-commerce crawl budget gets messy
E-commerce sites generate URL combinations quickly: filters, sorting, pagination, variants, tracking parameters, search results, and out-of-stock pages. Many of those URLs are useful to shoppers but weak for search.
The goal is not to block everything. The goal is to help crawlers spend more time on indexable categories, product pages, buying guides, and pages that can actually earn traffic.
Where waste usually comes from
- Faceted navigation creates many near-duplicate filter URLs.
- Sort parameters and tracking parameters create crawlable copies.
- Out-of-stock or discontinued products stay indexable without a plan.
- Internal links point to redirected, canonicalized, or noindex URLs.
How to clean it up
Decide which filtered URLs deserve indexation before changing directives. Valuable landing pages should get clean internal links, self-canonical tags, and unique content. Low-value combinations should usually be canonicalized, noindexed, or blocked from crawl depending on the case.
Keep XML sitemaps focused on canonical 200 URLs. Link directly to final product and category URLs, not parameterized versions. For discontinued products, redirect only when there is a strong replacement; otherwise use a useful 404 or keep a helpful archive page noindexed.
How to monitor progress
Use DomainLens to catch crawlability, canonical, sitemap, redirect, and internal-link problems on representative templates. Then use Search Console and logs to see whether Googlebot is spending less time on junk and more time on commercial pages.
Revisit the rules after merchandising changes. A filter that was low value last season may become a valid landing page when search demand or inventory changes.