Skip to main content

Yaar, imagine this – you’ve spent hours, days, or even weeks building and tweaking your website, and Google STILL refuses to index some of your pages. It’s like sending an invite to a party, but some guests just don’t show up. Annoying, right?

If your pages are stuck in the “NON INDEXED” limbo and you’ve no idea why, don’t worry! You’re not alone. We’ve all been there. But here’s the good news: I’ve got the ultimate game plan to fixing NON INDEXED pages and fix this once and for all! 🚀

Let’s dive in, shall we?

Step 1: Crawl Your Site with Screaming Frog SEO Spider (Don’t Worry, It’s Not as Scary as it Sounds) 🐸

First things first, we need to figure out what’s going on   with your site. And for that, you’re going to need the Screaming Frog SEO Spider. Sounds fancy, no? But chill, it’s a super useful tool that will help you crawl through your site like a detective finding hidden clues.

Here’s what you need to do:

  1. Open Screaming Frog and go to Configuration > Crawl Config.
  2. Adjust the settings:
    • Crawl: Make sure to follow internal nofollow links and crawl linked XML sitemaps.
    • Rendering: Set it to Javascript (trust me, this helps with all those dynamic, fancy pages).
    • Robots.txt: Tell it to ignore robots.txt but report status (this way, no page gets blocked).
    • API Access: Connect your Google Search Console (GSC) and pick data from the past 16 months.

Step 2: Run the Crawl – Let’s Get This Show On the Road! 🚀

Once you’re all set up, click on Start Crawl. It’s like sending the tool on a little adventure to dig through your site and find any troublemakers (aka non-indexed pages). Grab a cup of chai, because this might take a few minutes.

Step 3: Export to Google Sheets – Time to Get Organized 📊

Once the crawl is done, you’ll get a nice, neat report. Export all this data into a spreadsheet. Trust me, Google Sheets will be your best friend during this process.

Step 4: Export NON INDEXED Data From GSC – The Plot Thickens 🔍

Now, hop over to your Google Search Console and export the following data for pages that Google isn’t indexing:

  • Crawled – Currently Not Indexed
  • Discovered – Currently Not Indexed
  • Duplicate – Google Chose a Different Canonical

This is where the mystery deepens. These are the pages Google crawled but for some reason, decided not to index. Bhai, why Google, why?!

Step 5: Organize Everything in Google Sheets – Keep It Neat, Keep It Tidy 📑

Now that you have the crawled data and the non-indexed data from GSC, you need to keep things clean. Create a spreadsheet with tabs for each of these categories:

  • Full Site Crawl (for the crawl data)
  • Separate tabs for each non-indexed issue (like “Crawled Not Indexed,” “Discovered Not Indexed,” etc.)

This step is crucial because when everything is in one place, it’s easier to find patterns. No one likes chaos, right?

Step 6: Run the HTTP Status Code Script – Time to Check the Status

You’re almost there, bhai! Now, let’s run a little script to check the HTTP status code for each of those non-indexed URLs. Don’t sweat it – it’s a simple process. You’ll find the script in the comments section of SEO forums, or you can search it on Google.

Once you run the script, you’ll know which pages are active (HTTP 200) and which are dead (404). These little status codes will give you a clue about what’s going wrong.

Step 7: Use VLOOKUP to Pair Data – Time to Play Detective 🔎

This is where the magic happens. Now, pair the crawled data with the HTTP status codes using VLOOKUP. Don’t worry if that sounds like a techie word – it’s basically a way to match your data and figure out what’s going on with each URL.

Here’s a simple formula for you:

=IFERROR(VLOOKUP(A2, ‘Full Site Crawl’!$A$1:$YM$145124, 2, FALSE), “-“)

What this does is check the URL in cell A2 (in your Full Site Crawl sheet) and pulls the status code from the corresponding row. It’s like connecting the dots, but in spreadsheet form!

Step 8: Find Orphan Pages – The Ghost Pages 👻

If you notice that a page has a 200 status (active), but you can’t find it in the crawl, it means Google didn’t see it during the crawl. These are called orphan pages – basically, pages that don’t have internal links pointing to them.

Bhai, no one likes an orphan. Show those pages some love!

Step 9: Create a Filter and Sort – Time to Clean Up 🧹

Now, let’s clean up the data. Use data filters in Google Sheets to sort and analyze your pages better. You can filter by:

  1. HTTP Status Codes: Look for pages that are active but still not indexed (this is key!).
  2. Word Count and Internal Links: Maybe the page needs more content or internal links to get noticed by Google.

Step 10: Check Historical Data in GSC – The Final Clue 🕵️‍♂️

Before you call it a day, export your Google Search Console data again and use VLOOKUP to see if these non-indexed pages have ever been indexed in the last 16 months. Sometimes, it’s just a matter of time before Google decides to index them – or maybe you need to tweak something.

BONUS Tip: Fix Duplicate Content Issues – Avoid Google’s “Oops” 🙈

If Google is choosing a different canonical URL, it means there’s duplicate content somewhere. You’ll need to fix that by setting the right canonical tags so that Google knows which page to prioritize. Don’t let your pages fight for attention!

Wrapping Up – You Got This! 🙌

Fixing NON INDEXED pages can be a pain, but now you’ve got a solid process to follow. With the right tools and a little patience, you can easily tackle this issue and boost your SEO game. Google will start indexing those pages in no time, and your site will be on the path to greater visibility.

So, go ahead, implement these steps, and let me know how it goes! You’ll be an SEO superhero in no time. ✌️

Ashutosh Sharma

A digital strategist with over 2 years of experience in SEO, Website Development, and AR filters, Ashutosh Sharma specializes in crafting innovative digital solutions that drive results. He enjoys exploring emerging trends and translating them into impactful strategies for clients. When not working on digital projects, Ashutosh enjoys diving into the latest tech advancements and has a keen interest in discovering new digital tools