If you’ve been working in SEO for a while, you’ve probably felt the pain of watching your rankings plummet despite pouring hours into keyword research, link-building strategies, and optimizing on-page elements. It’s frustrating, right? Like trying to build a sandcastle only for the tide to wash it away.
But what if I told you the game has changed? The very rules of SEO that you’ve been following for years are now shifting. And if you don’t adjust, you’ll miss out on the real opportunities to rank.
That’s where AI and machine learning come in. And no, we’re not talking about some futuristic concept. It’s happening now. The introduction of Large Language Models (LLMs) like ChatGPT and BERT, MUM has completely transformed the SEO landscape. But how can you tap into this shift and start making it work for you?
SEO Nightmares: The Mistakes We’re Still Making
We’ve all made the same mistakes in SEO. It’s like we were stuck in a loop of “tried and tested” techniques that just weren’t working anymore. Here’s the scary part:
- Keyword Stuffing and Over-Optimization:
The classic SEO nightmare. For years, we believed that loading up our content with as many keywords as possible would magically push us to the top of search rankings. Google, however, was much smarter than we gave it credit for. And now, we’re paying the price.- Mistake: Over-optimization doesn’t just ruin the user experience, it’s punished by search engines.
- Chasing the Latest Fads:
Google drops a new algorithm update, and suddenly, everyone’s scrambling to chase the next SEO “hack”. But what happens when you keep jumping from one tactic to another?- Mistake: Your strategy becomes a patchwork quilt that’s bound to fall apart.
- Ignoring Searcher Intent:
The most horrifying mistake of all is treating SEO like a set of rules to follow, instead of focusing on what users are really looking for. For years, we optimized for keywords instead of intent. Well, guess what? Google already figured that out.
The Evolution of SEO: AI, LLMs, and ChatGPT
Let’s talk about the elephant in the room: Artificial Intelligence.
SEO has always been about algorithms, but in the past couple of years, the algorithms got a serious upgrade, thanks to the rise of Large Language Models (LLMs). These models, powered by deep learning and natural language processing (NLP), have allowed search engines to understand content in ways that go beyond keywords.
One of the biggest developments in SEO came with LLMs like ChatGPT, Gemini, AI Overviews and AI Mode. These models don’t just care about whether the word “best” is next to “buy”. They care about the meaning behind the words, the context, and the user’s true intent.
It’s not about matching keywords anymore; it’s about matching ideas.
Think about it. When you ask Google “What is the capital of France?”, it doesn’t just look for pages that have those exact words. It understands that you want to know the city; not a random collection of content about the history of France. Google understands intent.
How the AI-Powered SEO Audit Works
If you’ve ever felt like your SEO efforts were all over the place, this Python-based content audit tool will bring clarity. Here’s how it works:
- Fetches Content and Analyzes It
The script scrapes your webpage content, every paragraph of it, and breaks it down into individual sentences. It then evaluates how well each sentence matches specific search queries. - Semantic Matching (Goodbye Keyword Stuffing)
The tool uses AI models to compare the semantic meaning of your content with the queries you’ve provided. Unlike traditional SEO tools that focus only on keywords, this audit looks at context and the meaning behind the text. For example, if someone searches “best camera for vlogging”, it doesn’t just look for those keywords. It checks if your page actually provides detailed, helpful information about the best cameras for vlogging. - Query vs Overall Content Score
The script doesn’t just find the best-matching sentence. It calculates how well the entire page matches the query. This means you get a holistic view of how well your content aligns with the user’s search intent. So if your page mentions “camera” but doesn’t fully answer the question about vlogging cameras, your score will reflect that gap. - Intent Match
Here’s the fun part: The script assesses whether your content matches the user intent behind the search query. For example, if your page is about camera reviews but the query is about camera buying guides, the audit will flag that mismatch. You’ll know exactly where your content needs to improve to align with user intent. - Ranking Your Pages
After all the analysis, the script ranks your pages based on how well they meet the queries. It calculates an average match score and intent match percentage for each page, helping you prioritize the pages that need work.
P.S. Shout out to @Tanay Mahesh for heping in building this tool for content analysis and auditing the conent for SEO.
How the Script Works:
Step 1: Prepare Your Data
Input: URLs and Queries
- URLs: List the URLs of the pages you want to analyze.
- Queries: Specify the search queries (or keywords) you want to evaluate against these pages.
You’ll be prompted to input these two elements when you run the script.

Step 2: Set Up the Python Environment
To run the script, you need a Python environment with a few necessary libraries installed. These libraries allow the script to fetch web pages, process text, compute similarity scores, and export results.
Installation of Libraries:
Run this command to install the required libraries:
pip install requests beautifulsoup4 nltk sentence-transformers pandas openpyxl
requests
: Used for fetching the webpage content.beautifulsoup4
: For parsing and extracting text from HTML.nltk
: Natural Language Toolkit, used for sentence tokenization.sentence-transformers
: For generating text embeddings and calculating similarity.pandas
: For handling and processing data.openpyxl
: For exporting results to Excel.
Step 3: Fetch and Parse the Webpage Content
The script begins by fetching the content of the provided URLs. It uses the requests
library to retrieve the HTML of the webpage.
Then, it uses BeautifulSoup
to extract text from the page. The script specifically looks for content inside <p>
tags, assuming that the main body of the content is inside these tags.
- Why this step is important: Webpages often contain various HTML elements. By focusing on
<p>
tags, we get the most relevant text that the user sees on the page.

Step 4: Preprocess and Tokenize the Text
After fetching the text from the webpage, the script uses NLTK (Natural Language Toolkit) to break the content into individual sentences. This helps in comparing each sentence with the queries.
- Why tokenization matters: Tokenizing the content into sentences allows the script to evaluate how each sentence matches a given query, instead of analyzing the entire page at once. This provides more detailed results.
Step 5: Generate Sentence Embeddings
To understand the meaning behind both the queries and webpage content, the script uses Sentence Transformers, a pre-trained model that creates embeddings for both queries and sentences from the page.
How it works:
- Sentence Embeddings: These are numerical representations of sentences in a high-dimensional space. Sentences with similar meanings will have embeddings that are closer to each other in this space, even if they don’t share the same words.
- The model used in the script is
multi-qa-MiniLM-L6-cos-v1
, which is optimized for generating meaningful sentence embeddings.
Step 6: Calculate Similarity Scores

The script compares the query and the page’s content using cosine similarity, which measures how similar two vectors (embeddings) are to each other.
- Best Match Score: The script checks each sentence on the page and calculates how similar it is to the query. The best match score is the similarity score of the sentence that most closely matches the query.
- Query vs Overall Content Score: The script also compares the entire page’s content to the query by calculating similarity between the query’s embedding and the embedding of the full page content.
- Why this is useful: The best match score tells you how well the page answers the query based on individual sentences. The query vs overall content score tells you how well the entire page aligns with the query.
Step 7: Evaluate Intent Match
The script uses a threshold of 0.7 to determine whether a page’s content meets the user’s intent for a query:
- Intent Match: If the best match score is greater than or equal to 0.7, the script considers the content to be relevant and marks it as a “Yes” for intent match.
- If the score is below 0.7, it’s marked as “No”, indicating that the content may not fully address the search intent.
This evaluation is crucial because SEO is not just about matching keywords but also about understanding the intent behind the search query.
Step 8: Rank URLs Based on Relevance

Once all queries have been analyzed, the script calculates two key metrics for each URL:
- Average Match Score: The average of the best match scores for all queries. This tells you how well the content aligns with the queries overall.
- Intent Match Percentage: The percentage of queries for which the content meets the user’s intent.
These metrics are used to rank URLs. The URLs with the highest scores are the ones that most effectively address the user’s queries.
Step 9: Export Results to Excel
Finally, the script generates an Excel file with the analysis results:

- Detailed Results: This sheet contains the results for each query, showing the best matching sentence, match score, overall content score, and whether the intent was met.
- Ranked URLs: A summary sheet that ranks the URLs based on their average match scores and intent match percentages.
- Best Intent-Matched URL: This sheet summarizes the URL that best met the intent of all queries.
Step 10: Analyze and Optimize Content
With the Excel results, you can:
- See which pages are performing well based on their relevance to the queries.
- Identify content gaps where certain queries have low match scores, and make improvements.
- Refine content for better intent matching, ensuring your pages fully satisfy the user’s needs.
This script offers a fast and automated way to evaluate webpage content against search queries. By analyzing how well your content matches user intent, it gives you actionable insights into what needs improvement. The use of AI and automation allows you to conduct content audits much faster and more efficiently, ultimately helping to improve your SEO strategy.
Get the complete script from Tanay’s Portfolio – https://github.com/tanaymahesh12/ai-powered-seo-audit