Need relevancy in keyword topics? Understand what the users search for.
The keyword research process in SEO and SEA made a wide stride with “recognition”. It helps identify and categorize entities, in other words, key information in a text. Google’s “refine keywords” option in keywords planner works with this idea it works out nice for some keywords, but it shows no found entities for others. In particular, it frequently happens on rare words that are mainly used in your business domain. However, there are approaches for you to build your own perfect working solution.
Before creating your perfect holistic SEO landing page, try to understand what the users are searching for.
When you’re new to a business and have no idea how users conduct their searches, it’s challenging to start with keyword tagging. First, you’ll need a plan about the most important categories to define logical lookup lists. For that reason, you should start with keyword clustering techniques or N-Gram analysis to get a basic understanding of keyword topics.
If you already know some of your keyword topics, you can start right away, defining your lookup lists. We highly recommend using N-Gram Analysis and starting with the words that have the highest word counts. It’ll give you the best keyword tagging coverage in a short time. Let’s try our free N-Gram analyzer tool!
Build a centralized entity database for your business. Don’t just use it for tagging your keywords. Think of applying it on SERP snippets or your competitors’ content.
Entity Recognition applied on a keyword list. You can also use additional metrics like search volume in addition to the keyword count.
If you have a product feed available, you can start copy and paste right away. If not, you probably already have a logic for your site and a URL structure where you can find information about tag categories.
Do you already know which keywords are the most popular in your business? Let’s use this knowledge and start adding tag categories that come to your mind with some signal words. — No need for a perfect list at that stage.
Try to free N-Gram tool for free with the highest search volumes to the existing tag group. By doing this, you’ll get the best coverage of tagged search queries as soon as possible. We’re sure you’ll find many new categories with this process — add them to your lookup list.
At that stage, you can use the existing lookup list and look for similar words within the untagged words. Regex functions and string distance metrics will add a lot of new lookup keys to your list. Have a look at the Python module FuzzyWuzzy that uses Levenshtein distance for string similarity.
It’s the “wow” moment for most people. If you create word embeddings with word2vec on your complete keyword set, it’s possible to search for semantically similar words. If you have an entity “color” with one value “green,” the word2vec model will show you “red” and “blue” as similar semantic words. It’ll boost your list size.
Regular impressions are costly in computation time when you process large keyword files. We used them for setting up our tag category database. The same problem occurs when you’re using similarity algorithms like Levenshtein. We used those approaches once for defining our lookup lists. By doing so, we can use Python dictionaries for all the tagging. It’s blazing fast.
There’s a great Python module out there called Flash Text. You can use it for the keyword tagging process. Load your lookup database into the Keyword Processor and start extracting entities. It’s all about speed — maybe other approaches are ok when making lookups for 100,000 keywords. However, think of 100,000 website pages you want to tag with our entity database — now you’ll realize whether your solution scales or not. If you’re interested in the Flash Text algorithm have a look at here.
If speed is crucial for you or the datasets are huge, think of splitting the tagging process to multiple worker scripts. When you publish your Python solution to Amazon, Lambda or Google Cloud functions, you can easily use those scaling approaches.