N-gram analysis in PPC

N-gram analysis of your PPC data will give you insights to take actions that will save you tons of money!

Become a PPC Super Hero by using data science Approaches for PPC problems. N-gram analysis will inspire you with insights and action plans which were hidden before.

On this page
    N-Gram analysis in PPC

    What are n-grams and how to create them on search terms?

    n-grams are used to analyze large texts. They’re also great for analyzing thousands of very small texts, like Query or Keyword related Reports from your PPC Accounts.

    Consider you have a window of N co-occurring words within the Query and move that window word by word forward—this is the way n-grams are created. “N” can be 1,2,3 and so on. Choosing N=1 will give you 1-Grams, N=2 are called 2-grams. Needless to say, it’s not rocket science.

    EXAMPLE-QUERY: “book hotel in berlin”

    1-grams: book, hotel, in, berlin
    2-grams: book hotel, hotel in, in berlin
    3-grams: book hotel in, hotel in berlin

    Why and where should you use n-grams in PPC?

    As mentioned in the beginning, you’ll save a lot of money by using poor performing n-grams as negative Keywords within your PPC Account. Those poor search patterns were hidden before by just looking at complete search terms.

    On Query Level, you will have trouble to judge performance as you do not have sufficient sample sizes available. N-grams increase observations drastically, and it is then possible to make decisions based on numbers.

    Nevertheless, you can also do it in the opposite way: Detect new well performing search patterns you can use for creating new keywords. Once you’re familiar with the concept, you will find more use cases easily.

    Step-by-Step Guide: From Keyword/Query Reports to n-gram Reports

    Step 1: Choose the PPC Report you want to analyze

    The most important use cases will be the Google Ads Keyword Performance and the Search Term Performance Report (of course you can apply it to the equivalent Amazon PPC or Bing reports). We consider the Query or Keyword as text (yes, it’s very short) we want to create n-grams on. You should also consider including some metrics in your report since counting on word occurrences alone will give you limited insights.

    Step 2: Select useful KPIs

    The most important metrics should be:

    • Impressions
    • Clicks
    • Costs
    • Conversions
    • ConversionValue

    You might be confused at this point as we haven’t selected some calculated measures like Cost Per Conversion or Conversion Rate. However, the underlying reason is simple: When we run the n-gram transformations and the mapping of KPIs (Key Performance Indicator) with calculated metrics, you will end up in wrong numbers. We have to calculate all those metrics after the n-gram mapping. What you need to do is to only use absolute numbers as input values—no ratios or averages!

    Step 3: Map all available KPIs to n-grams

    First, you have to decide how deep you want to analyze your Queries. With deep, we mean the value of “N”. A good way to start is using 1-grams and 2-grams. When you go higher, you’ll end up in low click numbers per n-gram and you have trouble again to make actions out of that.

    Please also remember: The higher N is, the higher the required memory for processing is. Besides, the storage of the results takes more space. This can be a problem when you analyze Gigabytes of Search Query data.

    Step 4: Add calculated Measures to your n-grams

    After the mapping process, all KPI sums are available on n-gram Level. Now you can start adding calculated KPIs that are familiar to you when it comes to reviewing your PPC Account data: CPO (Cost Per Order), CR (Conversion Rate), ROI (Return on Investment) and more. Sorting by these KPIs will bring for instance, the poor search patterns on top of your list.

    When we put everything together, the n-gram Analysis Process looks like this:

    Simple example of an n-gram analysis: From Search Terms to n-grams with mapped KPIs and calculated measures

    In our example, we only had 3 Queries in total, and it was manageable by hand. What if we have 1 GB of Search Term data we have to analyze? This will be part of the next paragraph. we can already tell you that there are multiple ways to do it and even if you aren’t a programmer, you’ll be able to create a PPC n-gram analysis.

    Hands On: Create your own n-gram analysis on search term data

    Free Online N-gram Analyzer

    Free PEMAVOR N-gram Analyzer

    If you aren’t a programmer or Google Ads Script or Python isn’t the first option for you, the quickest way to get an n-gram report is using our free N-gram Analyzer. It’s working not only for the Search Term Performance Report, but also for every other report that contains queries or keywords and some KPIs you want to analyze. Just upload your CSV data and all n-gram Transformations are done in seconds. At that point, we only have the sums of the provided metrics. For adding calculated measures, you can download the CSV or just copy the result to the clipboard and paste to Google sheets or Excel to add the missing calculations.

    Note: Due to processing limitations, we only calculate 1-grams in this free tool.

    Create n-grams with Big Query ML

    If you have stored your Search Term Data already in Google Big Query, we have good news for you: You’re just one SQL Select Statement away from your n-gram Analysis. One of the Big Query ML Features are addressing n-grams, and it’s available as a build in function. In our Select Statement we used ML-NGRAM Function to create 1-grams and 2-grams, you can easily change this in accordance with your needs in the SQL Statement.

    SELECT
      Ngram,
      SUM(Impressions) AS Impressions,
      SUM(Clicks) AS Clicks,
      SUM(Costs) AS Costs,
      SUM(Conversions) AS Conversions,
      COUNT(1) Count,
      ROUND(SUM(Conversions)/SUM(Clicks),2) AS CR,
      ROUND(SUM(Costs)/SUM(Conversions),2) AS CPO
    FROM
      `sealyzer-data-science.test.T_Queryreport`,
      UNNEST(ML.NGRAMS(SPLIT(REGEXP_REPLACE(LOWER(Query), r'(\pP)', r' \1 '), ' '),
          [1,
          2],
          ' ')) AS Ngram # [1,2] means 1-Grams an 2-Grams are created. You can edit this to set your value for "N"
    GROUP BY
      Ngram

    After Running the Script, you’ll get this output:

    1-grams and 2-grams generated with SQL within Big Query

    Note: This approach is very nice and scales well even on big data sets. The challenge for some PPC Managers will be to bring the data to Google Big Query. Because the Google Data Transfer Services for importing Google Ads Data can be very costly—we recommend to use Google Ads Script in Combination with Big Query API. Feel free to contact us should you have any interest in this respect.

    N-gram Script in Python

    Processing steps using Python Pandas to create n-gram analysis of Google Ads search terms.

    There are some available modules that extract n-grams from texts. However, those modules don’t cover the mapping of the KPIs, which is crucial for getting actionable insights. For that reason, we created our own solution based on Python Pandas, which is very flexible in changing inputs. Moreover, the processing will still work when you have changing file column names,.

    Here is the Python Script that creates 1-grams for your PPC Reports.

    Note: With some small adjustments, you can create 2-grams and 3-grams as well. Contact us when you’re interested.

    import pandas as pd
    from collections import defaultdict
    import numpy as np
    
    
    
    def explode_str(df, col, sep):
        s = df[col]
        i = np.arange(len(s)).repeat(s.str.count(sep) + 1)
        return df.iloc[i].assign(**{col: sep.join(s).split(sep)})
    
    # queries.csv contains your search term report with KPIs
    # Adjust delimiter by seeting the sep= parameter
    print("----------------------------------------------------------------------------------------------------")
    print("\tThe raw report")
    print("----------------------------------------------------------------------------------------------------")
    df = pd.read_csv("queries.csv",sep="\t")
    print(df)
    
    # Create 1-Grams by splitting 1 row to multiple rows
    # "Query" is the columnname where you want to apply the n-Gram analysis
    df = explode_str(df, 'Query', ' ')
    print("")
    print("----------------------------------------------------------------------------------------------------")
    print("\tSplitting into 1-Grams")
    print("----------------------------------------------------------------------------------------------------")
    print(df)
    
    
    # Group by N-Grams
    df = df.groupby('Query').sum()
    df = df.reset_index()
    print("")
    print("----------------------------------------------------------------------------------------------------")
    print("\tAggregated Metrics on 1-Grams")
    print("----------------------------------------------------------------------------------------------------")
    print(df)
    
    
    print("")
    print("----------------------------------------------------------------------------------------------------")
    print("\tAdded Calculated Measures")
    print("----------------------------------------------------------------------------------------------------")
    
    # Define your calculated measures here:
    df['CR'] = df['Conversions']/df['Clicks']
    df['CPO'] = df['Costs'] / df['Conversions']
    print(df)
    
    #Save to CSV
    df.to_csv("ngrams.csv",index=False)

    N-gram Script in Google Ads

    There are some examples available to do the N-gram Processing directly in Google Ads by using Google Ads Script. However, the processing of large data sets can be a problem, especially when you go deeper then 1-grams. In this case:

    Depending on your use case, you have to adjust the JavaScript Code to fit your desired output. We will soon share a running solution that is already built on Google Ads API Query Statements—the old N-grams built on the soon outdated AdWords API—will stop working.

    Frequently Asked Questions (FAQs)

    What is N-gram classification?

    For language recognition purposes, they’re usually restricted to adjacent characters in a word and include spaces before and after the word, but N-grams contain any characters in a word.

    What is unigram, bigram and trigram?

    A 1-gram (or unigram) is a one-word sequence. For example, “coffee” is 1-gram (unigram), “coffee shops” 2-gram (bigram), and “coffee shops near” is a 3-gram (trigram).

    What is N-gram used for?

    N-grams are useful in many applications of text analysis. They are used in computational linguistics, probability theory, communication theory, and data compression, among others.

    Why N-gram analysis in PPC?

    N-gram analysis is a great way for PPC marketeers to figure out the right keywords. This allows them to create or optimize their content strategies and PPC campaigns more effectively.

    More Similar Posts