15 Advanced Regex: Mastering Google Search Console
In the labyrinth of data that is Google Search Console, we’ve been on a treasure hunt. Piece by piece, we’ve unearthed the secrets to optimizing our website’s visibility and decoding the stories behind clicks and queries.
With each post, we’ve delved deeper, moving from the realm of basic search terms to the more enigmatic world of regular expressions (regex) — our powerful ally in slicing through data clutter.
But our journey is far from over. It’s time to arm ourselves with even more complex regex amulets that can reveal the hidden chambers of our website’s performance and illuminate the path to unmatched SEO success.
All URLs under a specific directory
^/directory-name/
This regex will match all URLs that start with /directory-name/
. This is particularly useful if you’ve organized your content into folders and want to see how all content in a particular folder is performing.
URLs that end with a specific file extension
\.extension$
Replace extension
with the desired file type (e.g., pdf
, jpg
). This regex is helpful if you want to see how specific file types are performing, like how often PDFs on your site are accessed.
URLs that contain a number
/.*\d.*/
This pattern matches any URL that contains a number. It can be useful if you have a pattern of URLs that contain numerical identifiers or dates.
URLs excluding certain parameters
^((?!exclude-parameter).)*$
Replace exclude-parameter
with the parameter you wish to exclude. This is helpful if you have certain URL parameters that you do not want to see in your results, such as tracking parameters.
URLs that contain certain keywords
keyword1|keyword2|keyword3
This pattern matches any URL that contains any of the keywords you specify. Replace keyword1
, keyword2
, and keyword3
with your desired keywords. This can help you quickly see how pages about certain topics are performing.
URLs that have either www or non-www but not both
^(https?://)?(www\.)?example\.com/
This will capture both http://example.com/
and http://www.example.com/
but not URLs that might accidentally have patterns like www.www.example.com
.
URLs containing any of a list of specific parameters
\?(parameter1|parameter2|parameter3)=
This is useful if you want to capture URLs that contain specific URL parameters.
URLs that start with a language or country code
^/(en|fr|es|de)/
This will match URLs that are organized by language or country code, such as /en/page1
or /fr/page1
.
Exclude URLs with certain file extensions
^((?!\.pdf|\.jpg|\.png).)*$
This regex will exclude all URLs ending with .pdf
, .jpg
, or .png
.
Capture specific patterns in subdirectories
^/directory/(subdir1|subdir2|subdir3)/
This is useful if you’re interested in performance from specific subdirectories under a main directory.
URLs that contain a date pattern
(e.g., YYYY/MM/DD)
/(\d{4}/\d{2}/\d{2}/)
This can be useful for blogs or news sites that structure their URLs with date patterns.
Match URLs but exclude certain paths
^/path/to/match/(?!exclude-this-path).*$
This will match URLs that start with /path/to/match/
but exclude URLs that have the subpath /path/to/match/exclude-this-path
.
URLs that contain certain patterns, but only at the end
/keyword1/?$|/keyword2/?$
This regex matches URLs ending with either keyword1
or keyword2
, potentially followed by a single trailing slash.
Match specific file types under specific directories
^/specific-directory/.*\.(jpg|png|gif)$
This captures specific file types like images under a specified directory.
Complex parameter matching
\?(param1=value1¶m2=value2|param3=value3)
Matches URLs with very specific parameter-value pairs.
Note
Google Search Console’s regex implementation might differ slightly from the standard. It’s always a good idea to test your regex patterns to ensure they match the desired URLs. Additionally, as you become more familiar with the specific structure of your website’s URLs and the queries you’re interested in, you can further customize these regex patterns to better suit your needs.
Comments
Leave a Comment