Extract Publish Dates Using Screaming Frog Custom Extraction

If you are here that means you are hungry for more data to drive your SEO strategy. While performing SEO audits one of the most powerful tools out there is Screaming Frog.

In this tutorial, you will learn how to get more out of Screaming Frog by leveraging its Custom Extraction Functionality. I will share some awesome use cases with examples that you can easily follow.

Step-by-Step Guide to extracting publish dates from Blog Post

Screaming Frog by default wouldn’t extract the blog publish dates in its dashboard, but using Custom Extraction functionality you can do that.

And here is how you do that

Step 1: Open Inspect Element on the page & select the date

sf extract blog publish dates

Just open the inspect element on the page by pressing ctrl+U and then hover over the visible date element and that will highlight the code part in the inspect element.

Step 2: Copy Selector from the date element

inspect element copy selector

Just right click on the date element that’s already highlighted, hover over copy which should open new menu and from that menu click [copy selector]

Next, paste that code in the notepad as a means to jot down the code which you will need for Screaming Frog Custom Extraction

Here is what the code would be once you copy it 

#post-1229 > div > header > div > span.posted-on > time.entry-date.published

From the above code remove #post-1229 >div and paste only what’s left after that in the next step

P.S. This is what remains – header > div > span.posted-on > time.entry-date.published

Step 3: Paste the CSS Path in Screaming Frog Custom Extraction and Run the crawl

sf custom extraction tool

Open the Custom Extract dialogue box from Screaming Frog

In the first placeholder type [date] this is just for your reference you can name it whatever you want. Then select CSSPath from the second dropdown, then paste the code that we had copied in the third box (header > div > span.posted-on > time.entry-date.published) paste after removing what was before “header”.

In the fourth box select [Extract Text] since that’s what you want to extract. After this step hit OK and then run the crawl.

Here is the outcome

screaming frog extracted dates

Note: It will show dates only on those pages where visible date label was present

P.S. The code that you will copy and paste can differ from platform to platform, and maybe even from themes to themes.

For example, Shopify’s code would be different.

Use case:

Finding out about publishing dates of all the blog posts when auditing a site you can measure the content velocity of the site. You can identify if the site has been pushing content consistently or if there are abrupt breaks in between.

Furthermore, you can do trend analysis for example you can add GSC & GA API on Screaming Frog to find traffic data of the blog post and see if publishing a certain form or topic of content on certain dates is getting virality.

Bonus:

Here is the CSS Path you will need for extracting publishing dates for a Shopify blog

header > div > div.grid__item.grid__item–tablet-up-two-thirds > ul:nth-child(2) > li:nth-child(1) > time

P.S. Took this from Shopify.in not sure if this would work on all Shopify themes.

Leave a Comment