What if I told you it is possible to capture a full-screen screenshot for a list of URLs & have it stored in a folder on your local machine?
I recently came across a situation where I had to visually scan hundreds of webpage to check whether they have a specific visual element that I am looking for.
Manually opening hundreds of pages to check that would have been tedious and time-consuming and at the same time Screaming Frog XPATH Custom Extraction wouldn’t have worked because this site was also hiding the same visual element via CSS.
This means that Screaming Frog would have detected it on the code level but in actuality, it would have been a different case. That is when I tinkered around with the idea of building a Python Script that can store visual screenshots in a folder then I can just enlarge the image thumbnails & quickly scan whether the said pages have the visual element that I am looking for.
The quantity was relatively less that’s why I didn’t build an additional Python Script that utilizes OCR Python Library to do the job of detection but that’s also possible 😉.
Anyways without further ado here is the script that you were looking for.
Step 1 - Install the necessary Libraries
!pip install selenium webdriver-manager pillow
!apt-get update
!apt-get install -y wget unzip
!wget -q -O /tmp/google-chrome-stable_current_amd64.deb https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
!dpkg -i /tmp/google-chrome-stable_current_amd64.deb || true
!apt-get -f install -y
!apt-get install -y google-chrome-stable
!pip install selenium pillow
!pip install chromedriver-autoinstaller
Step 2 - Specify your List of URLs in the next code block
urls = [
"https://www.decodedigitalmarket.com/should-you-dns-prefetch-all-3rd-party-domains/",
"https://www.decodedigitalmarket.com/how-to-display-last-updated-date-on-wordpress-posts-with-php/",
"https://www.decodedigitalmarket.com/how-to-turn-python-code-into-streamlit-app/",
]
Step 3 - Import Dependecies & Specify Functions along with specifying the Environments
import os
import io
import chromedriver_autoinstaller
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from PIL import Image
# Function to take full page screenshot
def take_fullpage_screenshot(driver, url, output_path):
driver.get(url)
# Get the total height of the page
total_height = driver.execute_script("return document.body.scrollHeight")
# Set the window size to the total height
driver.set_window_size(1920, total_height)
# Take screenshot
screenshot = driver.get_screenshot_as_png()
# Save the screenshot using PIL
image = Image.open(io.BytesIO(screenshot))
image.save(output_path)
def main(urls, output_folder):
# Set up the Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless") # Run headless Chrome
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage") # Overcome limited resource problems
# Create the output folder if it doesn't exist
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# Automatically install and setup ChromeDriver
chromedriver_autoinstaller.install()
# Initialize the Chrome driver
driver = webdriver.Chrome(options=chrome_options)
# Loop through the URLs and take screenshots
for idx, url in enumerate(urls):
output_path = os.path.join(output_folder, f"screenshot_{idx+1}.png")
take_fullpage_screenshot(driver, url, output_path)
print(f"Saved screenshot of {url} to {output_path}")
# Quit the driver
driver.quit()
Step 4 - Specify the Output Folder where you want to store the screenshots
# Output folder to save screenshots
output_folder = "screenshots"
# Run the main function
main(urls, output_folder)
Step 5 - Specify the function to store all the extracted screenshots in a Zip folder
import shutil
shutil.make_archive("screenshots", 'zip', "screenshots")
That’s all it takes to store the full-screen page screenshots in a zip folder which you can download & store in your system.
Here below is an example of the stored screenshot.
Hope this helps!
Kunjal Chawhan founder of Decode Digital Market, a Digital Marketer by profession, and a Digital Marketing Niche Blogger by passion, here to share my knowledge