Capture Full-Screen Screenshots of URLs with Python

What if I told you it is possible to capture a full-screen screenshot for a list of URLs & have it stored in a folder on your local machine?

I recently came across a situation where I had to visually scan hundreds of webpage to check whether they have a specific visual element that I am looking for.

Manually opening hundreds of pages to check that would have been tedious and time-consuming and at the same time Screaming Frog XPATH Custom Extraction wouldn’t have worked because this site was also hiding the same visual element via CSS.

This means that Screaming Frog would have detected it on the code level but in actuality, it would have been a different case. That is when I tinkered around with the idea of building a Python Script that can store visual screenshots in a folder then I can just enlarge the image thumbnails & quickly scan whether the said pages have the visual element that I am looking for.

The quantity was relatively less that’s why I didn’t build an additional Python Script that utilizes OCR Python Library to do the job of detection but that’s also possible 😉.

Anyways without further ado here is the script that you were looking for.

Step 1 - Install the necessary Libraries

				
					!pip install selenium webdriver-manager pillow
!apt-get update
!apt-get install -y wget unzip
!wget -q -O /tmp/google-chrome-stable_current_amd64.deb https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
!dpkg -i /tmp/google-chrome-stable_current_amd64.deb || true
!apt-get -f install -y
!apt-get install -y google-chrome-stable
!pip install selenium pillow
!pip install chromedriver-autoinstaller
				
			

Step 2 - Specify your List of URLs in the next code block

				
					urls = [
"https://www.decodedigitalmarket.com/should-you-dns-prefetch-all-3rd-party-domains/",
"https://www.decodedigitalmarket.com/how-to-display-last-updated-date-on-wordpress-posts-with-php/",
"https://www.decodedigitalmarket.com/how-to-turn-python-code-into-streamlit-app/",
]
				
			

Step 3 - Import Dependecies & Specify Functions along with specifying the Environments

				
					import os
import io
import chromedriver_autoinstaller
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from PIL import Image

# Function to take full page screenshot
def take_fullpage_screenshot(driver, url, output_path):
    driver.get(url)
    # Get the total height of the page
    total_height = driver.execute_script("return document.body.scrollHeight")
    # Set the window size to the total height
    driver.set_window_size(1920, total_height)
    # Take screenshot
    screenshot = driver.get_screenshot_as_png()
    # Save the screenshot using PIL
    image = Image.open(io.BytesIO(screenshot))
    image.save(output_path)

def main(urls, output_folder):
    # Set up the Chrome options
    chrome_options = Options()
    chrome_options.add_argument("--headless")  # Run headless Chrome
    chrome_options.add_argument("--disable-gpu")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")  # Overcome limited resource problems

    # Create the output folder if it doesn't exist
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    # Automatically install and setup ChromeDriver
    chromedriver_autoinstaller.install()

    # Initialize the Chrome driver
    driver = webdriver.Chrome(options=chrome_options)

    # Loop through the URLs and take screenshots
    for idx, url in enumerate(urls):
        output_path = os.path.join(output_folder, f"screenshot_{idx+1}.png")
        take_fullpage_screenshot(driver, url, output_path)
        print(f"Saved screenshot of {url} to {output_path}")

    # Quit the driver
    driver.quit()
				
			

Step 4 - Specify the Output Folder where you want to store the screenshots

				
					# Output folder to save screenshots
output_folder = "screenshots"

# Run the main function
main(urls, output_folder)
				
			

Step 5 - Specify the function to store all the extracted screenshots in a Zip folder

				
					import shutil

shutil.make_archive("screenshots", 'zip', "screenshots")
				
			

That’s all it takes to store the full-screen page screenshots in a zip folder which you can download & store in your system.

Here below is an example of the stored screenshot.

Hope this helps!

Leave a Comment