Skip to main content

Automating YouTube Transcriptions with Python and Selenium

Document

This blog post will guide you through a Python-based method for automating YouTube transcription retrieval using Selenium. This script extracts the video title and its entire transcription by providing a YouTube video URL, then saves them to a text file. This can be useful for archiving, analyzing content, or creating summaries.

Prerequisites

You’ll need:

  • Python: Version 3.6 or later
  • Selenium: Python package for automating web browsers
  • ChromeDriver: Required to run Chrome browser instances with Selenium

Install Selenium using:


        pip install selenium
    

Downloading ChromeDriver

To run this script, you need ChromeDriver to control the Chrome browser via Selenium. Here’s how to download it:

  1. Go to the official ChromeDriver for Testing page.
  2. Find the version that matches your installed version of Chrome.
  3. Download the appropriate version for your operating system.
  4. Extract the chromedriver executable and place it in a known directory, like a selenium folder within your project directory.

Code Walkthrough

Step 1: Initialize the WebDriver

The initialize_driver function sets up a Chrome WebDriver in headless mode, keeping the browser invisible during the script’s execution.


        from selenium import webdriver
        from selenium.webdriver.chrome.service import Service 
        from selenium.webdriver.chrome.options import Options
        from selenium.webdriver.common.by import By

        def initialize_driver(CHROMEDRIVER_PATH, headless=True):
            chrome_options = Options()
            if headless: 
                chrome_options.add_argument('--headless')  
            service = Service(CHROMEDRIVER_PATH)
            driver = webdriver.Chrome(service=service, options=chrome_options)
            return driver
    

Step 2: Extract Title and Transcript

The get_transcriptions the function opens a YouTube video, clicks the necessary buttons to display the transcript, and extracts each line.


        import time

        def get_transcriptions(driver, video_url, button1_xpath, button2_xpath, title_xpath):
            driver.get(video_url)
            time.sleep(5)
            video_title = driver.find_element(By.XPATH, title_xpath).text
            time.sleep(2)
            driver.find_element(By.XPATH, button1_xpath).click()
            time.sleep(2)
            driver.find_element(By.XPATH, button2_xpath).click()
            time.sleep(2)
            
            transcript, index = '', 1
            while True:
                transcript_xpath = f'//*[@id="segments-container"]/ytd-transcript-segment-renderer[{index}]/div/yt-formatted-string'
                try:
                    sentence = driver.find_element(By.XPATH, transcript_xpath).text
                except:
                    break
                transcript = transcript + sentence + '\n'
                index += 1
            
            driver.quit()
            return transcript, video_title
    

Step 3: Define XPaths and Run the Script

Define the XPaths for buttons to show the transcript and extract the title. YouTube updates may require adjusting these XPaths.


        title_xpath = '//*[@id="title"]/h1/yt-formatted-string'
        more_xpath = '//*[@id="expand"]'
        show_transcript_xpath = '//*[@id="primary-button"]/ytd-button-renderer/yt-button-shape/button/yt-touch-feedback-shape/div/div[2]'
        CHROMEDRIVER_PATH = r'selenium\chromedriver.exe'

        # Initialize driver
        driver = initialize_driver(CHROMEDRIVER_PATH, headless=True)

        # URL of the YouTube video
        video_url = "https://www.youtube.com/watch?v=x6TsR3y5Qfg"

        # Fetch transcription
        transcription, video_title = get_transcriptions(driver, video_url, more_xpath, show_transcript_xpath, title_xpath)

        # Save transcription to file
        with open("transcription.txt", "w") as file:
            file.write(f"Video Title: {video_title}\n\n")
            file.write("Transcription:\n")
            file.write(transcription)
    

Running the Script

Ensure that chromedriver.exe is available in the specified directory or update CHROMEDRIVER_PATH to the full path.

To run the script:


        python youtube_transcription.py
    

The script will create a transcription.txt file containing the video title and transcription.

Code Enhancements

  • Error Handling: Adding try-except blocks for specific elements helps in case YouTube’s structure changes.
  • Adjustable Wait Times: Instead of time.sleep(), use Selenium’s WebDriverWait for more reliable element loading.

Conclusion

This script automates YouTube transcript extraction, making it a valuable tool for content creation, educational purposes, and accessibility. Feel free to modify and enhance it to support other websites or batch processing multiple videos.

Comments

Popular posts from this blog

Unlock the Power of Video: How to Transcribe MP4 Files with Python in Minutes

Document Introduction In a world brimming with video content, converting MP4 files into text can be a game-changer for content creators, researchers, and businesses. Whether it’s to enhance accessibility, repurpose video content for blog posts, or improve SEO, an efficient video transcription tool can make a significant difference. In this guide, I’ll walk you through building a Python script that turns your video files into text — effortlessly and swiftly. The Value of Video Transcription Imagine you’re a content creator preparing video tutorials, a business extracting key insights from recorded meetings, or a researcher analyzing hours of video interviews. Transcribing these videos by hand is time-consuming and prone to errors. Automating the process with Python can save hours of work and boost productivity. A Personal Use C...

Control Your PowerPoint Presentations with Hand Gestures Using Python

Document Have you ever wished you could control your PowerPoint presentations with just a wave of your hand? Imagine presenting to an audience and effortlessly switching slides with natural hand gestures, no clicker required. Thanks to Python, OpenCV, and MediaPipe, this futuristic concept can be a reality! In this blog, we'll explore a Python script that enables you to navigate your PowerPoint presentation using hand gestures. This script detects your hand movements via your webcam and translates them into slide navigation commands. Prerequisites This project combines MediaPipe for hand detection and win32com.client to interact with Microsoft PowerPoint. Here’s a breakdown of what it does: Hand Detection : Detects left and right hands using your webcam feed. Slide Control : A left-hand gesture ...