Installation

Prerequisites

Before installing, ensure you have:

Python 3.10 or higher
OpenAI API key (get one here)
FFmpeg with development headers
ImageMagick for subtitle rendering
NVIDIA GPU with CUDA support (optional but recommended for 5-10x faster transcription)

GPU acceleration requires an NVIDIA GPU with CUDA support. For systems without a compatible GPU, see the CPU-Only Installation section.

Ubuntu/Debian Installation

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install System Dependencies

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev \
  libopus-dev libvpx-dev pkg-config libsrtp2-dev imagemagick

These packages provide:

ffmpeg: Video/audio processing
libavdevice-dev, libavfilter-dev: FFmpeg development libraries
libopus-dev, libvpx-dev: Audio/video codec support
pkg-config: Build configuration tool
libsrtp2-dev: Secure RTP protocol support
imagemagick: Subtitle rendering

Fix ImageMagick Security Policy

ImageMagick has a restrictive security policy by default that prevents subtitle rendering:

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

This step is required on Linux for subtitles to work. Without it, subtitle generation will fail silently.

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate

Install Python Dependencies

pip install -r requirements.txt

This installs key dependencies:

faster-whisper (1.0.1): GPU-accelerated speech transcription
torch (2.7.1): PyTorch with CUDA support
langchain-openai (0.3.0): GPT-4o-mini integration
moviepy (1.0.3): Video editing and manipulation
opencv-python (4.8.1.78): Face detection and cropping
pytubefix (9.1.1): YouTube video downloading

Configure Environment Variables

Create a .env file in the project root:

OPENAI_API=your_openai_api_key_here

Replace your_openai_api_key_here with your actual OpenAI API key from platform.openai.com/api-keys.

Verify Installation

Test that GPU acceleration is working:

python -c "import torch; print('CUDA available:', torch.cuda.is_available())"

Should output: CUDA available: TrueIf it shows False, you may need to install CUDA drivers or use CPU-only mode.

macOS Installation

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install Homebrew (if not already installed)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install System Dependencies

brew install ffmpeg imagemagick

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate

Install Python Dependencies

pip install -r requirements.txt

macOS does not support CUDA, so transcription will run on CPU. For faster processing, consider using a cloud GPU instance or the AI Clipping API.

Configure Environment Variables

Create a .env file:

echo "OPENAI_API=your_openai_api_key_here" > .env

Windows Installation

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install FFmpeg

# Run PowerShell as Administrator
choco install ffmpeg -y

Install ImageMagick

choco install imagemagick -y

After installation, configure the security policy:

Open C:\Program Files\ImageMagick-7.x.x-Q16-HDRI\config\policy.xml
Find: <policy domain="path" rights="none" pattern="@*"/>
Change to: <policy domain="path" rights="read|write" pattern="@*"/>
Save the file

Create Virtual Environment

python -m venv venv
.\venv\Scripts\Activate.ps1

If you get an execution policy error, run:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Install Python Dependencies

pip install -r requirements.txt

Configure Environment Variables

Create a .env file in the project root:

echo "OPENAI_API=your_openai_api_key_here" > .env

Run the Tool

On Windows, run Python directly (instead of using run.sh):

python main.py "https://youtu.be/VIDEO_ID"

Or for interactive mode:

python main.py

CPU-Only Installation

If you don’t have an NVIDIA GPU, you can run the tool in CPU-only mode. Transcription will be significantly slower (5-10x), but all features remain functional.

Ubuntu/Debian (CPU)

Install System Dependencies

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev \
  libopus-dev libvpx-dev pkg-config libsrtp2-dev imagemagick

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate

Install CPU PyTorch First

pip install torch --index-url https://download.pytorch.org/whl/cpu

Important: Install CPU PyTorch before installing other dependencies to avoid downloading CUDA packages.

Install Other Dependencies

pip install -r requirements-cpu.txt

If requirements-cpu.txt doesn’t exist, use requirements.txt but skip CUDA-related packages.

Verify CPU Mode

python -c "import torch; print('CUDA available:', torch.cuda.is_available())"

Should output: CUDA available: False

Windows (CPU)

Install System Dependencies

choco install ffmpeg imagemagick -y

Configure ImageMagick policy as described in the Windows Installation section.

Create Virtual Environment

python -m venv venv
.\venv\Scripts\Activate.ps1

Install CPU PyTorch

pip install torch --index-url https://download.pytorch.org/whl/cpu

Install Other Dependencies

pip install -r requirements-cpu.txt

macOS (CPU)

macOS installation is CPU-only by default. Follow the standard macOS installation instructions.

Performance Note: CPU transcription of a 5-minute video may take 2-5 minutes compared to ~30 seconds with GPU acceleration.

Docker Installation

Docker provides a containerized environment with all dependencies pre-configured, including GPU support.

Prerequisites

Docker 20.10+ installed (get Docker)
Docker Compose 1.29+ (get Docker Compose)
For GPU support: NVIDIA Docker runtime (installation guide)

Using Docker Compose (Recommended)

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Configure Environment Variables

Create a .env file:

OPENAI_API=your_openai_api_key_here

Build and Run Container

docker-compose up --build

The container configuration:

Base image: nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
GPU support: Enabled via NVIDIA runtime
Mounts: .env file, ./videos (input), ./output (output)
Interactive mode: Enabled for URL input

Process Videos

Interactive mode:

docker-compose run youtube-shorts-generator ./run.sh

With YouTube URL:

docker-compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

With local file:

# Place video in ./videos directory
docker-compose run youtube-shorts-generator ./run.sh "/app/videos/video.mp4"

Manual Docker Build

Build Image

docker build -t ai-shorts-generator .

Run Container

docker run --gpus all \
  -v $(pwd)/.env:/app/.env \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it ai-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

Flags explained:

--gpus all: Enable GPU acceleration
-v $(pwd)/.env:/app/.env: Mount environment variables
-v $(pwd)/videos:/app/videos: Mount input videos
-v $(pwd)/output:/app/output: Mount output directory
-it: Interactive terminal for prompts

CPU-Only Docker

Remove GPU-specific configuration:

docker run \
  -v $(pwd)/.env:/app/.env \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it ai-shorts-generator ./run.sh

Environment Variables

The tool requires the following environment variable:

Variable	Required	Description	Example
`OPENAI_API`	Yes	OpenAI API key for GPT-4o-mini	`sk-proj-...`

Configuration file: .env in project root

OPENAI_API=sk-proj-1234567890abcdefghijklmnopqrstuvwxyz

Security: Never commit your .env file to version control. Add it to .gitignore.

Verifying Your Installation

Test your installation with a short video:

./run.sh "https://youtu.be/dQw4w9WgXcQ"

Successful installation will:

Download the video
Extract and transcribe audio
Analyze transcript with GPT-4o-mini
Present highlight selection for approval
Process and output vertical short

Troubleshooting

CUDA/GPU Issues

Problem: torch.cuda.is_available() returns False Solutions:

Verify NVIDIA drivers are installed:
```
nvidia-smi
```

Check CUDA library paths:

export LD_LIBRARY_PATH=$(find $(pwd)/venv/lib/python3.10/site-packages/nvidia -name "lib" -type d | paste -sd ":" -)

The run.sh script handles this automatically.

Reinstall PyTorch with CUDA:

pip uninstall torch
pip install torch --index-url https://download.pytorch.org/whl/cu121

ImageMagick Subtitle Issues

Problem: No subtitles appear in output video Solution: Check ImageMagick policy:

grep 'pattern="@\*"' /etc/ImageMagick-6/policy.xml

Should show: rights="read|write" If not:

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Face Detection Issues

Problem: Cropping doesn’t center on faces Causes:

Video needs visible faces in first 30 frames
Low-resolution videos have less reliable detection
For screen recordings, motion tracking applies automatically

Solution: Adjust face detection sensitivity in Components/FaceCrop.py:detectMultiScale:

minNeighbors=8  # Higher = fewer false positives
minSize=(30, 30)  # Minimum face size in pixels

OpenAI API Issues

Problem: ERROR: Failed to get highlight from LLM Causes:

Invalid or missing API key
Rate limiting
Network connectivity issues
Insufficient API credits

Solutions:

Verify API key in .env file
Check API usage at platform.openai.com/usage

Test API key:

curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API"

FFmpeg Not Found

Problem: ffmpeg: command not found Solutions:

Ubuntu/Debian: sudo apt install ffmpeg
macOS: brew install ffmpeg
Windows: Add FFmpeg to system PATH

Python Version Issues

Problem: Module compatibility errors Solution: Ensure Python 3.10+ is installed:

python --version  # Should show 3.10.x or higher

Install Python 3.10:

Ubuntu: sudo apt install python3.10 python3.10-venv
macOS: brew install python@3.10
Windows: Download from python.org

Next Steps

Quickstart Guide

Generate your first short in under 5 minutes

Usage Examples

Learn CLI commands and automation techniques

Configuration

Customize subtitle styling, AI prompts, and video settings

API Reference

Explore the codebase and component architecture

Get Started

User Guides

Features

Advanced

Prerequisites

Ubuntu/Debian Installation

macOS Installation

Windows Installation

CPU-Only Installation

Ubuntu/Debian (CPU)

Windows (CPU)

macOS (CPU)

Docker Installation

Prerequisites

Using Docker Compose (Recommended)

Manual Docker Build

CPU-Only Docker

Environment Variables

Verifying Your Installation

Troubleshooting

CUDA/GPU Issues

ImageMagick Subtitle Issues

Face Detection Issues

OpenAI API Issues

FFmpeg Not Found

Python Version Issues

Next Steps

Quickstart Guide

Usage Examples

Configuration

API Reference

Get Started

User Guides

Features

Advanced

Documentation Index

​Prerequisites

​Ubuntu/Debian Installation

​macOS Installation

​Windows Installation

​CPU-Only Installation

​Ubuntu/Debian (CPU)

​Windows (CPU)

​macOS (CPU)

​Docker Installation

​Prerequisites

​Using Docker Compose (Recommended)

​Manual Docker Build

​CPU-Only Docker

​Environment Variables

​Verifying Your Installation

​Troubleshooting

​CUDA/GPU Issues

​ImageMagick Subtitle Issues

​Face Detection Issues

​OpenAI API Issues

​FFmpeg Not Found

​Python Version Issues

​Next Steps

Quickstart Guide

Usage Examples

Configuration

API Reference

Prerequisites

Ubuntu/Debian Installation

macOS Installation

Windows Installation

CPU-Only Installation

Ubuntu/Debian (CPU)

Windows (CPU)

macOS (CPU)

Docker Installation

Prerequisites

Using Docker Compose (Recommended)

Manual Docker Build

CPU-Only Docker

Environment Variables

Verifying Your Installation

Troubleshooting

CUDA/GPU Issues

ImageMagick Subtitle Issues

Face Detection Issues

OpenAI API Issues

FFmpeg Not Found

Python Version Issues

Next Steps