Documentation Index Fetch the complete documentation index at: https://mintlify.com/SamurAIGPT/AI-Youtube-Shorts-Generator/llms.txt
Use this file to discover all available pages before exploring further.
Docker Installation
The project includes Docker and Docker Compose configurations for containerized execution with NVIDIA GPU support.
Docker setup requires NVIDIA Container Toolkit for GPU-accelerated Whisper transcription. CPU-only Docker support is available but significantly slower.
Prerequisites
Install Docker Compose
Docker Compose v2 is included with Docker Desktop. For Linux: If needed, install from Docker Compose docs .
Install NVIDIA Container Toolkit (GPU only)
For CUDA-accelerated transcription: curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify GPU access: docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Create .env file
Create a .env file in the project root with your OpenAI API key: OPENAI_API = your_openai_api_key_here
Dockerfile Configuration
The project uses an NVIDIA CUDA base image for GPU support.
Base Image
FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
Provides:
CUDA 12.1.0 runtime
cuDNN 8 for deep learning
Ubuntu 22.04 base system
System Dependencies
RUN apt-get update && apt-get install -y \
python3.10 \
python3.10-venv \
python3-pip \
ffmpeg \
libavdevice-dev \
libavfilter-dev \
libopus-dev \
libvpx-dev \
pkg-config \
libsrtp2-dev \
imagemagick \
git \
wget \
&& rm -rf /var/lib/apt/lists/*
FFmpeg
ImageMagick
Audio Libraries
Python 3.10
Video processing, audio extraction, and format conversion.
Subtitle rendering and text overlay generation.
libopus-dev, libvpx-dev, libsrtp2-dev for audio codec support.
Required Python version with virtual environment support.
ImageMagick Policy Fix
Critical for subtitle rendering:
RUN sed -i 's/rights="none" pattern="@ \* "/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
Without this fix, ImageMagick will refuse to write temporary files, causing subtitle generation to fail.
CUDA Library Path
ENV LD_LIBRARY_PATH=/usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib:/usr/local/lib/python3.10/dist-packages/nvidia/cublas/lib:$LD_LIBRARY_PATH
Ensures Whisper can find NVIDIA CUDA libraries for GPU acceleration.
Docker Compose Configuration
The docker-compose.yml file defines the service with GPU support and volume mounts.
GPU Configuration
deploy :
resources :
reservations :
devices :
- driver : nvidia
count : 1
capabilities : [ gpu ]
GPU driver
driver: nvidia specifies NVIDIA GPU access
GPU count
count: 1 allocates one GPU (change for multi-GPU setups)
Capabilities
capabilities: [gpu] enables GPU compute access
Environment Variables
environment :
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
Controls which GPUs are visible to the container:
all: All GPUs available
0: Only GPU 0
0,1: GPUs 0 and 1
none: No GPU access (CPU-only)
NVIDIA_DRIVER_CAPABILITIES
Defines GPU capabilities:
compute: CUDA compute operations
utility: nvidia-smi and monitoring tools
graphics: Graphics rendering (not needed here)
video: Video encode/decode (not needed here)
Volume Mounts
volumes :
- ./videos:/app/videos # Input videos
- ./output:/app/output # Output directory
- ./.env:/app/.env:ro # OpenAI API key (read-only)
./videos:/app/videos
./output:/app/output
./.env:/app/.env:ro
Purpose : YouTube downloads and local video inputHost path : ./videos (create if doesn’t exist)Container path : /app/videosUsage :# Place local videos here
cp ~/Downloads/my-video.mp4 ./videos/
# Container downloads go here
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/ID"
Purpose : Final short video outputsHost path : ./output (created automatically)Container path : /app/outputOutput naming : {title}_{session-id}_short.mp4Access :Purpose : OpenAI API key configurationHost path : ./.envContainer path : /app/.envRead-only : :ro flag prevents container from modifying the fileAlternative : Use env_file (already configured on line 24-25)
Volumes persist data between container runs. Downloaded videos remain in ./videos/ for reuse.
Interactive Mode
stdin_open : true
tty : true
Enables interactive input for:
YouTube URL prompts
Resolution selection
Approval workflow
Equivalent to docker run -it.
Building the Image
Build the Docker image before first use:
Build process:
Downloads NVIDIA CUDA base image (~2GB)
Installs system dependencies
Fixes ImageMagick policy
Installs Python packages from requirements.txt
Copies application code
Sets up CUDA library paths
Build time: 5-10 minutes (depending on network speed)
The image is ~6-8GB due to CUDA runtime and dependencies. Ensure sufficient disk space.
Running with Docker Compose
Interactive Mode
Run with prompts for URL and approval:
You’ll see:
Session ID: 3f8a9b12
Enter YouTube video URL or local video file path:
Use docker compose up for interactive mode, not docker compose up -d (detached mode won’t show prompts).
Command-Line Mode
Process a specific video without interaction:
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
Auto-Approve Mode
Fully automated processing:
docker compose run youtube-shorts-generator ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
Local File Processing
Process videos from the mounted ./videos directory:
# Copy video to mounted directory
cp ~/my-video.mp4 ./videos/
# Process inside container
docker compose run youtube-shorts-generator ./run.sh "/app/videos/my-video.mp4"
Running with Docker CLI
Alternative to Docker Compose for more control:
Basic Run
docker run --rm \
--gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
Flag Purpose --rmRemove container after exit --gpus allEnable all GPUs -v $(pwd)/.env:/app/.env:roMount API key (read-only) -v $(pwd)/videos:/app/videosMount videos directory -v $(pwd)/output:/app/outputMount output directory -itInteractive mode with TTY ai-youtube-shorts-generatorImage name
With Command-Line Arguments
docker run --rm \
--gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
ai-youtube-shorts-generator \
./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
CPU-Only Mode
Run without GPU (significantly slower transcription):
docker run --rm \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
CPU-only transcription can take 10-20x longer than GPU-accelerated processing.
Batch Processing with Docker
Sequential Processing
while IFS = read -r url ; do
docker compose run youtube-shorts-generator ./run.sh --auto-approve " $url "
done < urls.txt
Parallel Processing
cat urls.txt | xargs -P 3 -I {} \
docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"
Running multiple Docker containers in parallel may cause GPU memory issues. Limit parallelism based on available VRAM:
8GB GPU: 2-3 containers max
16GB GPU: 4-5 containers max
24GB+ GPU: 6+ containers
Troubleshooting
GPU Not Detected
Symptom:
Could not load dynamic library 'libcudnn.so.8'
Solutions:
Verify NVIDIA drivers
Ensure drivers are installed on host.
Check Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Verify NVIDIA Container Toolkit works.
Restart Docker daemon
sudo systemctl restart docker
Check docker-compose.yml GPU config
Ensure GPU reservation is correctly configured: deploy :
resources :
reservations :
devices :
- driver : nvidia
count : 1
capabilities : [ gpu ]
ImageMagick Policy Error
Symptom:
ImageMagick security policy blocks '@' pattern
Solution:
Rebuild image to apply the policy fix:
docker compose build --no-cache
The Dockerfile includes the fix on line 27:
RUN sed -i 's/rights="none" pattern="@ \* "/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
Volume Permission Issues
Symptom:
Permission denied: '/app/output/video_short.mp4'
Solution:
Ensure host directories have correct permissions:
mkdir -p videos output
chmod 777 videos output
Or run container with user mapping:
Then:
UID = $( id -u ) GID = $( id -g ) docker compose run youtube-shorts-generator
Out of Memory (OOM)
Symptom:
Solutions:
Reduce parallelism
Lower video resolution
Increase GPU memory
CPU-only mode
Run fewer concurrent containers: # Instead of -P 5
cat urls.txt | xargs -P 2 -I {} docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"
Select lower resolution during YouTube download (480p instead of 1080p).
Use a GPU with more VRAM or reduce other GPU workloads.
Remove GPU access for CPU-based transcription: docker run --rm \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
Symptom:
docker compose up
Exited with code 1
Check logs:
Common issues:
Missing .env file → Create .env with OPENAI_API=your_key
Invalid API key → Verify key at https://platform.openai.com/api-keys
Missing volumes → Ensure videos/ and output/ directories exist
Docker Build Cache
Speed up rebuilds by leveraging layer caching:
# requirements.txt copied separately for caching
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
# Application code copied last (changes frequently)
COPY . .
Changing Python code won’t invalidate the pip install layer.
Shared Volume for Downloads
Reuse downloaded videos across runs:
# Download once
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
# Videos persist in ./videos/
ls -lh ./videos/
# Process again without re-downloading
docker compose run youtube-shorts-generator ./run.sh "/app/videos/video_file.mp4"
Pre-built Image
Build once, run many times:
# Build and tag
docker build -t ai-shorts:v1.0 .
# Run without rebuilding
docker run --rm --gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
ai-shorts:v1.0 \
./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
For production deployments, consider pushing the image to Docker Hub or a private registry for faster distribution.