Skip to content

How to Avoid Adding Large Files to Git: A Complete Developer's Guide

You want to push your latest changes to Github but get a git large file error. Here is how to avoid it and how to solve it.

How to Avoid Adding Large Files to Git: A Complete Developer's Guide

We’ve all been there: you’re working on a project, make some commits, and then hit a wall when trying to push to GitHub. The dreaded “file exceeds GitHub’s file size limit” error appears, and you realize you’ve accidentally committed a large file that should have been ignored.

This scenario is frustrating because it breaks your workflow and requires cleanup that could have been easily prevented. Large files in Git repositories cause several problems:

  • Push failures when exceeding platform limits (GitHub: 100MB, GitLab: 10GB)

  • Slow clone times for anyone downloading your repository

  • Repository bloat that persists even after removing files

  • Bandwidth waste when syncing changes

Let’s explore multiple strategies to prevent this problem before it happens.

Pre-commit hooks are automated checks that run before each commit, catching issues early in your workflow.

Installation and Setup

Step 1: Install pre-commit

# macOS (using Homebrew)
brew install pre-commit

# Ubuntu/Debian
sudo apt install pre-commit

# Windows (using pip)
pip install pre-commit

# Or install globally with pip on any system
pip install pre-commit

Step 2: Create configuration file

Create a .pre-commit-config.yaml file in your project root:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0  # Use the latest version from GitHub releases
    hooks:
      - id: check-added-large-files
        args: ["--maxkb=50000"]  # 50MB limit (adjust as needed)
      - id: check-case-conflict
      - id: check-merge-conflict
      - id: trailing-whitespace
      - id: end-of-file-fixer

Step 3: Install the hook

# Run this in your repository root
pre-commit install

Step 4: Test the setup

# Test on all files (optional)
pre-commit run --all-files

# Or just test the large file check
pre-commit run check-added-large-files --all-files

Customizing File Size Limits

Adjust the --maxkb parameter based on your needs:

  • Small projects: --maxkb=10000 (10MB)

  • Medium projects: --maxkb=50000 (50MB)

  • Projects with assets: --maxkb=90000 (90MB)

Solution 2: Enhanced .gitignore Strategy

A proactive .gitignore file is your first line of defense. Here’s a comprehensive approach:

Create a Global .gitignore

Set up patterns that apply to all your projects:

# Create global gitignore file
touch ~/.gitignore_global

# Configure Git to use it
git config --global core.excludesfile ~/.gitignore_global

Add common large file patterns to ~/.gitignore_global:

# Large file types
*.zip
*.tar.gz
*.rar
*.7z
*.dmg
*.iso

# Media files
*.mp4
*.avi
*.mkv
*.mov
*.wmv
*.mp3
*.wav
*.flac

# Database files
*.db
*.sqlite
*.mdf
*.ldf

# Virtual machine files
*.vmdk
*.vdi
*.vhd

# Archive and backup files
*.bak
*.backup
*.old

Project-Specific .gitignore

For each project, add patterns specific to your technology stack:

# Node.js
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
package-lock.json  # if using yarn

# Python
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
*.egg-info/

# Build outputs
dist/
build/
target/
bin/
obj/

# IDE files
.vscode/
.idea/
*.swp
*.swo

# OS files
.DS_Store
Thumbs.db

Solution 3: Git Hooks (Advanced)

For teams wanting more control, you can create custom Git hooks:

Client-side Pre-commit Hook

Create .git/hooks/pre-commit (make it executable with chmod +x):

#!/bin/bash

# Check for large files before commit
max_size=52428800  # 50MB in bytes

large_files=$(git diff --cached --name-only | xargs -I {} find {} -size +${max_size}c 2>/dev/null)

if [ -n "$large_files" ]; then
    echo "❌ Error: Large files detected:"
    echo "$large_files"
    echo ""
    echo "Files larger than 50MB are not allowed."
    echo "Consider adding them to .gitignore or using Git LFS."
    exit 1
fi

echo "✅ No large files detected"

Solution 4: Git LFS for Legitimate Large Files

Sometimes you need to version large files. Git LFS (Large File Storage) is the solution:

Setup Git LFS

# Install Git LFS
git lfs install

# Track specific file types
git lfs track "*.psd"
git lfs track "*.zip"
git lfs track "*.mp4"

# Track specific files
git lfs track "large-dataset.csv"

# Add the .gitattributes file
git add .gitattributes
git commit -m "Add Git LFS configuration"

Migrate Existing Large Files

If you already have large files in your history:

# Migrate specific file types
git lfs migrate import --include="*.zip,*.mp4,*.psd"

# Or migrate everything above a certain size
git lfs migrate import --above=100MB

Troubleshooting: When You’ve Already Committed Large Files

Quick Fix: Remove from Staging

If you haven’t pushed yet:

# Remove file from staging area but keep in working directory
git reset HEAD large-file.zip

# Add to .gitignore
echo "large-file.zip" >> .gitignore
git add .gitignore
git commit -m "Add large file to gitignore"

Nuclear Option: Remove from History

⚠️ Warning: This rewrites history and should only be used if you haven’t pushed or if you coordinate with your team.

# Remove file from entire history
git filter-branch --tree-filter 'rm -f path/to/large-file.zip' HEAD

# Or use the newer git-filter-repo (recommended)
pip install git-filter-repo
git filter-repo --path path/to/large-file.zip --invert-paths

Force Push (Use with Caution)

# After cleaning history, force push
git push --force-with-lease origin main

Best Practices and Tips

1. Regular Repository Maintenance

# Check repository size
git count-objects -vH

# Find largest files in repository
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | tail -10

2. Team Workflow Integration

  • Add pre-commit hooks to your project setup documentation

  • Include .gitignore templates in project scaffolding

  • Set up CI/CD checks to catch large files

  • Regularly review repository size in team meetings

3. Platform-Specific Limits

PlatformFile Size LimitRepository Size Limit
GitHub100MB1GB (recommended)
GitLab10GBNo hard limit
Bitbucket1GB2GB
Azure DevOpsNo hard limit250GB

4. Automation Scripts

Create a simple script to check for large files:

#!/bin/bash
# save as check-large-files.sh

echo "🔍 Checking for files larger than 50MB..."
find . -type f -size +50M -not -path "./.git/*" -exec ls -lh {} \; | awk '{print $9 ": " $5}'

if [ $? -eq 0 ]; then
    echo "✅ No large files found"
else
    echo "❌ Large files detected - consider adding to .gitignore or Git LFS"
fi

Conclusion

Preventing large files from entering your Git repository saves time, bandwidth, and frustration. The pre-commit hook approach is the most reliable method because it catches issues automatically before they become problems.

Quick setup recap:

  1. Install pre-commit: brew install pre-commit

  2. Create .pre-commit-config.yaml with file size checks

  3. Run pre-commit install in your repository

  4. Maintain a comprehensive .gitignore file

For teams handling legitimate large files, Git LFS provides a robust solution that keeps your repository fast while still versioning important assets.

Remember: prevention is always easier than cleanup. Set up these safeguards once, and you’ll never have to deal with the “file too large” error again.

Follow

Want more posts like this?

Follow along for more writing about language, AI, developer tools, and side projects.