Removing Sensitive Information from Git with git-filter-repo

Removing Sensitive Information from Git with git-filter-repo

Git has become the de facto standard in source control. Over 90% of respondents to the 2021 Stack Overflow Developer Survey indicated that they use Git. However, sometimes mistakes can be made and sensitive or secret information accidentally end up in the commit history.

Thankfully, tools such as git-filter-repo exist to help rewrite history (the history of a git repository anyway)! In this post, I am going to go through using git-filter-repo to scrub a password or secret from a git repository.

Before following the steps below, ensure that you do not have any branches, apart from your default branch.

Setting up Windows Subsystem for Linux 2

I am usually a Windows user; however, I found git-filter-repo was much easier to install on a Linux distribution. As a result, if you’re using Windows, I’d recommend setting up WSL2 by following the Microsoft guide to install WSL2 on Windows 10.

If you’re using Linux, you can just carry on to the next step!

Installing git-filter-repo

Now it is time to install git-filter-repo itself. The easiest way is to install a version using a package manager bundled with your Linux distribution. You can find the details of the package availability and versions on the install page.

Generally, you’ll run a command similar to the following (I was using an Ubuntu so was using the apt package manager).

apt install git-filter-repo

When it has finished installing, it is time to get a copy of the git repository.

Cloning the repository

If you don’t already have a copy of the git repository on your local computer you will need to clone it. Since we won’t be working on any files directly, we can get away with cloning a mirror repository. To do this run the following command.

git clone <url> --mirror

After it has finished cloning, it is time to tell git-filter-repo what information needs to be removed from the git repository’s history.

Using git-filter-repo

First, we need to create a file to tell git-filter-repo what expressions we want to search for and what we want them to be replaced with.

The following expressions.txt file is a quick example of replacing the text “P@ssw0rd” with empty space.

literal:P@ssw0rd==>

You can find other ways to match and replace text in git-filter-repo user manual, it supports literals, regex and more.

Now this file exists we can use git-filter-repo with the –replace-text option by running the following command.

git-filter-repo --replace-text expressions.txt

When it has finished, you should find that the files in your git repository no longer contain the sensitive or secret information.

Pushing to a remote repository

If you need to update a remote repository (such as one hosted on GitHub or GitLab) you can do this by pushing your repository to the server. You will need to use the –force option and have adequate permissions for it to work.

git remote add origin <url>
git push -u origin master –force

Summary

We setup WSL2 and install the git-filter-repo package. After this is done, we then clone and target repository ad use git-filter-repo to replace the sensitive information with blank text. The final step is to push the repository branch back to the remote repository (such as one hosted on GitHub or GitLab) to update it with the changes.