Deleting the remote history but not affecting the latest repository content is a feature that I've been trying to implement for a long time, and there are two great uses for it:
- Some history commits inadvertently contain more sensitive information, which is not noticed at the time of the commit, and is only realized after some time. By this time, there are so many new history commits that it's impossible to roll back.
- Sometimes you will take a Git repository to store content other than code files, such as art resources, dependency libraries, and so on. At this point, except for a few commits, most of the historical commits are meaningless and take up a lot of space in the repository.
However, it's important to note that deleting history in Git is not quite what we thought it would be, and requires the use of the rebase feature. It's a pretty big change to the repository, so just in case we need to back up and create a branch:
git checkout -b cleanup-history
Use the change-base directive to rewrite the commit history, as shown below.
git rebase -i HEAD~n
-i
denotes an interactive rewrite that will bring up a page containing so history of commits for you to edit. The n here means go back n versions. For example, check all the history commits first:
git rev-list --count HEAD
If you get a value of 500, then set n to 499 to see all the history. Sometimes this value is not correct, maybe because it includes merged commits, then try it:
git rev-list --first-parent --count HEAD
Or:
git rev-list --count --no-merges HEAD
to roughly estimate the value of n. Of course if the history commits you're backtracking on aren't too far away, just give a rough idea of the history commits you can see that you want to delete.
existgit rebase -i HEAD~n
After that, in the interactive page, change the action from pick to drop for the history commits you want to delete. save and exit the editor, and Git will start rewriting the history to delete the specified commits. Sometimes there are too many history commits you want to delete, and it's a pain to change them to drop one by one. You can use a text tool like NotePad3 to batch change them with the column picking feature.
If you delete enough history far enough, the next thing you'll see is a rather disturbing scene, your Git code repository will go back to the furthest history state, and then gradually begin to auto-commit, this process is likely to have some problems. For example, if an empty commit is detected, it will prompt and abort the base change process, which can be skipped:
git rebase --skip
There may also be a conflict issue for you to resolve. If it's a document file, edit it beforegit add xxx
; if it's a binary file, either delete thegit rm xxx
Either directlygit add xxx
Conflicting files and then continue to change bases:
git rebase --continue
Next, if all goes well along the way, force the changes to be pushed to the remote branch:
git push origin cleanup-history --force
Finally, check the history of commits on the branch and replace this branch with the master branch if there are no problems:
git checkout main
git reset --hard cleanup-history
git push origin main --force
If you want to completely remove these commits and compress the size of your Git repository, so you can use the following command:
git reflog expire --expire=now --all
git gc --prune=now --aggressive
For other users, you can use the following command to update:
git pull -rebase origin main
In the author's actual use of the process, encountered a very large number of conflict problems, often have to stop to solve the conflict problem. I do not quite understand why the author to delete the history of the current repository snapshot to solve the problem of conflict, guessing that may be because the author's history of the commit record contains a lot of merged commits. Therefore, this method may not be suitable for some readers, may merge the history of commits, or only keep the latest commit version is more reasonable.