Git Plumbing
Snatched from Dissecting Git's Guts, Emily Xie - Git Merge 2016
Basic object types
How git stores files
Store a file in git manually
$ cd /tmp/test
$ git init
$ echo 'Hello World!' > hello_world.txt
/tmp/test
$ git hash-object -w hello_world.txt
980a0d5f19a64b4b30a87d4206aade58726b60e3
Content addressable filesystem. Read content addressable by hash.
$ ls -alh .git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3
-r--r--r-- 1 florian.begusch wheel 29 Dec 14 08:02 .git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3
$ git cat-file -p 980a0d5f19a64b4b30a87d4206aade58726b60e3
Hello World!
How to generate the hash yourself
$ printf 'blob 13\000Hello World\041\n' | openssl sha1
980a0d5f19a64b4b30a87d4206aade58726b60e3
How git stores folder structure
Mimic git add
Add to stash/index
git update-index --add hello_world.txt
git update-index --add foo_bar.txt
Inspect index
# binary content
less .git/index
$ git ls-files --stage
100644 9af24d2496973bb2603bb4ebd6ea3bba6179b577 0 foo_bar.txt
100644 980a0d5f19a64b4b30a87d4206aade58726b60e3 0 hello_world.txt
Write tree to git
$ git write-tree
e4ff61e51b77e4e42e3b687ab8a086b24db2e16d
Trees are hash addressable as well
$ git cat-file -p e4ff61e51b77e4e42e3b687ab8a086b24db2e16d
100644 blob 9af24d2496973bb2603bb4ebd6ea3bba6179b577 foo_bar.txt
100644 blob 980a0d5f19a64b4b30a87d4206aade58726b60e3 hello_world.txt
$ find .git/objects -type f
.git/objects/e4/ff61e51b77e4e42e3b687ab8a086b24db2e16d # tree
.git/objects/9a/f24d2496973bb2603bb4ebd6ea3bba6179b577 # blob
.git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3 # blob
Mimic git commit
/Write commit object
A commit knows when trees were written and contains information about committer and author.
$ echo 'first commit' | git commit-tree e4ff61e51b77e4e42e3b687ab8a086b24db2e16d
040ee26513c799dfda3316f06df78bade3dddf99
$ git cat-file -p 040ee26513c799dfda3316f06df78bade3dddf99
tree e4ff61e51b77e4e42e3b687ab8a086b24db2e16d
author Florian Begusch <florian.begusch@gmail.com> 1673844479 +0000
committer Florian Begusch <florian.begusch@gmail.com> 1673844479 +0000
first commit
$ find .git/objects -type f
.git/objects/04/0ee26513c799dfda3316f06df78bade3dddf99
.git/objects/e4/ff61e51b77e4e42e3b687ab8a086b24db2e16d
.git/objects/9a/f24d2496973bb2603bb4ebd6ea3bba6179b577
.git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3
How git commits create parent commit relations
$ echo asdf > foo_bar.txt
git update-index --add foo_bar.txt
git write-tree
eb606ccdc530241cc740e97ea8d87c15a6cc3e69
$ echo 'second commit' | git commit-tree eb606ccdc530241cc740e97ea8d87c15a6cc3e69
7edad5bf0e1967282ce4452464bca10ffc56bd66
$ git cat-file -p 7edad5bf0e1967282ce4452464bca10ffc56bd66
tree eb606ccdc530241cc740e97ea8d87c15a6cc3e69
author Florian Begusch <florian.begusch@gmail.com> 1673844947 +0000
committer Florian Begusch <florian.begusch@gmail.com> 1673844947 +0000
second commit
Git references
Create a branch
Git branches are aliases or pointers to commit objects
$ find .git/refs/heads -type f
empty, no branches yet
Add a master branch, point it at a commit
git update-ref refs/heads/master 7edad5bf0e1967282ce4452464bca10ffc56bd66
$ find .git/refs/heads -type f
.git/refs/heads/master
$ cat .git/refs/heads/master
7edad5bf0e1967282ce4452464bca10ffc56bd66
Branch off of a given branch
It just duplicates the pointer
git checkout -b feature
$ find .git/refs/heads -type f
.git/refs/heads/master
.git/refs/heads/feature
$ cat .git/refs/heads/feature
7edad5bf0e1967282ce4452464bca10ffc56bd66
What is the current branch? HEAD/detached HEAD or headless
$ cat .git/HEAD
ref: refs/heads/feature
Pack objects
Example
Loose objects
$ find .git/objects -type f
.git/objects/04/0ee26513c799dfda3316f06df78bade3dddf99
.git/objects/eb/606ccdc530241cc740e97ea8d87c15a6cc3e69
.git/objects/e4/ff61e51b77e4e42e3b687ab8a086b24db2e16d
.git/objects/7e/dad5bf0e1967282ce4452464bca10ffc56bd66
.git/objects/9a/f24d2496973bb2603bb4ebd6ea3bba6179b577
.git/objects/98/0a0d5f19a64b4b30a87d4206aade58726b60e3
.git/objects/8b/d6648ed130ac9ece0f89cd9a8fbbfd2608427a
$ git count-objects -H
7 objects, 28.00 KiB
Garbage collect
$ git gc
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 12 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (4/4), done.
Total 4 (delta 0), reused 0 (delta 0), pack-reused 0
Packed repo
$ git count-objects -H
3 objects, 12.00 KiB
$ find .git/objects -type f
.git/objects/04/0ee26513c799dfda3316f06df78bade3dddf99
.git/objects/e4/ff61e51b77e4e42e3b687ab8a086b24db2e16d
.git/objects/pack/pack-2bf9b004e25e7885f2a76d2a592375642b224422.idx # index into packfile (offsets)
.git/objects/pack/pack-2bf9b004e25e7885f2a76d2a592375642b224422.pack # pack file
.git/objects/9a/f24d2496973bb2603bb4ebd6ea3bba6179b577
.git/objects/info/commit-graph
.git/objects/info/packs # which packs exit?
Show contents of pack
$ git verify-pack -v .git/objects/pack/pack-2bf9b004e25e7885f2a76d2a592375642b224422.pack
7edad5bf0e1967282ce4452464bca10ffc56bd66 commit 200 126 12
8bd6648ed130ac9ece0f89cd9a8fbbfd2608427a blob 5 14 138
980a0d5f19a64b4b30a87d4206aade58726b60e3 blob 13 22 152
eb606ccdc530241cc740e97ea8d87c15a6cc3e69 tree 82 87 174
non delta: 4 objects
.git/objects/pack/pack-2bf9b004e25e7885f2a76d2a592375642b224422.pack: ok