From bf85101cf3829008beb7941983b5ca675d6ad6fa Mon Sep 17 00:00:00 2001
From: Suzanne Soy Our imaginary user will create a
+ A working tree designates the directory (and the subdirectories and files within) in which
+ the user will normally view and edit the files. GIT has commands to save the state of the working tree
+ (git commit), in order to be able to go back in time later on, and view older versions of the files.
+ The command Example working directory
+Example working tree
proj
directory,
and start filling in some files.git worktree
allows the user to create multiple working trees using the same
+ local repository. This effectively allows the user to easily have two or more versions of the project
+ side-by-side. GIT commands can be invoked in either copy. It is worth noting that the .git/
+ directory exists only in the original working tree; while it is safe to remove other worktrees (followed by
+ an invocation of git worktree prune
from one of the remaining working tree to let GIT
+ detect the deletion), the removal of the original working tree will discard ths .git/
+ directory, and all versions of the project that have not been published elsewhere (usually via
+ git push
) will be lost.
+
cat .git/HEAD
and using git cat-file -p some-hash
- to pretty-print an object given its hash.
+ to pretty-print an object given its hash. This will help sink in the points explained in this tutorial, and give a better
+ understanding of the internals of GIT. This knowledge is helpful for day-to-day tasks, as the GIT commands usually perform
+ simple changes to this internal representation. Understanding the representation better can demistify the semantics of
+ the daily GIT commands. Furthermore, equipped with a better understanding of GIT's implementation, the dreamy reader will
+ be tempted to compare this lack of intrinsic complexity with the apparent complexity, and be entitled to expect a better,
+ less arcane user interface for a tool with such a simple implementation.
cat .git/HEAD
and using the zlib
decompression tool
- from the zlib
compression section.
+ Inspect a small existing repository, starting with cat .git/HEAD
and using the zlib
decompression
+ tool from the zlib
compression section. Larger repositories will make use
+ of GIT packs, which are compressed archives containing a number of objects. GIT packs only matter as an optimization of the
+ disk space used by large repositories, but other tools would be necessary to inspect those. This should help understand
+ the internal representation of GIT commits and branches, and should help having a instinctive idea of how the data store is
+ modified by the various commands. This in turn could come in handy in case of apparent data loss (a lost stash or a checkout
+ leaving an unreferenced commit on a detached HEAD), as this would help understand the work done by the various
+ disaster-recovery one-liners that a quick panicked online search provides.
git init new-directory
in a terminal, and create an initial single-file commit from scratch, using only
- git hash-object
, printf
and overwriting .git/HEAD
. This will involve retracing the
- steps in this tutorial to create a blob object for the file, a tree object to be the directory containing just that file,
- and a commit object.
+ git hash-object
, printf
and overwriting .git/HEAD
and/or
+ .git/refs/heads/name-of-a-branch
. This will involve retracing the steps in this tutorial to create a blob
+ object for the file, a tree object to be the directory containing just that file, and a commit object. This exercise should
+ help sink in the feeling that the internal representation of GIT commits is not very complex, and that many commands with
+ convoluted options have very simple semantics. For example, git reset --soft other-commit
is little more than
+ writing that other commit's hash in .git/refs/heads/name-of-the-current-branch
or .git/HEAD
.
+ Furthermore, equipped with an even better understanding of GIT's implementation, the dreamy reader will
+ be tempted to compare this lack of intrinsic complexity with the sheer complexity of the systems they are working with on
+ a day-to-day basis, and be entitled to expect better features in a versioning tool. After all, writing those
+ few lines of code to reimplement the core of a versioning tool shouldn't take more than a
+ couple of afternoons, surely our community can do better?
commit
, diff
, checkout
,
@@ -1383,12 +1417,26 @@ commands.
explicitly give the name (origin) or URL of the remote, the hash of the commit to push, and the path that should be
updated on the remote (git push
while the main
branch is checked out locally is equivalent
to git push origin HEAD:refs/heads/main
, where HEAD
can be replaced by the actual hash of
- the commit).
+ the commit). This should help sink in the feeling that the internals of GIT are very simple (most of these commands
+ are implemented in this tutorial, and the other ones are merely wrappers around enhanced versions of the *NIX commands
+ diff
, patch
and scp
), and that the rest of the GIT toolkit consists mostly of
+ convenience wrappers to help seasoned users perform common tasks more efficiently.
git cherry-pick
or git diff
a few times, instead make two copies the git
directoy, check out the two different commits in each copy, and use the traditional *NIX commands diff
and
- patch
.
+ patch
. This should help sink in the feeling that commits are not diffs, but are actual (deduplicated)
+ copies of the entire project directory. GIT commits are quite similar to the age-old manual versioning technique of
+ copying the entire directory under a new name at each version, except that the metadata keeps track of which version
+ was the previous one (or which versions were merged together to obtain the new one), and the deduplication avoids
+ excessive space usage, as would be the case with cp --reflink
on a filesystem supporting Copy-On-Write (COW).
+ git fetch && git checkout origin/remote-branch
, and use the reflog and a text file
+ outside of the repository to keep track of the latest commit in a current "branch" instead of relying on GIT. This
+ should help sink in the feeling that branches are not containers in which commits pile up, but are merely pointers to
+ the latest commit that are automatically updated.