diff --git a/collects/meta/web/stubs/git.rkt b/collects/meta/web/stubs/git.rkt
index 0b9b646956..e78e2310b8 100644
--- a/collects/meta/web/stubs/git.rkt
+++ b/collects/meta/web/stubs/git.rkt
@@ -80,292 +80,2756 @@
;; ----------------------------------------------------------------------------
;; git "guide"
-;; TODO: link man pages; make pre blocks with frames and shaded bg; toc
-(define intro
- @page[#:title "git intro"]{
- @(begin (define cmd tt)
- (define path tt)
- (define man tt)
- (define git-host "git.racket-lang.org")
- (define at-racket "@racket-lang.org")
- (define at-git-racket "@git.racket-lang.org")
- (define at-lists-racket "@lists.racket-lang.org")
- (define -- mdash))
- @h1{Getting git}
- @p{I @strong{highly} recommend getting a new git installation. Git itself
- is pretty stable (that is, you probably will not run into bugs with
- whatever version you have installed), but there are many usability
- related improvements. Specifically, I am using 1.7.x and it is likely
- that some things in this document are specific to that version.}
- @p{You can
- @a[href: "http://git-scm.com/download"]{download a recent version},
- available in binary form for some popular platforms (RPMs for Fedora and
- RedHat, Windows, OSX). In addition to these, you can get a build for
- @ul{@li{Ubuntu:
- @pre{sudo add-apt-repository ppa:git-core/ppa
- sudo apt-get install git-core}}
- @li{OSX using macports:
- @pre{sudo port selfupdate
- sudo port install git-core +svn}}}
- (For OSX, you can also get @a[href: "http://gitx.frim.nl/"]{@cmd{GitX}}
- @-- it's a good gui front-end for git, similar to @cmd{gitk} and
- @cmd{git gui}.)}
- @p{You can also build git from source is @-- here are the steps that I'm
- using to install a new version:
- @pre{GVER=1.7.1
- BASE=http://www.kernel.org/pub/software/scm/git
- TARGET=/usr/local
- cd /tmp; curl $BASE/git-$GVER.tar.gz | gunzip | tar xf -; @;
- cd git-$GVER
- make prefix=$TARGET all && sudo make prefix=$TARGET install}
- If you do this and you want the man pages too, then getting the
- pre-built man pages is the easiest route (building them requires some
- "exotic" tools):
- @pre{cd $TARGET/share/man
- curl $BASE/git-manpages-$GVER.tar.gz | gunzip | sudo tar xf -}}
- @h1{General git setup}
- @p{Commits to a git repository are done locally, so git needs to know who
- you are. (Unlike subversion, where you need to identify yourself to be
- able to talk to the server, so the commit object is created there based
- on who you authenticated as.) To get git to know you, run the following
- two commands:
- @pre{git config --global user.name "My Name"
- git config --global user.email "foo@at-racket"}
- This sets your @em{default} name and email for @em{all} repositories @--
- it stores this information in @path{~/.gitconfig} which is the global
- configuration file, used for such defaults. You can edit this file
- directly too @-- it is in a fairly obvious textual format. There is a
- lot that can be configured, see below for some of these (and see the
- @man{git-config} man page for many more details).}
- @p{In addition to this file, each repository has its own configuration file
- (located at @path{.git/config}). Whenever git needs to check some
- option, it will use both the repository-specific config file (if you're
- in a repository) and the global one. The @cmd{--global} flag above
- tells git to set the option in the global file. Note that a
- configuration file cannot be part of the repository itself @-- so when
- you get a repository, you still need to do any local configuration you
- want. (This is intentional, since the configuration file can specify
- various commands to run, so it avoids a major security hazard.)}
- @p{Important: this sets your default identity name and email for @em{all}
- repositories. This may be a problem if you want to commit to different
- git repositories under different identities. See the section on
- customizing git below for more details on this.}
- @h1{SSH setup}
- @p{Being a distributed system, you can do everything locally on your own
- repository, but eventually you will want to communicate with other
- people and you'll need to push these changes elsewhere. The most
- popular way to communicate with remote repositories @-- including
- repositories on the Racket server, is via ssh. (Access is controlled
- via a tool called "gitolite" @-- more on this below.) The username and
- hostname of the server is "git@at-git-racket" @-- and you should be able
- to connect to this account using the ssh identity key that corresponds
- to the public key that you gave me. To try it, run
- @pre{ssh git@at-git-racket}
- and the server (gitolite, actually) should reply with information about
- your current permissions. The exact details of this is not important
- for now, just the fact that you were able to connect and get some
- reply.}
- @p{Using an ssh configuration file (usually ~/.ssh/config), you can set up
- a short name for the server. For example, you can have this:
- @pre{Host pltgit
- HostName @git-host
- User git}
- and now you can simply use @cmd{ssh pltgit info} instead of the last
- example: @cmd{ssh} will know that @cmd{pltgit} is actually defined as
- @cmd{git@at-git-racket}.}
- @p{This is the @strong{preferred} way to set things up: besides being more
- convenient in that you need to type less -- it is also a useful extra
- level of indirection, so if the server settings ever change (for
- example, we might switch to a non-standard port number), you can simply
- edit your ssh config file, and continue working as usual. In addition,
- such a configuration is needed if you created a specific ssh identity
- file to be used with git -- specifying an alternative identity file on
- the `ssh' command line is possible (an "-i" flag, in the case of
- openssh), but remember that most of your interactions with the remote
- server are done implicitly through git. (It is possible to configure
- how git invokes ssh, but it is much easier to just configure ssh). In
- this case, you will have:
- @pre{Host pltgit
- HostName @git-host
- User git
- IdentityFile ~/.ssh/my-plt-git-identity-file}}
- @h1{Gitolite: the server's gateway}
- @p{All access to the PLT server is done via @cmd{ssh}, and this is where
- gitolite comes in as the "who can do what" manager. What actually
- happens on the server is that no matter what command you're trying to
- run (as you usually would, for example: @cmd{ssh somewhere ls}), the
- server has settings that make it always run its own command @-- and that
- is a gitolite script. The script knows the command that you were
- actually trying to run, and it will reply based on that. In the above
- ssh example, you're not specifying any command (so if it wasn't for the
- pre-set gitolite script, you'd be asking for a remote shell to start),
- and gitolite responds by telling you about your permissions.}
- @p{This is actually the @cmd{info} command, so you get the same reply with
- @cmd{ssh pltgit info}. Again, this connects to ssh and tries to run
- @cmd{info}; gitolite sees that you're trying to run @cmd{info}, and
- instead of running it, it responds with that information. There are a
- few additional commands that you can use this way @-- these are all
- "meta commands" in the sense that you're not interacting with a git
- process on the other end, but rather get gitolite to perform various
- tasks on your behalf. You can run the @cmd{help} command
- (@cmd{ssh pltgit help}) to see a list of available commands. They are
- mostly useful in dealing with your private repositories on the server,
- which will be discussed further below.}
- @h1{A (very) quick introduction to git}
- @p{This is a quick description; see the last section for more resources
- (specifically,
- @a[href: "http://eagain.net/articles/git-for-computer-scientists/"]{
- Git for Computer Scientists}
- covers these basics well). Understanding how git models and stores data
- will make it significantly easier to work with it.}
- @p{A git repository is actually a database of a few kinds of objects, which
- form a DAG. There are only a few of these kinds of objects, and they
- are all addressed by the SHA1 checksum of their contents. You will
- generally see a lot of these SHA1 strings (40 hexadecimal characters),
- since they form a kind of a universal address for such objects. (For
- convenience, any unique prefix of a SHA1 can be used with git commands
- when you need to refer to it.) Whenever the following descriptions
- mention a pointer @-- this is actually such a SHA1 hash.}
- @ul{@li{A @em{blob} object is a generic container for any information,
- which (usually) represents a file. This object has no pointers to
- any other objects. It does not have anything except for the actual
- contents: no name, permission bits, etc.}
- @li{A @em{tree} object represents a directory hierarchy: it contains a
- list of names, and for each name a pointer to the object that is
- its contents. Some of these will point at blobs (when the tree
- contains a file), and some of these will point at other trees (when
- it contains a sub-tree). (These objects are similar to directories
- in a file system in that they contain all "meta" information on
- files: their names and permission bits are kept here.)}
- @li{A @em{commit} object represents a versioned snapshot of a tree,
- forming a line of work. It has the following bits of information:
- @ul{@li{tree: a pointer to the tree object that was committed}
- @li{parent: a pointer to the previous commit, which this one revised}
- @li{author: the identity of the commit author (name, email, date)}
- @li{committer: the identity of the committer}
- @li{the text of the commit message (which can be arbitrarily long)}}
- The parent field is actually any number of parents: there will be no
- parents if this is the first commit in the line of work, or more
- than one parent if this is a "merge" commit that merges two lines of
- work. Furthermore, there is nothing that prevents a git repository
- from having completely separate lines of work @-- in fact, you can
- have several independent projects contained in your repository.
- @br
- @small{(Note that git distinguishes the author of a commit from the
- person who actually performed the commit, for example @-- a patch
- could be created by X, and sent to Y to be committed.)}}
- @li{Finally, there is a `tag' object, which is very roughly a pointer
- to another object (almost always a commit), and is not important
- for now.}}
- @p{The fact that all of these objects are addressed by the SHA1 hash of
- their contents has some immediate important implications.}
- @ul{@li{Since SHA1 are cryptographic checksums, they can be considered
- @em{unique} for all practical purposes.}
- @li{The git repository is inherently hash-consed: you can never have
- "two identical files" in git -- because a file is stored at its
- SHA1 hash, two identical files will always be stored once. (Note
- that the name of a file is stored in the tree that contains it, so
- the SHA1 of the contents does not depend on it.) The same holds
- for two trees: if you have two identical directories (same contents
- of files, same names, etc), then there will actually be only one
- tree stored in the repository.}
- @li{Furthermore, these addresses are actually global: any two
- repositories that hold a file with the same contents will have it
- at the exact same SHA1 hash. (For example, if I have a repository
- that contains several projects, and each project contains several
- copies of the same LGPL text, then I'll have only a single blob
- object with that contents.) This is not only making the store
- efficient, it also makes it possible to refer to an object by its
- hash @-- for example, you can refer to the SHA1 of a specific file
- at a specific version in an email, and this will have the exact
- same meaning for anyone that reads the file (eg, anyone can run
- @cmd{git show @i{SHA1}} to see that file). (This does require that
- the readers have the actual object in their repository, of course
- @-- but no mistakes can happen, statistically speaking.)}
- @li{This holds for commits too: since a commit has the SHA1 of the tree
- it points to, then the commit SHA1 depends on the tree it points
- to. More importantly, since a commit object has the SHA1 of its
- parent(s), then the commit depends on them. This means that
- "replaying" a number of commits on a different parent commit (eg,
- when doing a "rebase") will always result in a separate line of
- commit objects. These SHA1s are also global, meaning that talking
- about a specific revision by its SHA1 will always refer to it
- unambiguously (as long as others have that object in their
- repositories).}
- @li{By itself, this kind of storage @em{cannot} have any reference
- cycle. (At least there is no practical way to get one.) The
- storage is therefore inherently a DAG. In addition to this object
- store, git does have a number of external references (eg, a branch
- is actually a pointer to a SHA1) @-- and those could be arbitrary,
- but the object storage itself cannot have cycles.}
- @li{The fact that a commit has a pointer to a tree is what makes git
- keep revisions of the whole tree @-- a commit cannot mark a change
- to a subtree. (At least not with the usual higher-level commands
- that git implements.)}}
- @p{On top of this object store, there is a layer of meta-information about
- it. The most important component here are branches (and tags). A
- branch is basically a file that has the SHA1 of a specific commit (for
- example, your @cmd{master} branch is a SHA1 that is stored in
- @path{.git/refs/heads/master}). This is what makes branch creation
- extremely cheap: all you need to do is create a new file with the SHA1.}
- @p{In addition, the @cmd{HEAD} (where your working directory is currently),
- will usually have a symbolic reference rather than a SHA1 (you can see
- this symbolic reference in the @path{.git/HEAD} file, which should
- usually look like @cmd{ref: refs/heads/@i{branch-name}}). When you
- commit a new version, a new commit object is created, and the branch
- that the HEAD points to is updated. It is also possible to checkout a
- specific SHA1 of a commit directly @-- the result of this is called
- "detached HEAD", since the HEAD is not a symbolic reference. The
- possible danger in doing this is that @cmd{git commit} will create new
- commits that are derived from the one you're on, but no branch is
- updated; if you later checkout something else, no reference is left to
- your new commit which means that it could be lost now. For this reason,
- if you checkout a SHA1 directly, git will spit out a detailed warning,
- including instructions on how you could name your current position
- (create a branch that points there).}
- @p{Tags come in two flavors: lightweight tags are SHA1 pointers like
- branches. The problem with this is that such a tag could easily move to
- a different commit, which is considered bad practice. For this reason,
- there are also "annotated tags", which are tag objects that are created
- in the object store. These tags contain information that is similar to
- a commit (there's the tagger's identity, the commit that it points to,
- and a log message) @-- and they are reliable since you can refer to
- their SHA1. In this case, the symbolic reference for such a tag (its
- name) will point to the tag object in the store (it is also possible to
- move it, but that would also be bad practice). Furthermore, tags (of
- both kinds) can point to any object in the store @-- they can point to a
- tree or even to a specific blob. This is sometimes used to store
- meta-information (eg, web pages) inside the repository. (The repository
- for git itself has a tag that points to a blob holding the maintainer's
- GPG key.)}
- @p{Note that all of this is under a more high level of managing information
- between branches and repositories, with push/pull being the main
- operations at that level. A high-level overview (more below):
- @ul{@li{a branch is a line of development, represented as a pointer to
- the commit at its tip@";"}
- @li{branches can be organized into hierarchies using "/" as a
- separator@";"}
- @li{some branches are local, and some are remote -- remote ones are
- named @path{remotes/origin/@i{branch}}@";"}
- @li{local branches are represented as files in
- @path{.git/refs/heads/@i{branch}} and remote ones are in
- @path{.git/refs/remotes/origin/@i{branch}}@";"}
- @li{@cmd{origin} is just the conventional name for the original
- repository you cloned @-- later on you can add new remote
- repositories so you can push and pull to/from them
- conveniently@";"}
- @li{some local branches are set to track remote ones, usually (but
- not necessarily) the two will have the same name@";"}
- @li{you can also have local branches to track other local branches
- (with pushing and pulling happening inside your repository)@";"}
- @li{@cmd{git fetch} is used to update your remote branches @-- ie,
- connect to the remote repository, get new commits (and the
- required parents and trees), and update your remote branch with
- the new tips@";"}
- @li{@cmd{git merge} and @cmd{git rebase} are used to update one
- branch with commits on another@";"}
- @li{@cmd{git pull} is, roughly speaking, a convenient way to do a
- fetch followed by a merge (or a rebase, when used with
- @cmd{--rebase}).}}}
- @p{@-- Incomplete @--}
- })
+;; TODO: link man pages
+;; @man and other occurrences of "man page"
+;; @v for meta-vars
+(define intro (let ()
+
+(define (cmd . text) (span class: "code" text))
+(define (path . text) (span class: "path" text))
+(define (man name . text)
+ (a href: (list "http://www.kernel.org/pub/software/scm/git/docs/"
+ (and name (list name ".html")))
+ (if (null? text) (span class: "man" name) text)))
+(define (selflink . url) (a href: url url))
+(define git-host "git.racket-lang.org")
+(define at-racket "@racket-lang.org")
+(define at-git-racket "@git.racket-lang.org")
+(define at-lists-racket "@lists.racket-lang.org")
+(define (npre . text) (apply pre style: "margin-left: 0;" text))
+(define style
+ @style/inline[type: 'text/css]{
+ a:link, a:visited {
+ text-decoration: underline;
+ }
+ .p {
+ display: block;
+ margin: 1em 0;
+ @; text-indent: 1em;
+ }
+ .code, .path, .man, pre {
+ font-family: monospace;
+ font-size: large;
+ font-weight: bold;
+ background-color: #eee;
+ }
+ .code, .path, .man {
+ white-space: nowrap;
+ }
+ .the_text pre {
+ margin-left: 2em;
+ padding-left: 0.6em 0 0.6em 0.6em;
+ }
+ .the_text ul, .the_text ol, .the_text dl,
+ .the_text li, .the_text dt, .the_text dd {
+ margin-top: 1em;
+ margin-bottom: 1em;
+ }})
+
+;; xhtml strict doesn't allow lists inside
, so fake our own paragraphs
+;; using divs:
+(define p* (make-separated-tag values (lambda (text) (div class: 'p text))))
+
+@page[#:title "git intro" #:extra-headers style]{
+
+@sections[#:newpages? #t]
+
+@div[class: 'the_text]{
+
+@section{Getting git}
+@p*{
+ I @strong{highly} recommend getting a new git installation. Git itself is
+ pretty stable (that is, you probably will not run into bugs with whatever
+ version you have installed), but there are many usability related
+ improvements. Specifically, I am using 1.7.x and it is likely that some
+ things in this document are specific to that version.
+@~
+ You can @a[href: "http://git-scm.com/download"]{download a recent version},
+ available in binary form for some popular platforms (RPMs for Fedora and
+ RedHat, Windows, OSX). In addition to these, you can get a build for
+ @ul*{
+ @~ Ubuntu:
+ @pre{sudo add-apt-repository ppa:git-core/ppa
+ sudo apt-get install git-core}
+ @~ OSX using macports:
+ @pre{sudo port selfupdate
+ sudo port install git-core +svn}}
+ (For OSX, you can also get @a[href: "http://gitx.frim.nl/"]{@cmd{GitX}} —
+ it's a good gui front-end for git, similar to @cmd{gitk} and @cmd{git gui}.)
+@~
+ You can also build git from source is — here are the steps that I'm using to
+ install a new version:
+ @pre{GVER=1.7.1
+ BASE=http://www.kernel.org/pub/software/scm/git
+ TARGET=/usr/local
+ cd /tmp; curl $BASE/git-$GVER.tar.gz | gunzip | tar xf -; cd git-$GVER
+ make prefix=$TARGET all && sudo make prefix=$TARGET install}
+ If you do this and you want the @man[#f]{man pages} too, then getting the
+ pre-built man pages is the easiest route (building them requires some
+ “exotic” tools):
+ @pre{cd $TARGET/share/man
+ curl $BASE/git-manpages-$GVER.tar.gz | gunzip | sudo tar xf -}}
+
+@section{General git setup}
+@p*{
+ Commits to a git repository are done locally, so git needs to know who you
+ are. (Unlike subversion, where you need to identify yourself to be able to
+ talk to the server, so the commit object is created there based on who you
+ authenticated as.) To get git to know you, run the following two commands:
+ @pre{git config --global user.name "My Name"
+ git config --global user.email "foo@at-racket"}
+ This sets your @em{default} name and email for @em{all} repositories — it
+ stores this information in @path{~/.gitconfig} which is the global
+ configuration file, used for such defaults. You can edit this file directly
+ too — it is in a fairly obvious textual format. There is a lot that can be
+ configured, see below for some of these (and see the @man{git-config} man
+ page for many more details).
+@~
+ In addition to this file, each repository has its own configuration file
+ (located at @path{.git/config}). Whenever git needs to check some option, it
+ will use both the repository-specific config file (if you're in a repository)
+ and the global one. The @cmd{--global} flag above tells git to set the
+ option in the global file. Note that a configuration file cannot be part of
+ the repository itself — so when you get a repository, you still need to do
+ any local configuration you want. (This is intentional, since the
+ configuration file can specify various commands to run, so it avoids a major
+ security hazard.)
+@~
+ Important: this sets your default identity name and email for @em{all}
+ repositories. This may be a problem if you want to commit to different git
+ repositories under different identities. See the section on customizing git
+ below for more details on this.}
+
+@section{SSH setup}
+@p*{
+ Being a distributed system, you can do everything locally on your own
+ repository, but eventually you will want to communicate with other people and
+ you'll need to push these changes elsewhere. The most popular way to
+ communicate with remote repositories — including repositories on the PLT
+ server, is via ssh. (Access is controlled via a tool called “gitolite” —
+ more on this below.) The username and hostname of the server is
+ "git@at-git-racket" — and you should be able to connect to this account using
+ the ssh identity key that corresponds to the public key that you gave me. To
+ try it, run
+ @pre{ssh git@at-git-racket}
+ and the server (gitolite, actually) should reply with information about your
+ current permissions. The exact details of this is not important for now,
+ just the fact that you were able to connect and get some reply.
+@~
+ Using an ssh configuration file (usually ~/.ssh/config), you can set up a
+ short name for the server. For example, you can have this:
+ @pre{Host pltgit
+ HostName @git-host
+ User git}
+ and now you can simply use @cmd{ssh pltgit info} instead of the last example:
+ @cmd{ssh} will know that @cmd{pltgit} is actually defined as
+ @cmd{git@at-git-racket}.
+@~
+ This is the @strong{preferred} way to set things up: besides being more
+ convenient in that you need to type less — it is also a useful extra level of
+ indirection, so if the server settings ever change (for example, we might
+ switch to a non-standard port number), you can simply edit your ssh config
+ file, and continue working as usual. In addition, such a configuration is
+ needed if you created a specific ssh identity file to be used with git —
+ specifying an alternative identity file on the @cmd{ssh} command line is
+ possible (an @cmd{-i} flag, in the case of openssh), but remember that most
+ of your interactions with the remote server are done implicitly through git.
+ (It is possible to configure how git invokes ssh, but it is much easier to
+ just configure ssh). In this case, you will have:
+ @pre{Host pltgit
+ HostName @git-host
+ User git
+ IdentityFile ~/.ssh/my-plt-git-identity-file}}
+
+@section{Gitolite: the server's gateway}
+@p*{
+ All access to the PLT server is done via @cmd{ssh}, and this is where
+ gitolite comes in as the “who can do what” manager. What actually happens on
+ the server is that no matter what command you're trying to run (as you
+ usually would, for example: @cmd{ssh somewhere ls}), the server has settings
+ that make it always run its own command — and that is a gitolite script. The
+ script knows the command that you were actually trying to run, and it will
+ reply based on that. In the above ssh example, you're not specifying any
+ command (so if it wasn't for the pre-set gitolite script, you'd be asking for
+ a remote shell to start), and gitolite responds by telling you about your
+ permissions.
+@~
+ This is actually the @cmd{info} command, so you get the same reply with
+ @cmd{ssh pltgit info}. Again, this connects to ssh and tries to run
+ @cmd{info}; gitolite sees that you're trying to run @cmd{info}, and instead
+ of running it, it responds with that information. There are a few additional
+ commands that you can use this way — these are all “meta commands” in the
+ sense that you're not interacting with a git process on the other end, but
+ rather get gitolite to perform various tasks on your behalf. You can run the
+ @cmd{help} command (@cmd{ssh pltgit help}) to see a list of available
+ commands. They are mostly useful in dealing with your private repositories
+ on the server, which will be discussed further below.}
+
+@section{A (very) quick introduction to git}
+@p*{
+ This is a quick description; see the last section for more resources
+ (specifically,
+ @a[href: "http://eagain.net/articles/git-for-computer-scientists/"]{
+ Git for Computer Scientists} covers these basics well). Understanding how
+ git models and stores data will make it significantly easier to work with it.
+@~
+ A git repository is actually a database of a few kinds of objects, which form
+ a DAG. There are only a few of these kinds of objects, and they are all
+ addressed by the SHA1 checksum of their contents. You will generally see a
+ lot of these SHA1 strings (40 hexadecimal characters), since they form a kind
+ of a universal address for such objects. (For convenience, any unique prefix
+ of a SHA1 can be used with git commands when you need to refer to it.)
+ Whenever the following descriptions mention a pointer — this is actually such
+ a SHA1 hash.}
+@ul*{
+@~ A @em{blob} object is a generic container for any information, which
+ (usually) represents a file. This object has no pointers to any other
+ objects. It does not have anything except for the actual contents: no name,
+ permission bits, etc.
+@~ A @em{tree} object represents a directory hierarchy: it contains a list of
+ names, and for each name a pointer to the object that is its contents. Some
+ of these will point at blobs (when the tree contains a file), and some of
+ these will point at other trees (when it contains a sub-tree). (These
+ objects are similar to directories in a file system in that they contain all
+ “meta” information on files: their names and permission bits are kept here.)
+@~ A @em{commit} object represents a versioned snapshot of a tree, forming a
+ line of work. It has the following bits of information:
+ @ul*{@~ tree: a pointer to the tree object that was committed
+ @~ parent: a pointer to the previous commit, which this one revised
+ @~ author: the identity of the commit author (name, email, date)
+ @~ committer: the identity of the committer
+ @~ the text of the commit message (which can be arbitrarily long)}
+ The parent field is actually any number of parents: there will be no parents
+ if this is the first commit in the line of work, or more than one parent if
+ this is a “merge” commit that merges two lines of work. Furthermore, there
+ is nothing that prevents a git repository from having completely separate
+ lines of work — in fact, you can have several independent projects contained
+ in your repository.
+ @br
+ @small{(Note that git distinguishes the author of a commit from the person
+ who actually performed the commit, for example — a patch could be created
+ by X, and sent to Y to be committed.)}
+@~ Finally, there is a @em{tag} object, which is very roughly a pointer to
+ another object (almost always a commit), and is not important for now.}
+@p*{
+ The fact that all of these objects are addressed by the SHA1 hash of their
+ contents has some immediate important implications.}
+@ul*{
+@~ Since SHA1 are cryptographic checksums, they can be considered @em{unique}
+ for all practical purposes.
+@~ The git repository is inherently hash-consed: you can never have “two
+ identical files” in git — because a file is stored at its SHA1 hash, two
+ identical files will always be stored once. (Note that the name of a file is
+ stored in the tree that contains it, so the SHA1 of the contents does not
+ depend on it.) The same holds for two trees: if you have two identical
+ directories (same contents of files, same names, etc), then there will
+ actually be only one tree stored in the repository.
+@~ Furthermore, these addresses are actually global: any two repositories that
+ hold a file with the same contents will have it at the exact same SHA1 hash.
+ (For example, if I have a repository that contains several projects, and
+ each project contains several copies of the same LGPL text, then I'll have
+ only a single blob object with that contents.) This is not only making the
+ store efficient, it also makes it possible to refer to an object by its hash
+ — for example, you can refer to the SHA1 of a specific file at a specific
+ version in an email, and this will have the exact same meaning for anyone
+ that reads the file (eg, anyone can run @cmd{git show @i{SHA1}} to see that
+ file). (This does require that the readers have the actual object in their
+ repository, of course — but no mistakes can happen, statistically speaking.)
+@~ This holds for commits too: since a commit has the SHA1 of the tree it
+ points to, then the commit SHA1 depends on the tree it points to. More
+ importantly, since a commit object has the SHA1 of its parent(s), then the
+ commit depends on them. This means that “replaying” a number of commits on
+ a different parent commit (eg, when doing a “rebase”) will always result in
+ a separate line of commit objects. These SHA1s are also global, meaning
+ that talking about a specific revision by its SHA1 will always refer to it
+ unambiguously (as long as others have that object in their repositories).
+@~ By itself, this kind of storage @em{cannot} have any reference cycle. (At
+ least there is no practical way to get one.) The storage is therefore
+ inherently a DAG. In addition to this object store, git does have a number
+ of external references (eg, a branch is actually a pointer to a SHA1) — and
+ those could be arbitrary, but the object storage itself cannot have cycles.
+@~ The fact that a commit has a pointer to a tree is what makes git keep
+ revisions of the whole tree — a commit cannot mark a change to a subtree.
+ (At least not with the usual higher-level commands that git implements.)}
+@p*{
+ On top of this object store, there is a layer of meta-information about it.
+ The most important component here are branches (and tags). A branch is
+ basically a file that has the SHA1 of a specific commit (for example, your
+ @cmd{master} branch is a SHA1 that is stored in
+ @path{.git/refs/heads/master}). This is what makes branch creation extremely
+ cheap: all you need to do is create a new file with the SHA1.
+@~
+ In addition, the @cmd{HEAD} (where your working directory is currently), will
+ usually have a symbolic reference rather than a SHA1 (you can see this
+ symbolic reference in the @path{.git/HEAD} file, which should usually look
+ like @cmd{ref: refs/heads/@i{branch-name}}). When you commit a new version,
+ a new commit object is created, and the branch that the @cmd{HEAD} points to
+ is updated. It is also possible to checkout a specific SHA1 of a commit
+ directly — the result of this is called “detached HEAD”, since the HEAD is
+ not a symbolic reference. The possible danger in doing this is that @cmd{git
+ commit} will create new commits that are derived from the one you're on, but
+ no branch is updated; if you later checkout something else, no reference is
+ left to your new commit which means that it could be lost now. For this
+ reason, if you checkout a SHA1 directly, git will spit out a detailed
+ warning, including instructions on how you could name your current position
+ (create a branch that points there).
+@~
+ Tags come in two flavors: lightweight tags are SHA1 pointers like branches.
+ The problem with this is that such a tag could easily move to a different
+ commit, which is considered bad practice. For this reason, there are also
+ “annotated tags”, which are tag objects that are created in the object store.
+ These tags contain information that is similar to a commit (there's the
+ tagger's identity, the commit that it points to, and a log message) — and
+ they are reliable since you can refer to their SHA1. In this case, the
+ symbolic reference for such a tag (its name) will point to the tag object in
+ the store (it is also possible to move it, but that would also be bad
+ practice). Furthermore, tags (of both kinds) can point to any object in the
+ store — they can point to a tree or even to a specific blob. This is
+ sometimes used to store meta-information (eg, web pages) inside the
+ repository. (The repository for git itself has a tag that points to a blob
+ holding the maintainer's GPG key.)
+@~
+ Note that all of this is under a more high level of managing information
+ between branches and repositories, with push/pull being the main operations
+ at that level. A high-level overview (more below):
+ @ul*{
+ @~ a branch is a line of development, represented as a pointer to the commit
+ at its tip;
+ @~ branches can be organized into hierarchies using @path{/} as a separator;
+ @~ some branches are local, and some are remote — remote ones are named
+ @path{remotes/origin/@i{branch}};
+ @~ local branches are represented as files in
+ @path{.git/refs/heads/@i{branch}} and remote ones are in
+ @path{.git/refs/remotes/origin/@i{branch}};
+ @~ @cmd{origin} is just the conventional name for the original repository you
+ cloned — later on you can add new remote repositories so you can push and
+ pull to/from them conveniently;
+ @~ some local branches are set to track remote ones, usually (but not
+ necessarily) the two will have the same name;
+ @~ you can also have local branches to track other local branches (with
+ pushing and pulling happening inside your repository);
+ @~ @cmd{git fetch} is used to update your remote branches — ie, connect to
+ the remote repository, get new commits (and the required parents and
+ trees), and update your remote branch with the new tips;
+ @~ @cmd{git merge} and @cmd{git rebase} are used to update one branch with
+ commits on another;
+ @~ @cmd{git pull} is, roughly speaking, a convenient way to do a fetch
+ followed by a merge (or a rebase, when used with @cmd{--rebase}).}
+@~
+ There are several git tools that are relevant here. These are @em{not}
+ commands that you need to know for everyday use — so you can ignore this
+ part. It's only relevant if you want to see more of the low level structure
+ (or maybe if you want to write code that interfaces with a repository at this
+ level).}
+@dl*{
+@~ @cmd{git show @i{SHA1}}
+@~ Show the object, in some appropriate way based on the type of the object.
+ (For blobs it shows the contents, for trees you get a listing of its
+ contents, and for commits it shows the log and the patch.)
+@~ @cmd{git cat-file {-t | -s | @i{type} | -p} @i{SHA1}}
+@~ A more low-level command that tells you the type/size of an object (@cmd{-t}
+ and @cmd{-s}), or shows the contents of an object as-is when given a type.
+ @cmd{-p} will “pretty-print” the object, eg, showing the contents of a tree
+ object instead of dumping its binary encoding.
+@~ @cmd{git gc}
+@~ Starts from a rootset holding all known references (branches, tags, etc),
+ and collects dangling objects. Such objects are generated due to various
+ reasons — for example, rebasing means that new commits are generated, and
+ the old ones are kept around. Actually, this will not remove recently
+ referenced objects — there is a protection mechanism that keeps them around
+ for a while, so if you somehow mess things up there is still a way to
+ recover.
+@~ @cmd{git fsck}
+@~ Does a “file system check” on the repository.
+@~ @cmd{git rev-parse @i{symbolic-name}}
+@~ Prints out the full SHA1 of a symbolic name (eg, a branch name or a tag
+ name). Will also print out the SHA1 given a possibly short prefix of one.
+ (Actually, this command can also show other information about a repository,
+ which makes it an important entry point for programs that deal with a
+ repository.)}
+
+@section{Clone the PLT repository}
+@p*{
+ As you probably know by now, in git you don't checkout a repository — you
+ clone it, getting a copy of the complete repository you cloned. This
+ includes the object store and the various references (branches and tags).
+ There are several ways to get the PLT repository, but the one that is
+ relevant to work on it is to do so through ssh — since this allows pushing
+ changes back to the server. (It is also possible to clone from one place and
+ push to another, but if you start with cloning through ssh your clone will be
+ already set up to push changes back.) The information that gitolite gives
+ you (with @cmd{ssh pltgit info}, assuming the above ssh setup) includes two
+ repositories that you have write access to: @cmd{plt} is the main repository,
+ and @cmd{play} is setup similarly (intended to try things out, see the
+ “Fooling around” section below). To get the main repository, run
+ @pre{git clone pltgit:plt}
+ which will create a @path{plt} directory with your new clone. You can now
+ start working in this directory.
+@~
+ The repository is also available from other sources, some can be used for
+ read-only cloning:
+ @ul*{
+ @~ @cmd{git clone git://@|git-host|/plt.git}@br
+ cloning the repository using git's own network protocol
+ @~ @cmd{git clone http://@|git-host|/plt.git}@br
+ clone the repository over http
+ @~ @cmd{git clone http://github.com/plt/racket.git}@br
+ this uses the repository mirror on github (which is automatically kept in
+ sync)}
+ and some present a web interface for additional information:
+ @ul*{
+ @~ @cmd{@selflink{http://@|git-host|/plt}}@br
+ a web interface to inspect the repository
+ @~ @cmd{@selflink{http://github.com/plt/racket}}@br
+ github's fancier web interface}}
+
+@section{Start working: git commits vs subversion commits}
+@p*{
+ As seen in the previous section, you start with
+ @pre{git clone pltgit:plt@";" cd plt}
+ And now you get to actually do some work.
+@~
+ For the normal cycle of operations, working with git is not all that
+ different from working with subversion — you would change some files, and
+ then:
+ @pre{git commit some/paths}
+ or
+ @pre{git commit some/paths -m "add some feature" -m "requires another"}
+ only now the commit lives only in @strong{your clone only}, not in the server
+ (which is why committing is blindingly fast, not requiring a network
+ connection). To push your commits to the server, run @cmd{git push}, and to
+ pull updates from the server run @cmd{git pull}. This is obviously very much
+ oversimplifying the process: mainly neglecting to talk about updates on the
+ server when you already have local changes. (See below for a more detailed
+ explanation.) Note that in these examples I'm explicitly specifying the
+ paths to commit, either the files that you want to commit or a directory
+ where you want to commit all changes. See the section below on the “staging
+ area” for more details.
+@~
+ One major difference to keep in mind is that git commits are @strong{not}
+ like subversion commits. (This is confusing since many places that discuss
+ the difference between the two and/or try to teach git to subversion users
+ almost always work under the assumption that commits in the two systems are
+ the same.) The thing is that git commits are done at a finer level than
+ subversion commits — since a commit is done locally and not on the server.
+ To really imitate how subversion works, you would push all commits right
+ after you create them — essentially equating commits with pushes, which is
+ how you work with a subversion repository. But by just @em{not} doing this,
+ you will immediately get some of the benefits that git gives you. So a
+ better way to think about it is: in git you commit at points that make sense
+ for the respective changes, usually at a finer level than subversion commits.
+ Then, you push back a bunch of commits to the server — whether one or a
+ hundred. The point where you push your changes to the server is effectively
+ the point where you decide that you're in a good enough state to make your
+ work public.
+@~
+ Incidentally, following this intuition, drdr is running a build for every
+ push to the server — not for every commit. When you push to the server, it
+ will tell you which push number this is — these numbers are going to be used
+ by drdr, and they (very!) roughly correspond to subversion commits.
+ (Currently, every push gets a number, but in the future this might be used
+ only for pushes to the master branch.) There's no plan at the moment to use
+ these numbers for anything else.}
+
+@section{Fooling around with git}
+@p*{
+ Experimenting with git is easy to do, and the server is set up to make it
+ even easier. You can use one of the following ways to experiment safely with
+ the main repository:
+ @ul*{
+ @~ There is a @cmd{play} repository on the server. This repository is very
+ similar to the @cmd{plt} repository, and it is set up in the same way that
+ @cmd{plt} is. Feel free to destroy it in any way you want, even if it
+ becomes unusable, it's easy to just recreate it.
+ @~ You can create (and later delete) your own repositories — including making
+ your own copy of the main repository, an operation that is known as
+ “fork”. Your fork will be created efficiently (ie, creating a fork of the
+ plt repository is cheap), but any changes made to it will not affect the
+ main repository. A fork is created with a gitolite command, and once it's
+ there you can clone it and eventually delete it. Here are the relevant
+ commands — use your actual username in place of @cmd{$user} (or have
+ @cmd{$user} set to your username):
+ @ol*{@~ @cmd{ssh pltgit fork plt $user/myplt}
+ @~ @cmd{git clone pltgit:$user/myplt}
+ @~ ...play with this clone, push, pull, etc...
+ @~ @cmd{ssh pltgit delete $user/myplt}}
+ More on user repositories below.}}
+
+@section{The staging area}
+@p*{
+ Something that tends to confuse people is git's “staging area”. This is a
+ concept that is unique to git — roughly speaking, you can have three versions
+ of a tree:
+ @ul*{
+ @~ the files that you actually see (and edit) — the working directory,
+ @~ another is the staging area which you add stuff to from the working tree
+ using @cmd{git add},
+ @~ and then there is the tree that is in the HEAD with all prior versions.}
+@~
+ The thing that can confuse here is that when you @cmd{git add some/file} for
+ a file that you edited (or created) and then edit it further, then the
+ version will get committed by a plain @cmd{git commit} will be the one that
+ was added. Note that @cmd{git status} will tell you which modifications are
+ in the staging area waiting to be committed, and which modifications are in
+ your working directory — in the given example, it will tell you that
+ @path{some/file} is in both.
+@~
+ The staging area can be useful at times, but most likely at the beginning
+ stages you will want to just avoid it. The good news is that it is easy to
+ do so.}
+@dl*{
+@~ @strong{Avoid the @cmd{-a} flag}.
+@~ @p*{
+ Before we see how to ignore it, note that there are many web pages that
+ will tell you to use @cmd{-a} with the commit command. This will make git
+ commit all changes to tracked files — @strong{including tracked files that
+ are outside of your current directory}, and this can make you commit
+ changes that you didn't intend to commit.}
+@~ @strong{Always specify a path to @cmd{git commit}}.
+@~ @p*{
+ The easiest way to avoid the staging area is to specify the path(s) to
+ what you want to commit, possibly @path{.} for all changes in the current
+ directory (and below). Specifying a path this way will make
+ @cmd{git commit} behave very similarly to subversion: tracked files that
+ were modified will get committed, and added files (with @cmd{git add})
+ that are listed in the paths-to-be-committed are also committed. Tracked
+ and added files that are not listed (and not in a specified subdirectory)
+ are left as-is. So, if you had a habit of doing this with subversion
+ (@cmd{svn commit .}), then git will essentially do the same. You will
+ still need to use @cmd{git add} for newly created files, but this is
+ essentially the same as with subversion.
+ @~
+ It is also possible to “make up new git commands” for yourself. See the
+ following section on the subject: it adds a new @cmd{git ci} command that
+ passes @path{.} to @cmd{git commit}, similarly to what @cmd{svn ci} does
+ by default.}
+@~ It is a good idea to @strong{avoid using the @cmd{-m} flag}, until you're
+ more comfortable with git.
+@~ @p*{
+ Let git pop up an editor to write the commit log: the file that you will
+ edit will list the changes that you are about to commit as well as changes
+ that you are not going to commit. Glancing through it, you will see
+ changes that you missed, furthermore, the paths are relative which makes
+ it easy to quickly distinguish paths in the current directory and outside
+ of it (the latter will begin with @path{../}). If you see any problems,
+ just make sure that you have no commit message in the editor and when you
+ quit it git will abort the commit (same as subversion).}
+@~ @strong{Don't push out all commits to the server immediately}.
+@~ @p*{
+ Even if you did commit something by mistake, it is possible to undo the
+ commit — run @cmd{git reset HEAD^}, which will undo the last commit (it
+ moves the branch to the parent commit), and the changes that are no longer
+ committed will be left in your working directory, so you get to try the
+ commit again. Note that this is possible @strong{only if you didn't push
+ out the commit} that you're undoing — if you did, then the server will
+ later not allow you to push changes that are not strict extensions of what
+ it has (since this is likely to confuse other people who already got your
+ commit).
+ @~
+ So in general, remember that you can commit often, and commit when it
+ makes sense to do so, and push commits out only when you're done with
+ whatever you were working on. Consider your local git history as
+ something that you have full control over: you can undo commits and redo
+ them (in fact, @cmd{git commit --amend} does just that: undo the last
+ commit, and combine it with new changes — it's a solution for “oops
+ commits”), you can rebase them, and you can just throw away everything and
+ start from scratch. But when you do push your history out, the party is
+ over, and any mistakes will need to be rectified in further commits (eg,
+ you can no longer use that @cmd{--amend} flag, you have to do an “oops
+ commit”).
+ @~
+ (BTW, strictly speaking, it is only the policy on our server that forbids
+ such rewritten history — since this is likely to be a mistake for now, and
+ if it happens most people will be confused about what needs to be done.)
+ @~
+ Also, as said above, pushing all commits immediately means that you're
+ essentially restricting yourself to the same mode of operation as
+ subversion. Same mode, but more complicated — and you won't enjoy any of
+ the benefits, which will guarantee that you will suffer.}}
+
+@section{Configuring and extending git}
+@p*{
+ As mentioned above, git uses several configuration files that customize
+ various aspects. The two important ones are your global file
+ (@path{~/.gitconfig}) and a per-repository file in @path{.git/config} at the
+ repository root — whenever a value is needed, git will first consult the
+ repository configuration, and if the option is not set there it will try your
+ global version. (It will also look at a system-wide configuration, but this
+ is irrelevant here.) Configuration options names are separated by a
+ @path{.}, and configuration files have a simple syntax, with @cmd{foo.bar}
+ option listed as a @cmd{bar = some value} line in a @cmd{[foo]} section.
+ Note that you can set @em{any} configuration you want to, no restrictions.
+ This can be useful for customizing various extensions, including scripts that
+ you may want to write (for example, the git server has a script that checks
+ the @cmd{hooks.counter} option to know if it should keep track of pushes).
+ This is facilitated by several options to @cmd{git config} which makes it
+ easy to query configurations from scripts etc.
+@~
+ To edit your configuration options you can either use the @cmd{git config}
+ command, or you can edit the file directly. When @cmd{git config} changes
+ the file it rewrites only part of the file and leaves the rest untouched,
+ which conveniently leaves your own format and any comments you might want to
+ include. To set a value through the command and then get it:
+ @pre{git config foo.bar some-value
+ git config foo.bar}
+ and you can add a @cmd{--global} flag to either form to use only your global
+ configuration file. There are many other options — for dealing with keys
+ that must be booleans or integers, for keys with multiple values, etc. The
+ @man{git-config} man page will tell you much more on this.
+@~
+ The man page also lists the configuration options that customize various
+ functionalities. Here are some important ones that you should consider
+ setting (each listed as a command that sets it globally):
+ @ul*{
+ @~ @npre{git config --global user.name "My Name"
+ git config --global user.email "foo@at-git-racket"}
+ @p*{
+ As said in the beginning of this text, you will likely want to set a
+ default username and email for yourself. But note that if you do set
+ this globally, it will be your default identity for all repositories.
+ This makes sense only if you commit to plt-related repositories, but it
+ can be confusing if you're also committing to some other non-plt-related
+ repositories and want to commit under a different email (or name) — for
+ example, you may want to commit to a public project with a gmail
+ address, and to a departmental repository with your
+ @cmd|{foo@cs.bar.edu}| email.
+ @~
+ You could set the racket-lang.org identity locally in your PLT clone or
+ you could set your other identity in the other repository, but in any
+ case you should be aware of this and avoid letting git guess your name
+ and email. (Some confusion is likely to happen anyway, and git has a
+ way to “map” some name/email to another when mistakes happen.)}
+ @~ @npre{git config --global push.default tracking}
+ @p*{
+ By default, when you run @cmd{git push}, git will push all branches that
+ correspond to branches in the remote repository. This can be surprising
+ if you're working on several branches since it will push them all out.
+ Setting this option to @cmd{tracking} will make git push the current
+ branch to the branch it is tracking.
+ @~
+ Another option for this is @cmd{current}, which makes @cmd{git push}
+ always push the current branch to the remote it was cloned from. This
+ is convenient in that you never need to set up how local branches track
+ remote ones — it's as if all local branches @em{always} track all remote
+ branches under the same name. For example, after you clone an empty
+ repository (see the user repositories section below), a @cmd{git push}
+ will push a master branch remotely — whereas with @cmd{tracking} you
+ need to have the first push explicitly specify the branch to push,
+ usually @cmd{git push origin master} (this sets things up so later you
+ can just run @cmd{git push}). However, using @cmd{current} you can no
+ longer push from one local branch to another local branch it is set to
+ track.
+ @~
+ So a possible conclusion here is that you should use @cmd{tracking},
+ unless you plan on branches to always track remote branches by the same
+ name. @cmd{tracking} is often preferred over @cmd{current}.}
+ @~ @npre{git config --global core.excludesfile "~/.gitignore"}
+ @p*{
+ You'll probably want to always ignore a number of common patterns, like
+ backup files or OSX @path{.DS_Store} files. To do this, you first set a
+ default file as shown here (note that @path{~} is quoted, and git will
+ expand it to whatever your home directory is). If you have this
+ setting, you can then create a file at this path with patterns for files
+ that you always want to ignore. This file has shell-patterns (and
+ possibly @cmd{#}-comments) — for example:
+ @pre{# backups
+ *~
+ # autosaves (note the #-quoting)
+ \#*
+ # OSX junk
+ .DS_Store}
+ (See the @man{gitignore} man page for a few more details.)
+ @~
+ In addition to this, git repositories can have their own
+ @path{.gitignore} files (unlike @path{.git/config} files), which are
+ combined hierarchically together with this global option. In fact, you
+ don't really need to set the above ignores for the PLT repository since
+ they're already included in its toplevel @path{.gitignore} file — but
+ doing so is still a good idea since you're likely to work on other
+ repositories too.}
+ @~ @npre{git config --global core.editor emacs
+ git config --global core.pager less}
+ @p*{
+ These two settings are used to tell git which command to use for editing
+ log messages, and which command is used to paginate output. (The former
+ might already be set in your environment as the value of the
+ @cmd{EDITOR} variable.) If you set the latter to @cmd{cat}, git will
+ just spill all output directly. In addition to these, you can also
+ control which individual commands use a pager, for example, to disable
+ the pager for @cmd{git log}, you can do this:
+ @pre{git config --global pager.log false}}
+ @~ @npre{git config --global color.ui auto
+ @i{...}
+ git config --global color.branch.current yellow red bold
+ git config --global color.branch.local yellow
+ git config --global color.branch.remote green
+ @i{...}}
+ @p*{
+ These settings control how git uses colors: whether it shows them, and
+ which colors it will use for various outputs. There are many of these
+ settings, which you can find in the @man{git-config} man page.}
+ @~ @p*{
+ @cmd{remote.origin.*}, @cmd{branch.master.*}
+ @~
+ Git keeps track of what the @cmd{origin} repository and how branches
+ track other branches in the configuration file too. (You will have such
+ entries for all known remote repositories and branches.) Usually you
+ set these values (often implicitly) via various git commands — but you
+ might want to look in your configuration file if you want to tweak
+ things yourself. Note that for configuration names with more than two
+ parts, the section name will something like @cmd{[remote "origin"]}.}
+ @~ @p*{
+ @cmd{sendemail.identity}, @cmd{sendemail.from},
+ @cmd{sendemail.bcc}, @cmd{sendemail.suppresscc}, ...
+ @~
+ These settings configure @cmd{git send-email}, which is used to send
+ patches from your repository elsewhere. You will probably want to
+ customize them if/when you get to use this facility often. (See
+ below.)}}
+@~
+ In addition to these settings, you can extend git with your own aliases and
+ commands. Aliases are stored in your git configuration — so you can use
+ @cmd{git config} to create an alias, for example, @pre{git config --global
+ alias.up "pull --stat --all"} creates a global @cmd{git up} command which is
+ actually a short alias for running @cmd{git pull --stat --all}.
+@~
+ @strong{Notes about aliases:}
+ @ul*{
+ @~ To edit aliases, it is more convenient to edit your configuration file
+ directly.
+ @~ Since aliases are stored in git configuration files, they can be made
+ local to each repository.
+ @~ When command-line arguments are given to the alias, they will be appended
+ to the alias text.
+ @~ Aliases @em{cannot} override git commands; this is intentional, to avoid
+ scripts breaking due to modified commands.
+ @~ An alias that starts with a @cmd{!} character will be run as a shell
+ command. For example, you can use
+ @pre{k = "!gitk -d"}
+ to make @cmd{git k} run the gitk program with the @cmd{-d} flag.
+ @~ Some aliases that I find useful are:
+ @pre{# satisfy the "up instinct"
+ up = pull --ff-only --stat --all
+ # quick status, similar to what subversion shows
+ st = status -s
+ # we will be dealing more with branches
+ br = branch}}
+@~
+ In addition to aliases, you can create new git commands using a script that
+ is called @cmd{git-@i{something}} somewhere in your path. (Note that these
+ cannot override known git commands either.) Such commands will be available
+ as @cmd{git @i{something}}. One use for this is using our facility for
+ managing file properties — the @path{collects/meta/props} program. To do
+ this put this in a file called @cmd{git-prop} somewhere in your PATH, for
+ example, @path{~/bin/git-prop}:
+ @pre|{#!/bin/sh
+ top="$(git rev-parse --show-toplevel)" || exit 1
+ exec "$top/collects/meta/props" "$@"}|
+ then run @cmd{chmod +x ~/bin/git-prop}, and you can now use it as a git
+ command (try @cmd{git prop -h}). Note the use of the @cmd{rev-parse}
+ command: it will display the repository root, which means that you will get
+ to run the props script of the repository you're @em{currently} using.
+ (There are many git commands that are useful for such scripts.)
+@~
+ Another useful script is @cmd{git-ci} which mimics the behavior of
+ @cmd{svn commit} (avoiding confusion with the staging area). As said above,
+ a good way to achieve this is to specify the current directory (@path{.}) if
+ you don't specify any other path. If you save this as @path{git-ci}, you
+ will get a @cmd{git ci} that does just that:
+ @pre|{
+ #!/bin/sh
+ add_dot=yes; for p; do if [ -e "$p" ]; then add_dot=no; fi; done
+ if [ -e "$(git rev-parse --git-dir)/MERGE_HEAD" ]; then add_dot=no; fi
+ if [ -d "$(git rev-parse --git-dir)/rebase-apply" ]; then add_dot=no; fi
+ if [ $add_dot = yes ]; then git commit . "$@"; else git commit "$@"; fi
+ }|
+ This small script will basically check all arguments and see if one is an
+ existing path (or if you're resolving a merge). If none are, it adds
+ @path{.} as a first argument (this avoids confusing it as a value for some
+ flag). Note that this is not completely foolproof: for example, if you'll
+ use the @cmd{-m .} hack, it will assume that you did specify a path. (But
+ you should really avoid such log messages.)}
+
+@section{User repositories}
+@p*{
+ As mentioned above, the plt server allows you to create your own
+ repositories. Repositories on the server can be organized in a nested
+ directory structure, and you “own” all repositories that are in a directory
+ with your username. The gitolite @cmd{info} command that was mentioned above
+ shows you this with a @cmd{C CREATER/.*} line: this means that you can create
+ any repository if it is in a subdirectory with your username. (In this
+ discussion, “the server” is actually the gitolite script that runs on the
+ server.)
+@~
+ Any git operation that you do on a repository that you own which does not
+ exist will make the server create it for you — for example, if you clone such
+ a repository. To run these examples use your git username instead of
+ @cmd{$user}, or simply set the @cmd{user} variable in your shell (as this
+ example shows) — this is only to make copy-pasting easy, of course.
+ @pre{user=eli # your own username here
+ git clone pltgit:$user/foo}
+ (You are encouraged to run these commands — at the end of this section you'll
+ see how you can clean things up.)
+@~
+ What will happen now is (a) git will initialize a @path{foo} repository for
+ you, (b) it will connect to the server to clone its contents, (c) the server
+ will notice that it doesn't exist so it will create it, (d) your git process
+ will clone the empty result. Because the remote repository is empty, git
+ will complain that “You appear to have cloned an empty repository” — this is
+ expected, so you shouldn't worry about it. Once you have your (empty) clone,
+ you can populate it as usual, then push the new content back to the server's
+ copy:
+ @pre{cd foo
+ @i{...create some files...}
+ git add .
+ git commit -m "initial content"
+ git push origin master}
+ Note that the last command explicitly names the branch to push over — once
+ this push is done, git will remember this relation and further pushes can be
+ done with just @cmd{git push}. If you happen to forget this and use
+ @cmd{git push}, then git will not push anything, and it will tell you about
+ it and suggest specifying a branch. On the other hand, if you set the
+ @cmd{push.default} configuration option to @cmd{current} (as described in the
+ customization section above), then even in this first push you can just run
+ @cmd{git push} since git assumes that you always want local branches to
+ correspond to remote branches by the same name.
+@~
+ Instead of cloning a repository to create a new one, you could also start
+ with an existing repository and simply push it to a yet-to-exist repository
+ on the server. Again, the server will see that it doesn't exist and will
+ create it for you (provided that it is in your directory). To continue the
+ above example, I could now create a new repository from @cmd{foo}:
+ @pre{# (still in the foo directory)
+ git push pltgit:$user/bar}
+@~
+ There is, however, an issue of efficiency here: with this last command I just
+ created a second copy of it all. This could be problematic if you have a
+ large repository (eg, a copy of the plt repository). (Note that with
+ subversion this is the only way to do things, but there you would create
+ copies inside the tree, which subversion optimizes.) One nice feature of git
+ is that creating a clone of a repository on the same filesystem will use
+ hard-links for the clone, which makes the clone use very little additional
+ space. But the problem is that you have no access to the plt server. The
+ solution here is in the form of a gitolite @cmd{fork} command (this is
+ actually our own extension) — this command will create a clone on the server,
+ starting from a specified repository. I could therefore create my @path{bar}
+ repository as a copy of @path{foo} with the following:
+ @pre{ssh pltgit fork $user/foo $user/bar}
+ (Note that if you follow these examples and you already have @path{bar}, the
+ server will tell you about it.) The result is a @path{$user/bar} repository
+ that was cloned from @path{$user/foo}, and the two share their store using
+ hard links. If the two repositories are updated with identical content, the
+ new content will not be shared, but for a large repository like the plt
+ repository you still get the benefit of having the bulk of the data shared
+ (the complete store, at the time of forking). To get a feeling for how fast
+ this is, you can now clone the plt repository to your own private copy:
+ @pre{ssh pltgit fork plt $user/myplt}
+ This would seem suspiciously fast for such a large repository — but this
+ repository has most of the data packed (objects in the store are put in large
+ “pack files”), so there are not too many files, and the server-side cloning
+ basically created hard links to these files. The result is fast, efficient
+ (even in speed: when you interact with your clone, files are likely to be
+ paged in memory), and cheap.
+@~
+ As we've seen above, the gitolite @cmd{info} command lists the permissions
+ that you have, but it doesn't show you the actual repositories. For this,
+ there is an @cmd{expand} command. (Yes, this is not a great name; it's
+ related to how gitolite was intended to be used. Remember that there is also
+ a @cmd{help} command that describes the available gitolite commands.) When
+ you run the expand command — @cmd{ssh pltgit expand} — you get a listing of
+ all of the repositories that you can access, each with an indication of read
+ permissions (@cmd{R}) or write permission (@cmd{W}). A @cmd|{@}| indicator
+ means that you have the respective permissions because it is allowed for all
+ users. Each repository is also listed with its owner, or @cmd{} in
+ case it is a globally configured repository.
+@~
+ Some of the gitolite commands are used to configure your repositories — you
+ can only use these with repositories that you own.
+ @ul*{
+ @~ @cmd{getperms} and @cmd{setperms} — these are used to get or set
+ permissions for your repositories. The first will print the current
+ permissions (which will be initially empty), and the second will read the
+ permissions on its input and set them. The format of the permissions is
+ simple: each line begins with an @cmd{R} or @cmd{RW}, and then the
+ usernames that this permission applies to. You can use the magic username
+ @cmd|{@all}| to grant access to everyone in the system. For example, to
+ grant read permissions to everyone, and write permissions to user1, create
+ a file with:
+ @pre|{R @all
+ RW user1}|
+ and then run the setperms commands with this as its input:
+ @pre{ssh pltgit setperms $user/foo < the-file}
+ You can also just run the @cmd{setperms} command and type in the
+ permissions directly. Note that these permissions are not cumulative:
+ every use of @cmd{setperms} specifies all permissions. (We might have a
+ more convenient interface for all of this in the future.)
+ @~ @cmd{config} — this command can be used to set known configuration options
+ in your repositories. It works with sub-verbs:
+ @dl*{@~ @cmd{ssh pltgit config list}
+ @~ Displays the known configuration options
+ @~ @cmd{ssh pltgit config get @i{repo} @i{config}}
+ @~ Displays the configuration value of a specific repository
+ @~ @cmd{ssh pltgit config set @i{repo} @i{config} @i{value}}
+ @~ Sets the configuration value of a specific repository}
+ These configuration options can customize aspects of the scripts that run
+ after every push — currently, you can use this to set an email address to
+ send notification emails to. Other configurations may be added in the
+ future. (Note that this does not let you set any configuration, since
+ some of these can execute arbitrary commands.)
+ @~ @cmd{delete} — finally, this command can be used to delete repositories.
+ For example, to clean up the above, you can now run:
+ @pre{ssh pltgit delete $user/foo
+ ssh pltgit delete $user/bar}
+ The repositories are moved to a temporary holding directory, and will
+ eventually be removed. The bottom line here is that if you lost anything
+ by mistake recently, chances are there's a backup of your repository.}}
+
+@section{Working with git}
+
+@subsection[#:newpage? #f]{basics}
+@p*{
+ The above description is much simplified in that it doesn't deal with
+ development that happens outside of your own work — and such development
+ obviously changes the story. Overall, this is not too different from working
+ with subversion: if there were any changes on the server you need to update
+ your working copy first, and this implies dealing with conflicts if there are
+ any. But the way to deal with such things in git is significantly different
+ than dealing with them in subversion, and this difference is at the technical
+ level (different commands) and at the workflow level (you will likely branch
+ much more, and you're likely to push less frequently than you commit with
+ subversion).
+@~
+ To start working, you first need to get a repository clone. Usually you will
+ clone the PLT repository (or a private copy that you do your work in), but
+ remember that to experiment with git you have the @cmd{play} repository or
+ you could make a fork of the PLT repository to play with and remove it when
+ you're done. Either way, be sure to try these things out — it will make your
+ life much easier in the future.
+@~
+ In the following examples I will use an empty repository to demonstrate
+ things and I will list the exact commands that I'm using (this means that I
+ will use unix commands to create and edit files, and use @cmd{-m} when
+ committing). Lines that I enter are displayed with a @cmd{$} prompt, most
+ output lines are omitted, comments start with @cmd{#}, and @cmd{$user} is set
+ to my username. Note that if you try this yourself, the SHA1s of commits
+ will be different (the reason for that is that a commit object includes the
+ author name and email and the date). Note also that in some places I will
+ “jump to an earlier continuation”: start from an earlier state and do
+ something different — so if you want to try these things out it will be
+ convenient to put the commands in a shell script so you can re-run it to get
+ to the earlier state.
+@~
+ First, I create a private empty repository, populate it, and update the
+ remote repository:
+ @pre{$ user=eli # your own username here
+ $ mkdir /tmp/sandbox; cd /tmp/sandbox
+ $ ssh pltgit delete $user/foo # delete previous repository, if any
+ $ git clone pltgit:$user/foo
+ $ cd foo
+ $ echo "foo" > foo; echo "bar" > bar
+ $ git add .
+ $ git st # uses the `st' alias as shown above
+ A bar
+ A foo
+ $ git commit -m "initial content" .
+ [master (root-commit) 87f1f02] initial content
+ @i{...}
+ # git tells us the branch we committed to, the new commit SHA1 and
+ # that this is the first commit, and the log message; we can verify
+ # this now with `git log'
+ $ git log
+ commit 87f1f02c23b32e7f9b... # this is the commit object I created
+ @i{...}
+ $ git push # (or `git push origin master' if needed)
+ To pltgit:eli/foo # where we pushed to, and the branch
+ * [new branch] master -> master}
+@~
+ @small{[A quick note on commit messages: several git command consider the
+ first paragraph of your commit message as a short description for it. This
+ is all of the text up to the first blank line if you write a commit message
+ in an editor, or the first @cmd{-m} message if you use it with
+ @cmd{git commit} (it accepts multiple @cmd{-m} arguments, for multiple
+ paragraphs). Keep this in mind when composing such messages.]}
+@~
+ To see what happens when multiple people commit to the repository, we create
+ a second clone of our repository now in a @path{foo2} directory:
+ @pre{$ cd ..
+ $ git clone pltgit:$user/foo foo2
+ $ cd foo # go back to foo now}
+@~
+ Lets make two new commits now:
+ @pre{$ echo "more foo" >> foo; echo "more bar" >> bar
+ $ git ci -m "more stuff" # uses the `git-ci' script from above
+ [master b7d3c41] more stuff
+ $ echo "even more foo" >> foo
+ $ git ci -m "even more stuff"
+ [master 18bc0e6] even more stuff}
+@~
+ At this point, instead of blindly pushing these commits, lets around first.
+ One useful tool for inspecting the history is @cmd{gitk} — if you run it now,
+ you will see the simple 3-commit graph, and two of them are marked as
+ branches — clearly showing that your local master branch is 2 commits ahead
+ of the remote one. This could be different at this point: someone else might
+ have pushed more commits to the remote — remember that your remote master
+ branch (@cmd{remotes/origin/master}) is not really what's on the remote, but
+ rather what you know about it last time you pulled from it.
+@~
+ Another useful command to examine the history is @cmd{git log}, which can
+ show commit history in many ways. As things stand in the current repository,
+ if you just run @cmd{git log}, you will see a listing of the same three
+ commits that gitk shows. To get a more condensed format with
+ one-line-per-commit, use @cmd{--oneline}. Another thing that you can do is
+ show a specific range of commits — you can do this by specifying two
+ revisions separated with @cmd{..}, where the revisions can be written
+ explicitly using the (short prefix) SHA1 form, or more conveniently using a
+ symbolic name (eg, branch, tag, HEAD):
+ @pre{$ git log origin/master..master}
+@~
+ Since we @em{are} currently on the master branch, we could use @cmd{HEAD} for
+ the second one (@cmd{origin/master..HEAD}), but this is also the default, so
+ an even shorter form is @cmd{origin/master..}. In addition to @cmd{git log},
+ you can also use @cmd{git diff} in a similar way, but instead of a commit
+ listing, you get the diff between the two specified points, so
+ @pre{$ git diff origin/master..}
+ will show you the changes that you did not yet push. Note that there are a
+ number of places where git will guess the full name of branches, for example,
+ @cmd{origin/master} is actually a short name for @cmd{remotes/origin/master}.
+ In a similar way, just @cmd{origin} will make git guess that you're talking
+ about @cmd{origin/master}.
+@~
+ In these cases, the revision specification for @cmd{log} and @cmd{diff} are
+ the same, but this is a little misleading: @cmd{git diff} usually works by
+ comparing two specific end points in your history, but @cmd{git log} actually
+ works on a @em{set of commits} rather than on a range. The @cmd{R1..R2}
+ notation is actually shorthand for @cmd{^R1 R2} — specifying a commit means
+ “the commit and all of its parents”, and a @cmd{^} prefix negates a set, so
+ @cmd{^R1 R2} means “include the set of commits leading to R2 (inclusive), but
+ exclude the ones leading to R1 (inclusive)”.
+@~
+ In addition to this range/set specification, there is a lot to specifying a
+ revision set. As mentioned, you can use a SHA1 (or a shorter unique prefix),
+ or you can use a symbolic name. You can also use @cmd{R^} for the parent of
+ @cmd{R} (the first parent in the case of merge commits, which have more than
+ one parent), @cmd{R^^} would be the grand parent, @cmd{R~3} is the
+ 3rd-generation parent commit. There is also @cmd|{R@{yesterday}}|,
+ @cmd|{R@{1 month 2 weeks go}}| etc for a symbolic name R — which refers to
+ the branch/HEAD at that point in time (this refers to @em{your own} version
+ of it at that time; there are @cmd{--since} and @cmd{--until} flags to filter
+ commits by the time they were made at). You can also use a -N flag (where N
+ is an integer) to show only N commits. Finally, you can use a branch name R
+ with @cmd|{R@{upstream}}| (short: @cmd|{R@{u}}|) to refer to the “upstream”
+ version of the branch — the branch that R is set to follow. This is
+ particularly convenient for things like @pre|{$ git log --oneline @{u}..}|
+ which will always show the commits that you have over the branch your current
+ one follows. (For example, you could set up an alias to use this.)
+@~
+ We now continue by pushing our two commits to the remote server. Since we
+ already did a push, a plain @cmd{git push} works fine.
+ @pre{$ git push # no need to specify a branch now
+ To pltgit:eli/foo
+ 87f1f02..18bc0e6 master -> master}
+@~
+ As you can see in the last line, git tells us that we pushed from our local
+ @cmd{master} branch to the remote @cmd{master} branch, and that this made it
+ advance from the first commit we pushed (87f1f02) to the last one we created
+ now (18bc0e6).
+@~
+ Since we've made some progress in one place, we can go to the @path{foo2}
+ clone to see what happens when we update a repository that did not have these
+ changes:
+ @pre{$ cd ../foo2
+ $ git pull
+ @i{...}
+ From pltgit:eli/foo
+ 87f1f02..18bc0e6 master -> origin/master
+ Updating 87f1f02..18bc0e6
+ Fast-forward
+ bar | 1 +
+ foo | 2 ++
+ 2 files changed, 3 insertions(+), 0 deletions(-)}
+@~
+ This looks expected — git shows the new commits that we received, and they're
+ the same as what we pushed earlier. Using @cmd{git pull} is actually doing
+ two things: it's running @cmd{git fetch} first to update your remote branch
+ from the server, and then it uses @cmd{git merge} to merge it into your
+ master branch. The point where @cmd{get merge} starts is the “Updating” line
+ — and there's an important thing to note here: the next line says
+ “Fast-forward”, which is a special kind of a merge. When you merge some
+ branch into your branch, and this branch is a proper superset of your branch
+ (it has commits that your branch doesn't, and all commits in your branch are
+ included in it), git will simply “move your branch forward” to the other: it
+ will update your branch to the tip of the merged one, and then your working
+ directory will be update accordingly.
+@~
+ It is often better to do the fetch first, so you can see the changes that
+ happened remotely before you merge them. To do this we're going to use
+ @cmd{git fetch}, avoiding the merge step that @cmd{git pull} does. In fact,
+ since creating a merge commit is something that you might want to always do,
+ @cmd{git pull --ff-only} will only do the merge if it will be a fast-forward
+ merge.
+@~
+ Assuming we start again from the @path{foo2} repository before the above
+ pull, we get the same output up to the point where the merge started:
+ @pre|{$ git fetch
+ @i{...}
+ From pltgit:eli/foo
+ 87f1f02..18bc0e6 master -> origin/master
+ $ git log --oneline @{u}..
+ # nothing
+ $ git log --oneline ..@{u}
+ # the same two commits}|
+ The first log doesn't show anything, since we have no commits over the ones
+ in the remote (the “upstream” of our current branch). To see this, consider
+ that after expanding empty names to @cmd{HEAD}, and the @cmd|{@{u}}| to the
+ remote branch name we get @cmd{remotes/origin/master..master}, and this is
+ short for @cmd{^remotes/origin/master master} — the set of commits made of
+ our master branch and all parents, minus the set of commits from the remote
+ branch and up — since it's ahead of that, we get an empty set. The second
+ log command reverses the two, giving us the set of commits that the remote
+ has and the local branch doesn't. In addition to @cmd{git log}, you can use
+ @cmd{gitk} to inspect the repository: use the @cmd{--all} flag to make it
+ show all branches. Either way you'll be able to see that a fast-forward
+ merge is possible.}
+
+@subsection{concurrent development}
+@p*{
+ Again, we'll assume starting with the @path{foo2} repository before the pull.
+ We will now create a new commit before we get changes. This makes it similar
+ to commits pushed to the server while you do your work — so let's see how
+ this common story goes and try to push this change:
+ @pre{
+ $ echo "blah blah" > blah
+ $ git add blah
+ $ git ci -m "blah"
+ $ git push
+ To pltgit:eli/foo
+ ! [rejected] master -> master (non-fast-forward)
+ error: failed to push some refs to 'pltgit:eli/foo'
+ To prevent you from losing history, non-fast-forward updates were rejected
+ Merge the remote changes before pushing again. See the 'Note about
+ fast-forwards' section of 'git push --help' for details.}
+@~
+ As expected, git refuses to push our change. The terminology in the error
+ message is a little confusing — what it basically says is that the commit(s)
+ that we are trying to push are not an extension of the tip of the master
+ branch on the server. A “non-fast-forward update” in this case would mean
+ that we'd set the master branch on the remote to be the same as our branch —
+ but this means that whatever commit that were pushed to the server (the two
+ commits we pushed out from the @path{foo} clone in this case) will be lost.
+@~
+ To merge in the remote changes, we need to pull them in. We'll now look at
+ three different ways to do this.}
+@h3{1. Playing it safe}
+@p*{
+ First, as we've seen above, doing a separate @cmd{git fetch} step would allow
+ you to see where things stand before you do anything. Alternatively, we can
+ use the @cmd{--ff-only} variant of @cmd{git pull}, which will do a merge only
+ if it's a fast-forward one, covering the trivial cases. If a fast-forward
+ merge cannot be done, it will tell you about it and then stop:
+ @pre{$ git pull --ff-only
+ @i{...}
+ From pltgit:eli/foo
+ 87f1f02..18bc0e6 master -> origin/master
+ fatal: Not possible to fast-forward, aborting.}
+@~
+ We can now use the usual tools to see where things stand. The following are
+ all useful here:
+ @ul*{
+ @~ @cmd{gitk --all}@br
+ This visualizes the commit graph. If you do this, you will see the four
+ commits that we have so far: the initial commit at the root, the commit
+ that we did in this clone, and the two commit that we retrieved and are
+ waiting on @cmd{remotes/origin/master}.
+ @~ (a) @cmd|{git log @{u}..}|@br
+ (b) @cmd|{git log ..@{u}}|@br
+ (c) @cmd|{git diff @{u}..}|@br
+ (c) @cmd|{git diff ..@{u}}|@br
+ Inspect the commit that the local branch has over the remote (a), and the
+ two that the remote has over the local one (b); look at the difference
+ between the local branch and the remote either way (c).
+ @~ (a) @cmd|{git log --left-right --oneline @{u}...}|@br
+ (b) @cmd|{git log --left-right --oneline ...@{u}}|@br
+ (c) @cmd|{git log --graph --oneline --all}|@br
+ An alternative notation for specifying commit sets for @cmd{git log} is
+ @cmd{R1...R2} (with @em{three} dots) — this stands for all commits from
+ both R1 and R2 and their parents, but excluding commits from their “merge
+ base” — the parent commit that both descend from. As demonstrated in (a),
+ this is especially useful with the @cmd{--left-right} flag: you'll see the
+ commits that are new on the remote branch and the ones that are new on the
+ local one, with @cmd{<} or @cmd{>} indicating which side each commit is
+ coming from. Yet another way that @cmd{git log} can be used is (c) with a
+ @cmd{--graph} flag, which makes it render the commit graph in ASCII-art.
+ @~ @cmd{git show-branch -a}@br
+ This is another potentially useful commands that shows how commits are
+ distributed over branches in your repository. In this case you will see a
+ matrix with the four commits in separate rows, and each will have a
+ @cmd{+} or @cmd{*} indicating whether it is included in a branch.
+ @~ (a) @cmd|{git diff @{u}...}|@br
+ (b) @cmd|{git diff ...@{u}}|@br
+ The three-dots notation is also used by @cmd{git diff}, with a slightly
+ different semantics than in @cmd{git log} (remember that @cmd{git log}
+ talks about commit sets, and @cmd{git diff} compares two specific points).
+ In the @cmd{diff} case, these compare a specific branch tip to the merge
+ base of this branch and another, which means that you see a diff with the
+ work done on one branch that is not included in the other (this is unlike
+ the two-dot syntax where you get the diff between the two branch tips).
+ In the first (a) example you will see all changes that you did locally,
+ and in (b) the changes that were done remotely.}}
+@h3{2. Merging (not the fast-forward variant)}
+@p*{
+ Now that we did a @cmd{git fetch} or a @cmd{git pull --ff-only} to update the
+ remote branch, we can proceed with merging it into our branch — which we can
+ do in one of two ways:
+ @pre{$ git merge origin # merge origin/master into master
+ Merge made by recursive.
+ @i{...}
+ # -or-
+ $ git pull # will do a merge as usual}
+@~
+ Using @cmd{merge} is better for the usual reason: @cmd{git pull} can bring in
+ more updates that were pushed by others since you fetched, and include them
+ in the merge. Note that if instead of running a separate fetch or pulling
+ with @cmd{--ff-only} you were using @cmd{git pull}, then you'd essentially
+ get to the same point we are now at.
+@~
+ Either way, the merge tells us “Merge made by recursive” — which is important
+ here: if we see “Fast-forward” it means that no new commits were made, but
+ “Merge made ...” means that a merge commit was created (“by recursive” refers
+ to the merge strategy, git has several of them). If we look at the commit
+ graph now with gitk or with @cmd{git log}, we'll see the new commit that was
+ created:
+ @pre{$ git log --oneline --graph --date-order
+ * 12bf7ee Merge remote branch 'origin'
+ |\
+ * | a40a45f blah
+ | * 18bc0e6 even more stuff
+ | * b7d3c41 more stuff
+ |/
+ * 87f1f02 initial content}
+ The merge was successful without manual intervention (you didn't need to
+ resolve any conflicts), so it proceeded immediately to create the commit
+ that connects the two lines of development, and used some standard
+ template for the commit log. This is the safe thing to do as the
+ default for git, but in this case it makes the history complicated with
+ no good reason — it just happened that while we were working on @cmd{blah}
+ someone else pushed an unrelated change. (It could be related: perhaps
+ one of the remote commits was referring to the file that I've added, but
+ especially in the PLT case this would be very rare.) If you use gitk to
+ look at the recent history of the repository, you'll see many such
+ commits, and they can make it harder to figure out how the development
+ went on.}
+@h3{3. Rebasing}
+@p*{
+ To get a simple/readable history, the goal is to have a more linear history:
+ just have the two remote commits and move our commit to follow them (which
+ would be the history that you get with subversion under similar conditions).
+ But our commit object already points to its parent, so it @em{cannot} move:
+ in the above graph, @cmd{a40a45f} is a hash that was computed based on
+ @cmd{87f1f02} being its parent, and changing a parent means getting a new
+ hash and therefore a new commit object.
+@~
+ This is where @cmd{git rebase} gets into the picture. Assuming that we
+ didn't merge as described above, we would just use @cmd{rebase} instead:
+ @pre{$ git log --graph --all --oneline
+ * a40a45f blah # \
+ | * 18bc0e6 even more stuff # \
+ | * b7d3c41 more stuff # > same tree as before the merge
+ |/ # /
+ * 87f1f02 initial content # /
+ $ git rebase origin
+ First, rewinding head to replay your work on top of it...
+ Applying: blah # git tells us that it re-applies this
+ $ git log --graph --all --oneline
+ * 6ebd1fb blah # \
+ * 18bc0e6 even more stuff # \ the resulting history
+ * b7d3c41 more stuff # / is linear
+ * 87f1f02 initial content # /}
+@~
+ Here's what git did in this rebase: it (1) moved the HEAD to the merge base
+ between your local branch and the remote one — the @cmd{87f1f02} commit; (2)
+ did a fast-forward merge, which just moves your local branch to the tip of
+ the remote one; (3) it now @em{replays the same changes} that you had in your
+ commits (only @cmd{a40a45f} in this case) on top of the new tip, leading to
+ @em{new} commit objects (@cmd{6ebd1fb} in here). So we end up with a fresh
+ commit object, and the old one (@cmd{a40a45f}) is gone. (It's not really
+ gone — it's kept in your repository store for a while, to protect you from
+ losing work.) If you look at the complete details of the new commit using
+ gitk or @cmd{git log}, you will see that this commit has different dates for
+ the author and the committer date:
+ @pre{$ git log --format=fuller -1 6ebd1fb # all details, show only one commit
+ @i{...}
+ AuthorDate: 2010-05-02 10:26:00 -0500
+ @i{...}
+ CommitDate: 2010-05-02 10:30:00 -0500}
+ This is because rebasing just created a new commit — but the author time is
+ still considered the same. In any case, you now have linear history that is
+ a proper descendant of the remote branch, and you can now push your changes
+ out.
+@~
+ Practically every place where you read about rebasing in git will warn you
+ about not doing it for public history. The problem is that if someone had a
+ copy of your previous branch (@cmd{a40a45f}), then next time they will
+ update, if their copy of the branch is updated, then things will change in a
+ nasty way: even more so if they've committed more changes on that branch.
+ (This “someone” can also be yourself, of course.) This also explains why
+ @cmd{git pull} does not rebase by default.
+@~
+ As far as the PLT repository goes, server will never allow pushes that are
+ not strict extensions of what's on it (in other words, it only allows
+ fast-forward pushes) — so you won't be able to mess things up for others.
+ But as long as your change is something that you work on privately, there is
+ absolutely no problem in doing this. Note that this is the same thing as
+ with re-doing commits because of mistakes: as long as a commit did not go
+ out, you can fix it in multiple ways; when it does get pushed to the server,
+ the only practical way to fix it is by pushing another commit. (In some
+ extreme cases we may do such a thing: for example, if you commit a passwords
+ file, then there is no other way to remove it completely from the repository
+ — but these are very rare, and such fixes affect everyone.)
+@~
+ It is therefore best to get one of two habits when you do a @cmd{git pull}:
+ either use @cmd{--ff-only} or @cmd{--rebase}. The latter is a little more
+ convenient but you might not feel comfortable about doing a rebase
+ automatically — it @em{might} just be that someone worked on the same set of
+ files, and you really prefer a plain merge. For this, you might prefer using
+ @cmd{--ff-only} which will automatically work in the trivial cases, and
+ otherwise leave you in a state where you can look at things and decide how to
+ proceed yourself.}
+
+@subsection{additional forms of history tweaking}
+@p*{
+ As described in the previous section, rebasing is not some kind of a magical
+ operation: it is really just an expected by-product of the way git works — of
+ the fact that commits can be created as descendants of any commit (not just
+ tip commits). You could perform a rebase manually by starting from some
+ commit, then inspect each of the changesets that the rebased history
+ contains, and play them back on the new commit. (Lumping this tediousness
+ lead to a script, which lead to the rebase command.) This means that you
+ don't really have to limit yourself to replaying these commits exactly as
+ they were — for example, you could write new commit log messages, combine two
+ commits into one, drop some commits, or reorder their order.
+@~
+ The rebase command has a flag that makes doing all of these things easy:
+ @cmd{--interactive}. Continuing the above example, we now have four commits
+ in our history — and say that we want to tweak the last two. If we now run
+ @pre{$ git rebase HEAD~2}
+ we ask git to rebase our current head off of its grandparent commit (remember
+ that @cmd{HEAD^} is the parent, @cmd{HEAD^^} is the grandparent, and
+ @cmd{HEAD~2} is an alternative syntax for @cmd{HEAD^^}). Since it @em{is}
+ already based there, @cmd{git rebase} does nothing, and tells us that the
+ branch is up to date. But if we add @cmd{--interactive}, we get something
+ different: git pops up an editor with this text:
+ @pre{pick 18bc0e6 even more stuff
+ pick 6ebd1fb blah
+ # Rebase b7d3c41..6ebd1fb onto b7d3c41
+ # @i{...}}
+ This is a listing of the last two commits with their one-line log messages.
+ As the text that is below these lines say, you can replace the @cmd{pick}
+ before a commit with a different command: you can use @cmd{reword} to get to
+ write a different log message (you will get another editor window to do the
+ editing), @cmd{squash} to combine a commit with the previous one (it will let
+ you edit the log message for the combination), @cmd{fixup} which does the
+ same but discards the log message, and finally @cmd{edit} will make the
+ rebasing process stop at the relevant commit and let you tweak it before it
+ continues. In addition, removing a commit line means that the commit will be
+ skipped, and reordering lines will replay the commits in a different order.
+ If any of these lead to conflicts, the rebasing will stop for a manual
+ resolution, and you'll need to @cmd{git rebase --continue} when resolved, or
+ @cmd{git rebase --abort} to get back to the original state.
+@~
+ Note that since our @path{foo2} clone tracks a public @path{foo} repository,
+ this particular rebase is bad: we intend to edit the last two commits, but
+ only one is local to @path{foo2} — the other is a commit that we got from
+ @path{foo}, and changing it means that we will get a rebased commit based on
+ its parent, and the server will forbid pushing it later. If you see that you
+ went too far when you see the rebase editor, all you need to do is keep those
+ lines untouched: in the trivial case of leaving an initial set of commits in
+ unmodified, they will be “rebased” by leaving them in as is. (If you inspect
+ the history later, you will see that they have the same SHA1s.)
+@~
+ A useful case of using @cmd{squash} (or @cmd{fixup}) with interactive
+ rebasing is doing @cmd{checkpoint} commits frequently, and eventually
+ combining them to a single commit. This demonstrates one a popular git
+ principle: keep commits as logical units that correspond with the changes
+ done, since there is no central server that dictates a
+ public-commit-or-nothing. Doing these will make it easier in the future to
+ deal with the history: inspect the changeset as a whole, undo it, and it also
+ works well with finding bugs in the history — you can have checkpoints for
+ intermediate states of the code even if it doesn't work, since this state
+ will eventually be hidden.
+@~
+ A particularly common case for editing history is “oops commits”: you just
+ made some change, committed it, and then realized that something is wrong —
+ you forgot to change some related reference, to remove some debugging
+ printout, or to describe some new aspect of the commit. You could use
+ @cmd{git rebase HEAD^} in these cases to rebase just the last commit while
+ editing it, but there is a much more convenient way to do this: @cmd{git
+ commit --amend}. Usually, @cmd{git commit} creates a new commit based on the
+ current branch tip and a given commit log message, but with @cmd{--amend} it
+ does something different: it takes a snapshot of the tree as usual, but it
+ makes the commit be a descendant of the tip's parent commit. For example,
+ assuming we didn't really change anything with the rebase above, our history
+ and recent change is now:
+ @pre|{$ git log --oneline
+ 6ebd1fb blah
+ 18bc0e6 even more stuff
+ b7d3c41 more stuff
+ 87f1f02 initial content
+ $ git log --oneline -p -1 # -p => show patch, -1 => only one
+ 6ebd1fb blah
+ diff --git a/blah b/blah
+ @i{...}
+ --- /dev/null
+ +++ b/blah
+ @@ -0,0 +1 @@
+ +blah blah # this is the recent change}|
+ This is obviously wrong — we need to have three “blah”s there. With
+ subversion we would now need to perform an “oops, forgot a blah” commit, and
+ in fact, we would need to do the same with git if we push these change out
+ now. But as long as we didn't, we can fix it without an additional commit
+ using @cmd{--amend}:
+ @pre{$ echo "blah blah blah" > blah
+ $ git add blah
+ $ git ci --amend -m "blah^3"
+ [master 5cf863d] blah^3 # the re-made commit
+ $ git log --oneline
+ 5cf863d blah^3
+ 18bc0e6 even more stuff
+ b7d3c41 more stuff
+ 87f1f02 initial content}
+ As you can see, the last commit is gone (remember that it is still backed up,
+ in case of problems), and there is a completely new commit instead. Usually,
+ @cmd{--amend} is used without @cmd{-m} — the log message editor will be
+ initiated with the previous log message, so you can edit it instead of
+ rewriting it from scratch. If there were no modifications to be committed, a
+ @cmd{git commit --amend} is a convenient way to edit the last commit message
+ only. If you leave the message untouched, a new commit will still be made —
+ one with a new commit time; and if you delete the text completely, the
+ re-commit will be aborted, and you will be left with the old one intact.}
+
+@subsection{resetting the tree}
+@p*{
+ Both the @cmd{commit --amend} feature and rebasing build on the ability to
+ “move” the current branch tip to some earlier commit in its history. To do
+ this directly, git provides a @cmd{git reset} command, which can move the
+ current branch tip to a specified commit, and adjust the working directory
+ and/or the staging area accordingly. For example, for the @cmd{--amend}
+ functionality, you will use @cmd{HEAD^} to move the branch tip to its parent
+ commit. You can of course specify any other commit to move to, and since git
+ branches are effectively short bookmarks, you can create branches to be able
+ to move to them later on (or as targets for rebasing, merging, etc). In
+ addition, you can use @cmd{HEAD} (or just omit the target, since @cmd{HEAD}
+ is the default) to only change the working directory (or staging area).
+@~
+ The reset command has three major modes for its work, specified with a flag.
+ (See the @man{git-reset} man page for a more thorough explanation, with lots
+ of usage examples, some have evolved into their own functionality — like the
+ @cmd{--amend} feature.) Using @cmd{HEAD^} as the target commit, here are
+ some summaries of how it can be used:
+ @ul*{
+ @~ @cmd{git reset --hard HEAD^}@br
+ This will move the branch tip to the previous commit, and will change the
+ working tree and the staging area to match. Translation: completely
+ forget the last commit and any work in the working directory.
+ @~ @cmd{git reset --soft HEAD^}@br
+ Moves the branch tip, but does not change the working directory or the
+ staging area. Translation: undo the last commit, and leave your working
+ directory in a state where @cmd{git commit} will get the same change in.
+ (Note: not the @cmd{git ci} script — this will add changes in the working
+ directory, if any.)
+ @~ @cmd{git reset --mixed HEAD^}@br
+ Moves the branch tip and the staging area to the parent commit, and any
+ modifications done by the commit that are going to be lost are put in your
+ working directory. Translation: similar to the @cmd{--soft} version,
+ except that the staging area is cleared, so to recommit the changes you
+ will need to add files again (or use @cmd{git ci} as usual).
+ @br
+ (Note: This is the default mode.)}
+@~
+ When using @cmd{HEAD} (which is the also default when nothing is mentioned),
+ the branch tip is not moved, and we get:
+ @ul*{
+ @~ @cmd{git reset --hard}@br
+ Get rid of all changes to the working directory and the staging area.
+ Translation: lose all work that was not committed, getting back to the
+ content on the branch (a convenient way to do something similar to an
+ @cmd{svn revert -R .} in the root of a subversion working directory).
+ @~ @cmd{git reset --mixed}@br
+ Get rid of all changes to the staging area, leaving your working directory
+ intact. Translation: lose everything that was added to the staging area.
+ If you're avoiding it (for example, if you only use the @cmd{git ci}
+ script), then this would be a no-op other than new files that were
+ @cmd{git add}ed.
+ @~ (@cmd{git reset --soft} is a no-op.)}
+@~
+ When @cmd{git reset} changes the HEAD, it creates another toplevel reference
+ name called @cmd{ORIG_HEAD} that points to the previous commit that
+ @cmd{HEAD} pointed at, so if you happen to @cmd{git reset --hard HEAD^} by
+ mistake, you can immediately get back to it with
+ @cmd{git reset --hard ORIG_HEAD} (but changes in the working tree would still
+ be lost). (@cmd{git merge} is another command that changes the @cmd{HEAD},
+ and it also saves the previous value in @cmd{ORIG_HEAD}.) Finally, note that
+ @cmd{git reset} can be restricted to make it work only on a specific set of
+ paths, not on the whole repository.}
+
+@subsection{other forms of reverting}
+@p*{
+ While we're on the topic of reverting files, there are three more things
+ worth mentioning:
+ @ul*{
+ @~ @cmd{git checkout -- @i{path ...}}@br
+ @p*{
+ When @cmd{git checkout} is given some paths, it will only check out the
+ relevant files from their state in the staging area. This is a more
+ popular way to revert changes to a specific file. If you avoid using
+ the staging area, then this is roughly the same as using reset with the
+ @cmd{--hard} flag on the paths, since your staging area will usually be
+ the same as your @cmd{HEAD}. (Note that the @cmd{--} is optional, and
+ needed only when a path name can be confused with a branch name.)
+ @~
+ As with @cmd{reset}, you can also specify a branch to check the path(s)
+ from — which is useful to try some files from a different branch
+ selectively. However, note that unlike subversion, git does not
+ remember the association of the branch and the paths that were checked
+ out of it (the branch is not “sticky”) — the files will simply be
+ considered as modified (and they will not be updated when the branch is,
+ unless you do the same checkout).}
+ @~ @cmd{git show HEAD:@i{path}}@br
+ @p*{
+ This shows the file as it exists in the @cmd{HEAD}, making it useful to
+ inspect the file before you made some additional modifications (similar
+ to @cmd{svn cat @i{path}}). You can also omit the @cmd{HEAD} — using
+ @cmd{:@i{path}} will show the file in the staging area, which will
+ usually be the same as the @cmd{HEAD}. One caveat to note here is that
+ the path should be the full path relative to the repository root.
+ (Note: I have a wrapper @cmd{git cat} script that emulates
+ @cmd{svn cat}, I'll add it if anyone wants.)}
+ @~ @cmd{git revert @i{commit}}@br
+ @p*{
+ The @cmd{git revert} command is used to revert the changes introduced by
+ the given commit. It will basically apply the change in that commit
+ in reverse, then ask you for a log message for a new commit where the
+ message is initially populated with text indicating the commit that
+ was applied in reverse.
+ @~
+ Note that this is very different from @cmd{svn revert} — it is more like
+ @pre{svn merge -c-123@";" svn commit "Revert revision 123"}
+ Since this is a frequent source of confusion, the @man{git-revert} man
+ page mentions it at the top, and it refers readers to @cmd{git reset}
+ and @cmd{git checkout} as the way to do the equivalent of @cmd{svn
+ revert} (which are described above.)}}}
+
+@subsection{dealing with conflicts}
+@p*{
+ We'll now see how to deal with merge conflicts. First, we'll set up the
+ repository for a conflict. Continuing with the @path{foo2} clone, we'll
+ first create a file (which I'll do here using shell commands, to make it easy
+ to play with), commit, and push the new history (which includes the blah
+ work) back to the server. Note the use of @cmd{git branch -v} which shows
+ the local @cmd{master} branch and the fact that there's two commits that we
+ haven't pushed out yet.
+ @pre{$ echo "#lang racket" > foo
+ $ echo "(define (foo x)" >> foo
+ $ echo " (* x x))" >> foo
+ $ git ci -m "turn foo into a library"
+ [master fd856ef] turn foo into a library
+ $ git branch -v
+ * master fd856ef [ahead 2] turn foo into a library
+ $ git push
+ To pltgit:eli/foo
+ 18bc0e6..fd856ef master -> master}
+@~
+ Now hop over to the @path{foo} clone, get the changes (the relevant bits of
+ the output are shown), edit the file (using sed, to make it a command line),
+ inspect the change, commit it, and push.
+ @pre|{$ cd ../foo
+ $ git pull
+ From pltgit:eli/foo
+ 18bc0e6..fd856ef master -> origin/master
+ Updating 18bc0e6..fd856ef
+ Fast-forward
+ blah | 1 +
+ foo | 6 +++---
+ 2 files changed, 4 insertions(+), 3 deletions(-)
+ create mode 100644 blah
+ $ sed -i '2s/x/[x 0]/' foo
+ $ git diff
+ diff --git a/foo b/foo
+ index 78d9889..b81de80 100644
+ --- a/foo
+ +++ b/foo
+ @@ -1,3 +1,3 @@
+ #lang racket
+ -(define (foo x)
+ +(define (foo [x 0])
+ (* x x))
+ $ git ci -m 'add a default value'
+ [master 5035c9a] add a default value
+ $ git push
+ To pltgit:eli/foo
+ fd856ef..5035c9a master -> master}|
+@~
+ And now get back to @path{foo2}, and before we pull, modify the same line by
+ adding a comment and commit, then do a @cmd{--ff-only} pull and watch it
+ refuse to merge as expected, then look at the history so far.
+ @pre{$ cd ../foo2
+ $ sed -i '2s/$/ ; int->int/' foo
+ $ git ci -m 'document the type of foo'
+ [master 21a78df] document the type of foo
+ $ git pull --ff-only
+ From pltgit:eli/foo
+ fd856ef..5035c9a master -> origin/master
+ fatal: Not possible to fast-forward, aborting.
+ $ git log --graph --all --oneline -4
+ * 21a78df document the type of foo # ← our change
+ | * 5035c9a add a default value # ← the conflicting change we pulled
+ |/
+ * fd856ef turn foo into a library
+ * 5cf863d blah^3}
+@~
+ Rebasing is the common thing to do, but let's see what happens with a plain
+ @cmd{merge} first:
+ @pre{$ git merge origin
+ Auto-merging foo
+ CONFLICT (content): Merge conflict in foo
+ Automatic merge failed; fix conflicts and then commit the result.}
+@~
+ We now have a conflict that needs to be resolved before we can finish the
+ merge. Using @cmd{git st} (the alias listed above for the svn-like status
+ that @cmd{git status -s} produces) shows a new @cmd{UU} status for @path{foo}
+ — this indicates an “unmerged” (conflicted) file. To investigate further, we
+ use a plain @cmd{git status}, which tells us that our history diverged from
+ the remote (we already know that since @cmd{pull --ff-only} failed) and count
+ the diverging commits, and it also tells us that @path{foo} is unmerged and
+ hints at using @cmd{git add} to resolve it:
+ @pre{$ git st
+ UU foo
+ $ git status
+ # On branch master
+ # Your branch and 'origin/master' have diverged,
+ # and have 1 and 1 different commit(s) each, respectively.
+ # Unmerged paths:
+ # (use "git add/rm ..." as appropriate to mark resolution)
+ # both modified: foo}
+ You can also see that git knows about the conflict and refuses to do a
+ commit:
+ @pre{
+ $ git commit
+ fatal: 'commit' is not possible because you have unmerged files.
+ Please, fix them up in the work tree, and then use 'git add/rm ' as
+ appropriate to mark resolution and make a commit, or use 'git commit -a'.}
+@~
+ In most cases the way to continue is simple: open the conflicted file in your
+ editor, look for the conflict markers and fix the code. Then, as suggested
+ above, use @cmd{git add @i{file}} which tells git that the file is resolved,
+ and finally use @cmd{git commit} to commit the result. (Note that using
+ @cmd{git commit @i{file}} will not work, which is why the @cmd{git-ci} script
+ avoids adding a @path{.} if the tree requires resolving a merge.) I'll
+ simulate the editing part with echos, and then mark it resolved:
+ @pre{$ echo "#lang racket" > foo
+ $ echo "(define (foo [x 0]) ; int->int" >> foo
+ $ echo " (* x x))" >> foo
+ $ git add foo # ← tell git that it's resolved}
+ And now the last step is to run @cmd{git commit}, which will start your
+ editor to edit the log message — it will be populated by text that indicates
+ the merge and the file that had conflicts, which you can commit as is, or add
+ some text regarding the way it was resolved.
+@~
+ At this point (or before we started working on resolving the conflict), we
+ can get back to the original state using @cmd{reset}:
+ @pre{$ git reset --hard
+ HEAD is now at 21a78df document the type of foo}
+@~
+ This kind of reset is generally useful if you had some problematic conflict
+ to resolve and you want to back up completely and re-try. But now that we've
+ at the start, we will see what happens when we try to rebase with the
+ conflict instead:
+ @pre{
+ $ git rebase origin
+ First, rewinding head to replay your work on top of it...
+ Applying: document the type of foo
+ @i{...}
+ CONFLICT (content): Merge conflict in foo
+ Failed to merge in the changes.
+ @i{...}
+ When you have resolved this problem run "git rebase --continue".
+ If you would prefer to skip this patch, instead run "git rebase --skip".
+ To restore the original branch and stop rebasing run "git rebase --abort".}
+@~
+ Obviously, we get a different message (note that @cmd{git status} will now
+ tell you that you're not currently on any branch — a result of being in the
+ middle of a rebase). The process that follows is very similar to the merge
+ case: edit the conflict away, then @cmd{git add} the file. There are two
+ differences: (1) after you @cmd{git add} the resolved files, you should use
+ @cmd{git rebase --continue} instead of committing[*]; (2) if you want to
+ abort the merge, use @cmd{git rebase --abort} instead of using reset.
+ @small{([*] If you did commit, then it means that you wrote a new log message
+ for the replayed commit, and you can just as well use the @cmd{--skip} flag
+ so rebasing continues with the rest, or you can use @cmd{reset} to undo
+ your commit and let rebase do it for you.)}
+@~
+ When you're in a conflicted state, there are a few git tools that help you in
+ the resolution work. The first useful utility is @cmd{git diff}: when
+ there's a conflict, all files that were automatically merged are already
+ going to be in your staging area, and parts of conflicted files that could be
+ merged merged will be there too. This leaves only the conflict regions in
+ your working directory, which means that @cmd{git diff} will show you only
+ the conflicts (since by default it shows differences between the working
+ directory and the staging area). Also, the diff output itself is not a
+ standard one. At the current point of conflict during the rebase that we
+ started, this is what we'll see:
+ @pre|{$ git diff
+ diff --cc foo
+ index b81de80,86a4c54..0000000
+ --- a/foo
+ +++ b/foo
+ @@@ -1,3 -1,3 +1,7 @@@
+ #lang racket
+ ++<<<<<<< HEAD
+ +(define (foo [x 0])
+ ++=======
+ + (define (foo x) ; int->int
+ ++>>>>>>> document the type of foo
+ (* x x))}|
+@~
+ The diff header uses @cmd{--cc} which indicates git's “combined diff format”,
+ used to represent merge commits (any commit with more than one parent). The
+ next line has the two SHA1s of the two files that are merged. The diff
+ itself starts with three @cmd|{@}|s, and instead of a single indicator
+ character (@cmd{+}, @cmd{-}, or @cmd{ }), there are two — indicating a
+ three-way diff between the two versions and their common ancestor version.
+ In the above you can see that the line with the optional argument is coming
+ from @cmd{HEAD}, and the type-annotated one is coming from its commit. You
+ might notice that this look backwards, since we're in the repository where we
+ committed the type annotation to the HEAD — but we're now rebasing, which
+ means that we start from the remote branch and merge our local changes into
+ it, essentially making the rebase perform merges in the other way than plain
+ merges. The conflict markers themselves are marked as new in both versions,
+ and the labels that follow them depend on available information (in a
+ @cmd{merge}, we would see @cmd{HEAD} and @cmd{origin}).
+@~
+ During a conflict resolution, the staging area actually holds three versions
+ of each file: the common ancestor, our version, and the merged version.
+ These things are called “file stages”, and they can be accessed using a
+ special syntax:
+ @pre{$ git show :1:foo # the common ancestor of both versions
+ $ git show :2:foo # our version (optional argument)
+ $ git show :3:foo # merged version (type-annotated)}
+ (Again, remember that this is a rebase, so the last two are swapped.) You
+ can also checkout one of these versions using @cmd{git checkout foo}, giving
+ it an @cmd{--ours} or @cmd{--theirs} flag to specify which version you want
+ to use; and you can use @cmd{git diff} to compare against them. For example,
+ we resolve the file (as above) and then try the different diffs (before we
+ mark it as resolved) — these examples only show the changed lines from each
+ of the diffs:
+ @pre{$ echo "#lang racket" > foo
+ $ echo "(define (foo [x 0]) ; int->int" >> foo
+ $ echo " (* x x))" >> foo
+ $ git diff -1 foo # can also use --base
+ -(define (foo x) # original version
+ +(define (foo [x 0]) ; int->int # new version
+ $ git diff -2 foo # can also use --ours
+ -(define (foo [x 0])
+ +(define (foo [x 0]) ; int->int
+ $ git diff -3 foo # can also use --theirs
+ -(define (foo x) ; int->int
+ +(define (foo [x 0]) ; int->int}
+@~
+ Finally, @cmd{git log} and @cmd{gitk} accept a @cmd{--merge} flag which shows
+ commits relevant to a merge. With @cmd{git log} the @cmd{--left-right} flag
+ is useful here, since you'll see which side the relevant commits are on.
+ (But this works only in @cmd{git merge}, not in rebasing.)
+@~
+ Again, when you're happy with the resolution, you @cmd{git add} the file, and
+ because we're doing a @cmd{rebase} rather than a @cmd{merge}, use use it to
+ continue:
+ @pre{$ git add foo
+ $ git rebase --continue
+ Applying: document the type of foo
+ $ git log --graph --all --oneline -4
+ @i{...linear history...}}
+ Note that @cmd{git rebase --continue} did the commit of the resolved content
+ for you, and it used the previous commit message you've written. This is a
+ good rule-of-thumb for deciding whether you should rebase or merge: if the
+ commit message are still fine as a description of the modifications, then a
+ rebase is fine; otherwise you might want to @cmd{merge} instead.}
+
+@subsection{copying/renaming files}
+@p*{
+ Git is, by design, tracking snapshots of the complete repository tree.
+ Specifically, it does @em{not} keep explicit track of file/directory copies
+ and renames. Instead, it provides ways to infer such changes in the
+ repository based on the content. As a result of this, there are almost no
+ git commands that deal with file movements:
+ @ul*{
+ @~ There is no @cmd{git copy} command: you just copy the file and add the new
+ one as usual.
+ @~ There @em{is} a @cmd{git rm} command, but its purpose is mostly to remove
+ a file from the staging area. You could also just remove the file outside
+ of git, and then use either @cmd{git commit @i{removed-file}} or
+ @cmd{git commit @i{containing-directory}} to remove it (or using the above
+ script — @cmd{git ci} in the same directory). @cmd{git rm} will delete
+ the file from the staging area so you can do a plain @cmd{git commit}
+ without naming any paths.
+ @~ For the same reason, there is a @cmd{git mv} command — it uses
+ @cmd{git rm} as above to update the staging area, and if you're fine with
+ ignoring it, then you can just rename the file outside of git, and
+ @cmd{git add} the new version — but as we will soon see, it's really best
+ to use @cmd{git mv} to avoid the possible confusion if you want the file's
+ history to be visible.}
+@~
+ To try things out, let's properly name the @path{foo} library:
+ @pre{$ mv foo foo.rkt
+ $ git st
+ D foo
+ ?? foo.rkt}
+ As you can see, we forgot to @cmd{git add} the new file, so if we commit now
+ we'll only be committing the deletion. An important thing to note here is
+ that when git infers file copying and renaming, it does so only when the
+ operations appear in a @em{single} commit. So if we commit this change and
+ later commit a new version with the new file will make it lose connection to
+ its history. As long as you didn't push the new commits out, you can still
+ fix it: simply use @cmd{git rebase --interactive}, and squash the file
+ addition together with the deletion. But let's start over and do the rename
+ the easy way:
+ @pre{$ rm foo.rkt
+ $ git reset --hard
+ $ git mv foo foo.rkt
+ $ git ci -m "properly name the foo library"}
+ to see this commit, we can use @cmd{git show} (which can show arbitrary
+ objects, but with no arguments it shows the @cmd{HEAD}). @cmd{git diff} can
+ also be used to show only the diff part — using the @cmd{HEAD^!} syntax that
+ roughly means the range from the previous HEAD to the current one:
+ @pre{$ git show
+ @i{...log message...}
+ @i{...addition+deletion...}
+ $ git diff HEAD^!
+ @i{...addition+deletion...}
+ $ git diff --stat HEAD^! # shows an overview of the changes
+ foo | 3 ---
+ foo.rkt | 3 +++
+ 2 files changed, 3 insertions(+), 3 deletions(-)
+ $ git log --oneline foo.rkt
+ 599b3b6 properly name the foo library}
+ All of these show the two operations as disconnected, and the log doesn't
+ show any of the prior history. The thing is that you need to ask git to look
+ for file operations, and the @cmd{-M} and @cmd{-C} flags do that. In
+ addition, @cmd{git log} needs a @cmd{--follow} flag to make it follow history
+ beyond renames (but note that it can do that only when given a single file
+ path). For example:
+ @pre{$ git diff -M --stat HEAD^!
+ foo => foo.rkt | 0
+ 1 files changed, 0 insertions(+), 0 deletions(-)
+ $ git log --oneline --follow foo.rkt
+ 599b3b6 properly name the foo library
+ 0fb8291 document the type of foo
+ 5035c9a add a default value
+ @i{...}}
+ In this case the rename was a trivial one as were no other changes. This
+ makes it especially easy to find renames since the SHA1 of the file would be
+ the same. But git considers such operations as renames as long as they're
+ “similar enough” — for example, if you just rename some files and change some
+ @cmd{require}s as a result, it will be detected as renames. (The usual claim
+ is that when the content is not similar enough, you can just as well claim
+ that the file is new.) If you think that you might be doing too many changes
+ to some files, and you want to preserve the connection, you can do only the
+ rename in one commit, and then the modifications in the next.
+@~
+ An added benefit of this mode of work is that @cmd{git blame} can find lines
+ in files that were copied from other files, and deal naturally with a file
+ that is split into two files etc. Like @cmd{log} and @cmd{diff}, it needs
+ some flags to do the extra work (see @cmd{-M} and @cmd{-C}).}
+
+@subsection{managing branches}
+@p*{
+ As seen in various places above, a branch in git is basically just a SHA1
+ pointer to a commit (and therefore to the whole line of commits in its
+ development line), with a naming hierarchy that follows some conventions
+ (@path{/}-separated, @cmd{master} as the main one, @cmd{remotes} prefix for
+ remote branches, @cmd{origin} for the default remote server name, etc). You
+ can see all of this in the toplevel @path{.git} meta directory — there is a
+ @path{HEAD} file which represents the head, its content will be a line that
+ looks like @cmd{ref: refs/heads/master}, and there will be a
+ @cmd{refs/heads/master} file with a content that is the actual SHA1. There
+ are, of course, various other bits of meta-data, so it's not a good idea to
+ change such files directly (for example, when there are many names git will
+ create a “packed” reference file with many references for efficiency) — but
+ overall this is the basic idea.
+@~
+ Branches come in two main kinds: local branches and remote ones, with remote
+ branches having a name that begins with @cmd{remotes/origin/}. (Later we'll
+ see how to add new remote repositories — remote branches from these will have
+ names that start with @cmd{remotes/@i{remote-name}/} instead.) The
+ difference between the two is that a remote branch is a way to mirror a
+ branch on a remote repository — it is not intended for local work. For
+ example, if you try to check out a remote branch, git will check out a
+ “detached head” (details on this below). If you do that, you'll see that the
+ @path{HEAD} file will have an explicit SHA1 rather than the usual
+ @cmd{ref: @i{branch-name}}.
+@~
+ The @cmd{git branch} command is the main way to manage branches. With no
+ flags, it will just print out the list of local branches, marking the current
+ branch with a @cmd{*}. You can add flags to show remote branches instead
+ (@cmd{-r}), both kinds (@cmd{-a}), and also to list more information on the
+ branches (@cmd{-v}):
+ @pre{$ git branch
+ * master
+ $ git branch -r
+ origin/HEAD -> origin/master # (this one is symbolic too)
+ origin/master
+ $ git branch -av
+ * master 599b3b6 [ahead 2] properly name the foo library
+ remotes/origin/HEAD -> origin/master
+ remotes/origin/master 5035c9a add a default value}
+@~
+ When given a single name argument, a branch by that name will be created, and
+ it will point to where the @cmd{HEAD} currently points to; a second argument
+ can be a name of an existing branch (or any commit) that the new branch will
+ start at. In addition to creating branches starting from the current head,
+ this can be useful in creating branches that start from elsewhere, even from
+ a “detached head”. For example, say that in our current repository we want
+ to try out some work based on the state of things before the last commit
+ (which renamed the @path{foo} file). We can check out @cmd{HEAD^} (which
+ will lead to a detached HEAD), and then create a branch for it:
+ @pre{$ git checkout HEAD^
+ @i{...}
+ You are in 'detached HEAD' state.
+ @i{...}
+ HEAD is now at 0fb8291... document the type of foo
+ $ cat .git/HEAD
+ 0fb8291... # doesn't point to a branch
+ $ git branch
+ * (no branch) # you can see it here too
+ master
+ $ git status
+ # Not currently on any branch. # and here
+ $ git branch pre-rename # create a branch here
+ $ git branch
+ * (no branch) # we're still detached
+ master
+ pre-rename
+ $ git checkout pre-rename
+ Switched to branch 'pre-rename'}
+ As you can see, creating a branch doesn't check it out — even when the new
+ branch is exactly where we already are. The difference is related to the
+ nature of @cmd{HEAD}: it is usually an indirect reference to a branch name,
+ and when a commit is made, the branch that @cmd{HEAD} points to is updated.
+ But when we are using a detached HEAD, it points directly at a SHA1 —
+ committing in this state will work, and the HEAD will point at the newly made
+ commit — but there will be no branch that will be updated, so if you checkout
+ a different branch (or a different commit) now, the commits you made are
+ “lost”.
+@~
+ The main reason that such commits will be lost is that git branches don't
+ live inside the repository store — and dealing with branches is not something
+ that gets recorded as part of the history. To make things safer, git
+ maintains something that is known as the “reflog”, which keeps track of where
+ your branches have been — those are kept for a while (usually around a
+ month), which means that you can easily go back to a previous commit if it
+ seems that you lost one (eg, as a result of committing on a detached HEAD).
+ (You can see these files in the @path{.git/logs} directory.)
+@~
+ Since creating a new branch and checking it out is a common combination, the
+ @cmd{checkout} command can create a branch before checking it out. Use the
+ @cmd{-b} flag for this:
+ @pre{$ git checkout -b also-pre-rename
+ Switched to a new branch 'also-pre-rename'
+ $ git checkout -b post-rename master
+ Switched to a new branch 'post-rename'
+ $ ls
+ bar blah foo.rkt
+ $ echo "one more line" >> foo.rkt
+ $ git ci -m "one more line"}
+@~
+ Finally, you use the @cmd{-d} flag to delete branches.
+ @pre{
+ $ git branch -d post-rename # won't allow it
+ error: Cannot delete the branch 'post-rename' which you are currently on.
+ $ git checkout master
+ Switched to branch 'master'
+ Your branch is ahead of 'origin/master' by 2 commits.
+ $ git branch -d post-rename
+ error: The branch 'post-rename' is not fully merged.
+ If you are sure you want to delete it, run 'git branch -D post-rename'.}
+ As you can see, git refuses to delete a branch that has unmerged work, since
+ this can lead to losing that unmerged work — so you need to use @cmd{-D} for
+ that. In addition, you usually don't delete remote branches, when you do,
+ you need to use the @cmd{-r} flag too.}
+
+@subsection{using branches}
+@p*{
+ Since git branches are so light weight, they fit any kind of parallel work
+ you need to do on several different topics. A result of that is that it is
+ possible to start a new branches for any work you'd want to do — and this is
+ common enough that there's a name for such branches, they're called “topic
+ branches”. Such branches are created from the master branch (usually) and
+ worked on in parallel. At any point where you want to work on something new,
+ you would create a new branch for it and switch to it (committing any work
+ you might have on your current branch before you do so):
+ @pre{$ git checkout -b improve-bar master # switch to a fresh topic branch
+ Switched to a new branch 'improve-bar'
+ $ echo "even more bar" >> bar # work there
+ $ git ci -m "improved bar" # save that work
+ $ git checkout post-rename # go back to where we were}
+@~
+ If you need to commit changes before you create the new branch, you shouldn't
+ have any problems doing so — because you can change where a branch points to,
+ you can just commit whatever you have and then get back to it:
+ @pre{
+ $ echo "another line" >> foo.rkt
+ # at this point you remember that you need to do something else in the
+ # `improve-bar' line of work.
+ $ git ci -m "checkpoint"
+ $ git checkout improve-bar
+ # ...work...
+ $ git checkout post-rename
+ $ git log --oneline -2
+ e9a4fcd checkpoint # this is our temporary checkpoint commit
+ d92fb0a one more line
+ $ git reset HEAD^ # undo it
+ Unstaged changes after reset:
+ M foo.rkt # git tells us that this is now uncommitted
+ $ git st
+ M foo.rkt # ... as does `status'
+ $ git log --oneline -2
+ d92fb0a one more line # the temporary commit is gone
+ 599b3b6 properly name the foo library}
+@~
+ You can even decide on some convention to use in some cases, then create new
+ git commands as scripts that will do the work for you. In this case, you
+ could write a command that will do a “checkpoint” commit if needed, switch to
+ another branch, and if the first commit there has only @cmd{checkpoint} as
+ its log message, undo it as above. There are several git convenience
+ commands that started out this way — in this case, checkout the @cmd{git
+ stash} command which allows you to save the current work by pushing it on a
+ “work in progress” stack, and later pop it back out (possibly on a different
+ branch).
+@~
+ Earlier we've seen how to merge or rebase your master branch from the remote
+ master branch, but the full story is that you can merge and rebase @em{any}
+ two branches. This makes branches very flexible: you can create a branch A
+ from an existing branch B, eventually merging/rebasing it back into A, or
+ directly into master and dump A. At any point you can run @cmd{gitk --all}
+ to see where things stand — in our current repository, this shows us that
+ there are redundant @cmd{pre-rename} and @cmd{also-pre-rename} branches, that
+ out @cmd{master} branch is two commits ahead of the remote one, and that we
+ have @cmd{improve-bar} and @cmd{post-rename} branches with 1 and 2 commits
+ over our @cmd{master} branch. If we're done with these two branches, we can
+ now merge/rebase them to our @cmd{master}, or merge/rebase one to the other
+ and the result to @cmd{master}, and then push everything out.
+@~
+ To make working with branches even easier, git has a notion of an “upstream
+ branch” — this is a per-branch setting that tells git which branch the
+ current one is based on. By default, any branch that is created with a
+ remote branch as its starting point will have that remote branch set as its
+ upstream. We've seen how git treats that information in various places so
+ far: @cmd{git status} and @cmd{git branch -v} both use it, and using a second
+ @cmd{-v} with the latter shows also the upstream branch:
+ @pre{
+ $ git reset --hard # dump the above uncommitted change
+ $ git checkout master
+ Switched to branch 'master'
+ Your branch is ahead of 'origin/master' by 2 commits.
+ $ git status
+ # On branch master
+ # Your branch is ahead of 'origin/master' by 2 commits.
+ @i{...}
+ $ git branch -v
+ @i{...}
+ * master 599b3b6 [ahead 2] properly name the foo library
+ @i{...}
+ $ git branch -vv
+ * master 599b3b6 [origin/master: ahead 2] properly name the foo library}
+@~
+ In addition to that, we've seen the @cmd|{@{upstream}}| and @cmd|{@{u}}|
+ notation that refers to the upstream branch, making it convenient to further
+ examine pending changes that weren't incorporated upstream:
+ @pre|{$ git log --oneline @{upstream}..
+ 599b3b6 properly name the foo library
+ 0fb8291 document the type of foo}|
+@~
+ And finally, @cmd{git pull} and @cmd{git push} know where to pull from and
+ push to based on this setting. Overall, this is a very useful feature to
+ have when you have many branches, therefore it is possible to use it between
+ local branches too. There are two ways to do this: when a branch is created
+ with either @cmd{git branch B} or @cmd{git checkout -b B}, you can use the
+ @cmd{--track} flag to set up tracking to the initial branch it's based on.
+ @pre{$ git branch -t b1 master
+ Branch b1 set up to track local branch master.
+ $ git checkout -tb b2 master
+ Branch b2 set up to track local branch master.
+ Switched to a new branch 'b2'}
+ (Note: if you're using @cmd{checkout}, then the @cmd{--track} flag should
+ precede the @cmd{-b} flag, as done above.) If a branch already exists, you
+ can use @cmd{git branch --set-upstream} to set the upstream information.
+ @pre{$ git branch --set-upstream post-rename
+ Branch post-rename set up to track local branch b2.
+ $ git branch --set-upstream improve-bar master
+ Branch improve-bar set up to track local branch master.}
+ As seen here, if it is given just a branch name, the current branch is set as
+ its upstream. @cmd{git branch} can also change the upstream branch, for
+ example, if the above tracking of @cmd{b2} was a mistake:
+ @pre{$ git branch --set-upstream post-rename master
+ Branch post-rename set up to track local branch master.}
+ Either way, we can now see this information in the git commands that do so,
+ as well as use @cmd|{@{upstream}}|:
+ @pre|{$ git branch -vv
+ b1 599b3b6 [master] properly name the foo library
+ * b2 599b3b6 [master] properly name the foo library
+ improve-bar e60c168 [master: ahead 1] improved bar
+ master 599b3b6 [origin/master: ahead 2] properly name the foo @;
+ library
+ post-rename d92fb0a [master: ahead 1] one more line
+ $ git checkout improve-bar
+ Switched to branch 'improve-bar'
+ Your branch is ahead of 'master' by 1 commit.
+ $ git log --oneline @{upstream}..
+ e60c168 improved bar}|
+@~
+ In addition, we can use @cmd{git pull} to get changes on the upstream branch
+ merged or rebased on the current one:
+ @pre{$ git pull
+ From .
+ * branch master -> FETCH_HEAD
+ Already up-to-date.}
+ Nothing actually happened here, because the current branch
+ (@cmd{improve-bar}) already contains all of the commits on the master branch.
+ You can see that this is a local pull since git says @cmd{From .}, which
+ stands for “our own repository”. You can also do a @cmd{push} now, which
+ will make the current additional commit (listed with @cmd|{@{upstream}..}|)
+ appear on the @cmd{master} branch:
+ @pre{$ git push
+ To .
+ 599b3b6..e60c168 improve-bar -> master}
+@~
+ Since the @cmd{improve-bar} line of development is unrelated to the one in
+ @cmd{post-rename}, it is now one commit behind the @cmd{master} branch, and
+ cannot be pushed as is:
+ @pre{
+ $ git checkout post-rename
+ Switched to branch 'post-rename'
+ Your branch and 'master' have diverged,
+ and have 1 and 1 different commit(s) each, respectively.
+ $ git branch -vv
+ improve-bar e60c168 [master] improved bar
+ master e60c168 [origin/master: ahead 3] improved bar
+ * post-rename d92fb0a [master: ahead 1, behind 1] one more line
+ $ git push
+ To .
+ ! [rejected] post-rename -> master (non-fast-forward)
+ error: failed to push some refs to '.'
+ To prevent you from losing history, non-fast-forward updates were rejected
+ @i{...}}
+ Dealing with this is similar to dealing with updates on the remote server —
+ for example, we can rebase the branch before pushing it:
+ @pre{$ git pull --rebase
+ From .
+ * branch master -> FETCH_HEAD
+ First, rewinding head to replay your work on top of it...
+ Applying: one more line
+ $ git push
+ To .
+ e60c168..7bdec0c post-rename -> master}
+@~
+ When you use @cmd{git push} to push changes when you have no upstream branch
+ set, or when you push to a different branch than the one set, you can use
+ @cmd{--set-upstream} to make git remember the push target as the upstream.
+ Therefore, an easy way to create a new branch that tracks a possibly new
+ remote branch by the same name is:
+ @pre{$ git checkout -b my-branch
+ Switched to a new branch 'my-branch'
+ $ git push origin my-branch --set-upstream
+ To pltgit:eli/foo
+ * [new branch] my-branch -> my-branch
+ Branch my-branch set up to track remote branch my-branch from origin.}
+ And when you deal with remote branches this way, you might want to have a
+ local branch that tracks a remote one with a different name. To do this, you
+ use a syntax for the branch to push that specifies the local branch to push
+ and the remote one to push to:
+ @pre{$ git push origin my-branch:different-branch --set-upstream
+ To pltgit:eli/foo
+ * [new branch] my-branch -> different-branch
+ Branch my-branch set up to track remote branch different-branch from @;
+ origin.}
+@~
+ Finally, note that git stores the upstream information in the
+ repository-local configuration file. If we look at it now, we will see the
+ various upstreams that we have set:
+ @pre{$ cat .git/config
+ @i{...}
+ [remote "origin"]
+ fetch = +refs/heads/*:refs/remotes/origin/*
+ url = pltgit:eli/foo
+ [branch "master"]
+ remote = origin
+ merge = refs/heads/master
+ @i{...}}
+ this is the upstream that was made by default when we first checked out our
+ clone, together with the information of where the @cmd{origin} repository is.
+ Following that are the ones we've setup later:
+ @pre{@i{...}
+ [branch "b1"]
+ remote = .
+ merge = refs/heads/master
+ [branch "b2"]
+ remote = .
+ merge = refs/heads/master
+ [branch "post-rename"]
+ remote = .
+ merge = refs/heads/master
+ [branch "improve-bar"]
+ remote = .
+ merge = refs/heads/master
+ [branch "my-branch"]
+ remote = origin
+ merge = refs/heads/different-branch}
+@~
+ Note that there are branches that track local branches (ones with a
+ @cmd{remote = .} setting), and ones that track remote ones; and also note
+ that the @cmd{my-branch} branch tracks a remote branch with a different name.
+ Since the settings are stored as configurations, it is possible to inspect
+ and change them using @cmd{git config}, or even edit the config file
+ directly.
+ @pre{$ git config branch.my-branch.remote
+ origin
+ $ git config branch.my-branch.merge
+ refs/heads/different-branch}}
+
+@subsection{managing remotes}
+@p*{
+ The distributed nature of git means that you can interact with multiple
+ remote repositories. You could have work done with other people done
+ locally, where people push/pull from each other's clones (possibly by sending
+ around patches, as described below), and eventually when the changes are
+ ready push them back to the main repository. You can even have your
+ repository track multiple unrelated remote repositories, essentially giving
+ you branches that have @em{unrelated} histories.
+@~
+ By default, when you clone a remote repository git names it @cmd{origin} —
+ and that name appears in many places, most notably in remote branch names.
+ As seen in the above config, git remember where the origin repository is via
+ a configuration:
+ @pre{$ git config remote.origin.url
+ pltgit:eli/foo
+ $ git config remote.origin.fetch
+ +refs/heads/*:refs/remotes/origin/*}
+ The first one is the url of the remote repository, and the second one is
+ which branches we want to get from it. As with branches, you can use
+ @cmd{git config} to change this information, or you can edit the file
+ directly, but there is a command that does this more conveniently, keeping
+ things consistent:
+ @pre{$ git remote # lists all known remotes
+ origin
+ $ git remote -v # remotes and push/pull specs
+ origin pltgit:eli/foo (fetch)
+ origin pltgit:eli/foo (push)
+ $ git remote show origin # see a detailed description
+ * remote origin
+ Fetch URL: pltgit:eli/foo
+ Push URL: pltgit:eli/foo
+ HEAD branch: master
+ Remote branches:
+ different-branch tracked
+ master tracked
+ my-branch tracked
+ Local branches configured for 'git pull':
+ master merges with remote master
+ my-branch merges with remote different-branch
+ Local refs configured for 'git push':
+ master pushes to master (fast-forwardable)
+ my-branch pushes to my-branch (up to date)}
+@~
+ The @cmd{git remote show} variant will actually query the remote repository
+ for its state (using @cmd{git ls-remote}) by default, and tell you when a
+ local branch that tracks a remote one is out-of-date.
+@~
+ There are a few more sub-verbs for the @cmd{git remote} command which you can
+ see on the @man{git-remote} man page, the most important one is for adding a
+ remote: @cmd{git remote add @i{short-name} @i{url}}. This is especially
+ convenient if you want to have a fork of the plt repository, with most
+ interaction happening against it, but occasionally pull/push updates from/to
+ the main repository.
+@~
+ Of course, remember that you don't need to add remotes to push and pull from
+ them. You could do the same by explicitly specifying a url for the
+ repository you want to interact with. For example, you could have
+ repositories in different accounts on different machines, and synchronize
+ your work between them by pushing and pulling directly from one repository to
+ another. (Reminder: if you do this, then you're likely to have “checkpoint
+ commits” — when you're done with the work, you can do an interactive rebase,
+ and squash these checkpoint changes back into logical commit.) But if you do
+ this often enough, you will likely find it more convenient to add a named
+ remote.}
+
+@subsection{using private repositories}
+@p*{
+ A particularly useful use-case for adding a new remote is when you want to
+ have private work done in your own fork of the plt repository. Such a mode
+ of work is not strictly necessary — you could just do your work in your
+ repository in a long-lived branch, but there are certain cases where working
+ with a private repository on the server might be more convenient. For
+ example, you might want to collaborate with someone else (that has access)
+ via the server, or you might use a private fork of the plt repository as a
+ central point for synchronizing work from clones on different filesystems as
+ described at the end of the last section (the difference from that is that
+ you basically use the plt server as your synchronization point). Other than
+ having the main repository reside on the plt server, working with a private
+ repository is not different than working with any other repository.
+@~
+ There are two facts that are worth reminding when you deal with a private
+ repository. First, remember that creating a private fork is cheap: creating
+ a new git clone of a repository will use hard links to the repository store
+ object, most of which will be contained in large packed files. The cost in
+ terms of space and time for creating a new clone is therefore minimal when
+ done on the same filesystem — and using the gitolite @cmd{fork} command is
+ doing just that. Please use the @cmd{fork} command to create a private clone
+ — gitolite has a feature where it creates any repository that you refer to
+ (as long as it has a name that you're allowed to create — starts with your
+ username); this means that you could clone the main plt repository and push
+ from it into a private repository that doesn't exist: it will be created, but
+ such a copy will not share storage with the main repository — it will require
+ a new copy, and it will be slow to create.
+@~
+ The second thing to remember is that due to the nature of the git store, any
+ object, including commits, is stored exactly once. Since commits contain
+ their parents, having a specific commit means that you have its complete
+ history — therefore, pulling in any branch from any repository will always
+ require getting only commits that you don't already have. As a result,
+ pulling and pushing to/from any repository will be efficient and move around
+ only those commits that are missing on the other side.
+@~
+ You can choose one of two basic approaches to working with a private fork:
+ you can have the public repository cloned but have branches pushed to your
+ private one, or you can have your private fork cloned and occasionally push
+ updates to the public one. A way to use these two approaches are described
+ and explained now. These examples use the @cmd{play} repository as an
+ example (which you are encouraged to experiment with). Note that you can use
+ a hybrid approach: you can think about a repository as a container for commit
+ histories, pushing and pulling from any other repository, including the copy
+ you're working with, the main plt repository, or a private fork (your own or
+ another). Note also that since forks are cheap, you can keep several of them
+ around, for example, you can have a fork for each long-lived branch — it's up
+ to you to settle on a layout that is convenient for your work.}
+@h3{Using a clone of the public repository, pushing branches to your private
+ one:}
+@ol*{
+@~ Setting up:
+ @ul*{@~ Create a fork:
+ @pre{ssh pltgit fork play $user/play}
+ @~ Get a local copy of the main repository:
+ @pre{git clone pltgit:play}
+ (or continue working in an existing one)
+ @~ Set up a convenient name for your private repository:
+ @pre{cd play
+ git remote add my-fork pltgit:$user/play}}
+@~ To start working on a private branch, create one, and push it to your
+ private repository:
+ @pre{git checkout -b my-branch}
+ Then use @cmd{push} to create this branch in your private repository, with
+ @cmd{--set-upstream} so git will remember this setting:
+ @pre{git push --set-upstream my-fork my-branch}
+ You can also have your branch named differently in your fork, for example:
+ @pre{git push --set-upstream my-fork my-branch:master}
+ will save your branch as the @cmd{master} branch in your fork. This might
+ be convenient if you want to clone your private repository elsewhere and
+ work only on this branch.
+@~ You can now work as usual in your repository, pushing/pulling changes
+ to/from the master branch will go to the public repository, and doing so
+ from @cmd{my-branch} will go to your private fork. You can merge changes on
+ the @cmd{master} branch to your private one, or rebase your branch onto it.
+ However, note that the server will not allow pushing a rebased history to
+ your clone. (More details at the end of this section.) You can bypass that
+ by pushing to a new branch while keeping your local branch name:
+ @pre{git push --set-upstream my-fork my-branch:my-branch-2}
+@~ When you're done merge your branch (possibly rebasing it first) to the
+ master branch, and push as usual.
+@~ If you want to delete branches on your fork (either because you pushed a
+ rebased version under a new name, or because you're done with that line of
+ work), use
+ @pre{git push my-fork :my-branch}
+ Using an empty branch name for the local branch that you push is the way to
+ delete remote branches. (As with local branches, this might lead to losing
+ commits, so be careful. If you make a mistake, let me know, since it is
+ likely easy to fix.)}
+@h3{Using a clone of your private repository, pushing changes to the public
+ one:}
+@ol*{
+@~ Setting up:
+ @ul*{@~ Create a fork:
+ @pre{ssh pltgit fork play $user/play}
+ @~ Get a local copy:
+ @pre{git clone pltgit:$user/play}
+ @~ Set up a convenient name for the main repository:
+ @pre{cd play
+ git remote add -t master main pltgit:play}
+ (@cmd{-t master} tells git to have only the @cmd{master} branch
+ retrieved.)}
+@~ Now you can work in this repository as usual — edit, commit, push, etc.
+@~ To push changes to the main repository, first make sure that you're on the
+ branch with the changes that you want to push, and then:
+ @ul*{@~ Get the recent tree from the main repository
+ @pre{git fetch main}
+ @~ Rebase or merge your changes with this:
+ @pre{git rebase main/master}
+ or
+ @pre{git merge main/master}
+ @~ Push the changes back:
+ @pre{git push main}}
+ Note that rebasing your branch on top of main/master means that it will be
+ rewritten, which means that you will not be able to push your branch back to
+ your clone. This is because rewritten histories are currently forbidden by
+ the configuration, but this will probably change in the future. Still, even
+ if the server would allow pushing a rebased history it (you will need to use
+ @cmd{-f} to force such a push), you would need to deal with the rebased
+ branch in other clones you might have. Because of this, a rebase is fine if
+ you're done with the work that you're pushing, otherwise, a merge is more
+ convenient.}
+
+@section{Collaborating with others}
+@p*{
+ Git makes it very easy to collaborate with anyone, anywhere. You should
+ think about repositories as being parts a network which can be synced in any
+ topology that is convenient for you. In the case of the PLT repository, the
+ main repository on the git server is the central point where the official
+ repository lives, and people who can push are directly syncing content into
+ it. People who cannot push directly do so through someone who can, by
+ sending out patches or “pull requests”. The same applies for any repository,
+ of course, including private repositories, even ones that you maintain
+ yourself independently of the plt server.
+@~
+ In the case of a patch-based workflow, the two sides that are involved are
+ the patch author, and the receiver that integrates it into his/her own clone
+ (and from there it goes to the main repository as usual). The work that each
+ side does is described in the next two subsections.
+@~
+ Following that there is a description for making your repository public,
+ which you will need if you're working on a private repository, but it is also
+ useful for your collaborator to do so you can use a pull-request workflow.
+ In this mode there is no need to email patches; instead, both people make
+ their repositories readable to each other, and when some work is ready on one
+ person's repository, the other pulls the commits. This is described in the
+ last subsection.}
+
+@subsection{Patch-based workflow@br
+ — instructions for the patch sender side}
+@h4{Executive summary:}
+@ol*{@~ Work in a plt repository clone (possibly in your own branch)
+ @~ @npre{$ git send-email origin/master}
+ @~ You're done — thanks!
+ @~ When the patch is applied, you will get the changes through
+ @cmd{origin/master}, so if you worked on your master branch, make sure
+ to use @cmd{git pull --rebase} which will notice that your changes were
+ applied; if you worked on a branch, then you can now delete it (the
+ commit objects will be different from the ones you've made).}
+@h4{Longer version:}
+@ol*{
+@~ Work & commit as usual. In general, it is a good idea to use
+ @cmd{Signed-off-by: @i{Your Name} <@i|{your@email}|>} in commit messages,
+ which is a conventional way to declare that you agree for your work to be
+ released as part of the PLT project, under the terms of the LGPL.
+ @cmd{git commit} will add that for you if you use the @cmd{-s} flag. You
+ can also make git do this later, when you send the patches out.
+@~ Make sure that you're working with a relatively recent clone, and that
+ you're on the branch where you did your work. In most cases, this would be
+ the master branch, but you can do your work in your own branch too, of
+ course.
+@~ @p*{
+ Verify that your commits are all in your history. You can see the commits
+ that you have over the plt history with
+ @pre{git log --oneline origin/master..}
+ these are the commits that you're going to send over now. (You can use
+ the usual git toolset to tweak them further, or specify only some commits,
+ etc.)
+ @~
+ A relevant point to consider here is that git takes the first paragraph of
+ each commit message as a subject line. When sending out a patch, this is
+ made concrete by actually using it as the emails's subject — so it is a
+ good idea to make sure that this log looks fine, since the @cmd{--oneline}
+ option will make it show those subjects.
+ @~
+ Obviously, you should also make sure that the commits have clear
+ descriptions of your work. People who in the core group often have some
+ general context that they are aware of, so some commit messages can be
+ cryptic or even worse (eg, you might find @cmd{.} as a commit message) —
+ don't mimic this... As a more occasional contributor, you should explain
+ your work in more details. (There's no policy on commit messages, but you
+ do need to go through some person on the team.)}
+@~ @p*{
+ At this point you should decide how to send your patches. Emailing them
+ is be the most convenient way to do this — to do this, you would use the
+ @cmd{send-email} command:
+ @pre{git send-email origin/master..}
+ or if you send only some commits, use a different specification. To make
+ things even easier, a single commit specification is considered as the
+ starting point and all of the following commits (up to your branch's tip)
+ will be included in the emails (in contrast to other git commands like
+ @cmd{log}, where a single commit name is considered as the set of commits
+ leading up to it) — so you can do this:
+ @pre{git send-email origin/master}
+ This will ask you a bunch of questions — it's easy to answer but you can
+ also specify them as command-line options. if you intend to do this
+ frequently it might be a good idea to make it easier with some settings in
+ your global .gitconfig file. For example, I have these settings:
+ @pre|{[sendemail]
+ from = Eli Barzilay
+ bcc = eli@eli.barzilay.org
+ suppresscc = self}|
+ and you can see more in the @man{git-config} and @man{git-send-email} man
+ pages. The address to send the patches to is also configurable — you can
+ use something like
+ @pre{to = plt-dev@at-lists-racket}
+ or
+ @pre{to = someone@at-racket}
+ depending on who you send your patches to — but this is better done as a
+ repository-local configuration option (or just use the @cmd{--to} flag).
+ @~
+ You can add a @cmd{-s} flag to the command, to make git add
+ @cmd{Signed-off-by} lines to commit messages. (See above for what this
+ means.)
+ @~
+ If you want to send the files in some other way (eg, send them all
+ packaged in an archive as attachment[*]), then just use @cmd{format-patch}
+ instead of @cmd{send-email} — git will create a number of patch files in
+ your current directory, which will be named @path{NNNN-text.patch} where
+ the text is made out of the subject lines of the commit messages (the
+ first line). You can even run
+ @pre{git format-patch origin/master --stdout > my-patch}
+ to concatenate them all and send the resulting file over.
+ @~
+ @small{[*] Note that doing this means that it is not as easy to read your
+ patch, so avoid doing this if you want to make it easier to read and
+ accept it. On the other hand, if you're working with someone specific,
+ they might prefer attachments (for example, it's easier to save the
+ attached file from gmail).}}
+@~ Once the commits have been pushed the the main repository, you would get
+ them when you pull to update. The commits will now be different objects
+ than the ones you have — since the information changed (at least the
+ committer information will be different, the log message might have been
+ edited, etc). If you made your commits on your master branch which is set
+ to track the plt master branch (the usual setup), then make sure that you
+ run @cmd{git pull --rebase} to update — this will identify the commits as
+ already included and will not include them in the rebased master. But if
+ you made your commits in a private branch, and assuming that you didn't do
+ any additional work there, then you can now just delete that branch. (If
+ you did do more work there, then you should rebase it, to avoid resending
+ the same patches again.)}
+
+@subsection{Patch-based workflow@br
+ — instructions for the patch receiver side}
+@p*{
+ Accepting patches that were sent via email (on any other way), is also
+ simple. The command to do this is @cmd{git am}, which expects an argument
+ that is a mailbox file holding the patch emails, or you could run it and pipe
+ a patch email into it.}
+@h4{Executive summary:}
+@ol*{@~ @npre{$ git checkout master}
+ @~ Save (unmodified) patch emails into a mail folder file.
+ @~ @npre{$ git am -3 @i{the-mail-folder}}
+ @~ Push the changes up to the server}
+@h4{Longer version:}
+@ol*{
+@~ While you will not be author of the commits, you will be their committed, so
+ you should of course be aware of the changes, and be willing to maintain the
+ new code and other work that is implied. So the first step that you should
+ do is review the patch and make sure that you are willing to accept
+ responsibility for it.
+@~ Save the patch emails to a mail folder (usually a file). You must take care
+ to save the emails @em{as is}, including the date, author, and subject
+ headers, and avoiding text that could have been butchered by your email
+ client. For example, if you're using gmail, then use the “show original”
+ option to view the raw email text, and save that to a file (even in this
+ format gmail will have a first line with a bunch of spaces — it's best to
+ remove that). Otherwise, gmail does things like wrap lines, replaces spaces
+ by non-breaking spaces, or remove spaces. Alternatively, extract patch
+ files from an archive if that's what you received, or save a single
+ attachment file etc.
+@~ In your repository clone, make sure that you're on the branch that you want
+ to integrate the changes into. You could do this in your master branch, or
+ in a new topic branch (especially if there is more than one patch).
+@~ Run @cmd{git am -3 mail-folder} (@cmd{am} stands for “apply-mail”) with the
+ mail folder that you've created above. It will apply the patches and commit
+ them one by one. Like @cmd{git rebase}, if there are conflicts the process
+ will stop so you can resolve it — and then run the @cmd{am} command with
+ @cmd{--continue}, or @cmd{--skip} this commit and continue with the rest, or
+ @cmd{--abort} to go back to the start. The @cmd{-3} flags tells git that if
+ a conflicting patch comes from the above @cmd{format-patch}, and it
+ specifies files that we have, then try a 3-way merge — this will make things
+ generally better (and it can identify more patches that were applied,
+ instead of showing them as conflicts).
+ @br
+ You can also use an @cmd{-i} flag to the command to get an interactive
+ version — for each commit it will ask you what to do with it, and let you
+ edit the log message.
+@~ Finally push the commits as usual.}
+
+@subsection{Making a private repository publicly available}
+@p*{
+ If you're working with “outside people” (people with no accounts on the plt
+ server, and no direct file-system access etc) on a private repository, you
+ will need to find some way to make your repository available for cloning. An
+ easy way to do so is to put it on a filesystem that those people can access —
+ eg, if you're all in the same department. Another easy way to make a
+ repository available is to find a hosting service like github and others —
+ there are many options here, some are free but limited, and some cost money;
+ if you prefer this easy solution, keep in mind that you can pay for the
+ duration of the collaboration and at the end you can simply keep your
+ repository clone to yourself (eg, if you're working on a paper then there's
+ no need to pay once all work is done).
+@~
+ But if you want to do it yourself, the quickest and most convenient way to
+ make a repository public is to put it in a directory that is available on the
+ web. Such repositories can be cloned directly from the URL the repository is
+ available at — there's no need to setup a server in a special way, and no
+ need to run cgi scripts.}
+@h4{Executive summary:}
+@ol*{@~ @npre{$ git clone --bare @i{your-repo} ~/public_html/@i{repo}.git}
+ @~ @npre{$ cd ~/public_html/@i{repo}.git/hooks;
+ mv post-update.sample post-update;
+ chmod +x post-update}
+ @~ @npre{$ git remote add public ~/public_html/@i{repo}.git}
+ @~ Tell people to clone from @cmd{http://some.where/~you/@i{repo}.git}
+ @~ Work, apply email patches, and: @cmd{git push public}}
+@h4{Longer version:}
+@ol*{
+@~ Make a “bare” repository — this is a repository that has no working
+ directory:
+ @pre{git clone --bare @i{your-repo} @i{repo}.git}
+ This will create a @path{@i{repo}.git} directory holding the bare
+ repository. You should use some path in a directory where you have web
+ pages published.
+@~ The URL where the directory is found at is what other people should use when
+ cloning.
+@~ You can now push to this repository, and other people will see it too. To
+ make things easier, you can set a remote name for this repository, so it's
+ easy for you to push changes to it.
+ @pre{git remote add public ~/public_html/@i{repo}.git}
+ And now you can use @cmd{git push public}. (You can also pull from it, but
+ since you're going to be the only one who pushes into it, that will not be
+ necessary.)
+@~ One thing to be aware of is that while a repository can be published through
+ HTTP this way, git considers that a “dumb protocol” (because there is no
+ proper interaction between the two sides). To still make cloning possible,
+ you will need to maintain some meta-files that hold entry points to the
+ objects in your repository — to get this, run:
+ @pre{git update-server-info}
+ You need to run this after each update to the repository — and to automate
+ this you can have a hook do it for you. In the bare repository you will
+ find a @path{hooks} directory with a file called @path{post-update.sample} —
+ simply rename this file to @path{post-update}, and make it executable with
+ @cmd{chmod +x post-update}. From now on every push to this repository will
+ run the hook and keep the meta files updated.}
+
+@subsection{Pull-request workflow}
+@p*{
+ A possibly easier way for people to contribute work is to make their
+ repositories available somehow. In the case of a private repository, the two
+ sides can be in a shared file system, with read permissions for each other;
+ or achieved as described in the previous subsection. In the case of
+ contributing to the plt repository, the contributor can maintain a public
+ fork of the plt repository (eg, by forking the plt github mirror at
+ @selflink{http://github.com/plt/racket} directly on github).
+@~
+ In this workflow there is no need to mail patches — instead, the receiver
+ simply pulls them directly from the sender's repository. For example,
+ someone tells you that they have some new commits in a @cmd{foo} branch of
+ their repository. Since this is a repository that you can access, and since
+ it shares history with yours, you can just pull that branch in, for example:
+ @pre{git checkout -b someones-work
+ git pull @i{someones-repository-url}}
+ or, if you expect to do this often (eg, you're going to suggest fixes for the
+ work and get new work in), then you can add a @cmd{someone} remote to be used
+ more conveniently:
+ @pre{git remote add someone @i{someones-repository-url}
+ git fetch someone
+ git checkout -b some-branch someone/some-branch}
+ possibly using -t to make the branch track the remote one:
+ @pre{git checkout -tb some-branch someone/some-branch}
+@~
+ Once you pulled in the branch, you can inspect the changes, merge them,
+ rebase them, etc. The important point here is that you have a copy of the
+ contributed line of work, which you can use with the usual git toolset.
+@~
+ When/if you're happy with the changes, you can simply integrate them to your
+ master branch, and if this is in a clone of the plt repository, then at this
+ point you can simply push these commits to the main server. Once that
+ happens, the contributor can update their own clone, and continue working as
+ usual.
+@~
+ Git has a tool that makes this mode of work a little more organized and
+ robust: @cmd{git request-pull}. This simple command (surprisingly, it has no
+ flags) is intended to be used by the contributor. It expects a commit that
+ marks the start of the new work (actually, the last one before it, eg,
+ @cmd{origin/master}), and the url of the repository. For example:
+ @pre{git request-pull origin git://github.com/someone/somefork.git}
+@~
+ Of course, the contributor doesn't have to work directly in the available
+ repository — in the case of github or with an over-the-web setup like the one
+ described in the previous subsection the public repository is a bare one, and
+ no work can be done directly on it. So what actually happens is: the
+ contributor works on his/her own repository, pushes changes to the public
+ one, and then requests a pull.
+@~
+ The @cmd{request-pull} command will therefore check that the new commits are
+ indeed available at that location, and find out the branch that they're on
+ (in case it's different than the branch that someone is working on). It then
+ prints out a “pull request” text with a description of the changes, the url
+ that was specified, the branch name with the new work, and a summary of the
+ files that were changed. In short, all the relevant information is there,
+ and it even verified that the commits are indeed available — merging them in
+ is now easy.
+@~
+ (As a sidenote, you can use @cmd{.} as the url:
+ @cmd{git request-pull origin .}, and get a condensed summary of your
+ changes.)}
+
+@section{Additional Resources}
+@dl*{
+@~ @strong{Quick and short:}
+@~ @dl*{
+ @~ @selflink{http://eagain.net/articles/git-for-computer-scientists/}
+ @~ Basic description of what makes a git repository
+ @~ Cheat sheets:
+ @~ @dl*{
+ @~ @selflink{http://gitref.org/}
+ @~ Quick reference thing, with links to the git man pages
+ and the progit book
+ @~ @selflink{http://jonas.nitro.dk/git/quick-reference.html}
+ @~ Really short
+ @~ @selflink{http://cheat.errtheblog.com/s/git}
+ @~ Explains some more
+ @~ @selflink{http://ktown.kde.org/~zrusin/git/git-cheat-sheet.svg}
+ @~ Short, intended for printing
+ @~ @selflink{http://github.com/guides/git-cheat-sheet}
+ @~ Similar}
+ @~ @selflink{http://git.or.cz/course/svn.html}
+ @~ subversion->git crash course
+ @~ @selflink{http://www.kernel.org/pub/software/scm/git/docs/everyday.html}
+ @~ Nice summary of a few things, but too verbose or too advanced
+ in some places, and also a little outdated.}
+@~ @strong{Books:}
+@~ @dl*{
+ @~ @selflink{http://book.git-scm.com/}
+ @~ The git community book. Also, there are a bunch of videos
+ linked, and some tutorial links in the “Welcome” part.
+ @~ @selflink{http://progit.org/}
+ @~ A frequently recommended book.
+ @~ @selflink{http://www-cs-students.stanford.edu/~blynn/gitmagic/}
+ @~ Another good book (a bit more verbose than the previous one)}
+@~ @strong{Misc:}
+@~ @dl*{
+ @~ @selflink{http://www.kernel.org/pub/software/scm/git/docs/@;
+ gittutorial.html}
+ @~ The git tutorial, also available as the @man{gittutorial} man
+ page.
+ @~ @selflink{http://github.com/guides/home}
+ @~ Some github guides, including some screencasts, etc.
+ @~ @selflink{http://learn.github.com/}
+ @~ github learning materials — work in progress, but useful.
+ @~ @selflink{http://www.gitready.com/}
+ @~ A kind of a collection of small tips; looks like it didn't
+ change in a while though.
+ @~ @selflink{http://marklodato.github.com/visual-git-guide/}
+ @~ This a short visual document about git. But it goes a little
+ fast, so it would be useful after you're comfortable with the
+ basics.}}
+
+}}))