CSE, Software Development: October 2018

CLI Vs GUI
Advantage:

quick, straightforward access to git

Disadvantage:

assumes basic familiarity with the unix command line;
requires good working knowledge of the command line;
proficiency takes longer

                                        Getting Started

                                                             Git Overview

As you develop software and make changes, add features, fix bugs, etc. it is often useful to have a mechanism to keep track of changes and to ensure that your code base and artifacts are well-protected by being stored on a reliable server (or multiple servers).
This allows you:

access to historic versions of your application’s code in case something breaks or
to “roll-back” to a previous version if a critical bug is found.

The solution is to use a revision control system that allows you to “check-in” changes to a code base.

It keeps track of all changes and allows you to “branch” a code base into a separate copy so that you can develop features or enhancements in isolation of the main code base (often called the “trunk” in keeping with the tree metaphor).
Once a branch is completed (and well-tested and reviewed), it can then be merged back into the main trunk and it becomes part of the project.

You may already be familiar with similar online (or “cloud”) storage systems such as Google Drive or Dropbox that allow you to share and even collaborate on documents and other files.
However, a version control system is a lot more. It essentially keeps track of all changes made to a project and allows users to work in large teams on very complex projects while minimizing the conflicts between changes.

These systems are not only used for organizational and backup purposes, but are absolutely essential when developing software as part of a team.

Each team member can have their own working copy of the project code without interfering with other developer’s copies or the main trunk.

Only when separate branches have to be merged into the trunk do conflicting changes have to be addressed. Otherwise, such a system allows multiple developers to work on a very complex project in an organized manner.

There are several widely used revision control systems including CVS (Concurrent Versions System), SVN (Apache Subversion), and Git.

CVS is mostly legacy and not as widely used anymore.

SVN is a centralized system: there is a single server that acts as the main code repository. Individual developers can check out copies and branch copies (which are also stored in the main repository). They also check all changes into the main repository.

Git, however, is a decentralized system; multiple servers can act as repositories, but each copy on each developer’s own machine is also a complete revision copy.

Code commits are committed to the local repository.
Merging a branch into another requires a push/pull request.
Decentralizing the system means that anyone’s machine can act as a code repository and can lead to wider collaboration and independence since different parties are no longer dependent on one master repository.

Git is an open-source version control system known for its

speed,
stability, and
distributed collaboration model.

Originally created in 2006 to manage the entire Linux kernel, Git now boasts:

a comprehensive feature set,
an active development team, and
several free hosting communities.

Git was designed from the ground up, paying little attention to the existing standards of centralized versioning systems.
So, if you’re coming from an SVN or CVS background, try to forget everything you know about version control before reading this guide.

Distributed software development is fundamentally different from centralized version control systems.

Instead of storing file information in a single central repository, Git gives every developer a full copy of the repository.

To facilitate collaboration, Git lets each of these repositories share changes with any other repository.

Having a complete repository on your local machine has a far-reaching impact on the development cycle.

Faster Commands
First, a local copy of the repository means that almost all version control actions are much faster. Instead of communicating with the central server over a network connection, Git actions are performed on the local machine. This also means you can work offline without changing your workflow.

Stability
Since each collaborator essentially has a backup of the whole project, the risk of a server crash, a corrupted repository, or any other type of data loss is much lower than that of centralized systems that rely on a single point-of-access.

Isolated Environments
Every copy of a Git repository, whether local or remote, retains the full history of a project. Having a complete, isolated development environment gives each user the freedom to experiment with new additions before polishing them up into clean, publishable commits.

Efficient Merging
A complete history for each developer also means a divergent history for each developer. As soon as you make a single local commit, you’re out of sync with everyone else on the project. To cope with this massive amount of branching, Git became very good at merging divergent lines of development.

Each Git repository contains 4 components:

The working directory
The staging area
Committed history
Development branches

Everything from recording commits to distributed collaboration revolves around these core objects.

The Working Directory
The working directory is where you actually edit files, compile code, and otherwise develop your project. For all intents and purposes, you can treat the working directory as a normal folder. Except, you now have access to all sorts of commands that can record, alter, and transfer the contents of that folder.

The Staging Area
The staging area is an intermediary between the working directory and the project history.
Instead of forcing you to commit all of your changes at once, Git lets you group them into related change-sets(in other words introduces an abstraction level for better organize the files to be committed).
Staged changes are not yet part of the project history

Committed History
Once you’ve configured your changes in the staging area, you can commit them to the project history where it will remain as a “safe” revision.
Commits are “safe” in the sense that Git will never change them on its own, although it is possible for you (if you want) to manually rewrite project's history.

Development Branches
So far, we’re still only able to create a project history in a linear fashion, adding one commit on top of another.

Branches make it possible to develop multiple unrelated features in parallel by forking the project history.

Installation

If you want to use Git on your own personal machine, then you may need to install a Git client.
Git is available on all major platforms. Consult the official Git Web site go to Downloads section and choose the package to download in function of your platform (for Windows exists for 32 and 64 bit machine editions and a portable-thumbdrive edition too).
Other than the default (CLI interface) Git is provided with his own native GUI and moreover some other third party free/opensource (or closed source) GUIs (you have to download apart) too for all tastes; some of them are multiplatform (i.e. for all three mainstream platforms Linux, Mac, Windows)

There are many options out there and you are encouraged to explore them, however the following suggestions are all free and open source.

Git has released its own graphical user interface clients which are available for free for all platforms
If you will be using the Eclipse IDE for development, the most recent versions already come with a Git client(Egit). Eclipse will work on any system.
If you use Windows and prefer to use a command line interface, you can download and install TortoiseGit : a Windows Shell Interface to Git
If you use Mac and want the command line version of Git, you can download and install here. Alternatively, you can install Git using a tool like MacPorts.

Configuration

Git comes with a long list of configuration options covering everything from your name to your favorite editor and merge tool. You can set options with the git config command, or by manually editing a file called .gitconfig (the dotted file mean is a hidden file from your file system for security purposes. You have to make it visible first e.g. in GNOME's Nautilus File Manager with CTRL-H ) in your home directory.

Some of the most common options are presented below.

User Info
The first thing you’ll want to do with any new Git installation is introduce yourself by means of your username and email; that's very important in a collaborative environment.
Git records this information with your commits, and third-party services like GitHub use it to identify you.

git config --global user.name "John Smith"
git config --global user.email john@example.com

The --global flag records options (at your home level) in ~/.gitconfig, making it the default for all new repositories you create.
Omitting it lets you specify options on a per-repository basis(mean you can specify a different user name and or email for different repositories you create or use).

Editor
Git’s command-line implementation relies on a text editor for most of its input.
You can force Git to use your editor-of-choice with the core.editor option:

git config --global core.editor gvim

make sure you use the correct parameters required probably for the specified editor(take a glance on the editor's web page) e.g. for geany lightweight GUI text editor you must use -imnst parameters otherwise geany when cllaed from a Git's commit command not stay open but appears and disappears instantly:

git config --global core.editor=geany -imnst

Aliases

By default, Git doesn’t come with any shortcuts, but you can add your own by aliasing commands.
E.g. if you’re coming from an SVN background, you’ll appreciate the following bindings:

git config --global alias.st status
git config --global alias.ci commit
git config --global alias.co checkout
git config --global alias.br branch

Here you can find some useful aliases
Learn more by running the git help config in your Git Bash prompt.

As you can tell, Git simply replaces the new command with whatever you alias it for. However, maybe you want to run an external command, rather than a Git sub-command. In that case, you start the command with a ! character. This is
useful if you write your own tools that work with a Git repository. We can demonstrate by aliasing git visual to run gitk:

$ git config --global alias.visual "!gitk"

You can get a Git project using two main approaches.

The first takes an existing project(or directory) and imports it into Git.
The second clones an existing Git repository from another server.

Initializing Repositories

To start, you can verify that git has been properly installed on your machine by executing the following:

git --version

which may output(depend on your platform) something like:

git version 1.7.1 or
git version 1.9.5 (Apple Git-50.3)

though your specific version may differ. However, if this command does not work, you will need to troubleshoot your installation before continuing.

Git is designed to be as unobtrusive as possible. The only difference between a Git repository and an ordinary project folder is an extra .git (recall dot mean hidden) directory in the project root (not in every subfolder like SVN).
To turn an ordinary project folder into a full-fledged Git repository, enter in the folder (through cd /path/to/folder) and run the following command:

git init

which should have output similar to:

Initialized empty Git repository in /your/directory/foo/.git/

This creates a new subdirectory named .git that contains all your necessary repository files—a Git repository skeleton.

At this point, nothing in your project is yet tracked yet.

The argument of git init should be a path to the repository (leaving it blank will use the current working directory).
After git init, you can use all of Git’s powerful version control features.

Cloning Repositories

As an alternative to git init, you can clone an existing Git (local or remote) repository using (if the repo were remote) the following command:

git clone ssh://user@server:path/to/repo.git

or more simply

git clone user@server:path/to/repo.git

This command do some actions like:

logs into the machine using SSH and
downloads the repo.git project,
creates locally a directory named repo,
initializes a .git directory inside it,
pulls down all the data for that repository, and
checks out a working copy of the repo's latest version.

This is a complete copy, not just a link to the server’s repository.

If you want name your local copy differently than the default proposed (repo) use MyRepo name as command's argument:

git clone user@server:path/to/repo.git MyRepo

After cloning you have available:

your own history,
your own branch structure,
your own staging area,
your own working directory, and moreover
no one will see any changes you make until you push them back to a public repository.

Git has a number of different transfer protocols you can use. The previous example uses the ssh:// but you may as well see https:// and git:// protocols e.g.

https://github.com/cbourke/CSCE155-C-Lab01

If successful, you should see a message like the following:

Cloning into ’CSCE155-C-Lab01’...
remote: Counting objects: 9, done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 9 (delta 2), reused 9 (delta 2), pack-reused 0
Unpacking objects: 100% (9/9), done.

A new directory/file structure should now exist in your directory and you can start working with/editing the files.
If the owner of the repository that you just cloned ever makes changes, you can “pull” those changes from the repository by using git pull to pull all changes.

Recording Changes

Maintaining a series of “safe” revisions of a project is the core function of any version control system. Git accomplishes this by recording snapshots of a project.
After recording a snapshot, you can:

go back and view old versions,
restore them, and
experiment without the fear of destroying existing functionality.

SVN and CVS users should note that this is fundamentally different from their system’s implementation. Both of these applications record diffs for each file—an incremental record of the changes in a project.
In contrast,

Git’s snapshots are just that —snapshots. Each commit contains the complete version of each file it contains.
This makes Git incredibly fast since the state of a file doesn’t need to be generated on the fly each time it’s requested but it is just recorded in the snapshot ready to be consulted when needed

Given the core components of Git-based revision control:

working directory,
staging area, and
committed history.

Let's examine the basic workflow for creating snapshots using them:

You have a bona-fide Git repository and a checkout (or working copy) of the files for that project.
You need to make some changes and commit snapshots of those changes into your repository each time the project reaches a state you want to record.

Remember that each file in your working directory can be in one of two states: tracked or untracked.

Tracked files are files that were in the last snapshot(mean are already under Version Control); they can be unmodified, modified, or staged.
Untracked files are everything else—any files in your working directory that were not in your last snapshot(mean not under Version Control)) and are not in your staging area.

When you first clone a repository, all your files will be tracked and unmodified because you just checked them out and haven’t edited anything.
As you edit files, Git sees them as modified, because you’ve changed them since your last(or before your first) commit.
You stage these modified files and then
commit all your staged changes,
repeat step 1... and the cycle repeats.

The Staging Area

Git’s staging area gives you a place(think it as a limbo) to organize a commit before adding it to the project's committed history.

Staging is the process of moving changes from the working directory to the staged snapshot.

It gives you the opportunity to pick-and-choose related changes from the working directory(even at a level of change into a line of text), instead of committing everything all at once.

This means you can create logical snapshots over chronological ones. This is a boon to developers because it lets them separate coding activities from version control activities.

When you’re writing features, you can forget about stopping to commit them in isolated chunks.
Then, when you’re done with your coding session, you can separate changes into as many commits as you like via the stage area.

To add new(untracked) or modified(already tracked) files from the working directory to the staging area, use the following command:

git add <file>

You can use shortcuts(globbing or wildcard characters) such as

git add .

git add *

Any of these commands add more than one (any) untracked files or already tracked but modified files in the working directory in the stage area all at once.

To add a directory(add recursively all contained files ) instead use:

git add <directory>

Removing Files
To delete (or also rename) a file from a versioned project, you need to delete (or also rename) it from your staging area) and then commit. The git rm, git mv commands does that, and also removes(rename) the file from your working directory so you don’t see it as an untracked file the next time around.
If you simply remove(rename) the file from your working directory, it shows up under the “Changed but not updated” (that is, unstaged) area of your git status output. Add it to the staging area just like a new or modified file.
You have two options:

Use your file systems (to delete or rename) commands and then add the deleted or moved(renamed) files to stage area or
use directly git's commands that do all that automatically for you

rm <file> && git add <file>
mv <file> && git add <file>

or respectively:

git rm <file>
git mv <file>

The next time you commit, the file will be gone and no longer tracked.
If you modified the file and added it to the index already, you must force the removal with the -f option. This is a safety feature to prevent accidental removal of data that hasn’t yet been recorded in a snapshot and that can’t be recovered from Git.

git rm -f <file>

Another useful thing you may want to do is to keep the file in your working tree but remove it from your staging area.

In other words, you may want to keep the file on your hard drive but not have Git track it anymore.

This is particularly useful if you forgot to add something to your .gitignore file and accidentally added it, like a large log file or a bunch of .a compiled files.

The next command, operates only in the stage area,

will stage the deletion and
stop tracking the file,
but it won’t delete the file from the working directory(delete it only from the stage area):

git rm --cached <file>

You can pass files, directories, and file-glob patterns to the git rm command.

\* is necessary because Git does its own filename expansion in addition
to your shell’s filename expansion.

The first command removes all files that have the .log extension in the log/ directory.
The second command removes all files that end with ~

git rm log/\*.log git rm \*~

Moving Files
Unlike many other VCS systems, Git doesn’t explicitly track file movement. If you rename a file in Git, no metadata is stored in Git that tells it you renamed the file; however, Git is pretty smart about figuring that out after the fact.
Thus it’s a bit confusing that Git has a mv command. If you want to rename a file in Git, you can run something like:

git mv <file_from> <file_to>

and it works fine. In fact, if you run something like this and look at the status, you’ll see that Git considers it a
renamed file:

git mv README.md README
git status

outputs

On branch master
Changes to be committed:
(use "git reset HEAD ..." to unstage)

renamed: README.md -> README

However, this is equivalent to running something like this:

mv README.md README
git rm README.md
git add README

Git figures out that it’s a rename implicitly, so it doesn’t matter if you rename a file that way or with the mv command. The only real difference is that mv is one command instead of three—it’s a convenience function. More important, you can use any tool you like to rename a file, and address the add/rm later, before you commit.

Inspecting the Stage

The main tool you use to determine which files are in which state is the git status command. Viewing the status of your repository is one of the most common actions in Git.

The following command outputs the state of the working directory and staging area:

$ git status

In case outputs:

# On branch master
# nothing to commit, working directory clean

This means you have a clean working directory—in other words, there are no tracked and modified(mean all tracked files are unmodified) or untracked files(the later would be listed here).
Finally, the command tells you which branch you’re on(actually we're on master branch)
In other words informs you that it has not diverged from the same branch on the server. For now, that branch is always “master,” which is the default;

In case output result in a message that resembles the following (certain sections may be omitted depending on the state of your repository):

    # On branch master
    # Changes to be committed:
    #
    # new file: foobar.txt
    #
    # Changes not staged for commit:
    #
    # modified: foo.txt.
    #
    # Untracked files:
    #
    # bar.txt

Let's examine it section by section:

The first section, "On branch master" indicates where we're actually; we're in the master branch;
“Changes to be committed” is your staged snapshot. If you were to run git commit right now, only these files would be added to the project's history(Git's very Data Base).
The next section "Changes not staged for commit" lists tracked (modified) files that will not be included in the next commit.
Finally, “Untracked files” contains files in your working directory that haven’t been added to the repository(they're new not versioned files from the repository's perspective). Git won’t start including them in your commit snapshots until you explicitly tell it to do so. It does this so you don’t accidentally begin including generated binary files or other files that you did not mean to include.
If you modify a staged file that file have two states staged and modified if at that point you do a commit the only staged(not the most recently modified in your working directory) version enter in the committed history

If you run git status -s or git status --short you get a far more simplified output:

$ git status -s

outputs:

   M README
MM Rakefile
A     lib/git.rb
M    lib/simplegit.rb
??   LICENSE.txt

New files that aren’t tracked have a ?? next to them
new files that have been added to the staging area have an A,
modified files have an M and so on.

There are two columns to the output:

the left hand column indicates that the file is staged and
the right hand column indicates that it’s unstaged modified.

So for example in that output, the README file is modified in the working directory but not yet staged, while the lib/simplegit.rb file is modified and staged. The Rakefile was modified, staged and then modified again, so there are changes to it that are both staged and unstaged.

Generating Diffs

If the git status command is too vague for you—you want to know exactly what you changed, not just which files were changed—you can use the git diff command about the changes in your working directory or staging area.

git diff

This outputs a diff of every tracked modified but yet unstaged change in your working directory.
In other words that command compares what is in your working directory with what is in your staging area. The result tells you the changes you’ve made that you haven’t yet staged.

It’s important to note that git diff by itself doesn’t show all changes made since your last commit—instead show only changes that are still unstaged.
This can be confusing, because if you’ve staged all your changes, git diff will give
you no output.

If you want to see what you’ve staged that will go into your next commit, you can use git diff --staged. (cashed and staged are synonyms)

git diff --cached

This command compares your staged changes to your last commit in history

Note that the project(commit) history is outside the scope of git status. For displaying committed snapshots, you’ll need git log.

Although git status answers questions like

What have you changed but not yet staged?
What have you staged that you are about to commit?

very generally by listing the file names, git diff shows you the exact lines added and removed—the patch, as it were.

Commits

Commits represent every saved version of a project, which makes them the atomic unit of Git-based version control.

Each commit contains a snapshot of the whole project,

an SHA-1 checksum of its entire contents:
your user information,
the date, and
a commit message

commit b650e3bd831aba05fa62d6f6d064e7ca02b5ee1b
Author: john
Date: Wed Jan 11 00:45:10 2012 -0600
Some commit message

The checksum serves:

as a commit’s unique ID,
and it also means that a commit will never be corrupted or unintentionally altered without Git knowing about it.

Since the staging area already contains the desired change-set, committing doesn’t require any involvement from the working directory.

To commit the staged snapshot and add it to the history of the current branch, execute the following:

git commit

You’ll be presented with your favorite text editor (you have earlier configured) and prompted for a “commit message.”
Commit messages should take the following -email like- form:

commit summary in 50 characters or less
blank line
detailed description of changes in this commit

Git uses the first line for formatting log output, e-mailing patches, etc., so it should be brief, while still describing the entire change-set.

If you can’t come up with the summary line, it probably means your commit contains too many unrelated changes. You should go back and split them up into distinct commits.

The summary should be followed by a blank line and a detailed description of the changes (e.g., why you made the changes, what ticket(issue) number it corresponds to).

As you notice that format resembles an email and that's not casual. Initially Linux's kernel contributions(patches) was implemented trough email lists

Skipping the Staging Area

Although it can be amazingly useful for crafting commits exactly how you want them, the staging area is sometimes a bit more complex than you need in your workflow. If you want to skip the staging area, Git provides a simple shortcut.

Adding the -a option to the git commit command makes Git automatically stage every file that is already tracked before doing the commit, letting you skip the
git add command :

git -am 'commitMessage' or git -a -m 'commitMessage'

Inspecting Commits

Like a repository’s status, viewing repository’s history is one of the most common tasks in Git version control.
You can display the current branch’s commits with:

git log

We now have the only two tools(status, log) we need to inspect every component of a Git repository.

This also gives us a natural grouping of commands:

Stage/Working Directory: git add, git rm, git mv, git status
Committed History: git commit, git log

Useful Configurations
Git provides a plethora of formatting options for git log, a few of which are included here.

To display each commit on a single line, use:

git log --oneline

or, to target the history of an individual file instead of the whole repository, use:

git log --oneline <file>

Filtering the log output is also very useful once your history grows beyond one screenful of commits.

Another really useful option is --pretty. This option changes the log output to formats other than the default.A few prebuilt options are available for you to use.

The oneline option prints each commit on a single line, whichis useful if you’re looking at a lot of commits.
In addition, the short, full, and fuller options show the output in roughly the same format but with less or more information, respectively
The most interesting option is format, which allows you to specify your own log output format. This is especially useful when you’re generating output for machine parsing—because you specify the format explicitly, you know it
won’t change with updates to Git

git log --pretty=oneline

ca82a6dff817ec66f44342007202690a93763949 changed the verison number
085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 removed unnecessary test
...

git log --pretty=format:"%h - %an, %ar : %s"

ca82a6d - Scott Chacon, 6 years ago : changed the version number
085bb3b - Scott Chacon, 6 years ago : removed unnecessary test
...

The oneline and format options are particularly useful with another log option called --graph. This option adds a nice little ASCII graph showing your branch and merge history. This type of output will become more interesting as we go through branching and merging

git log --pretty=format:"%h %s" --graph

* 2d3acf9 ignore errors from SIGCHLD on trap
* 5e3ee11 Merge branch 'master' of git://github.com/dustin/grit
|\
| * 420eac9 Added a method for getting the current branch.
* | 30e367c timeout code and tests
...

Finally, you can display a diffstat of the changes in each commit.

As you can see, the --stat option prints below each commit entry a list of modified files, how many files were changed, and how many lines in those files were added and removed. It also puts a summary of the information at the end.

git log --stat

outputs:

commit a6175fc3cf1e3afc06b3cd27b63a9138b24f2032
Author: harrykar
Date:   Fri Oct 19 08:35:07 2018 +0300

    issues:add in list #3009,#3010, correction blank in #3011

    added:
    Object-oriented Programming in JavaTM Textbook by Rick Halterman #3009

...

free-programming-books.md | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

One of the more helpful options is -p, which shows the difference introduced in each commit. You can also use -2, which limits the output to only the last two entries.
This is very helpful for code review or to quickly browse what happened during a series of commits that a collaborator has added.

git log -p -2

commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon
Date:
Mon Mar 17 21:52:11 2008 -0700

changed the verison number

diff --git a/Rakefile b/Rakefile
index a874b73..8f94139 10064
...

Limiting Log Output

You can use the following to display commits contained in until but not in since . Both arguments can be:

a commit ID,
a branch name, or
a tag:

git log since..until

For example, this command gets the list of commits made in the last two weeks:

git log --since=2.weeks

This command works with lots of formats—you can specify a specific date like "2008-01-15", or a relative date such as "2 years 1 day 3 minutes ago".
You can also filter the list to commits that match some search criteria. The --author option allows you to filter on a specific author, and the --grep option lets you search for keywords in the commit messages. (Note that if you want to specify both author and grep options, you have to add --all-match or the command will match commits with either).

Another really helpful filter is the -S option that takes a string and only shows the commits that introduced a change to the code that added or removed that string. For instance, if you wanted to find the last commit that added or removed a reference to a specific function, you could call:

git log --Sfunction_name

The last really useful option to pass to git log as a filter is a path.
If you specify a directory or filename, you can limit the log output to commits that introduced a change to those files. This is always the last option and is generally preceded by double dashes (--) to separate the paths from the options.

For example, if you want to see which commits modifying test files in the Git source code history were committed by Junio Hamano and were not merges in the month of October 2008, you can run something like this:

git log --pretty="%h - %s" --author=gitster --since="2008-10-01" \
--before="2008-11-01" --no-merges -- t/

Of the nearly 40,000 commits in the Git source code history, this command shows the 6 that match those criteria.

For visualizing history, you might also want to look at the gitk command, which is actually a separate (default) GUI program, normally distributed with git package, dedicated to graphing branches. Run git help gitk for details.
There are some 3rd part tools like that e.g. gitg for a GNOME desktop environment giggle etc.

Tagging Commits

Like most VCSs, Git has the ability to tag(bookmark) specific points in history(important revisions) as being important(like public releases). Typically people use this functionality to mark release points (v1.0, and so on). Tags are simple pointers to commits.
The git tag command can be used to create a new tag:

git tag -a v1.0 -m "Stable release"

The -a option tells Git to create an annotated tag, which lets you record a message along with it (specified with -m).

Running the same command without arguments will list your existing tags:

git tag

You can also search for tags with a particular pattern.

$ git tag -l 'v1.8.5*'

Creating Tags
Git uses two main types of tags:

lightweight and
annotated.

A lightweight tag is very much like a branch that doesn’t change—it’s just a pointer to a specific commit.

Annotated tags, however, are stored as full objects in the Git database. They’re check-summed; contain :

the tagger name,
e-mail, and
date;
have a tagging message; and
can be signed and verified with GNU Privacy Guard (GPG).

It’s generally recommended that you create annotated tags so you can have all this information; but if you want a temporary tag or for some reason don’t want to keep the other information, lightweight tags are available too.

Annotated Tags
Creating an annotated tag in Git is simple.

$ git tag -a v1.4 -m 'my version 1.4'

If you don’t specify a message for an annotated tag, Git launches your editor so you can type it in.
You can see the tag data along with the commit that was tagged by using the git show command:

$ git show v1.4

tag v1.4
Tagger: Ben Straub
Date:
Sat May 3 20:19:12 2014 -0700

my version 1.4

commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon
Date:
Mon Mar 17 21:52:11 2008 -0700

changed the version number

Lightweight Tags
Another way to tag commits is with a lightweight tag. This is basically the commit checksum stored in a file—no other information is kept. To create a lightweight tag, don’t supply the -a, -s, or -m option:

$ git tag v1.4-lw

This time, if you run git show on the tag, you don’t see the extra tag information. The command just shows the commit:

$ git show v1.4-lw

commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon
Date:
Mon Mar 17 21:52:11 2008 -0700

changed the version number

Tagging Later
You can also tag commits after you’ve moved past them. Suppose your commit history looks like this:

$ git log --pretty=oneline

15027957951b64cf874c3557a0f3547bd83b3ff6 Merge branch 'experiment'
a6b4c97498bd301d84096da251c98a07c7723e65 beginning write support
0d52aaab4479697da7686c15f77a3d64d9165190 one more thing

Now, suppose you forgot to tag the project at v1.2, which was at the “updated rakefile” commit. You can add it after the fact. To tag that commit, you specify the commit checksum (or part of it) at the end of the command:

$ git tag -a v1.2 9fceb02

You can see that you’ve tagged the commit:

$ git tag

v0.1
v1.2
...

$ git show v1.2

tag v1.2
Tagger: Scott Chacon
Date:
Mon Feb 9 15:32:16 2009 -0800

version 1.2
commit 9fceb02d0ae598e95dc970b74767f19372d61af8
Author: Magnus Chacon
Date:
Sun Apr 27 20:43:35 2008 -0700

updated rakefile
...

Sharing Tags
By default, the git push command doesn’t transfer tags to remote servers. You will have to explicitly push tags to a shared server after you have created them. This process is just like sharing remote branches—you can run

$ git push origin v1.5

Counting objects: 14, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (12/12), done.
Writing objects: 100% (14/14), 2.05 KiB | 0 bytes/s, done.
Total 14 (delta 3), reused 0 (delta 0)
To git@github.com:schacon/simplegit.git
* [new tag]
v1.5 -> v1.5

If you have a lot of tags that you want to push up at once, you can also use

$ git push origin --tags

Counting objects: 1, done.
Writing objects: 100% (1/1), 160 bytes | 0 bytes/s, done.
Total 1 (delta 0), reused 0 (delta 0)
To git@github.com:schacon/simplegit.git
* [new tag]
v1.4 -> v1.4
* [new tag]
v1.4-lw -> v1.4-lw

This transfers all your tags that are not already there to the remote server.
Now, when someone else clones or pulls from your repository, they will get all your tags as well.

Undoing Changes

The whole point of maintaining “safe” copies of a software project is peace of mind:

should your project suddenly break, you’ll know that you have easy access to a functional version, and you’ll be able to pinpoint precisely where the problem was introduced.

To this end, recording commits is useless without the ability to undo changes.
However, since Git has three components, “undoing” can take on many different meanings.
For example, you can:

Undo changes in the working directory
Undo changes in the staging area
Undo an entire commit

To complicate things even further, there are multiple(two) ways to undo a commit. You can either:

Simply delete the commit from the project history(that's generally bad because you loose history that's considered an important asset).
Leave the commit as is(for the sake of maintaining history), using a new commit to undo the changes introduced by the first commit.

Git has a dedicated tool for each of these situations.
Let’s start with the working directory.

Undoing in the Working Directory

The period of time immediately after saving a safe copy of a project is one of great experimentation and innovation.
Empowered by the knowledge that

you’re free to do anything you want without damaging the code base, you can experiment to your heart’s content.

However, this care free experimentation often takes a wrong turn and leads to a working directory with a heap of off-topic code. When you reach this point, you’ll probably want to run the following commands:

git reset --hard HEAD
git clean -f

This configuration of git reset makes the working directory and the stage match the files in the most recent commit (also called HEAD), effectively obliterating all uncommitted changes in tracked files.

Ok but what about untracked files?

To get rid of untracked files, you have to use the git clean command. Git is very careful about removing code, so you must also supply the -f option to force the deletion of these files.

Individual Files
It’s also possible to target individual files. The following command will make a single file in the working directory match the version in the most recent commit (pointed by HEAD).

git checkout HEAD <file>

What if you realize that you don’t want to keep your changes to the benchmarks.rb file? How can you easily unmodify it—revert to what it looked like when you last committed (or initially cloned, or however you got it into
your working directory)?
Luckily, git status tells you how to do that, too. In the last example output, the unstaged area looks like this:

Changes not staged for commit:
   (use "git add <file>..." to update what will be committed)
   (use "git checkout -- <file>..." to discard changes in working directory)

    modified: benchmarks.rb

It tells you pretty explicitly how to discard the changes you’ve made. Let’s do what it says:

git checkout -- benchmarks.rb
git status

On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

renamed: README.md -&gt README

You can see that the changes have been reverted.

It’s important to understand that git checkout -- <file> is a dangerous command. Any changes you made to that file are gone—you just copied another file over it. Don’t ever use this command unless you absolutely know that you don’t want the file.

If you would like to keep the changes you’ve made to that file but still need to get it out of the way for now, you’ll go over stashing and branching

Remember, anything that is committed in Git can almost always be recovered. Even commits that were on branches that were deleted or commits that were overwritten with an --amend commit can be recovered. However, anything you lose that was never committed is likely never to be seen again.

This command doesn’t change the project history at all, so you can safely replace HEAD with a

commit ID,
branch,
or tag

to make the file match the version in that commit.

But, do not try this with git reset, as it will change your history

Undoing in the Staging Area
In the process of configuring your next commit, you’ll occasionally add an extra file to the stage. The following invocation of git reset will unstage it:

git reset HEAD <file>

Omitting the --hard flag tells Git to leave the working directory untouched (opposed to git reset –-hard HEAD, which resets every file in both the working directory and the stage area).

The staged version of the file matches HEAD, and the working directory retains the modified version. As you might expect, this results in an unstaged modification in your git status output.

The nice part is that the command you use to determine the state of your staging area and working directory also reminds you how to undo changes to them.
E.g. let’s say you’ve changed two files and want to commit them as two separate changes, but you accidentally type git add * and stage them both. How can you unstage one of the two? The git status command reminds you:

git add .
git status

On branch master
Changes to be committed:
    (use "git reset HEAD ..." to unstage)

       renamed:
       README.md -> README

modified:benchmarks.rb

Right below the “Changes to be committed” text, it says use
git reset HEAD ... to unstage. So, let’s use that advice to unstage the benchmarks.rb file:

git reset HEAD benchmarks.rb

Unstaged changes after reset:
M benchmarks.rb

git status

On branch master
Changes to be committed:
   (use "git reset HEAD ..." to unstage)

   renamed: README.md -> README

Changes not staged for commit:
   (use "git add ..." to update what will be committed)
   (use "git checkout -- ..." to discard changes in working directory)

   modified: benchmarks.rb

The benchmarks.rb file is modified but once again unstaged.

While git reset --hard can be a dangerous command because touches also your working directory(you can loose content if you have uncommitted changes) git reset without an option(default --soft) is not dangerous—it only touches your staging area but not your working directory

Undoing Commits

There are two ways to undo a commit using Git:

You can either reset it by simply removing it from the project history,
or you can revert it by generating a new commit that gets rid of the changes introduced in the original.

Undoing by introducing another commit may seem excessive, but rewriting history by completely removing commits can have dire consequences in multi-user workflows.

Resetting
The ever-versatile git reset can also be used to move the HEAD reference.

git reset HEAD~1

The HEAD~1 syntax parameter specifies the commit that occurs immediately before HEAD (likewise, HEAD~n refers to the nth commit before HEAD).

By moving the HEAD reference backward, you’re effectively removing the most recent commit from the project’s history.

This is an easy way to remove a couple of commits that veered off-topic, but it presents a serious collaboration problem.

If another developer had started building on top of the commit we removed, how would he or she synchronize with our repository? The developer would have to ask us for the ID of the replacement commit, manually track it down in your repository, move all of the changes to that commit, resolve merge conflicts, and then share the “new” changes with everybody again.

Just imagine what would happen in an open-source project with hundreds or thousands of contributors

The point is, don’t reset public commits, but feel free to delete private ones that you haven’t shared with anyone.

Reverting
To remedy the problems introduced by resetting public commits, Git developers devised another way to undo commits: git revert

Instead of altering existing commits, reverting adds a new commit that undoes the problem commit:

git revert <commit-id>

This takes the changes in the specified commit, figures out how to undo them, and creates a new commit with the resulting change-set. To Git and to other users, the revert commit looks and acts like any other commit—it just happens to undo the changes introduced by an earlier commit.

Revert is the ideal way of undoing changes that have already been committed to a public repository.

Amending
In addition to completely undoing commits, you can also amend the most recent commit (perhaps when you commit too early and possibly forget to add some files, or you mess up your commit message) by staging changes as usual, then if you want to try that commit again, you can run commit with the --amend option:

git commit --amend

This replaces (creates a new commit ID) the previous commit instead of creating a new (distinct) one

This command takes your staging area and uses it for the commit. If you’ve made no changes since your last commit (for instance, you run this command immediately after your previous commit), then your snapshot will look
exactly the same, and all you’ll change is your commit message.

For your convenience, the commit editor is seeded with the old commit’s message. You can edit the message the same as always, but it overwrites your previous commit.

E.g., if you commit and then realize you forgot to stage the changes in a file you wanted to add to this commit, you can do something like this:

git commit -m 'initial commit'
git add <forgotten_file>
git commit --amend

Again, you must be careful when using the --amend flag, since it rewrites history much like git reset.

Branches

Branches multiply the basic functionality offered by commits by allowing users to fork their history.

Creating a new branch is akin to requesting a new development environment, complete with an isolated working directory, staging area, and project history.

This gives you the same peace of mind as committing a “safe” copy of your project, but you now have the additional capacity to work on multiple versions at the same time.

Branches enable a non-linear workflow—the ability to develop unrelated features in parallel.

A non-linear workflow is an important precursor to the distributed nature of Git’s collaboration model.

Unlike SVN or CVS, Git’s branch implementation is incredibly efficient.

SVN enables branches by copying the entire project into a new folder, much like you would do without any revision control software. This makes merges clumsy, error-prone, and slow.

In contrast, Git branches are simply a pointer to a commit. Since they work on the commit level instead of directly on the file level, Git branches make it much easier to merge diverging histories. This has a dramatic impact on branching workflows.

Manipulating Branches

Git separates branch functionality into a few different commands.
The git branch command is used for:

listing,
creating,
or deleting branches.

Listing Branches
First and foremost, you’ll need to be able to view your existing branches with:

git branch

This will output all of your current branches, along with an asterisk next to the one that’s currently “checked out” (more on that later):

* master
some-feature
quick-bug-fix

The master branch is Git’s default branch, which is created with the first commit in any repository. Many developers use this branch as the “main” history of the project—a permanent branch that contains every major change it goes through.

Creating Branches

Here's a shortcut that create a branch and switch to

git checkout -b <name>

or you can create a new branch by passing the branch name to the git branch command:

git branch <name>

This creates a pointer to the current HEAD, but does not switch to the new branch.
For that you’ll need:

git checkout <branch-name>

Immediately after requesting a new branch, your repository will look something like the following.

Your current branch (master) and the new branch (some-feature) both reference the same commit, but any new commits you record will be exclusive to the current branch (pointed by HEAD).

Again, this lets you work on unrelated features in parallel, while still maintaining sensible histories.

For example, if your current branch(pointed by HEAD) was some-feature, your history would look like the following after committing a snapshot.

The new HEAD exists only in the some-feature branch. It won’t show up in the log output of master, nor will its changes appear in the working directory after you check out master.
You can actually see the new branch in Git's internal database by opening the file:

.git/refs/heads/

The file contains the ID of the referenced commit, and it is the sole definition of a Git branch. This is the reason branches are so lightweight and easy to manage.

Deleting Branches
Finally, you can delete branches via the -d flag:

git branch -d <name>

But, Git’s dedication to never losing your work prevents it from removing branches with unmerged commits.

$ git branch -d testing
error: The branch 'testing' is not fully merged.
If you are sure you want to delete it, run 'git branch -D testing'.

If you really do want to delete the branch and lose that work, you can force it with -D, as the helpful message points out.

git branch -D <name>

Unmerged changes will be lost, so be very careful with this command.

The git branch command does more than just create and delete branches. If you run it with no arguments, you get a simple listing of your current branches. To see the last commit on each branch, you can run

git branch -v

The useful --merged and --no-merged options can filter this list to branches that you have or have not yet merged respectively into the branch you’re currently on.
To see which branches are already merged into the branch you’re on, you can run

$ git branch --merged

Branches on this list without the * in front of them are generally fine to delete with git branch -d; you’ve already incorporated their work into another branch, so you’re not going to lose anything.

To see all the branches that contain work you haven’t yet merged in, you can run

$ git branch --no-merged

Because it contains work that isn’t merged in yet, trying to delete it with git branch –d will fail

Checking Out Branches

Of course, creating branches is useless without the ability to switch between them. Git calls this “checking out” a branch:

git checkout <branch>

After checking out the specified branch, your working directory is updated to match the specified branch’s commit.
In addition, the HEAD is updated to point to the new branch, and all new commits will be stored on the new branch.

You can think of checking out a branch as switching to a new project folder—except it will be much easier to pull changes back into the project.

It’s usually a good idea to have a clean working directory before checking out a branch. A clean directory exists when there are no uncommitted changes.

If this isn’t the case, git checkout has the potential to overwrite your modifications.

It’s important to note that when you switch branches in Git, files in your working directory will change. If you switch to an older branch, your working directory will be reverted to look like it did the last time you committed on that branch. If Git cannot do it cleanly, it will not let you switch at all.

As with committing a “safe” revision, you’re free to experiment on a new branch without fear of destroying existing functionality.

But, you now have a dedicated history to work with, so you can record the progress of an experiment using the exact same git add and git commit commands.

Now your project history has diverged. You created and switched to a branch, did some work on it, and then switched back to your main branch and did other work.
Both of those changes are isolated in separate branches: you can switch back and forth between the branches and merge them when you’re ready.
And you did all that with simple branch, checkout, and commit commands.
This functionality will become even more powerful once we learn how to merge divergent histories back into the “main” branch (e.g., master). We’ll get to that in a moment, but first, there is an important use case of git checkout that must be considered.

Detached HEADs
Git also lets you use git checkout with tags and commit IDs, but doing so puts you in a detached HEAD state. This means that you’re not on a branch anymore—you’re directly viewing a commit.

You can look around and add new commits as usual, but since there is no branch pointing to the additions, you’ll lose all your work as soon as you switch back to a real branch.

Fortunately, creating a new branch in a detached HEAD state is easy enough:

git checkout -b <new-branch-name>

This is a shortcut for

git branch <new-branch-name>

followed by

git checkout <new-branch-name>

After which, you’ll have a shiny new branch reference to the formerly detached HEAD.

This is a very handy procedure for forking experiments off of old revisions.

Merging Branches

Merging is the process of pulling commits from one branch into another. There are many ways to combine branches, but the goal is always to share information between branches. This makes merging one of the most important features of Git.
The two most common merge methodologies are:

The “fast-forward” merge
The “3-way” merge

They both use the same command, git merge, but the method is automatically determined based on the structure of your history.

In each case, the branch you want to merge into must be firstly checked out, and the target branch will remain unchanged.

The next two sections present two possible merge scenarios for the following commands:

git checkout master
git merge <some-feature>

Again, this merges the some-feature branch into the master branch, leaving the former(some-feature) untouched. You’d typically run these commands once you’ve completed a feature and want to integrate it into the stable project.

Fast-forward Merges
This is the simplest type of merge. The first scenario looks like this:

We created a branch to develop some new feature, added two commits, and now it’s ready to be integrated into the main code base. Instead of rewriting the two commits missing from master, Git can “fast-forward” the master branch’s pointer to match the location of some-feature.

After the merge, the master branch contains all of the desired history, and the feature branch can be deleted (unless you want to keep developing on it).

Of course, we could have made the two commits directly on the master branch; however, using a dedicated feature branch gave us a safe environment to experiment with new code.

If it didn’t turn out quite right, we could have simply deleted the branch (as opposed to resetting/reverting).
Or, if we added a bunch of intermediate commits containing broken code, we could clean it up before merging it into master (see also Rebasing).

As projects get more complicated and acquire more collaborators, this kind of branched development makes Git a fantastic organizational tool.

3-way Merges
But, not all situations are simple enough for a fast-forward commit.

Remember, the main advantage of branches is the ability to explore many independent lines of development simultaneously.

As a result, you’ll often encounter a scenario that looks like the following:

We started out like a fast-forward merge, but we added a commit to the master branch while we were still developing some-feature.
For example, we could have stopped working on the feature to fix a time- sensitive bug. Of course, the bug-fix should be added to the main repository as soon as possible, so we wind up in the scenario shown above.

Merging the feature branch into master in this context results in a “3-way” merge. This is accomplished using the exact same commands as the fast-forward merge from the previous section.

Git can’t fast-forward the master pointer to some-feature without backtracking. Instead, it generates a new merge commit that represents the combined snapshot of both branches. Note that this new commit has two parent commits, giving it access to both histories (indeed, running git log after the 3-way merge shows commits from both branches).

The name of this merge algorithm originates from the internal method used to create the merge commit. Git looks at three commits(the two latest branch snapshots --leaf nodes-- and the most recent common ancestor of the
two --first common node-- branches) creating a new snapshot (and commit) --the final state of the merge-- .

Merge Conflicts
If you try to combine two branches that make different changes to the same portion of code, Git won’t know which version to use. This is called a merge conflict.
Obviously, this can never happen during a fast-forward merge(Why?!).
When Git encounters a merge conflict, you’ll see the following message:

Auto-merging index.html
CONFLICT (content): Merge conflict in <file>
Automatic merge failed; fix conflicts and then commit the result.

Instead of automatically adding the merge commit, Git stops and asks you what to do.
Running git status in this situation will return something like the following:

# Not currently on any branch.
# Unmerged paths:
#   (use "git reset HEAD ..." to unstage)
#   (use "git add/rm ..." as appropriate to mark resolution)
#
#   both modified: <file>

no changes added to commit (use "git add" and/or "git commit -a")

Every file with a conflict is stored under the “Unmerged paths” section. Git annotates these files to show you the content from both versions:

<<<<<<< HEAD
This content is from the current branch.
=======
This is a conflicting change from another branch.
>>>>>>> some-feature

The part before the ======= is from the master branch, and the rest is from the branch you’re trying to integrate.

To resolve the conflict, get rid of the <<<<<<, =======, and >>>>>>> notation,
and change the code to whatever you want to keep(either master branch's version or some-feature branch's version or both master + some-feature branches version).
Then, tell Git you’re done resolving the conflict with the git add command:

git add <file>
That’s right; all you have to do is stage the conflicted file to mark it as resolved.
Finally, complete the 3-way merge by generating the merge commit:
git commit <file>

The log message in editor is seeded with a merge notice, along with a “conflicts” list, which can be particularly useful when trying to figure out where something went wrong in a project. And that’s all there is to merging in Git.

Now that we have an understanding of the mechanics behind Git branches, we can take an in-depth look at how veteran Git users leverage branches in their everyday workflow.

Branching Workflows

The workflows presented in this section are the typical ones and represent the hallmark of Git-based revision control.

The lightweight, easy-to-merge nature of Git’s branch implementation makes branches one of the most productive tools in your software development arsenal.
All branching workflows revolve around the following commands discussed earlier

git branch,
git checkout, and
git merge

Types of Branches
It’s often useful to assign special meaning to different branches for the sake of organizing a project. I introduce you to the most common types of branches, but keep in mind these distinctions are purely superficial—to Git, a branch is a branch regardless of the meaning we tend to associate.

All branches can be categorized as either

permanent branches or
topic branches.

The former contain the main history of a project (e.g., master), while the latter are temporary branches used to implement a specific topic, then discarded (e.g., some-feature).

Permanent Branches
Permanent branches are the lifeblood of any repository. They contain every major waypoint of a software project. Most developers use master exclusively for stable code.

In these workflows, you never commit directly on master—master is only an integration branch for completed features that were built in dedicated topic branches.

In addition, many users add a second layer of abstraction in another integration branch (conventionally called develop, though any name will suffice). This frees up the master branch for really stable code (e.g., public commits), and uses develop as an internal integration branch to prepare for a public release.

Topic Branches
Topic branches generally fall into two categories:

feature branches
and hotfix branches.

Feature branches are temporary branches that encapsulate a new feature or refactor, protecting the main project from untested code.
They typically stem from another feature branch or an integration branch, but not the “super stable”(master) branch.

Hotfix branches are similar in nature, but they stem from the public release branch (e.g., master).

Instead of developing new features, they are for quickly patching the main line of development.
Typically, this means bug fixes and other important updates that can’t wait until the next major release

Again, the meanings assigned to each of these branches are purely conventional —Git sees no difference between master, develop, features, and hotfixes. With that in mind, don’t be afraid to adapt them to your own ends.

The beauty of Git is its flexibility. When you understand the mechanics behind Git branches, it’s easy to design novel workflows that fit your project and personality.

Rebasing

Rebasing is the process of moving a branch to a new base. Git’s rebasing capabilities make branches even more flexible by allowing users to manually organize their branches.

Like merging, git rebase requires the branch to be checked out and takes the new base as an argument:

git checkout some-feature
git rebase master

This moves the entire some-feature branch onto the tip of master:

After the rebase, the feature branch if you examine the log of a rebased branch
looks like a linear extension of master it appears that all the work happened in series, even when it originally happened in parallel, which is a much cleaner way to integrate changes from one branch to another.
Compare this linear history with a merge of master into some-feature, which results in the exact same code base in the final snapshot:

There is no difference in the end product of the integration(the snapshot pointed to by the merge commit --pointed by some-feature-- is exactly the same as the one that was pointed to by the final snapshot --pointed by some-feature--in rebase), but rebasing makes for a cleaner history.

Note that the snapshot pointed to by the final commit you end up with, whether it’s the last of the rebased commits for a rebase or the final merge commit after a merge, is the same snapshot—it’s only the history that is different.

Rebasing replays changes from one line of work onto another in the order they were introduced, whereas merging takes the endpoints and merges them.

Since the history has diverged, Git has to use an extra merge commit to combine the branches. Doing this many times over the course of developing a long- running feature can result in a very messy history.
These extra merge commits are superfluous—they exist only to pull changes from master into some-feature.

Typically, you’ll want your merge commits to mean something, like the completion of a new feature. This is why many developers choose to pull in changes with git rebase, since it results in a completely linear history in the feature branch.

Interactive Rebasing
Interactive rebasing goes one step further and allows you to change commits as you’re moving them to the new base. You can specify an interactive rebase with the -i flag:

git rebase –i master

This populates a text editor with a summary of each commit in the feature branch, along with a command that determines how it should be transferred to the new base.

For example, if you have two commits on a feature branch, you might specify an interactive rebase like the following:

pick 58dec2a First commit for new feature
squash 6ac8a9f Second commit for new feature

The default pick command moves the first commit to the new base just like the normal git rebase, but then the squash command tells Git to combine the second commit with the previous one, so you wind up with one commit containing all of your changes:

Git provides several interactive rebasing commands, each of which are summarized in the comment section of the configuration listing.

The point is interactive rebasing lets you completely rewrite a branch’s history to your exact specifications. This means you can add as many intermediate commits to a feature branch as you need, then go back and fix them up into meaningful progression after the fact.

Other developers will think you are a brilliant programmer, and knew precisely how to implement the entire feature in one fell swoop.

This kind of organization is very important for ensuring large projects have a navigable history.

Rewriting History
Rebasing is a powerful tool, but you must be judicious in your rewriting of history.
Both kinds of rebasing don’t actually move existing commits—they create brand new ones. If you inspect commits that were subjected to a rebase, you’ll notice that they have different IDs, even though they represent the same content.

This means rebasing destroys existing commits in the process of “moving” them. As you might imagine, this has dramatic consequences for collaborative workflows. Destroying a public commit (e.g., anything on the master branch) is like ripping out the basis of everyone else’s work. Git won’t have any idea how to combine everyone’s changes, and you’ll have a whole lot of apologizing to do.

We’ll take a more in-depth look at this scenario after we learn how to communicate with remote repositories.
For now, just abide by the golden rule of rebasing:

never rebase a branch that has been pushed to a public repository.

More Interesting Rebases (medium level)
You can also have your rebase replay on something other than the rebase target branch. Take a history like:

for example.

You branched a topic branch (server) to add some server-side functionality to your project, and made a commit.
Then, you branched off that to make the client-side changes (client) and committed a few times.
Finally, you went back to your server branch and did a few more commits.

Suppose you decide that you want to merge your client-side changes into your mainline for a release, but you want to hold off on the server-side changes until it’s tested further.
You can take the changes on client that aren’t on server (C8 and C9) and replay them on your master branch by using the --onto option of git rebase:

$ git rebase --onto master server client

This basically says,

“Check out the client branch, figure out the patches from the common ancestor of the client and server branches, and then replay them onto master.”

It’s a bit complex, but the result is pretty cool.

Now you can fast-forward your master branch:

$ git checkout master
$ git merge client

Let’s say you decide to pull in your server branch as well.
You can rebase the server branch onto the master branch without having to check it out first by running git rebase [basebranch] [topicbranch]—which checks out the topic branch (in this case, server) for you and replays it onto the base branch (master):

$ git rebase master server

This replays your server work on top of your master work.

Then, you can fast-forward the base branch (master):

$ git checkout master
$ git merge server

You can remove the client and server branches because all the work is integrated and you don’t need them anymore.

$ git branch -d client
$ git branch -d server

The Perils of Rebasing
The bliss of rebasing isn’t without its drawbacks, which can be summed up in a single line:

Do not rebase commits that exist outside your repository.

If you follow that guideline, you’ll be fine. If you don’t, people will hate you.

When you rebase stuff, you’re abandoning existing commits and creating new ones that are similar but different. If you push commits somewhere and others pull them down and base work on them, and then you rewrite those commits with git rebase and push them up again, your collaborators will have to re-merge their work and things will get messy when you try to pull their work back into yours.

Let’s look at an example of how rebasing work that you’ve made public can cause problems.
Suppose you clone from a central server and then do some work off that.

Now, someone else does more work that includes a merge, and pushes that work to the central server. You fetch them and merge the new remote branch into your work.

Next, the person who pushed the merged work decides to go back and rebase their work instead; they do a

git push --force

to overwrite the history on the server. You then fetch from that server, bringing down the new commits.

Now you’re both in a pickle. If you do a git pull, you’ll create a merge commit which includes both lines of history.

If you run a git log when your history looks like this, you’ll see two commits that have the same author, date, and message, which will be confusing.
Furthermore, if you push this history back up to the server, you’ll reintroduce all those rebased commits to the central server, which can further confuse people. It’s pretty safe to assume that the other developer doesn’t want C4 and C6 to be in the history; that’s why she rebased in the first place.

Rebase When You Rebase
If you do find yourself in a situation like this, Git has some further magic that might help you out.
If someone on your team force pushes changes that overwrite work that you’ve based work on, your challenge is to figure out what is yours and what they’ve rewritten.

It turns out that in addition to the commit SHA checksum, Git also calculate a checksum that is based just on the patch introduced with the commit. This is called a “patch-id.”

If you pull down work that was rewritten and rebase it on top of the new commits from your partner, Git can often successfully figure out what is uniquely yours and apply them back on top of the new branch.

For instance, in the previous scenario, if instead of doing a merge, abandoning
commits you’ve based your work on we run

git rebase teamone/master

Git will:

Determine what work is unique to our branch (C2, C3, C4, C6, C7)
Determine which are not merge commits (C2, C3, C4)
Determine which have not been rewritten into the target branch (just C2 and C3, since C4 is the same patch as C4')
Apply those commits to the top of teamone/master

So instead of the result seen earlier, we would end up with something more like:

This only works if C4 and C4' that your partner made are almost exactly the same patch. Otherwise the rebase won’t be able to tell that it’s a duplicate and will add another C4-like patch (which will probably fail to apply cleanly, since the changes would already be at least somewhat there).

You can also simplify this by running a git pull --rebase instead of a normal git pull. Or you could do it manually with a git fetch followed by a git rebase teamone/master in this case.

If you are using git pull and want to make --rebase the default, you can set the pull.rebase config value with something like

git config --global pull.rebase true

If you treat rebasing as a way to clean up and work with commits before you push them, and if you only rebase commits that have never been available publicly, then you’ll be fine. If you rebase commits that have already been pushed publicly, and people may have based work on those commits, then you may be in for some frustrating trouble, and the scorn of your teammates.

If you or a partner does find it necessary at some point, make sure everyone knows to run git pull --rebase to try to make the pain after it happens a little bit simpler.

Rebase vs. Merge
Now that you’ve seen rebasing and merging in action, you may be wondering which one is better. Before we can answer this, let’s step back a bit and talk about what history means.

One point of view on this is that your repository’s commit history is a record of what actually happened. It’s a historical document, valuable in its own right, and shouldn’t be tampered with. From this angle, changing the commit history is almost blasphemous; you’re lying about what actually transpired. So what if there was a messy series of merge commits? That’s how it happened, and the repository should preserve that for posterity.
The opposing point of view is that the commit history is the story of how your project was made. You wouldn’t publish the first draft of a book, and the manual for how to maintain your software deserves careful editing. This is the camp that uses tools like rebase and filter-branch to tell the story in the way that’s best for future readers.

Now, to the question of whether merging or rebasing is better: hopefully you’ll see that it’s not that simple. Git is a powerful tool, and allows you to do many things to and with your history, but every team and every project is different.
Now that you know how both of these things work, it’s up to you to decide which one is best for your particular situation.

In general the way to get the best of both worlds is to rebase local changes you’ve made but haven’t shared yet before you push them in order to clean up your story, but never rebase anything you’ve pushed somewhere.

Remote Repositories

Simply put, a remote repository is one that is not your own.

It could be on :

a central server,
another developer’s personal computer, or
even your own file system.

As long as you can access it from some kind of network protocol, Git makes it incredibly easy to share contributions with other repositories.

The primary role of remote repositories is to represent other developers within your own repository. Branches, on the other hand, should only deal with project development. That is to say, don’t try to give individual developers their own branch to work on—give them a complete repository and reserve branches for developing features.

You can have several remote repositories, each of which generally is either read-only or read/write for you. Collaborating with others involves managing these remote repositories and pushing and pulling data to and from them when you need to share work.

We begin by covering the mechanics of remotes, and then present the two most common workflows of Git-based collaboration:

the centralized workflow
and the integrator workflow.

Manipulating Remotes

Managing remote repositories includes :

knowing how to add remote repositories,
remove remotes that are no longer valid,
manage various remote branches and
define them as being tracked or not, and ... more.

Similar to git branch, the git remote command is used to manage connections to other repositories.

Remotes are nothing more than bookmarks to other repositories—instead of typing the full path, they let you reference it with a user-friendly name.

Listing Remotes
You can view your existing remotes by calling the :

git remote

If you have no remotes, this command won’t output any information.

If you used git clone to get your repository, you’ll see an origin remote—that is the default name Git gives to the server you cloned from. Git automatically(per default) adds this connection on every cloned repository, under the assumption that you’ll probably want to interact with it down the road.

e.g.

$ git clone git@github.com:EbookFoundation/free-programming-books.git

Initialized empty Git repository in /home/harrykar/Desktop/test/free-programming-books/.git/
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 13037 (delta 0), reused 2 (delta 0), pack-reused 13033
Receiving objects: 100% (13037/13037), 4.31 MiB | 93 KiB/s, done.
Resolving deltas: 100% (7968/7968), done.

$ cd free-programming-books/

$ git remote
origin

You can request a little bit more information about your remotes with the -v flag
This displays the complete path to the repository.

$ git remote -v

origin git@github.com:EbookFoundation/free-programming-books.git (fetch)
origin git@github.com:EbookFoundation/free-programming-books.git (push)

If you have more than one remote, the command lists them all.
For example, a repository with multiple remotes for working with several collaborators might look something like this.

$ git remote -v

bakkdoor https://github.com/bakkdoor/grit (fetch)
bakkdoor https://github.com/bakkdoor/grit (push)
cho45    https://github.com/cho45/grit (fetch)
cho45    https://github.com/cho45/grit (push)
defunkt https://github.com/defunkt/grit (fetch)
defunkt https://github.com/defunkt/grit (push)
koke     git://github.com/koke/grit.git (fetch)
koke     git://github.com/koke/grit.git (push)
origin   git@github.com:mojombo/grit.git (fetch)
origin   git@github.com:mojombo/grit.git (push)

This means we can pull contributions from any of these users pretty easily. We may additionally have permission to push to one or more of these.

Creating(Adding) Remotes
The git remote add command creates a new bookmark(alias) to a remote repository URL.

git remote add <shortname> <path-to-repo-[url[>

After running this command, you can reach the Git repository at path-to-repo using shortname. Again, this is simply a convenient bookmark for a long path name—it does not create a direct link into someone else’s repository.

Git accepts many network protocols for specifying the location of a remote repository, including

file://, ssh://, https://, and its custom git:// protocol.

e.g.

git remote add some-user ssh://git@github.com/some-user/some-repo.git

After running this command, you can access the repository at

github.com/some-user/some-repo.git

using only

some-user

Since we used ssh:// as the protocol, you’ll probably be prompted for an SSH password before you’re allowed to do anything with the account. This makes SSH a good choice for granting write access to developers, whereas HTTP paths are generally used to give read-only access. This is designed as a security feature for distributed environments.

Deleting and Renaming Remotes
Finally, you can delete a remote connection with the following command:

git remote rm <remote-name>

If you want to rename a reference

git remote rename pb paul

What used to be referenced at pb/master is now at paul/master.

Remote Branches

Commits may be the atomic unit of Git-based version control, but branches are the medium in which remote repositories communicate.

Remote branches are references (pointers) to the state of branches in your remote repositories.
In your repository they’re local branches that you can’t move; they’re moved automatically for you whenever you do any network communication.

Remote branches act as bookmarks to remind you where the branches on your remote repositories were the last time you connected to them. They take the form <remote>/<branch>

For instance, if you wanted to see what the master branch on your origin remote looked like as of the last time you communicated with it, you would check the origin/master branch.

If you were working on an issue with a partner and they pushed up an iss53 branch, you might have your own local iss53 branch; but the branch on the server would point to the commit at origin/iss53.

Remote branches act just like the local branches we’ve covered thus far, except they represent a branch in someone else’s repository.

Once you’ve downloaded a remote branch, Mary’s feature branch is now accessible locally as mary/feature—you can merge it into one of your branches, or you can check out a local branch at that point if you want to inspect it or extend it like any other branch. This makes for a very short learning curve if you understand how to use branches locally.

Fetching Remote Branches

The act of downloading branches from another repository is called fetching.

To fetch a remote branch, you can specify the repository and the branch you’re looking for:

git fetch <remote> <branch>

Or, if you want to download every branch in remote, simply omit the branch name.

git fetch <remote>

The command goes out to that remote project and pulls down all the data from that remote project that you don’t have yet.
After you do this, you should have references to all the branches from that remote, which you can merge in or inspect at any time.

If you clone a repository, the command automatically adds that remote repository under the default (short)name “origin.”
So,

git fetch origin or git fetch

fetches any new work that has been pushed to that server since you cloned (or last fetched from) it.

It’s important to note that the git fetch command pulls the data to your local repository—it doesn’t automatically merge it with any of your work or modify what you’re currently working on. You have to merge it manually into your work when you’re ready.

If you have a local branch set up to track a remote branch, you can use the git pull command to automatically first fetch and then merge a remote branch into your current(pointed by HEAD) branch.

This may be an easier or more comfortable workflow for you; and by default, the git clone command automatically sets up your local master branch to track the remote master branch (or whatever the default branch is called) on the server you cloned from. Running git pull generally fetches data from the server you originally cloned from and automatically tries to merge it into the code you’re currently working on.

After fetching, you can see the downloaded branches by passing the -r option to git branch:

git branch -r

This gives you a branch listing that looks something like:

origin/master
origin/some-feature
origin/another-feature

If you want to get a more verbose output(i.e. SHA-1 hash and commit message) pass the -v option as well:

git branch -rv

Remote branches are always prefixed with the remote name (e.g. origin/) to distinguish them from local branches.

Remember, Git uses remote repositories as bookmarks—not real-time connections with other repositories.

Remote branches are copies of the local branches of another repository. Outside of the actual fetch, repositories are completely isolated development environments.

This also means Git will never automatically fetch branches to access updated information—you must do this manually. But, this is a good thing, since it means you don’t have to constantly worry about what everyone else is contributing while doing your work. This is only possible due to the non-linear workflow enabled by Git branches.

Tracking Branches
Checking out a local branch from a remote branch automatically creates what is called a “tracking branch” (or sometimes an “upstream branch”).
Tracking branches are local branches that have a direct relationship to a remote branch.

If you’re on a tracking branch and type git push, Git automatically knows which server and branch to push to. Also, running git pull while on one of these branches fetches all the remote references and then automatically merges in the corresponding remote branch.

When you clone a repository, it generally automatically creates a master branch that tracks origin/master.
That’s why git push and git pull work out of the box with no other arguments.

However, you can set up other tracking branches if you wish—ones that track branches on other remotes, or don’t track the master branch. The simple case is running:

git checkout -b <branch> <remotename>/<branch>

This is a common enough operation that git provides the --track shorthand:

$ git checkout --track origin/serverfix

Branch serverfix set up to track remote branch serverfix from origin.
Switched to a new branch 'serverfix

To set up a local branch with a different name than the remote branch, you can easily use the first version with a different local branch name:

$ git checkout -b sf origin/serverfix

Branch sf set up to track remote branch serverfix from origin.
Switched to a new branch 'sf'

Now, your local branch sf will automatically push to and pull from origin/serverfix.
If you already have a local branch and want to set it to a remote branch you just pulled down, or want to change the upstream branch you’re tracking, you can use the -u or --set-upstream-to option to git branch to explicitly set it at any time.

$ git branch -u origin/serverfix

Branch serverfix set up to track remote branch serverfix from origin.

UPSTREAM SHORTHAND
When you have a tracking branch set up, you can reference it with the @{upstream} or @{u} shorthand. So if you’re on the master branch and it’s tracking origin/master, you can say if you want something like

git merge @{u}

instead of

git merge origin/master

If you want to see what tracking branches you have set up, you can use

git branch -vv

This lists your local branches with more information, including what each branch is tracking and whether your local branch is ahead, behind or both.

$ git branch -vv

iss53 7e424c3 [origin/iss53: ahead 2] forgot the brackets
master 1ae2a45 [origin/master] deploying index fix
* serverfix f8674d9 [teamone/server-fix-good: ahead 3, behind 1] this should do it
testing 5ea463a trying something new

So here we can see that

our iss53 branch is tracking origin/iss53 and is “ahead” by two, meaning that we have two commits locally that are not pushed to the server.
We can also see that our master branch is tracking origin/master and is up to date.
Next we can see that our serverfix branch is tracking the server-fix-good branch on our teamone server and is ahead by three and behind by one, meaning that there is one commit on the server we haven’t merged in yet and three commits locally that we haven’t pushed.
Finally we can see that our testing branch is not tracking any remote branch.

It’s important to note that these numbers are only since the last time you fetched from each server. This command does not reach out to the servers, it’s telling you about what it has cached from these servers locally.
If you want totally up to date ahead and behind numbers, you’ll need to fetch from all your remotes right before running this. You could do that like this:

$ git fetch --all; git branch -vv

Inspecting Remote Branches

For all intents and purposes, remote branches behave like read-only branches.You can safely inspect their history and view their commits via git checkout, but you cannot continue developing them before integrating them into your local repository. This makes sense when you consider the fact that remote branches are copies of other users’ commits.

If you want to see more information about a particular remote, you can use the git remote show <remote-name> command.

$ git remote show origin

* remote origin
Fetch URL: git@github.com:EbookFoundation/free-programming-books.git
Push URL: git@github.com:EbookFoundation/free-programming-books.git
HEAD branch: master
Remote branches:
    close-2491                 tracked
    deprecate-javascript-pages tracked
    eshellman-patch-1          tracked
    how-to                     tracked
    master                     tracked
    ...
    revert-3021-master         tracked
    standardize-wikibooks      tracked
Local branch configured for 'git pull':
    master merges with remote master
Local ref configured for 'git push':
    master pushes to master (up to date)

It lists the URL for the remote repository as well as the tracking branch information. The command helpfully tells you that if you’re on the master branch and you run git pull, it will automatically merge in the master branch on the
remote after it fetches all the remote references. It also lists all the remote references it has pulled down.

Lets consider another one (but different) output which this command shows which branch is automatically pushed to when you run git push while on certain branches.
It also shows you which remote branches on the server you don’t yet have, which remote branches you have that have
been removed from the server, and multiple branches that are automatically merged when you run git pull.

....

Remote branches:
master                           tracked
dev-branch           tracked
markdown-strip                   tracked
issue-43                         new (next fetch will store in remotes/origin)
issue-45                         new (next fetch will store in remotes/origin)
refs/remotes/origin/issue-11     stale (use 'git remote prune' to remove)
Local branches configured for 'git pull':
dev-branch merges with remote dev-branch
master        merges with remote master
Local refs configured for 'git push':
dev-branch pushes to dev-branch (up to date)         markdown-strip pushes to markdown-strip (up to date) master pushes to master   (up to date)

The .. syntax is very useful for filtering log history.

For example, the following command displays any new updates from origin/ master that are not in your local master branch. It’s generally a good idea to run this before merging changes so you know exactly what you’re integrating:

git log master .. origin/master

If this outputs any commits, it means you are behind the official project and you should probably update your repository.

This is described in the next section.

It is possible to checkout remote branches, but it will put you in a detached HEAD state. This is safe for viewing other user’s changes before integrating them, but any changes you add will be lost unless you create a new local branch tip to reference them.

Merging/Rebasing

Of course, the whole point of fetching is to integrate the resulting remote branches into your local project.

In Git, there are two main ways to integrate changes from one branch into another:

the merge and
the rebase

Let’s say you’re a contributor to an open-source project, and you’ve been working on a feature called some-feature.
As the “official” project (typically pointed to by origin) moves forward as well as your local some-feature branch too, you may want to incorporate its new commits into your repository.
This would ensure that your feature still works with the bleeding-edge developments.
Fortunately, you can use the exact same git merge command to incorporate changes from origin/master into your feature branch:

git checkout some-feature
git fetch origin
git merge origin/master

Since your history has diverged, this results in a 3-way merge, after which your some-feature branch has access to the most up-to-date version of the official project.

However, frequently merging with origin/master just to pull in updates eventually results in a history littered with meaningless merge commits.
Depending on how closely your feature needs to track the rest of the code base, rebasing might be a better way to integrate changes:

git checkout some-feature
git fetch origin
git rebase origin/master

As with local rebasing, this creates a perfectly linear history free of superfluous merge commits:

Rebasing/merging remote branches has the exact same trade-offs as discussed earlier in the section on local branches.

Often, you’ll do this to make sure your commits apply cleanly on a remote branch – perhaps in a project to which you’re trying to contribute but that you don’t maintain.
In this case, you’d do your work in a branch and then rebase your work onto origin/master when you were ready to submit your patches to the main project. That way, the maintainer doesn’t have to do any integration work—just a fast-forward or a clean apply.

Note that the snapshot pointed to by the final commit you end up with, whether it’s the last of the rebased commits for a rebase or the final merge commit after a merge, is the same snapshot—it’s only the history that is different.

Rebasing replays changes from one line of work onto another in the order they were introduced, whereas merging takes the endpoints and merges them.

Pulling
While the git fetch command will fetch down all the changes on the server that you don’t have yet, it will not modify your working directory at all. It will simply get the data for you and let you merge it yourself.
However, there is a command called git pull which is essentially a git fetch immediately followed by a git merge in most cases.

Since the fetch/merge sequence is such a common occurrence in distributed development, Git provides a pull command as a convenient shortcut:

git pull origin/master

This fetches the origin’s master branch, and then merges it into the current branch in one step.
In other words if you have a tracking branch set up, either by explicitly setting it or by having it created for you by the clone or checkout commands, git pull will look up what server and branch your current branch is tracking, fetch from that server and then try to merge in that remote branch.

Generally it’s better to simply use the fetch and merge commands explicitly as the magic of git pull can often be confusing.

You can also pass the --rebase option to use git rebase instead of git merge.

git pull --rebase origin/master

Pushing

To complement the git fetch command, Git also provides a push command.
When you have your project at a point that you want to share, you have to push it upstream.
Pushing is almost the opposite of fetching, in that fetching imports branches from another repository, while pushing exports branches to another repository.

git push <remote> <branch>

The above command sends the local branch to the specified remote repository. Except, instead of a remote branch, git push creates a local branch.

For example, executing git push mary my-feature in your local repository bind your my-feature branch onto mary's master branch -at the node pointed by master- (your repository will be unaffected by the push).

Notice that my-feature after push is a local branch in Mary’s repository, whereas it would be a remote branch had she fetched it herself.

This makes pushing a dangerous operation. Imagine you’re developing in your own local repository, when, all of a sudden, a new local branch shows up out of nowhere.
But, repositories are supposed to serve as completely isolated development environments, so why should git push even exist? As we’ll discover shortly, pushing is a necessary tool for maintaining public Git repositories.

If you want to push your master branch to your origin server (again, cloning generally sets up both of those names for you automatically), then you can run this to push any commits you’ve done back up to the server:

$ git push origin master

This command works only if you cloned from a server to which you have write access and if nobody has pushed in the meantime.

If you and someone else clone at the same time and they push upstream and then you push upstream, your push will rightly be rejected. You’ll have to pull down their work first and incorporate it into yours before you’ll be allowed to push.

Deleting Remote Branches
Suppose you’re done with a remote branch—say you and your collaborators are finished with a feature and have merged it into your remote’s master branch (or whatever branch your stable codeline is in).
You can delete a remote branch using the --delete option to git push. If you want to delete your serverfix branch from the server, you run the following:

$ git push origin --delete serverfix

To https://github.com/schacon/simplegit
- [deleted] serverfix

Basically all this does is remove the pointer from the server. The Git server will generally keep the data there for a while until a garbage collection runs, so if it was accidentally deleted, it’s often easy to recover.

Remote Workflows

Now that we have a basic idea of how Git interacts with other repositories, we can discuss the real-world workflows that are supported by these commands.
The two most common collaboration models are:

the centralized workflow and
the integrator workflow

SVN and CVS users should be quite comfortable with Git’s flavor of centralized development, but using Git means you’ll also get to leverage its highly-efficient merge capabilities.

The integrator workflow is a typical distributed collaboration model and is not possible in purely centralized systems.

As you read through these workflows, keep in mind that Git treats all repositories as equals.

There is no “master” repository according to Git as there is with SVN or CVS. The “official” code base is merely a project convention —the only reason it’s the official repository is because that’s where everyone’s origin remote points.

Public (Bare) Repositories
Every collaboration model involves at least one public repository that serves as a entry-point for multiple developers.

Public repositories have the unique constraint of being bare—they must not have a working directory.
This prevents developers from accidentally overwriting each others’ work with git push.

You can create a bare repository by passing the --bare option to git init:

git init --bare <path>

Public repositories should only function as storage facilities—not development environments. This is conveyed by adding a .git extension to the repository’s file path, since the internal repository database resides in the project root instead of the .git subdirectory.

So, a complete example might look like:

git init --bare some-repo.git

Aside from a lack of a working directory, there is nothing special about a bare repository.
You can add remote connections, push to it, and pull from it in the usual fashion.

The Centralized Workflow

The centralized workflow is best suited to small teams where each developer has write access to the repository.

It allows collaboration by using a single central repository, much like the SVN or CVS workflow.

In this model, all changes must be shared through the central repository, which is usually stored on a server to enable Internet-based collaboration.

Individual developers work in their own local repository, which is completely isolated from everyone's else.
Once they’ve completed a feature and are ready to share their code,they

clean it up,
integrate it into their local master, and
push it to the central repository (e.g., origin).

This also means all developers need SSH access to the central repository.
Then, everyone else can:

fetch the new commits and
incorporate them(trough merge or rebase) into their local projects.

Again, this can be done with either a merge or a rebase, depending on your team’s conventions.

This is the core process behind centralized workflows, but it hits a bump when multiple users try to simultaneously update the central repository.
Imagine a scenario where two developers finished a feature, merged it into their local master, and tried to publish it at the same time (or close to it).
Whoever gets to the server first can push his or her commits as normal, but then the second developer gets stuck with a divergent history, and Git cannot perform a fast-forward merge.

For example, if a developer named John were to push his changes right before Mary, we’d see a conflict in Mary’s repository:

The only way to make the origin’s master (updated by John) match Mary’s master is to overwrite John’s commit. Obviously, this would be very bad, so Git aborts the push and outputs an error message:

! [rejected] master -> master (non-fast-forward)
error: failed to push some refs to 'some-repo.git'

To remedy this situation, Mary needs to synchronize with the central repository. Then, she’ll be able to push her changes in the usual fashion.

git fetch origin master
git rebase origin/master
git push origin master

Other than that, the centralized workflow is relatively straightforward. Individual developers stay in their own local repository, periodically pulling/pushing to the central repository to keep everything up-to-date.

It’s a convenient workflow to set up, as only one server is required, and it leverages existing SSH functionality.

The Integrator Workflow

The integrator workflow is a distributed development model where individual users maintain a public repository, in addition to their private one. It exists as a solution to the security and scalability problems inherent in the centralized workflow.

The main drawback of the centralized workflow is that every developer needs push access to the entire project. This is fine if you’re working with a small team of trusted developers, but imagine a scenario where you’re working on an open-source software project and a stranger found a bug, fixed it, and wants to incorporate the update into the main project.
You probably don’t want to give him push access to the central repository, since he could start pushing all sorts of random commits, and you would effectively lose control of the project.

But, what you can do is tell the contributor to push the changes to his own public repository. Then, you can pull his bug fix into your private repository to ensure it doesn’t contain any undesired code.
If you approve his contributions, all you have to do is merge them into a local branch and push it to the main repository as usual. You’ve become an integrator, in addition to an ordinary developer:

In this workflow, individual developers only need push access to their own public repositories. Contributors use SSH to push to their public repositories, but the integrator can fetch the changes over HTTP (a read-only protocol).

This makes for a more secure environment for everyone, even when you add more collaborators:

Note that the team must still agree on a single “official” repository to pull from—otherwise changes would be applied out-of-order and everyone would wind up out-of-sync very quickly.

As an integrator, you have to keep track of more remotes than you would in the centralized workflow, but this gives you the freedom and security to incorporate changes from any developer without threatening the stability of the project.

In addition, the integrator workflow has no single point-of-access to serve as a choke point for collaboration. In centralized workflows, everyone must be completely up-to-date before publishing changes, but that is not the case in distributed workflows.Again, this is a direct result of the nonlinear development style enabled by Git’s branch implementation.

These are huge advantages for large open-source projects.

Organizing hundreds/thousands of developers to work on a single project would not be possible without the security and scalability of distributed collaboration.

Conclusion

Supporting these centralized and distributed collaboration models was all Git was ever meant to do.
The working directory, the staging area, commits, branches, and remotes were all specifically designed to enable these workflows, and virtually everything in Git revolves around these components.

True to the UNIX philosophy, Git was designed as a suite of interoperable tools, not a single monolithic program. As you continue to explore Git’s numerous capabilities, you’ll find that it’s very easy to adapt individual commands to entirely novel workflows.
Now is up to you to apply these concepts to real-world projects, but as you begin to incorporate Git into your daily workflow, remember that

Git is not a silver bullet for project management. It is merely a tool for tracking your files, and no amount of intimate Git knowledge can make up for a haphazard set of conventions within a development team.

Resources

Git's Home (Pro Git by Chacon-Straub)
Git Succinctly by Ryan Hodson
Git Tutorial by Dr. Chris Bourke

CSE, Software Development

Translate

Search This Blog

Total Pageviews

Saturday, October 20, 2018

Basic versioning: A guide to Git DVCS via the Command Line Interface (CLI)

Resources