6月 10, 2020

Version Control with Git

by
Jon Loeliger

1. Introduction

2. Installing Git

3. Getting Started

The Git Command Line

Initially, Git was provided as a suite of many simple commands, such as git-commit.
Now, it is the single git executable and affix a subcommand.
That said, both forms, git commit and git-commit , are identical.

List most common subcommands


$ git help

List all subcommands


$ git help --all

Quick Introduction to Using Git

Creating an Initial Repository


$ mkdir public_html
$ cd public_html
$ echo 'My website is alive!' > index.html
$ git init
Initialized empty Git repository in /home/jerry/test/public_html/.git/

the git init command creates a hidden directory, called .git, at the top level of your project. Git places all its revision information in this one top-level .git directory.
Initially, each Git repository is empty.

Adding a File to Your Repository


$ git add index.html
$ git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   index.html

Git has merely staged the file, but it is not permanent in the repository. This let the commit can be “batched”. The next commit will include the changes staged.


$ git commit -m "Initial contents of public_html"

Tou can also input the commit messages via a text console. To set the text editor to be used:

bash
tcsh

Configuring the Commit Author


$ git config user.name "Jon Loeliger"
$ git config user.email "jdl@example.com"

Viewing Your Commits

The command git log yields a sequential history of the individual commits within the repository:


$ git log
commit 51cdbfac97d1037ef8926693f1b09a6b85191273 (HEAD -> master)
Author: XXX<xxx@yyy>
Date:   Sat Jun 27 09:59:29 2020 +0800

    Initial contents of public_html

To see more detail about a particular commit, use git show with a commit number:


$ git show 51cdbfac97d1037ef8926693f1b09a6b85191273
commit 51cdbfac97d1037ef8926693f1b09a6b85191273 (HEAD -> master)
Author: XXX<xxx@yyy>
Date:   Sat Jun 27 09:59:29 2020 +0800

    Initial contents of public_html

diff --git a/index.html b/index.html
new file mode 100644
index 0000000..3e23ae4
--- /dev/null
+++ b/index.html
@@ -0,0 +1,2 @@
+
+hello

If you run git show without an explicit commit number, it simply shows the details of the most recent commit. git show-branch --more=10 , provides concise one-line summaries for the current development branch:

Viewing Commit Differences


$ git log
commit 1cfc8de547a1d4fb5eb411ec8c43dac372df183c (HEAD -> master)
...
commit 51cdbfac97d1037ef8926693f1b09a6b85191273

$ git diff 1cfc8de547a1d4fb5eb411ec8c43dac372df183c 51cdbfac97d1037ef8926693f1b09a6b85191273
diff --git a/index.html b/index.html
index f61eb65..3e23ae4 100644
--- a/index.html
+++ b/index.html
@@ -1,2 +1,2 @@
 
-hello world!
+hello

Removing and Renaming Files in Your Repository

As with an addition, a deletion requires two steps: git rm expresses your intent to remove the file and stages the change, and then git commit realizes the change in the repository. It's similar to rename a file : git mv then git commit.

Making a Copy of Your Repository

You can create a complete copy, or clone, of a repository using the git clone command.

Configuration Files

Git supports a hierarchy of configuration files:

.git/config

--file


[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true

~/.gitconfig

--global


[user]
	email = xxx@yyy.com
	name = Bruce Lee
[core]
	editor = vi
[color]
	ui = auto
[http]
	postBuffer = 2428000

git config -l can be used to list the settings of all the variables in configuration files:


$ git config -l
user.email=xxx@yyy.com
user.name=Bruce Lee
core.editor=vi
color.ui=auto
http.postbuffer=2428000
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true

4. Basic Git Concepts

Basic Concepts

Repositories

A Git repository is simply a database containing all the information needed to retain and manage the revisions and history of a project. Configuration settings are not propagated from one repository to another during a clone, or duplicating, operation. Git maintains two primary data structures, the object store and the index. All of this repository data is stored at the root of your working directory in a hidden subdirectory named .git.

object store

This contains your original data files and all the log messages, author information, dates, and other information required to rebuild any version or branch of the project. there are only 4 types of objects in the object store:

Blobs
Trees
Commits
Tags

Index

The index captures a version of the project’s overall structure at some moment in time.

Content-Addressable Names

Each object in the object store has a unique name produced by an SHA1 hash value of the content. Git users speak of SHA1, hash code, and sometimes object ID interchangeably.

Git Tracks Content

Git’s object store is based on the hashed computation of the contents of its objects. If two separate files located in two different directories have exactly the same content, Git stores a single copy of that content as a blob within the object store.

Pathname Versus Content

Git does not use filenames, git makes sure it can accurately reproduce the content of files and directories, which is indexed by hash value.

Object Store Pictures

The blob object is at the “bottom” of the data structure, it is only referenced by tre objects.
Tree objects point to blobs, and possibly to other trees as well. Any given tree object might be pointed at by many different commit objects.
A commit points to one particular tree
Each tag can point to at most one commit.

Consider a repository, each tree is represented by a triangle, a circle represents a commit.

after a single, initial commit added two files

branch

tag

adding a new subdirectory with one file in it

We can see that each commit only contains the differences from the last commit.

Git Concepts at Work

Inside the .git directory

Initialize an empty repository


$ mkdir hello
$ cd hello
$ git init
Initialized empty Git repository in /home/jerry/test/git/hello/.git/

$ find .git/objects
.git/objects
.git/objects/pack
.git/objects/info

Create a simple object and stage it:


$ echo "hello world" > hello.txt
$ git add hello.txt

Then, your objects directory should contain additional 2 files:


.git/objects/3b
.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad

Objects, Hashes, and Blobs

At the core of Git is a simple key-value data store.
You can insert any kind of content into it, and it will give you back a key that you can use to retrieve the content again at any time.
The hash in the above case is 3b18e512dba79e4c8300dd08aeb37f8e728b8dad .
Git inserts a / after the first two digits to improve filesystem efficiency. (an easy way to create a fixed, 256-way partitioning of the namespace for all possible objects with an even distribution.) git-cat-file can provide content or type and size information for repository objects.


$ git cat-file -p 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
hello world

Files and Trees

Git stores content in a manner similar to a UNIX filesystem.
All the content is stored as tree and blob objects, with trees corresponding to UNIX directory entries and blobs corresponding more or less to inodes or file contents.
A single tree object contains one or more tree entries, each of which contains a SHA-1 pointer to a blob or subtree with its associated mode, type, and filename.
For example, the most recent tree in a project may look something like this:


$ git cat-file -p master^{tree}
100644 blob a906cb2a4a904a152e80877d4088654daad0c859      README
100644 blob 8f94139338f9404f26296befa88755fc2598c289      Rakefile
040000 tree 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0      lib

$ git cat-file -p 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0
100644 blob 47c6340d6459e05787f644c2447d2595f5d3a54b      simplegit.rb

The git “index” is where you place files you want committed to the git repository.
Before you “commit” (checkin) files to the git repository, you need to first place the files in the git “index”.
git add 會將檔案加入 index， index 是一個二進位檔案，通常放在 .git/index，其中包含路徑名稱的排序列表、每個路徑名稱的權限和 blob 物件的 SHA-1 值。而 git ls-files 指令可顯示 index 的內容。


$ git ls-files -s
100644 3b18e512dba79e4c8300dd08aeb37f8e728b8dad  0  hello.txt

-s，--stage：Show staged contents' mode bits(檔案權限的八進位表示法), object name, stage number和檔案名稱
Each time you run commands such as git add , git rm , or git mv , Git updates the index with the new pathname and blob information.
Whenever you want, you can use the git write-tree command to write the staging area out to a tree object.
In real life, you can (and should!) skip the low-level git write-tree and git ommit-tree steps and just use the git commit command.

Commits

The format for a commit object is simple:

the top-level tree for the snapshot of the project at that point
the author/committer information (which uses your user.name and user.email configuration settings and a timestamp)
a blank line, and then the commit message

5. File Management and the Index

A commit is a two-step process: stage your changes and commit the changes.
The index is the layer between the working directory and the repository to stage, or collect changes.
When you run git commit , Git checks the index rather than your working directory to discover what to commit.
You can query the state of the index at any time with git status .
git diff displays the changes that remain in your working directory and are not staged;
git diff --cached shows changes that are staged and will therefore contribute to your next commit.

File Classifications in Git

Git classifies your files into three groups:

Tracked

in the repository

staged

Ignored
Untracked

Using git add

The command git add stages a file.
In terms of Git’s file classifications, if a file is untracked, git add converts that file’s status to tracked. When git add is used on a directory name, all of the files and subdirectories beneath it are staged recursively.
The entirety of each file, at the moment you issued git add , was copied into the object store and indexed by its resulting SHA1 name.
Staging a file is also called “caching a file” † or “putting a file in the index.”

Some Notes on Using git commit

Using git commit --all

The -a or --all option to git commit causes it to automatically stage all unstaged, tracked file changes before it performs the commit.

Using git rm

Any versions of the file that are part of history already committed in the repository remain in the object store and retain that history.
Git will remove a file only from the index or from the index and working directory simultaneously.
Git will not remove a file from just the working directory; the regular rm command may be used for that purpose.

Using git mv

Suppose you need to move or rename a file.

The .gitignore File

6. Commits

When a commit occurs, Git records a snapshot of the index and places that snapshot in the object store. This snapshot does not contain a copy of every file and directory in the index, Git creates new blobs for any file that has changed and new trees for any directory that has changed, and it reuses any blob or tree object that has not changed.
A commit is the only method of introducing changes to a repository, and any change in the repository must be introduced by a commit.

Atomic Changesets

Every Git commit represents a single, atomic changeset with respect to the previous state.

Identifying Commits

The unique, 40-hex-digit SHA1 commit ID is an explicit reference, while HEAD , which always points to the most recent commit, is an implied reference.
Git provides many different mechanisms for naming a commit.

Absolute Commit Names

The hash ID is an absolute name. Each commit ID is globally unique.
Git allows you to shorten this hash ID to a unique prefix within a repository’s object database.


$ git log --oneline
1cfc8de (HEAD -> master) change it twice
51cdbfa Initial contents of public_html

git takes author, date information off, also only keep 7 characters from the original hash ID.
To get the log for a commit:


$ git log 51cdbfa
commit 51cdbfac97d1037ef8926693f1b09a6b85191273
Author: XXX <yyy@gmail.com>
Date:   Sat Jun 27 09:59:29 2020 +0800

    Initial contents of public_html

refs and symrefs

A ref is a SHA1 hash ID that refers to an object within the Git object store.
Local topic branch names, remote tracking branch names, and tag names are all refs.

A symbolic reference, or symref, is a name that indirectly points to a Git object. It is still just a ref.
Each symbolic ref has an explicit, full name that begins with refs/ and each is stored hierarchically within the repository in the .git/refs/ directory. There are basically three different namespaces represented in .git/refs/ :


.git/refs
├── heads
│   └── zeus
├── remotes
│   └── origin
│       └── HEAD
└── tags

heads

dev

remotes

origin/master

tags

v2.6.23

You can use either a full ref name or its abbreviation.

Git maintains several special symrefs automatically,

HEAD

the most recent commit on the current branch

ORIG_HEAD

ORIG_HEAD

FETCH_HEAD

FETCH_HEAD

only valid immediately after a fetch operation

MERGE_HEAD

MERGE_HEAD

Relative Commit Names

Except for the first root commit, each commit is derived from at least one earlier commit.
The direct ancestor commits are called parent commits.
For a commit to have multiple parent commits, it must be the result of a merge operation. As a result, there is a parent commit for each branch contributing to a merge commit.
The ~(tilde) and ^(caret) symbols are used to point to a position relative to a specific commit:

The tilde symbol (~) is used to select a different ancestral parent

The caret symbol (^) is used to select a different parent.


 HEAD~3 ---> HEAD~2 ---> HEAD~1 ---> HEAD
             HEAD^1~1    HEAD^1        |
                           |           |
                           |           |
             HEAD~1^2 -----+           |
                                       |
                       ---HEAD^2-------+

Using the command git show-branch, , you can inspect the graph history and examine a complex branch merge structure:

  
$ git show-branch --more=35 | tail -10
-- [master~15] Merge branch 'maint'
-- [master~3^2^] Merge branch 'maint-1.5.4' into maint
+* [master~3^2^2^] wt-status.h: declare global variables as extern
-- [master~3^2~2] Merge branch 'maint-1.5.4' into maint
-- [master~16] Merge branch 'lt/core-optim'
+* [master~16^2] Optimize symlink/directory detection
+* [master~17] rev-parse --verify: do not output anything on error
+* [master~18] rev-parse: fix using "--default" with "--verify"
+* [master~19] rev-parse: add test script for "--verify"
+* [master~20] Add svn-compatible "blame" output format to git-svn

the output is limited to the final 10 lines.
In this example, a merge took place between master~15 and master~16 that introduced a couple of other merges as well as a simple commit named master~3^2^2^ .
One common usage of git rev-parse is to print the commit ID given a revision specifier. You can use it to get the commit ID:

$ git rev-parse master~3^2^2^
32efcd91c6505ae28f87c0e9a3e2b3c0115017d8

Commit History

Viewing Old Commits

git log acts like git log HEAD , printing the log message associated with every commit in your history that is reachable from HEAD .
If you supply a commit for git log, the log starts at the named commit and works backward.
Typically, a limited history is more informative. One technique to constrain history is to specify a commit range:

$ git log master~12..master~10

Here, git log shows the commits between master~12 and master~10 , or the 10-th and 11-th prior commits on the master branch.
To print the patch, or changes, introduced by the commit:

$ git log -1 -p 4fe86488

Notice the option -1 as well: it restricts the output to a single commit.

Commit Graphs

Commit Ranges

A range is denoted with a double-period ( .. ), as in start..end , where start and end may be some forms of commit.
For ex., the range master~12..master~10 to specify the 11-th and 10-th prior commits on the master branch.

Finding Commits

Using git bisect

git-bisect uses binary search to find the commit that introduced a bug.
To start, you first need to identify a good commit and a bad commit.

Version Control with Git

Version Control with Git

1. Introduction

2. Installing Git

3. Getting Started

The Git Command Line

Quick Introduction to Using Git

Creating an Initial Repository

Adding a File to Your Repository

Configuring the Commit Author

Viewing Your Commits

Viewing Commit Differences

Removing and Renaming Files in Your Repository

Making a Copy of Your Repository

Configuration Files

4. Basic Git Concepts

Basic Concepts

Repositories

object store

Index

Content-Addressable Names

Git Tracks Content

Pathname Versus Content

Object Store Pictures

Git Concepts at Work

Inside the .git directory

Objects, Hashes, and Blobs

Files and Trees

Commits

Tags

5. File Management and the Index

File Classifications in Git

Using git add

Some Notes on Using git commit

Using git commit --all

Using git rm

Using git mv

The .gitignore File

6. Commits

Atomic Changesets

Identifying Commits

Absolute Commit Names

refs and symrefs

Relative Commit Names

Commit History

Viewing Old Commits

Commit Graphs

Commit Ranges

Finding Commits

Using git bisect

Using git blame

Using Pickaxe

7. Branches

Reasons for Using Branches

8. Diffs .

9. Merges

10. Altering Commits

11. Remote Repositories

12. Repository Management

13. Patches

14. Hooks

15. Combining Projects

16. Using Git with Subversion Repositories

留言

熱門文章

A Tutorial on the Device Tree

Linux Modem Manager