4 Git concepts and architecture¶
4.1 The Three Trees¶
Two-tree architecutre (other VCS):
Repository and working copy are the two trees.
Directories and files can be thought of as trees
“checkout” copies from repository to working directory
make changes, and “commit” those changes back to the repository
Git: Three-tree architecutre:
Repository, Staging index, Working trees
Working directory: contains changes that may not be tracked
Staging index: changes that we are about to commit
Repository: actually being tracked
4.2 Git Workflow¶
Workflow for three-tree architecture.
Suppose we create
file.txt
in our working directory (call these changesA
).git add file.txt
stagesA
in the staging index.git commit
to pushA
to the repository.Suppose you make changes to
file.txt
, call these changesB
.Again use
add
andcommit
to stage and pushB
respectively.Now the repository has two sets of changes in it,
A
andB
. This is the typical workflow to make changes to a repository. Usegit log
to view these changes which is referenced by git using a unique number.
4.3 Hash Values (SHA-1)¶
Previously, we refered to the changes as
A
,B
,C
. These changes can be on a single file, or several files in a directory, or across directories.Git generates a checksum for each change set.
Checksum algorithms convert data into a simple number.
Same data always equals same checksum.
Data integrity is fundamental.
Changing data changes checksum.
Git uses SHA-1 hash algorithm to create checksums. “What’s the SHA value of that commit?”
40-character hexadecimal string. = f(all the data, all the changes). f is one-to-one.
f is also a function of (parent (SHA value of previous snapshot), author, commit message)
Thus not just the change set, but data integrity of the history of change sets is built in.
4.4 HEAD pointer¶
Pointer to tip of current branch in repository
Last state of repository, what was last checked out
Points to parent of next commit where writing commits takes place
By default, the branch we’re working on is called
master
. We start with our first commit. At the start, the HEAD pointer points to that commit.When a new commit is made, a new SHA is created, and git moves the HEAD pointer to this new SHA value.
When another is made, it does the same and moves the head pointer again.
If a new branch is created, the HEAD moves to commits on that branch.
cat .git/HEAD
shows where head pointer contents are locatedcat .git/refs/heads/master
contained the SHA where HEAD is pointing to.