03. Git Data Model

Git data model consists of:

  • In Git, the contents of files are stored as blobs.

  • Directories in Git basically correspond to trees. A tree is a simple list of trees and blobs that the tree contains, along with the names and modes of those trees and blobs.

  • The commit is very simple, much like the tree. It simply points to a tree and keeps an author, committer, message and any parent commits that directly preceded it.

  • A tag is an object that provides a permanent shorthand name for a particular commit. It contains an object, type, tag, tagger and a message.

  • References are simple pointers to a particular commit, something like a tag, but easily moveable. Examples of references are branches and remotes. A branch in Git is nothing more than a file in the .git/refs/heads/ directory that contains the SHA-1 of the most recent commit of that branch.

All of these types of objects are stored in the Git Object Database, which is kept in the Git Directory (.git). For particular example git database will be looks like this:

Git gets the initial SHA-1 of the starting commit object by looking in the .git/refs directory for the branch, tag or remote you specify. Then it traverses the objects by walking the trees one by one, checking out the blobs under the names listed.

In computer science speak, the Git object data is a directed acyclic graph. That is, starting at any commit you can traverse its parents in one direction and there is no chain that begins and ends with the same object.

The basic data model looks something like this:

For several commits model can be expanded:

The history in Git is the complete list of all commits made since the repository was created. History is the one of the most significant artefacts produced by git.

Last updated