Git Blame: 2014

Monday, December 22, 2014

On CVE-2014-9390 and Git 2.2.1

Now the security-fix releases are behind us, let's briefly talk about the ramifications.

The recent Git/Hg vulnerability on case-insensitive or normalizing filesystems are serious for people who fetch and integrate (either pull or pull --rebase) from untrusted sources.

When you grab a tree that records a malicious path, say, ".Git/hooks/post-checkout" using an older version of Git on such a filesystem (e.g. Windows NTFS or Mac OS X HFS+), Git will tell the filesystem to check it out at ".Git/hooks/post-checkout", but the filesystem overwrites a file different from what Git asked it to write, namely ".git/hooks/post-checkout", which is a path reserved for you to store an executable hook that is run after running "git checkout".

For an attacker to victimize you through this vector, the attacker has to have a write access to a repository you pull from. As long as you do not interact with untrustworthy strangers (e.g. only pull from the projects' official history), you will not be affected. That is often true in corporate setting, where the access to the central repository everybody in the product group uses is tightly controlled, and if an untrustworthy stranger has a write access there, you already have a bigger problem.

But the open-source is all about collaboration, and we need to meet and interact with new people every day while doing so. The prudent thing to do is to (1) update to the version of Git recently released to work around this issue, and then (2) respond to a pull request from a stranger, in this order. Don't do it the other way around!

Thanks.

Thursday, December 18, 2014

Git 1.8.5.6, 1.9.5, 2.0.5, 2.1.4 and 2.2.1 and thanking friends in Mercurial land

We have a set of urgent maintenance releases. Please update your Git if you are on Windows or Mac OS X.

Git maintains various meta-information for its repository in files in .git/ directory located at the root of the working tree. The system does not allow a file in that directory (e.g. .git/config) to be committed in the history of the project, or checked out to the working tree from the project. Otherwise, an unsuspecting user can run git pull from an innocuous-looking-but-malicious repository and have the meta-information in her repository overwritten, or executable hooks installed by the owner of that repository she pulled from (i.e. an attacker).

Unfortunately, this protection has been found to be inadequate on certain file systems:

You can commit and checkout to .Git/<anything> (or any permutations of cases .[gG][iI][tT], except .git all in lowercase). But this will overwrite the corresponding .git/<anything> on case-insensitive file systems (e.g. Windows and Mac OS X).
In addition, because HFS+ file system (Mac OS X) considers certain Unicode codepoints as ignorable; committing e.g. .g\u200cit/config, where U+200C is such an ignorable codepoint, and checking it out on HFS+ would overwrite .git/config because of this.

The issue is shared with other version control systems and has serious impact on affected systems (CVE-2014-9390).

Credit for discovering this issue goes to our friends in the Mercurial land (most notably, the inventor of Hg, Matt Mackall himself). The fixes to this issue for various implementations of Git (including mine, libgit2, JGit), ports using these implementations (including Git for Windows, Visual Studio) and also Mercurial have been coordinated for simultaneous releases. GitHub is running an updated version of their software that rejects trees with these confusing and problematic paths, in order to protect its users who use existing versions of Git (also see their blog post).

A huge thanks to all those who were involved.

New releases of Git for Windows, Git OSx Installer, JGit and libgit2 have been prepared to fix this issue. Microsoft (which uses libgit2 in their Visual Studio products) and Apple (which distributes a port of Git in their Xcode) both have fixes, as well.

For people building from the source, fixed versions of Git have been released as versions v1.8.5.6, v1.9.5, v2.0.5, v2.1.4, and v2.2.1 for various maintenance tracks.

Thanks.

Tuesday, September 30, 2014

Fun (?) with GnuPG

We use GnuPG as part of the infrastructure to certify authenticity of development history in Git in various places:

Signed tags created by git tag -s is to say "This tag was created by me, the holder of the private GnuPG key that signed this object". Because the object name of any Git object is computed as a cryptographic hash over what the object records, and because a signed tag object records the object name of a tagged object (typically a commit) and the human readable name (typically a release number or name) the tagger wants to give the tagged object, an attacker cannot forge a phony tag that points at a different commit signed with the private key the attacker does not have. You are saying "You can verify that it is true that I wanted to make that commit release X" safely because of this. Also, because the commit object records all the objects and their location in a project tree, and the parent commit objects, such a signed tag also ensures that all the development history behind such a tagged commit cannot be tampered with.
When you merge a signed tag (either done by git merge or git pull), the content of the tag with its GnuPG signature is copied to the resulting commit object. This lets you ensure that the history behind the side branch that was merged to the history cannot be tampered with and the signature certifies that it came from the signer (typically a subsystem lieutenant).
Signed commits created by git commit -S is a way to say "This commit was created by me", and ensures that the history behind the commit cannot be tampered with and certifies that the change it introduces came from the signer.
Still under development is git push --signed, a way to certify that you wanted to put a particular commit at the tip of a particular branch.

GnuPG is also used as a mechanism to ensure the integrity and authenticity of tarballs that are sent to the kernel.org servers, which is a common distribution point for open source projects like the Linux kernel and Git itself. A maintainer prepares a tarball and a detached signature, uploads them, and the receiving end will verify that the signature is good.

It is a common practice to specify the expiration date when creating a signing key. For example, the key I have been using to sign Git release tags was originally set up to expire in 3 years since the key was created. But the thing is, a project may outlive that expiry date. An interesting question is what happens to the existing tags when the key expires.

Unluckily, the right thing happens. If the holder of the key does not do anything, the key becomes expired, and the signatures in the signed tags stops validating. Luckily, the validity of a key can be extended by the holder of the key, and once it is done, the signatures made before the key's original expiration date will continue to validate fine.

At least, that is the theory ;-)

As my key was originally set to expire early next month, I've extended the lifespan of the key 96AFE6CB I have been using a few days ago and uploaded the updated key to pgp keyservers, so existing signed tags (e.g. v2.0.0) should continue to be valid.

A few tips:

Although this page is a specific instruction to Debian contributors, it was very helpful when I had to figure out how to futz with GnuPG subkeys. It does not talk about how to update the expiration date for a subkey, though (you use "gpg --edit-key" and then choose the subkey with the "key N" command and finally use the "expire" command).
In order to force a specific subkey to be used when signing for Git, you would need to use the ! suffix to the GnuPG key-id, e.g. in my ~/.gitconfig file:
[user] signingkey = 96AFE6CB!
Without the ! suffix, GnuPG tries to use the newest subkey you have associated with the same primary key, which may not be the subkey you would want to use.

I signed a new v2.1.2 maintenance release with the same key today. Hopefully it will validate OK for you (otherwise, you may have to fetch the public key from the keyserver).

Wednesday, May 28, 2014

Git 2.0

The real "Git 2.0" is finally out.

From the point of view of end users who are totally new to Git, this release will give them the defaults that are vastly improved compared to the older versions, but at the same time, for existing users, this release is designed to be as low-impact as possible, as long as they have been following recent releases along (instead of sticking to age-old releases like 1.7.x series). Some may even say, without remembering why it was a big deal to bring these new default behaviours to help new users, that the new release does not offer anything exciting—and that is exactly what we want to hear from existing users. In recent releases for the past year or so, we have added knobs to allow users to enable these new defaults before 2.0 happens, and added warnings to let users know when they perform an operation whose outcome will be different between 1.x series and 2.0 release. The existing users are hopefully very well prepared by now, and "Git 2.0" is designed to be the final "flipping the default" step.

We had to delay the final release by a week or so because we found a few problems in earlier release candidates (request-pull had a regression that stopped it from showing the "tags/" prefix in "Please pull tags/frotz" when the user asked to compose a request for 'frotz' to be pulled; a code path in git-gui to support ancient versions of Git incorrectly triggered for Git 2.0), which we had to fix in an extra unplanned release candidate.

Hopefully the next cycle will become shorter, as topics that have been cooking on the 'next' branch had extra time to mature, so it all evens out in the end ;-).

Have fun.

Friday, April 25, 2014

Git 2.0 release candidate 1

This is the first release candidate for the upcoming Git 2.0. There are usual sort of updates and fixes one would expect to see between any two feature releases, but the primary reason why its name begins with "2" (as opposed to the last feature release whose name was "Git 1.9") is because it has a few backward incompatible changes that are all meant to improve the end-user experiences.

People almost always push to a single place, and many people would push a single branch they are currently on. The default behaviour of "git push" (that does not say which branches to push out to where on the command line) has been updated to better support this mode of working (as opposed to working on making all branches they are going to publish ready and then push all of them in one go). The old default of pushing out all the matching branches is available by setting the push.default configuration variable to matching.
Even though "git commit -a" can be run from any subdirectory to commit changes to all the tracked paths in the working tree, "git add -u" and "git add -A" (without specifying any path on the command line) used to operate only inside the current directory. This inconsistency bothered many people, and these commands have been updated to operate on all modified (for "-u") or all (for "-A") paths. Use "git add -u ." and "git add -A ." to restrict the command to the current directory.
"git add path" is now the same as "git add -A path" now, so that "git add directory/" will notice paths you removed from the directory and record the removal. In older versions of Git, it used to ignore removals. You can say "git add --ignore-removal path" to add only added or modified paths, if you really want to.

Some of the readers may remember that we didn't give users a very good transition experience when we introduced a backward incompatible change in Git 1.6.0. We used to install all the "git-cmd"s in the same directory as "git" itself and people were used to that "git commit" and "git-commit" can be used interchangeably before that release. Then we stopped installing what does not have to be on user's $PATH at that release, which is a change that breaks people's finger-memory and existing scripts. All we did to prepare users for that change was to warn about it in release notes since Git 1.5.4 and it was apparently not enough. Many people were unhappy.

In retrospect, perhaps we could have done better by adding code to somehow detect when "git-cmd" is invoked as the top-level command and warn that such usage would break in future versions to train users to use "git cmd" form way before releasing the version that actually delivered the change.

This time around, we have been trying to be a lot more careful. For the past handful of releases, we have added extra code to detect cases where exiting versions of Git and the upcoming Git 2.0 will behave differently and to warn about the upcoming change. As the result, the actual difference between Git 1.9 and Git 2.0 is mostly "flipping the default" for these changes.

Have fun.

Tuesday, March 18, 2014

Git 1.9.1

Traditionally, releases numbered with three Dewey-decimal digits were major releases that add new features, while ones with four were maintenance releases with only fixes. This was meant to give us some flexibility to say that the difference between 1.7.12 and 1.8.0 are larger than the difference between 1.8.1 and 1.8.2 (1.7.12 was the last major release in the 1.7.x series), while reserving the difference in the first digit for really big changes (i.e. 2.0 may finally toggle a switch that makes Git incompatible with older 1.x releases out of the box).

But we found out that in practice, we do not need to have three levels of changes (an incremental that changes the third digit between 1.8.1 to 1.8.2, a larger update that changes the second digit between 1.7.12 to 1.8.0, and a huge update that changes the first digit between 1.9 and 2.0). Hence the last major release was officially called "Git 1.9" when it was released on February 14, 2014.

It logically follows that, because we are dropping the third digit (or the second, depending on how you look at it) from the numbering of major releases, the first maintenance release for Git 1.9 is named with three digits, not four.

Git 1.9.1 is such a release. Among many changes we have been cooking on the development front towards the next major release, which will be called Git 2.0, this maintenance release contains only the fixes, and everybody is encouraged to upgrade to it.

Fixes since Git 1.9 are as follows:

"git clean -d pathspec" did not use the given pathspec correctly and ended up cleaning too much.
"git difftool" misbehaved when the repository is bound to the working tree with the ".git file" mechanism, where a textual file ".git" tells us where it is.
"git push" did not pay attention to branch.*.pushremote if it is defined earlier than remote.pushdefault; the order of these two variables in the configuration file should not matter, but it did by mistake.
Codepaths that parse timestamps in commit objects have been tightened.
"git diff --external-diff" incorrectly fed the submodule directory in the working tree to the external diff driver when it knew it is the same as one of the versions being compared.
"git reset" needs to refresh the index when working in a working tree (it can also be used to match the index to the HEAD in an otherwise bare repository), but it failed to set up the working tree properly, causing GIT_WORK_TREE to be ignored.
"git check-attr" when working on a repository with a working tree did not work well when the working tree was specified via the --work-tree (and obviously with --git-dir) option.
"merge-recursive" was broken in 1.7.7 era and stopped working in an empty (temporary) working tree, when there are renames involved. This has been corrected.
"git rev-parse" was loose in rejecting command line arguments that do not make sense, e.g. "--default" without the required value for that option.
include.path variable (or any variable that expects a path that can use ~username expansion) in the configuration file is not a boolean, but the code failed to check it.
"git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value.
Attempting to deepen a shallow repository by fetching over smart HTTP transport failed in the protocol exchange, when no-done extension was used. The fetching side waited for the list of shallow boundary commits after the sending end stopped talking to it.
Allow "git cmd path/", when the 'path' is where a submodule is bound to the top-level working tree, to match 'path', despite the extra and unnecessary trailing slash (such a slash is often given by command line completion).

Have fun.