Tracked-but-uncommitted files with Git
Something I find awkward about Git is that it doesn’t seem to deal
with the concept of a tracked but uncommitted file — that is, the situation
you’d get into with CVS after running
cvs add on a new file, but
before committing that file to the central repository.
The difference between
cvs add and
git add is that, where CVS adds the
file to the set of tracked files whose changes can be committed, Git adds
the file’s current contents to the index of changes that are pending being
I’m pretty sure that Git doesn’t have the ability to do the CVS-like thing; my understanding of the internals suggests strongly that the ability to do it can’t be directly supported, and even if I’m wrong, I can’t find anything that would suggest how you’d go about doing it.
I stipulate that my desire to do this may well be influenced by my experience with CVS, but it’s nonetheless something I believe is useful. With off-the-shelf Git tools, your options are:
Don’t do the
git adduntil you’re almost ready to commit. The problem with that is that when you run
git diff, you don’t see the addition of the file you haven’t yet added.
git addwhile you’re still working on your change. Now
git diffshows you not the whole diff between the index and your working tree, but the diff excluding your new file. Then you hack a bit further, and now
git diffshows you some changes to your new file, but not the whole thing.
Neither of those is what I want.
I have a workaround for this.
First, the user interface: when I run a command
git track new-file.c, that
should tell Git to track
new-file.c for changes, but without adding its
data to the index.
Second, the sneaky trick. It can’t quite be done, but you can get very
close. Given that my problem with using
git add is what it does to your
diffs, the technique is merely to add the empty file to the index under
the new name. The new file is now in the index, so
git diff will report
on it. But the version in the index has no data, so
git diff will always
report every line in the file as an addition. That’s good enough for me.
This trick doesn’t quite do the right thing; specifically, we’ve now put data into the index that we know is wrong, and that has a reasonable chance of coming back to bite us in the future. In practice, it’s working for me, in the ways I’ve been using Git, but it seems worth pointing out the issue.
Third, the implementation. This is version is in Bash; it’s very simple, but adequate for most purposes. We need the Git hash of an empty file. In principle, it would be trivial to just embed the relevant 40-byte hex string into the program as a constant. But if your history contains no empty files, then you’d be liable to get spurious errors later on, saying
unable to find e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
So instead, call
git hash-object and ask it to write an empty file to the
sha1=$(git hash-object -w --stdin < /dev/null)
With that in hand, simply loop over all the filenames passed as arguments,
massage them appropriately, and send them to
for file; do mode=$(if [ -x "$file" ]; then echo 755; else echo 644; fi) echo -e "100$mode $sha1\t$file" done | git update-index --index-info
Fourth, integration with the rest of Git. Dropping that script into your
git-track allows you to run
git-track new-file.c. That’s
almost perfect, but note that the command name contains a hyphen; we want
the two-word version
git track instead. Fortunately, that’s very easy to
do: just add this to your
[alias] track = !git-track
Assuming you have a
git-track in your
$PATH, you can now do
just as if this new program shipped with Git.
Finally, an improved implementation. It would be nice for
git track to
accept directory names in the same way as
git add. That’s not too hard —
git ls-files to do the heavy lifting:
git ls-files -o --exclude-per-directory=.gitignore \ --no-empty-directory "$@"
The only problem you run into is that, if you want to handle file names
containing spaces and/or newlines, you have to jump through all the usual
shell hoops to avoid accidental word splitting. My normal approach in
such situations is to use a real programming language. So here’s a
simple Perl implementation of
git track which takes both
files and directories as command-line arguments.
Share and enjoy!