Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We do defer to Git for all write operations, but for reading, we do it ourselves partly for efficiency, and partly to get the right data.

In terms of getting the right data, one example is that we need to know the full set of non-ignored sub-directories in the working directory, so we can watch them for changes. It's easy enough to generate this ourselves as we calculate the status output, but I don't believe that git will emit it.

In terms of performance, we rely on being able to read objects efficiently. For example, to show a commit, we can't just use the output of "git diff", as we need the full file contents to be able to calculate syntax highlighting correctly. You could go a long way with "git cat-file --batch", but there are plenty of contexts where you can't practically batch requests, and process creation costs + the lack of caching across requests (which can be quite significantly due to the delta encoding of objects) would be quite significant.



Thanks. That makes perfect sense, I'd be very surprised if all the current plumbing was able to serve your current use-cases.

There's going to be cases where it sucks, e.g. what you point out with wanting both raw blobs and their diffs, you'd need to do that in two plumbing commands now.

But just on that example: Having poked at some of the diff code recently I can tell you there's no big technical hurdle to just exposing that sort of thing. I.e. spewing out machine-readable raw blobs and their diffs, it just happens not to be exposed now.

I think what a program like Sublime Merge would want/need short of C API access (which is unlikely to happen) is a git version of an open-ended "plumbing" IPC protocol of the sort that Common Lisp VMs tend to expose. I.e. being able to have one (or few) "git command-server" processes spawned, and ask them questions like "look up this blob" or "diff these two blobs" (where the previous blob lookup would be cached).

Obviously patching/coordinating/upstreaming those sorts of changes is going to take work, but so is duplicating and keeping up-to-date with the diff, pack, status etc. code.

I'm not trying to tell you what to do, just saying that the git project is definitely friendly to "we're a commercial product and need this missing plumbing for our editor" (unlike say, GCC).

The plumbing that's there now is mostly in the state it's in because it's what git itself needed in the past when it was more of a collection of shellscripts, as well as being biased towards what git server operators like GitHub needed (because they sent more patches), which is why plumbing for say batch blob operations tends to be better than the one for "status".

In any case it would be very interesting to have some post about the sort of read-only operations Sublime Merge is doing with its own custom git code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: