taw comments on Tool ideology - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (66)
The happy ending is that nobody uses subversion any more, git won and has none of these problems.
It's up to you how seriously you read my comment.
Hee. We still use subversion every day.
Version control systems nowadays suffer from the problem that all new version control systems are created by groups of hackers working on projects so big and complex that the existing systems aren't powerful enough for them. So you keep getting more and more powerful and complex systems. git is so complex that no one who isn't a software developer can use it correctly.
I was tasked with moving a complex natural-language processing program for the NIH from, I think, SVCS, to git. After three days studying git man pages and trying to explain them to a group of linguists, I gave up and put everything under QVCS, and it was smooth sailing after that.
Try mercurial. It's got basically the same features, but is more comprehensible to human beings. There's an excellent tutorial called hg init.
(And if you should happen to need to use other people's stuff that's in git, you can just use the git extension for mercurial.)
blinks
I was taught to use git within a few days of starting to become a professional programmer. I'm a dyed-in-the-wool fanboy. I probably have no perspective at all here. But whenever I've used Mercurial everything seems backwards. People start recommending that I do wacky-sounding things like making two clones of a repository just to do what I'd normally do with git branch/git checkout... Is there any way to track multiple heads without just making multiple checkouts all over your disk?
Also, I strongly suspect that people who have trouble with git are just having trouble visualizing the DAG in their heads. If you run gitk --all whenever you get confused, you can actually see the thing, and then there's nothing to be confused about.
...Though I suppose the above might just translate to "I'm a visual thinker, and everyone should be more like me."
Taboo "track" and "checkouts". I don't know what you mean by "track", and Mercurial doesn't have checkouts, as I understand the term. A clone isn't "checked out" of anything. (This was actually the hardest part for me to wrap my head around, coming from Subversion and the central-repository model, but I'm wondering whether you're talking about the same thing or not.)
If you simply mean you want more than one head or branch, you don't need multiple clones. You can switch your working copy between named branches or heads with "hg up", and list them with "hg heads".
It's true that people often suggest just using clones instead of named branches, but IMO this only makes sense for short-lived branches that are going to be folded in to something else. Mercurial works just fine with named branches and multiple heads. You can also use bookmarks to give local names to specific heads -- a kind of locally-named branch whose name isn't propagated to other repositories.
No, we just read the man pages and run screaming. It's not the model of a change-based system that's the problem, it's the UI design (or lack thereof). ;-)
From an outsider's perspective, git's UI is to mercurial's UI as Perl's is to Python. And since I've programmed almost exclusively in Python for about 13 years now, guess which one looks more attractive to me?
(Note: this doesn't have anything to do with Mercurial's innards being written in Python; other DVCS's have been written in Python and didn't have the same orthogonality of design.)
I'm told git massively improved its interface in the last few years. I started using it mainly in 2010 after switching from bzr, and had little trouble understanding the system (in fact I found hg's interface to be kind of weird). But there you go.
(Also, wrt
In git-land "checkout" means a working directory; by "multiple checkouts all over your disk" I assume MBlume means multiple clones of the repository.)
Harsh!
Well, to me, git's DAG model is 100% obvious, and
gitk --allis helpful in exactly the way you state — but at the beginning it was still confusing which command used how would produce the effect on the DAG (and working tree and index...) I wanted. Similarly, the commands to configure and manipulate branches and remotes are not entirely obvious, especially if you've gotten off the beaten path and want to fix your config to be normal.Git is new. It's already gotten easier to use (I'm already too much of a newb to have ever used the Git of Yore, which supposedly you needed a CS PhD to use effectively), and the folks at GitHub in particular seem to be working hard at sanding down its rough edges.
My experience with git was in 2006 or 2007.
This is quite ancient. git started as a solution to technical problem of high performance distributed version control. They got user interface into something reasonable only later.
It's still not that great. The internal DAG model is quite clean and clear. The actual commands do not always map clearly to this model. One common failure is often hiding or doing implicit magic to the staging area. Another is that many commands are a mish-mash of "manipulate the DAG" and "common user operations", where doing only one or the other would be much clearer. I really doubt that the user interface will get much better, because to do so they really need to throw out backward compatibility.
There are some problem with DAG, too, because you are supposed to store the information with little meta-information.
There are precedents of tools wrapping Git command-line interface, so that part possibly could be fixed. I frankly do not know why nobody does it.
Of course, Subversion is still "majority" VCS even for open-source projects. Maybe people need something other than Git to change that - or maybe SVK should become more widespread way to use SVN.
And for the sake of speed and stability Git doesn't store some data that every other open-source DVCS does store, and I have heard some Git users to say it is acceptable tradeoff (which is true for them) and some to say that nobody should care about this kind of data.
Of course, better tool is never a solution to tool ideology. Evaluating multiple other tools isn't either - after doing it with DVCSes, I now hate Git and implicitly assume that every tradeoff there is not fit for the medium-sized projects I'd care about.
I would guess that git is already more popular than svn for new projects (see github), and in at least some circles like among Ruby programmers still using svn for new stuff would raise some eyebrows. It's definitely way past just early adopters, but I have no idea how to get reasonable vcs usage statistics.
I don't know what you mean by these tradeoffs, git tends to store more data not less.
Well, Git stores code per se, for the rest of things it stores less data than either SVN or Bazaar (Mercurial, Monotone, Veracity).
It doesn't track explicit directory renaming. It doesn't keep branch history - if it did, reflog (which is local and only kept for 30 days) wouldn't be needed. It only allows unique tags - so if you want to mark every revision where buildbot succeded to make both update and rolling back easy - you are out of luck (there can be a way - it is not obvious at all).
It knows each directory by its content, so it knows when a directory was renamed, without needing to be explicitly told.
Reflog is an essentially local thing, it shows where a branch used to point in a particular repository instance. It has little to do with history of the project, and often includes temporary commits that shouldn't be distributed.
You need some way to specify what you'd want to update or roll back to - what kind of use case are you thinking about? You could support a successful-build branch, for example, so that its tip points to the last successful build, and you could create merge commits with previous successful builds as first parents, for the purpose of linking together all successful builds in that branch.
Tracking path by their content is not always good... It couples content changes with intent changes. If I need to make a copy of directory and then make the copies slowly diverge until they have nothing in common, I may want to mark which is for original intent and which is spin-off.
Branch history is not an inherently local thing.
When I have feature branches for recurring tasks, I will probably call them always the same. I will sometimes merge them with trunk and sometimes spawn them from the new trunk state. Later, I may wish to check whether some change was done in trunk or in the feature branch - it is quite likely to provide some information about commit intent. I can get it in every DVCS I know except Git easily - in Git I need to track DAG to get this information.
About succesful-build branch: for some projects I try to update to trunk head, and if it gives me too much trouble I look for closest previous revision which I can expect to work. In Monotone I simply mark some development commits as tested enough, there is a simple command to get all the branch.tested commits from the last month. This information says something about a commit, and to lose it I have to do something with the certificate that states it. In Git, rev-list behaviour depends on many things that happen later.
Linux kernel history is too big for any of the things I say to make sense for it. But in a medium project, I want to have access to strange parts of history to find out what happenned and how and what did we mean.
Doesn't work so well if the content is 'nothing'.
Git doesn't notice these at all.
Which is my point exactly. It is one aspect of Vi's criticism of git not storing some important data that is clearly valid. It is a tradeoff that probably doesn't matter if you are Linus and you are storing code for a Linux kernel but in other cases it is a blatant flaw that needs to be worked around via compromises or kludges.
Git is the absolute worst version control system out there (except for all the others).
In what situations would you want to store an empty directory and pay attention to whether it is renamed?
Empty directories are sometimes necessary and it's a pain in the ass that git cannot store them at all. I had to put almost empty README.txt files in directories like log/ in many projects. It's more a minor annoyance than anything more.
I have a complex enough deployment helper living in Monotone repository for which it is simpler and more natural to keep a few empty directories in the repository than to check-and-create from each of ten shellscripts. It is checkout-and-use, no other setup makes sense, so "just creating them in Makefile" would be suboptimal.
Is that something I need a justification for? My version control system throws away stuff that I am trying store. I'd also prefer it not to throw away files staring with 'b'.
I've learned to make my programs pessimistic and recreate the file system if necessary. It surprised me a few times before I learned the quirks.