You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Viliam_Bur comments on Call for volunteers: Publishing the Sequences - Less Wrong Discussion

13 Post author: wedrifid 28 June 2012 03:08PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (42)

You are viewing a single comment's thread. Show more comments above.

Comment author: Viliam_Bur 28 June 2012 09:02:43PM 5 points [-]

Bugs like this could be found in LaTeX source using regular expressions.

If a thing like this happens, it probably happens more than once in a long text. So when humans find a first example, computer could detect all remaining examples of the same pattern. (I don't recommend automatical fixing of these bugs, only reporting them.)

Comment author: [deleted] 29 June 2012 01:15:23AM 3 points [-]

Oh dear. Attempting to parse LaTeX with regexes is only slightly more insane than attempting to parse HTML with regexes.

Comment author: gwern 29 June 2012 01:19:18AM 2 points [-]

On the other hand, an interactive regexp search-and-replace is quite reasonable. Any good text editor should support such functionality...

Comment author: [deleted] 29 June 2012 01:30:48AM 0 points [-]

Sure, and any good (human) editor should have a macro package for working in such a way. I'm saying that LaTeX just makes it much harder to work like that, effectively.

Comment author: wedrifid 29 June 2012 02:24:30PM 0 points [-]

Sure, and any good (human) editor should have a macro package for working in such a way. I'm saying that LaTeX just makes it much harder to work like that, effectively.

It sure does. LaTeX is kind of a ridiculous hack. The semantics aren't even consistent. I kind of wish there was a mature publishing system based of DRYML.

Comment author: [deleted] 29 June 2012 02:26:28PM 0 points [-]

You and I both. Unfortunately any competing system has a nearly insurmountable barrier to entry. Kind of "Worse is Better" taken to insane extremes.

Comment author: wedrifid 29 June 2012 02:29:01PM 0 points [-]

Unfortunately any competing system has a nearly insurmountable barrier to entry.

The best approach would be to build something completely backwards compatible. That is, it allows easy embedding of LaTeX code and optionally compiles out to .tex.

Comment author: [deleted] 29 June 2012 08:28:58PM 0 points [-]

I think I disagree, but it would depend on implementation details.

One possibility I can see is that you keep around a copy of the LaTeX distribution to parse these easily embedded LaTeX fragments, something like LuaTeX might someday turn out to be. In that case, you're still stuck supporting LaTeX's monstrosity of a toolchain. In that case, there's still e.g. no LaTeX on the iPad.

Another possibility is that you rewrite the LaTeX engine ... ah, nevermind, this isn't a possibility.

Comment author: Viliam_Bur 29 June 2012 07:19:52AM *  0 points [-]

Well, I volunteer to try.

if ($text =~ m/\s+-\s+/) print "Hyphen in place of a dash.\n";

If this line could find dozen bugs, it's worth using. Even if it won't find all instances.

Comment author: [deleted] 29 June 2012 11:34:27AM 0 points [-]

Congratulations. You've just triggered a false positive on almost every minus sign in existence. (e.g., $1 - 1 = 0$.)

I would love it if what you suggest were possible, but it just isn't. Not when packages feel free to roll their own DSLs for anything.

Comment author: wedrifid 29 June 2012 02:19:59PM 1 point [-]

Congratulations. You've just triggered a false positive on almost every minus sign in existence. (e.g., $1 - 1 = 0$.)

Yes, but in each false positive all it does is print a message. Since there are rather few instances of minus signs compared to intended em dashes this doesn't seem like much a problem. Ignoring the irrelevant messages also doesn't introduce more than a trivial amount of work. Given that all equations need to be converted to the math environment (probably manually) and the time it takes a human to do the conversion (even when it just means adding $ around them) is orders of magnitude greater than the time taken to not do anything while reading that particular message we can merrily ignore the false positive issue as not worthy of optimisation.

I would love it if what you suggest were possible, but it just isn't.

It's almost exactly what I will do. It would be difficult to make a utility that got everything perfectly right every time without human intervention---that requires implementing comprehension skills and common sense. However, it is trivial to get something that does it well enough for our purposes with only minimal human intervention required.

Comment author: [deleted] 29 June 2012 02:22:43PM *  0 points [-]
Comment author: Kindly 29 June 2012 12:27:37PM 0 points [-]

Congratulations. You've just triggered a false positive on almost every minus sign in existence.

Every minus sign in the Sequences? What, all three of them?