I suspect a regular expression gone wild. (ETA: This commit looks like a likely culprit, but I'm not sure what's going on that might cause that particular behavior.)
ETA2: Heh. Thought so.
bash$ python
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> control_chars = re.compile('[\x00-\x08\x0b\0xc\x0e-\x1f]')
>>> text="This is some HTML with c's in it."
>>> control_chars.sub('',text)
"This is some HTML with 's in it."
ETA3: um, unit tests. Use them. Do yourself a favor. Pushing a bug to production that takes under a minute to locate from static inspection of the code? Embarrassing.
ETA4: ...though, in fairness, I can see how someone test-driving this code could easily have written a test that didn't catch this particular mistake.
ETA3: um, unit tests. Use them. Do yourself a favor. …
We agree completely. We inherited this code from Reddit, and we've spent multiple days trying to strap a workable unit testing framework onto it. As is often the case when you write the code first, strapping unit testing on later is hard.
We've basically given up on unit tests in this code base, but we'd love to be shown to be idiots on this one. Please take this opportunity to show us up by writing some example unit tests around any of our recent commits.
We have strapped Selenium tests on.
Code contributions are very very welcome.
As long as you're fixing regressions, how 'bout the whole "comments no longer showing up in IE7" that Silasbarta made a post on a few weeks ago?
edit: Seems to be fixed, thanks!
The regular expression is wrong: It has the term "\0xc" in it, when it should have had the term "\x0c"
So, instead of excluding the control character corresponding to ascii "0c", it excluded the letters "x" and "c".
Smoketoomuch: Yes, I saw your advert in the bolour supplement.
Bounder: The what?
Smoketoomuch: The bolour supplement.
Bounder: The colour supplement.
Smoketoomuch: Yes, I'm sorry, I can't say the letter 'B'.
Bounder: C?
Smoketoomuch: Yes, that's right. It's all due to a trauma I suffered when I was a sboolboy. I was attacked by a bat.
Bounder: A cat?
Smoketoomuch: No, a bat.
Bounder: Oh...can you say the letter 'K'?
Smoketoomuch: Oh, yes. Khaki, kind, kettle, Kipling, kipper, Kuwait, Keble Bollege Oxford.
Bounder: Yes, yes but why don't you use the letter 'K' instead of the letter 'C'?
Smoketoomuch: What, spell bolour with a 'K'?
Bounder: Yes!
Smoketoomuch: Kolour!
Oh, thank you! I never thought of that. What a silly bunt.
Luckily, the letter 'c' still shows up when I EDIT my posts, just not when I VIEW them. So the 'c's are still there, they're just not showing.
Yes, same symptoms. With the letters and the blockquotes.
EDIT: Also, it's not consistent for me even on this page. I can see the 'c' (letter after 'b') in "blockquotes" in your post that I replied to, and in a few other comments, including mine, but not in the original post.
Weird. My "c" (letter after b) appears for me in my comments, but no one else's does.
Also, "x" (letter before z) seems to be missing too. Xylophones?
It seems like the letters "c" and "x" are working, but were purged from the site at some point. Any new posts that contain them seem okay, but pre-purge posts are very much not so.
I don't know, they don't show up in the newest posts for me, but show up in any comments. I'm really curious why this is happening.
Information: I'm running the latest version of Firefox with NoScript enabled (lesswrong,com and viglink.com allowed) and things are rendering just fine for me.
All of a sudden the letter 'c' (the one after 'b', in case it doesn't render) is not showing up in articles on Less Wrong for me, in any browser, except in images. I see a 'c' in the word 'discussion' above, but not in the body text of posts like this one or this one. Is anybody else getting the same issue?