1 min read

6

This is a special post for quick takes by gwillen. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
10 comments, sorted by Click to highlight new comments since:

Why [not] ask why?

When someone asks for help, e.g. in a place like Stack Overflow, they are often met with the response "why do you want to do that?"

People like to talk about the "XY Problem": when someone's real problem is X, but their question is about how to do Y, which is a bad way to solve X. In response, some other, snarkier people sometimes talk about the "XY Problem Problem": when someone's problem is Y, and they ask about Y, but people refuse to help them with it because they're too busy trying to figure out the (nonexistent) value of X.

The other day, I thought about a taxonomy of good responses to the kind of informally-specified request for help that one sees online. I came up with the following:

  1. Straightforward answer to the question asked.
  2. "Frame challenge" (This is Stack Overflow terminology; the others below are not.)
    • "I understood your question, and I also understood your underlying problem, and I would like to offer an explanation of why I think the straightforward answer to your question does not solve your underlying problem."
    • Although this kind of response doesn't directly answer the question, I think it's good because (1) it's required to directly address why not, which provides something for the asker to disagree with if appropriate; and (2) it provides something that the answerer thinks is more useful than an actual answer.
  3. "Safety challenge"
    • "I think your question provides evidence that you are doing something dangerous to yourself or others, and you are not aware of that danger.  A direct answer would endorse or contribute to that danger, so instead I want to warn you about the danger."
    • This can be condescending if the danger is minor, or not real. But again, I think it's good because (1) it directly states why it doesn't answer the question, and (2) it provides some information the answerer thinks is more useful instead.
  4. "Assumption challenge"
    • "I think it is not possible to answer your question, because it embeds an assumption that is not true, as follows: [...]"
    • Good, because it clarifies what the assumption is, which gives the asker the opportunity to argue or clarify.
  5. "Ambiguity challenge"
    • "I did not understand your question, and I am unable to answer it because I can't figure out what it's asking."
    • This one is interesting and I will discuss it below.

When someone responds to "how do I do X?" with "why do you want to do X?", I think this creates conflict for a few reasons. Primarily, it tends to insult the asker. As certain people on this site would say, it's a status grab: "I know better than you what question you actually need answered, and it is not the one you asked; try again." It sets the answerer above the asker in status.

One of the reasons that "frame challenge" works so well on Stack Exchange is that you have to declare that you're making one. Saying "I would like to challenge the assumptions of your question" comes across as more respectful, and less status-grabbing, than "why would you want to do that? Don't do that." I think "safety challenge" could work similarly. Saying "you shouldn't do that, it's dangerous" will always come with some amount of status assertion, but saying "I think that's dangerous, and here's why" is less of a status grab than "Why would you want to do that?", because it provides an explanation, rather than assuming it's obvious and putting all the burden of communication on the asker.

Another reason is that it provides more information to the asker. For a "frame challenge" to be a valid answer on Stack Exchange, it has to include an explanation of why the answerer thinks the asker's frame is bad. The ball is in the answerer's court to provide the asker with useful information. A bare "why would you want that?" does not.

Stack Exchange's question/answer structure really helps here, compared to other types of discussion fora. In unstructured discussion fora, it's easy for a "Why?" to turn into an angry argument between the original asker and the responder. On Stack Exchange, each answer is a separate conversation thread, meaning that (1) conversation in response to one answer can't prevent someone from giving a better answer, and (2) multiple independent answers can be voted up, so there's room for BOTH "here's the answer to your question" and "here's why I think that won't help you" to be upvoted and discussed separately, on the same question (and I've seen this happen.)

Above I mentioned "the burden of communication", which I think is a big part of what's going on here. It is roughly always the case that a question is ambiguous in some way, or requires some context to understand. This means there will be some burden of negotiating the necessary context between asker and answerer. "Why do you want to do that?" tosses 100% of the burden back on the asker; it expresses "I don't like your question", but makes no effort to bridge the gap. "Why? Well, it all started about 14 billion years ago..." There are always infinite layers of "why", and this is no help in figuring out what the responder feels is missing from the question.

"Ambiguity challenge", which I suggested above, is the least helpful of my suggested responses -- to be helpful, I think it needs to come with some effort to explain what about the question is ambiguous. What form that will take depends on the question. It still beats "why?" because "the problem is ambiguity" is still some information. It means the problem is not safety, or clearly false assumptions. And it's a direct statement that the responder does not understand the question, which implies they aren't being totally condescending ("I understand you, but I refuse to help because I think your request is stupid.")

Importance vs Seriousness of projects

(Note: I'm not sure "serious" is the right word for what I mean here. As I was writing this, I overheard a random passerby say to someone, "that's unprofessional!" Perhaps "professional" is a better word for it.)

While working on some code for my MIT Mystery Hunt team, I started thinking about sorting projects by importance (i.e how bad the consequences would be if they broke.)

The code I'm working on is kind of important, since if it breaks it will impair the team's ability to work on puzzles. But it won't totally prevent progress (just make it harder to track it), and it's not like puzzles are a life-or-death issue. So it's a bit important, but not that important.

It's also possible to write code at Google that isn't that important -- for example, a side project with no external users and few internal ones. But even unimportant code at Google is typically "serious". By that I mean... I guess that it does things "properly" and incurs all the overhead involved?

Code in source control is more "serious" than code not in source control. (I don't even write hobby projects outside source control anymore, and haven't really for years -- that level of seriousness is table stakes now.) Code running in the cloud is more "serious" than code running on my desktop machine -- it's more effort to deploy, but it's less likely to suffer from a power outage or an ISP failure.

And it's also more possible to collaborate on "serious" projects -- writing your own web framework can genuinely get you advantages over using an existing one, but collaborating with others will be a lot harder.

Of course, if your project is important but not "serious", you have a big problem. You need it to keep working, but it's running on your laptop, using an undocumented framework only you know how to work with, using your personal Amazon credentials, and you do most of your testing in production. Sooner or later, your important project will break, and it will suck. (And if you do manage to keep it going, this will sometimes require a heroic effort.)

On the other hand, the costs of being too "serious" are all about overhead. You have a perfectly good computer on your desk, but you pay for two servers in the cloud anyway, one to deploy and one to test on. You have separate project accounts on various services, separate from your personal ones, and manage lots of extra credentials. (And you don't store them on your laptop, oh no; you store them in a credential storage service. Er, hm...)

"Don't test in production" is good advice for important projects, but it's defining advice for serious projects -- if you don't follow it, that's inherently less serious than if you do. Your overhead is lower, but with a higher chance of catastrophe.

This musing brought to you by the entire day I have wasted today, trying to get a local development environment to match the behavior of a cloud environment, to reproduce a problem. I haven't succeeded yet. The seriousness of this overhead really feels out of proportion to the project's importance -- it's not even mystery hunt season, nobody's even using it! Yet I would feel unserious leaving a trail of pushes to prod to diagnose the issue, without at least visibly struggling to do it "the right way" first.

(This also belatedly explains a number of annoyed disagreements I have had with other devs over project infrastructure. I each case, I was annoyed that their choices either seemed too serious -- having excessive overhead -- or else not serious enough.)

Nested layers of "options"

Here I use "option" in the sense of C++ std::optional<> / Rust Option / Haskell Maybe.

It feels to me like "real-world data" often ends up nested in thick layers of "optionality", with the number of layers limited mostly by how precisely we want to represent the state of our "un-knowledge" about it. When we get data from some source, which potentially got the data in turn from another source, and so on, there is some kind of fuzziness or uncertainty added at each step, which we may or may not need to represent.

I'm thinking about this because I'm feeding data from Home Assistant into Prometheus/Grafana to log and graph it, and ran across some weirdness in how HomeAssistant handles exporting datapoints from currently-unavailable devices.

Layers of optionality that seem to exist here:

  • The sensor itself can tell us that the thing it's sensing is in an unknown state. (For example, the Home Assistant Android app has a "sensor" tracking the current exercise activity of the person using the phone, which is often "unknown".)
  • The sensor itself could in theory tell us that it has had some kind of internal malfunction and is not sensing anything (but is still connected to the network.) I don't have examples of this here.
  • The system only samples the sensor at certain times; this may not be one of those times, so the current state may be inferred from the previous state. (This is sort of a weird one, because it involves time-dependence, which is a whole other kettle of fish.)
  • The system's most recent attempt to sample the sensor could have failed, so that the latest value is unknown. (This is the case that gives the weirdness I ran into above -- the Home Assistant UI, and apparently also the Promethus exporter, will repeat the last-known value for some time when this happens, which I think is ugly and undesirable.)
  • The Prometheus scraper could receive some kind of error state from Home Assistant. (In practice this would be merged with the following item.)
  • The Prometheus scraper could fail to reach the Home Assistant instance, and so not know anything about its state for the current instant.
  • (From this point on, I'm talking about things you wouldn't normally think of in this framework at all, but I think it fits:) Prometheus could display an error in the UI, because it can't reach its own database to get the values of the datapoint.
  • My browser could get an HTTP error from Prometheus indicating a failure to even produce a webpage.
  • My browser could give an error indicating that it couldn't reach Prometheus at all.

I have obviously added every conceivable layer I could think of here, including some that don't usually get thought about in a uniform way, and some that we would in practice never bother to distinguish. But I'm a data packrat, and also an amateur type theorist, and so I think a lot about data representations.

This whole thing shades into another space I think a lot about, which is error handling in programming languages and systems.

Some parts of the stack I described above really seem to fall under "error handling" -- what do you do if you can't reach component A from component B? Others seem to fall under "data representation" -- If you poll someone who they're voting for, and they say "I'm not voting", or "I don't know", or "fuck you", or "je ne parle pas Anglais", what do you write down on the form (and which of those cases do you want to distinguish versus merge?) But the two are closely related.

3D Printer foibles

I got a 3d printer last year, and I've been using it on and off. I want to document some of the stuff I've learned in the process. I'll start with just an outline for now, and see if people are interested (or I feel inspired) for more specifics.

The specific printer is a Monoprice Voxel, which is a rebadged / whitelabel Flashforge Adventurer 3.

  • Had I known it was a whitelabel I would have instead bought the original version. I don't know if that one has the same firmware bugs, but there's at least one missing feature in the Monoprice firmware (nozzle temperature calibration when replacing the nozzle.) It's possible to reflash it with the OEM firmware but I haven't attempted this.

  • I've had lots of issues related to bed leveling.

    • The UI has a feature called Auto Level. (sp?) I'm not sure if it's aspirational or fraudulent, but it does nothing even remotely useful, it just seems to pretend to.
    • The only operation the firmware actually supports here is calibrating the bed height. The bed is supposed to be factory level. If it's out of level there is no supported way to level it. (This is marketed as "it doesn't need leveling", of course.)
    • I went through a lot of stuff in the process of figuring this out. In the end, I leveled the bed by disassembling it and inserting shimming strips of blue tape in key places where it mounts. This worked perfectly.
  • The filament feeding mechanism can detect if the spool runs out, by seeing the end of the filament pass through the mechanism. It can not detect if the filament stops feeding because it gets stuck or jams. If the "tail" of the filament spool is attached to the spool itself and fails to come free when the spool runs out, this behaves like a jam, and the printer will keep trying to feed the spool forever. I left a spool jammed for hours overnight this way -- surprisingly there was no permanent damage to the printer, but it took some effort to figure that out and recover.

  • The original "collet" connecting the filament feeding tube ("Bowden tube") to the print head is not very good. It might be fine if you never mess with it, but in recovering from the incident above, I disassembled it, and after that it was loose until I replaced it. A loose Bowden tube can cause certain mysterious and hard-to-diagnose intermittent printing problems.

  • Getting prints to stick to the print bed the right amount seems to be a problem on every 3d printer, not specific to this one. I still have a lot of superstitions about it, BUT most of them are from before I got the bed properly level. Now that I've managed that, the acceptable values of various other parameters (bed material, bed covering material, nozzle height over the bed) seem much looser.

  • Sometimes the nozzle seems to get microscopic debris or something clogging it. The symptom is that plastic will intermittently stop flowing properly and then start again.

    • Differential diagnosis: This can also be caused by filament spool tangling, or (on the first layer) by the nozzle being squished too hard against the bed.
  • To convert 3d model (STL file) to something the printer can print (gcode file), you use "slicer" software. The slicer software that comes with this printer is not very good. Most people use something else (I use Ultimaker Cura.) To teach Cura about my printer, I had to give it a printer profile. I got one from a random stranger's forum post (like you do.) It turns out to have been subtly wrong, and also have a bunch of probably-superstitious crap in it that doesn't do anything.

    • The foibles of the slicing process would be an entire book on their own. But here's one that's maybe specific to this printer: If I turn on the "z hop" setting in Cura, which should raise the print head when moving around so it doesn't scrape the top of the model, the printer suddenly forgets to ever operate the Z axis at all, and tries to print every layer directly on the bed in the same space occupied by the previous layers. I don't know if this is Cura's fault, the printer's fault, or neither (gcode being a very loosely specified language in many respects.)

There's a lot more I could say, but since I said I was just outlining, I'll stop there...

What's up with all the foo-vectors?

This is an attempt to succinctly (hah) answer a question I keep having to refresh my memory about: What's up with (vectors, bivectors, axial vectors / pseudovectors, multivectors, the cross product, etc.?) How do they relate to each other?

Multivectors

Multivectors or k-vectors are a generalization of vectors. Vectors have a length and a direction, and can be thought of as one-dimensional; k-vectors generalize vectors to arbitrary dimension k. In this framework, a scalar -- a quantity without direction -- can be thought of as a 0-vector. A 1-vector is just a regular vector. A 2-vector (or "bivector") is a quantity associated with a two-dimensional "direction", which is an oriented plane. And so on.

What does it mean for a plane to be "oriented"? It means we pick one side to be the "right side" and the other to be the "wrong side". (In the same way, a vector is an "oriented line", which has a "right end" where we draw the arrowhead.)

The exterior ("wedge") product

We get multivectors from vectors using the exterior product, or "wedge product". In the 3-dimensional setting, the wedge product smells almost exactly like the cross product -- it takes two vectors and gives back a bivector, whose magnitude is the area of a parallelogram formed by those two vectors, and whose orientation depends on the relative directions of the two vectors. (I'm being deliberately vague here to avoid saying anything false; I could say "according to the right-hand rule" to get the general point across, but a later point will be that the left-right choice here is arbitrary, and could have been chosen the other way.)

Pseudovectors

In an n-dimensional space, a pseudovector (or axial vector) is an (n-1)-vector -- that is, an n-minus-one-dimensional multivector. (A pseudoscalar is an n-vector.) Consider a 3-dimensional space: A bivector picks out two dimensions of it (an oriented plane), but picking two out of three dimensions leaves just one dimension remaining un-picked. So every bivector (a plane with magnitude and orientation) can be matched up with some vector (with the same magnitude, and pointing normal to the plane in the direction of its orientation.)

So in 3-dimensional space, a bivector is a pseudovector, because it is very nearly equivalent to a regular vector. (And a trivector is a pseudoscalar -- there is only one possible basis-trivector, since there are only three dimensions and it has to span all of them. So a pseudoscalar only has a magnitude, and no meaningful direction, just like a regular scalar.)

Orientation

Why did I say "very nearly equivalent" -- what's the "pseudo" part about? This is trickier to explain, and while it will work fine as a refresher for myself, I don't know if I will get it across fully to anybody else, but I'll try.

Consider unit vectors pointing along the X, Y, and Z axes. Consider also a bivector X ^ Y, which has unit magnitude, and is oriented with the "right side" pointing in the same direction as our Z vector.

Now, flip the whole space around as though you are looking at it in a mirror. You can do this by e.g. negating any of our three vectors. In the resulting space, the X ^ Y bivector is now oriented in the opposite direction from the Z vector. (Thinking about the right-hand rule, consider that a right hand viewed in the mirror looks like a left hand. So if we apply the right-hand rule to the mirrored space, it will point in the opposite direction from how it pointed in the non-mirrored space.) If you take the "pseudovector" view of it -- treating X ^ Y as something like a vector pointing along the Z axis, instead of a plane oriented towards the +Z axis -- you will see where the "psuedo" comes from. Reflecting the space in a mirror causes the vector and the pseudovector, which pointed in the same direction before, to now point in opposite directions.

If you haven't encountered this before, it's probably going to seem like sophistry or handwaving, sorry. All I can say to that is, I promise this actually makes a difference, although I cannot adequately explain why at this time.

The cross product

This all comes around to why people say things like "the cross product doesn't really give a vector!" Because if you look at the universe in a mirror, the result of the cross product does not behave like a vector. It will not appear mirrored like regular vectors, because its direction depends on handedness, and mirrors reverse handedness.

This also explains why sometimes people say "the cross product gives a bivector" and other people say "the cross product gives a pseudovector". In 3-dimensional space, which is the only place the cross product is well-defined, the two are equivalent.

I've got a very good intuition about foo-vectors from Let's remove Quaternions from every 3D Engine post.

Actually, I think that post is probably what triggered me to write this originally, and I forgot that by the time I wrote it (or I would have added a link.) Thanks for the reminder!

A 1-vector is just a regular vector. A 2-vector (or "bivector") is a quantity associated with a two-dimensional "direction", which is an oriented plane. And so on.

Ok, but how do you actually "and so on" the orientability here? I have not actually tried to picture how you orient a 3-vector in a higher space. And I'm suspicious about my analogy between 1-vector and 2-vector orientation until I can picture that. (You can orient a plane by picking one of the two halves it divides a 3-d volume into, but you normally orient a line by thinking about the ends, not the sides where it divides the plane. Does that matter?)

I think the way this all works is a lot more subtle than I've been imagining, and probably some of the stuff in the original shortform about orientation is wrong.