This is an adaptation of an internal doc I wrote for Wave.

I used to think that behavioral interviews were basically useless, because it was too easy for candidates to bullshit them and too hard for me to tell what was a good answer. I’d end up grading every candidate as a “weak yes” or “weak no” because I was never sure what bar I should hold them to.

I still think most behavioral interviews are like that, but after doing way too many behavioral interviews, I now think it’s possible to escape that trap. Here are my tips and tricks for doing so!

Confidence level: doing this stuff worked better than not doing it, but I still feel like I could be a lot better at behavioral interviews, so please suggest improvements and/or do your own thing :)

Before the interview

Budget 2+ hours to build

That’s how long I usually take to design and prepare a new type of interview. If I spend a couple hours thinking about what questions and follow-ups to ask, I’m much more likely to get a strong signal about which candidates performed well.

It might sounds ridiculous to spend 2 hours building a 1-hour interview that you’ll only give 4 times. But it’s worth it! Your most limited resource is time with candidates, so if you can spend more of your own time to use candidates’ time better, that’s worth it.

Think ahead about follow-ups and rubric

I spend most of those 2 hours trying to answer the following question: “what answers to these questions would distinguish a great candidate from a mediocre one, and how can I dig for that?” I find that if I wait until after the interview to evaluate candidates, I rarely have conviction about them, and fall back to grading them a “weak hire” or “weak no-hire.”

To avoid this, write yourself a rubric of all the things you care about assessing, and what follow-up questions you’ll ask to assess those things. This will help you deliver the interview consistently, but most importantly, you’ll ask much better follow-up questions if you’ve thought about them beforehand. See the appendix for an example rubric.

Focus on a small number of skills

I usually focus on 1-3 related skills or traits.

To get a strong signal from a behavioral interview question I usually need around 15 minutes, which only leaves time to discuss a small number of scenarios. For example, for a head of technical recruiting, I decided to focus my interview on the cluster of related traits of being great at communication, representing our culture to candidates, and holding a high bar for job candidate experience.

You should coordinate with the rest of the folks on your interview loop to make sure that, collectively, you cover all the most important traits for the role.

During the interview

Kicking off

My formula for kicking off a behavioral question is “Tell me about a recent time when [X situation happened]. Just give me some brief high-level context on the situation, what the problem was,1 and how you addressed it. You can keep it high-level and I’ll ask follow-up questions afterward.”

I usually ask for a recent time to avoid having them pick the one time that paints them in the best possible light.

The second sentence (context/problem/solution) is important for helping the candidate keep their initial answer focused—otherwise, they are more likely to ramble for a long time and leave less time for you to…

Dig into details

Almost everyone will answer the initial behavioral interview prompt with something that sounds vaguely like it makes sense, even if they don’t actually usually behave in the ways you’re looking for. To figure out whether they’re real or BSing you, the best way is to get them to tell you a lot of details about the situation—the more you get them to tell you, the harder it will be to BS all the details.

General follow-ups you can use to get more detail:

  • Ask for a timeline—how quickly people operate can be very informative. (Example: I asked someone how they dealt with an underperforming direct report and they gave a compelling story, but when I asked for the timeline, it seemed that weeks had elapsed between noticing the problem and doing anything about it.)

  • “And then what happened?” / “What was the outcome?” (Example: I asked this to a tech recruiter for the “underperforming report” question and they admitted they had to fire the person, which they hadn’t previously mentioned—that’s a yellow flag on honesty.)

  • Ask how big of an effect something had and how they know. (Example: I had a head of technical recruiting tell me “I did X and our outbound response rate improved;” when I asked how much, he said from 11% to 15%, but the sample size was small enough that that could have been random chance!)

  • “Is there anything you wish you’d done differently?” (Sometimes people respond to this with non-actionable takeaways like “I wish I’d thought of that idea earlier” but having no plan or mechanism that could possibly cause them to think about the idea earlier the next time.)

Evaluating candidates

Make yourself a rubric

One of the worst mistakes you can make in a behavioral interview is to wing it: to ask whatever follow-up questions pop into your head, and then at the end try to answer the question, “did I like this person?” If you do that, you’re much more likely to be a “weak yes” or “weak no” on every candidate, and to miss asking the follow-up questions that could have given you stronger signal.

Instead, you should know what you’re looking for, and what directions to probe in, before you start the interview. The best way to do this is to build a scoring rubric, where you decide what you’re going to look for and what a good vs. bad answer looks like. See the appendix for an example.

General things to watch out for

Of course, most of your rubric should be based on the details of what traits you’re trying to evaluate! But here are some failure modes that are common to most behavioral interviews:

  • Vague platitudes: some people have a tendency to fall back on vague generalities in behavioral interviews. “In recruiting, it’s all about communication!” “No org structure is perfect!” If they don’t follow this up with a more specific, precise or nuanced claim, they may not be a strong first-principles thinker.

  • Communication bandwidth: if you find that you’re struggling to understand what the person is saying or get on the same page as them, this is a bad sign about your ability to discuss nuanced topics in the future if you work together.

  • Self-improvement mindset: if the person responds to “what would you do differently” with “nothing,” or with non-actionable vague platitudes, it’s a sign they may not be great at figuring out how to get better at things over time.

  • Being embarrassingly honest: if probing for more details causes you to learn that the thing went less well than the original impression you got, the candidate probably is trying to “spin” this at least a little bit.

  • High standards: if they say there’s nothing they wish they’d done differently, this may also be lack of embarrassing honesty, or not holding themselves to a high standard. (Personally, even for any project that went exceptionally well I can think of lots of individual things I could have done better!)

  • Scapegoating: if you ask about solving a problem, do they take responsibility for contributing to the problem? it’s common for people to imply/say that problems were all caused by other people and solved by them (eg “this hiring manager wanted to do it their way, but I knew they were wrong, but couldn’t convince them…”). Sometimes this is true, but usually problems aren’t a single person’s fault!

Appendix: example rubric and follow-ups

Here’s an example rubric and set of follow-up questions for a Head of Technical Recruiting.

Question: “tell me about a time when your report wasn’t doing a good job.”

  • moving quickly to detect and address the issue
    • ask for a timeline of events 
    • bad answer = lots of slop in time between “when problem started” / “when you discovered” / “when you addressed”
  • setting clear expectations with their report and being embarrassingly honest
    • ask what the conversation with their report was like
  • making their reports feel psychologically safe
    • ask how they thought their report felt after the tough convo
    • bad answer = not sure, or saying things in a non-supportive / non-generous way
  • being effective at discovering the root problem
    • ask a mini postmortem / “five whys” 
    • bad answer = not having deep understanding of root dynamics, only symptoms
  • do they understand whether what they did worked
    • ask for concrete metrics on how things were going before and after they intervened
    • bad answer = not having metrics, having metrics that moved only a small amount (and not realizing this is a failure), etc.
  • learning and updating over time
    • ask them what they could have done differently next time
    • bad answer = vague platitudes or “nothing”

  1. For behaviors that involve addressing problems—reword “problem” as something else as appropriate.
New Comment
6 comments, sorted by Click to highlight new comments since: Today at 3:25 AM

I notice that you have a lot of specific examples of bad answers but no specific examples of good answers - are good answers just obviously good, or are ~all answers not specifically called out as bad answers generally good, or something else? Would be curious to see some examples of good answers.

On the whole are they worth doing for the average person? Sounds like you're pretty good at this kind of thing and got a ton of practice and it's still meh

I haven't looked into this recently, but last time I looked at the literature behavioral interviews were far more predictive of job performance than other interviewing methods.

It's possible that they've become less predictive as people started preparing for them more.

The two highest mean validity paired procedures for predicting job performance are general mental ability (GMA) plus an integrity test, and GMA + a structured interview (Schmidt et al 2016 meta-analysis of "100 years of research in personnel selection", reviewing 31 procedures, via 80,000 Hours – check out Table 2 on page 71). GMA alone beats all other single procedures; integrity tests not only beat all other non-GMA procedures but also correlate nearly zero with GMA, hence the combination efficacy. 

A bit more on integrity tests, if you (like me) weren't clear on them:

These tests are used in business and industry to hire employees with reduced probability of counterproductive work behaviors on the job, such as fighting, drinking or taking drugs, stealing from the employer, equipment sabotage, or excessive absenteeism. Integrity tests do predict these behaviors, but surprisingly they also predict overall job performance (Ones, Viswesvaran, & Schmidt, 1993).

Behavioral interviews – which Schmidt et al call situational judgment tests – are either middle of the rankings (for knowledge-based tests) or near the bottom (for behavioral tendencies). Given this, I'd be curious what value Ben gets out of investing nontrivial effort into running them, cf. Luke's comment.

Communication bandwidth: if you find that you’re struggling to understand what the person is saying or get on the same page as them, this is a bad sign about your ability to discuss nuanced topics in the future if you work together.

Just pulling this quote out to highlight the most critical bit. Everything else is about distinguishing between BS and ability to remember, understand, and communicate details of an event (note: this is a skill not often found at the 100 IQ level). That second thing isn't necessarily a job requirement for all positions (eg sales, entry level positions), but being comfortable talking with your direct reports is always critical.

I really appreciate this post!