nwinter

nickwinter.net, codecombat.com, skritter.com, quantified-mind.com

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by
nwinter347

I think that both these clauses are very standard in such agreements. Both severance letter templates I was given for my startup, one from a top-tier SV investor's HR function and another from a top-tier SV law firm, had both clauses. When I asked Claude, it estimated 70-80% of startups would have a similar non-disparagement clause and 80-90% would have a similar confidentiality-of-this-agreement's-terms clause.  The three top Google hits for "severance agreement template" all included those clauses.

These generally aren't malicious. Terminations get messy and departing employees often have a warped or incomplete picture of why they were terminated–it's not a good idea to tell them all those details, because that adds liability, and some of those details are themselves confidential about other employees. Companies view the limitation of liability from release of various wrongful termination claims as part of the value they're "purchasing" by offering severance–not because those claims would succeed, but because it's expensive to explain in court why they're justified. But the expenses disgruntled ex-employees can cause is not just legal, it's also reputational. You usually don't know which ex-employee will get salty and start telling their side of the story publicly, where you can't easily respond with your side without opening up liability. Non-disparagement helps cover that side of it.  And if you want to disparage the company, in a standard severance letter that doesn't claw back vested equity, hey, you're free to just not sign it–it's likely only a bonus few weeks/months' salary that you didn't yet earn on the line, not the value of all the equity you had already vested.  We shouldn't conflate the OpenAI situation with Anthropic's given the huge difference in stakes.

Confidentiality clauses are standard because they prevent other employees from learning the severance terms and potentially demanding similar treatment in potentially dissimilar situations, thus helping the company control costs and negotiations in future separations. They typically cover the entire agreement and are mostly about the financial severance terms. I imagine that departing employees who cared could've ask the company for a carve-out on the confidentiality for the non-disparagement clause as a very minor point of negotiation.

It's great that Anthropic is taking steps to make these docs more departing-employee-friendly. I wouldn't read too much into that the docs were like this in the first place (as this wasn't on cultural radars until very recently) or that they weren't immediately changed (legal stuff takes time and this was much smaller in scope than in the OpenAI case).

Example clauses in default severance letter from my law firm:

7. ​Non-Disparagement.  You agree that you will not make any false, disparaging or derogatory statements to any media outlet, industry group, financial institution or current or former employees, consultants, clients or customers of the Company, regarding the Company, including with respect to the Company, its directors, officers, employees, agents or representatives or about the Company's business affairs and financial condition.

11. ​Confidentiality.  To the extent permitted by law, you understand and agree that as a condition for payment to you of the severance benefits herein described, the terms and contents of this letter agreement, and the contents of the negotiations and discussions resulting in this letter agreement, shall be maintained as confidential by you and your agents and representatives and shall not be disclosed except to the extent required by federal or state law or as otherwise agreed to in writing by the Company.

nwinter113

Right, Quantified Mind tests are not normed, so you couldn't say "participants added 10 IQ points" or even "this participant went from 130 to 140".

However, they do have a lot of data from other test-takers, so you can say, "participants increased 0.7 SDs [amidst the population of other QM subjects]" or "this participant went from +2.0 to +2.7 SDs", broken down very specifically by subskill.  You are not going to get any real statistical power using full IQ tests.

In terms of saturating the learning effect, that's a better approach, but getting people to put their time into doing that makes it even harder.

It sounds like the protocols involve hours of daily participant effort over multiple weeks. Compared to that, it seems doable to have them do 5-10 minutes of daily baseline psychometrics (which double as practice) for 2-4 weeks before the experimental protocols begin? This amount of practice washout might not be enough, but if your effects are strong, it might.

In reality, that's table stakes for measuring cognitive effects from anything short of the strongest of interventions (like giving vs. withholding caffeine to someone accustomed to having it). I recall the founder of Soylent approached us at the beginning, wanting to test whether it had cognitive benefits.  When we told him how much testing he would need to have subjects do, he shelved the idea. A QM-like approach reduces the burden of cognitive testing as much as possible, but you can't reduce it further than this, or you can't power your experiments.

On a more positive note, if you have a small number of participants who are willing to cycle your protocols for a long time, you can get a lot of power by comparing the on- and off-protocol time periods. So if this level of testing and implementation of protocols would be too daunting to consider for dozens of participants, but you have four hardcore people who can do it all for half a year, then you can likely get some very solid results.

If I sound skeptical about expected measured effects from cognitive testing due to various interventions, it's because, as I recall, virtually none of the experiments we ran (on our selves, with academic collaborators from Stanford, from QS volunteers, etc.) ever led to any significant increases. The exceptions were all around removing negative interventions (being tired, not having your normal stimulants, alcohol, etc.); the supposed positives (meditation, nootropics, music, exercise, specific nutrients, etc.) consistently either did roughly nothing or had a surprising negative effect (butter). What this all reinforced:

  • it's easy to fool yourself with self-reports of cognitive performance (unreliable)
  • it's easy to fool yourself with underpowered experiments (especially due to practice effects in longer and more complicated tests)
  • virtually no one does well-powered experiments (because, as above, it's hard)

This gives me a strong prior against most of the "intervention X boosts cognition!" claims. ("How would you know?")

Still, I'm fascinated by this area and would love to see someone do it right and find the right interventions. If you offset different interventions in your protocols, you can even start to measure which pieces of your overall cocktail work, in general and for specific participants, and which can be skipped or are even hurting performance. I have a very old and poorly recorded talk on a lazy way to do this.

One last point: all of this kind of psychometric testing, like IQ tests, only measures subjects' alert, "aroused" performance, which is close to peak performance and is very hard to affect. Even if you're tired and not at your best but just plodding along, when someone puts a cognitive test in front of you, boom, let's go, wake up, it's time–energy levels go up, test goes well, and then back to your slump. Most interventions that might make you generally more alert and significantly increase average, passive performance will end up having a negligible impact on the peak, active performance that the tests are measuring. If I were building more cognitive testing tools these days, I would try to build things that infer mental performance passively, without triggering this testing arousal. Perhaps that is where the real impacts from interventions are plentiful, strong, and useful.

nwinter10

Awesome! It's on an old version of Google App Engine, so not very vulnerable to that form of data loss, but it is very vulnerable to code rot, and needs to be migrated.  (It was originally running on quantified-mind.com, but he hasn't thought about it a long time and let the domain expire.)

Is that upgrade process something you could help with? The underlying platform is pretty good, and he put a lot time into adapting gold-standard psychometric tests in a way that allows for easy, powerful Quantified-Self-style experimentation, but the project doesn't have a maintainer.

nwinter1613

People to help me get better psychometrics, the variance in my dataset is huge and my tests stop working at 3 STDs of IQ, for the most part. I'd love to have one or two more comprehensive tests that are sensitive to analyses up to 5 STDs

A friend of mine made https://quantified-mind.appspot.com/ for measuring experiments like this (I helped with the website). It sounds like a good fit for what you're doing.  You can have create an experiment, invite subjects to it, and have them test daily, at the same time of day, for perhaps 5-15 minutes a day, for at least a few weeks. Ideally you cycle the subjects in and out of the experimental condition multiple times, so the controls are the off-protocol subjects, rather than using other people as the controls, because the interpersonal variance is so high.

... not taking into account memorization happening on the IQ tests

Practice effects on cognitive testing are high, to the point where gains from practice usually dominate gains from interventions until tens of hours of practice for most tests.  This effect is higher the more complicated the test: practice effects attenuate faster with simple reaction time than with choice reaction time than with Stroop than with matrices than with SATs. This means you typically want to test on low-level basic psychometric tests, have subjects practice all the tests quite a bit before you start measuring costly interventions, and include time or test number as one of the variables you're analyzing.

Apart from practice, the biggest typical confounders are things like caffeine/alcohol, time of day, amount of sleep, and timing of meals, so you'd either want to hold those variables constant or make sure they're measured as part of your experiment.

These are my recollections from what we learned–my friend did most of the actual experiments and knows much more. If you want to go deep on experimental design, I can ask him.

nwinter30

I talked to someone building the browser plugin version of this a year ago. Sorry, I don't remember what it was called.

nwinter00

I'll be there, too. Having a Less Wrong sign would be good.