User Comment Replies

This seems to assume that 100% of claims get approved. How can the equation be modified to account for the probability of claims being denied?

I would guess lower cost insurance policies tend to come from companies with lower claim approval rates, so it seems appropriate to price into the calculator. I believe there are also softer elements in insurance costs like this that should be considered, such as customer service quality, but that's probably out of scope for this calculator.

2kqr4mo

Fundamentally we are taking the probability-weighted expectation of log-wealth under all possible outcomes from a single set of actions, and comparing this to all other sets of actions. The way to work in uncompensated claims is to add another term for that outcome, with the probability that the claim is unpaid and the log of wealth corresponding to both paying that cost out of pocket and fighting the insurance company about it.

Bounty: Diverse hard tasks for LLM agents

Sunishchal Dev1yΩ150

Thanks, this is helpful!

I noticed a link in the template.py file that I don't have access to. I imagine this repo is internal only, so could you provide the list of permissions as a file in the starter pack?

# search for Permissions in https://github.com/alignmentrc/mp4/blob/v0/shared/src/types.ts

6Beth Barnes1y

Ah, sorry about that. I'll update the file in a bit. Options are "full_internet", "http_get". If you don't pass any permissions the agent will get no internet access. The prompt automatically adjusts to tell the agent what kind of access it has.

Bounty: Diverse hard tasks for LLM agents

Sunishchal Dev1yΩ5110

Thanks for the detailed instructions for the program! Just a few clarifications before I dive in:

The README file's airtable links for task idea & specification submission seem to be the same. Did you mean to paste a different link for task ideas?
Are the example task definitions in the PDF all good candidates for implementation? Is there any risk of doing duplicate work if someone else chooses to do the same implementation as me?
If I want to do an implementation that isn't in the examples list, is it a good idea to first submit it as an idea and w

... (read more)

9Beth Barnes1y

Great questions, thank you! 1. Yep, good catch. Should be fixed now. 2. I wouldn't be too worried about it, but very reasonable to email us with the idea you plan to start working on. 3. I think fine to do specification without waiting for approval, and reasonable to do implementation as well if you feel confident it's a good idea, but feel free to email us to confirm first. 4. That's a good point! I think using an API-based model is fine for now - because the scoring shouldn't be too sensitive to the exact model used, so should be fine to sub it out for another model later. Remember that it's fine to have human scoring also.

LESSWRONG
LW

All of Sunishchal Dev's Comments + Replies