Designing Artificial Wisdom: GitWise and AlphaWise

Jordan Arel

Introduction

In this post I will describe two possible designs for Artificial Wisdom (AW.) This post can easily be read as a stand-alone piece, however it is also part of a series on artificial wisdom. In essence:

Artificial Wisdom refers to artificial intelligence systems which substantially increase wisdom in the world. Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," or as “having good terminal goals and sub-goals.” By “strapping” wisdom to AI via AW as AI takes off, we may be able to generate enormous quantities of wisdom which could help us navigate Transformative AI and The Most Important Century wisely.

TL;DR

Artificially wise coaches that improve human wisdom seem like another promising path to AW. Such coaches could have negligible costs, be scalable & personalized, and soon perform at a superhuman level. Certain critical humans receiving wise coaching could be decisive in humans navigating transformative AI wisely.

One path to AW coaches is by creating a decentralized system like a wiki or GitHub for wisdom-enhancing use-cases. Users could build up a database of instructions for LLM’s to act as AW coaches to help users make difficult decisions, navigate difficult life and epistemic dilemmas, work through values conflicts, achieve career goals, improve relational/mental/physical/emotional well-being, and increase fulfillment/happiness.

One especially wise use-case could be a premortem/postmortem bot that helps people, organizations, and governments to avoid large-scale errors.

Another path to creating an AW coach is to build a new system trained on biographical data, which analyses and learns to predict which decision-making processes and strategies of humans with various traits in various environments are most effective for achieving certain goals.

Artificial Wisdom Coaches

The are several possible paths for developing AW coaches. After introducing the basic idea, I will briefly outline two of them.

The essential idea is that AW can learn from the best human coaches, therapists, teachers, or other wise data and training mechanisms, but can study orders of magnitude more data, which it never forgets, plus be fine-tuned on a large amount of human outcome data directly tied to the results of its coaching.

Artificially wise coaches will eventually perform much better than human coaches, and be much more scalable with negligible costs, hence they will be able to help humans perform at a higher level and not fall behind AI as much, since as AI gets better humans will also be getting better AW coaches (again, this is the idea of “strapping” human wisdom to AI as AI blasts off, as discussed in the last post).

GitWise

The first way of creating an AW coach is via a decentralized system in which numerous people interested in this concept contribute to gradually build up a database of ways in which to use LLM’s to improve human functioning and increase human wisdom.

This could be something like a wiki, a forum, or a GitHub for wise AI use, maybe we could call it “GitWise.”

This database could include many aspects, such as:

What background information to share and how to share it
Specific prompts
Prompt workflows
Highly effective use cases
Tips for effectively interacting with AW coaches
Etc.

As just one example, in the first category mentioned above, to start with there might be various processes for listing out all of the personal background information important for an AW coach to be able to help you specifically, perhaps in a specific use case or across all use-cases. For example, there could be instructions with examples for listing out all of your most important life goals, your values, your daily routine, important parts of your life history, what traits you want in a coach, your preferred learning styles, what is most motivating and demotivating to you, various details about your occupation, hobbies, social life, etc. and how exactly to present all of this to an AW coach, whether as a PDF in the context window, as part of its memory, etc.

Each set of instructions/prompts/workflows/etc. would be rated by users, and perhaps rated across the most relevant traits such as effectiveness, user-friendliness, etc. and this rating would also contribute to a user’s rating, so that users can easily find the best instructions and build reputations. A more advanced version could track more detailed metrics of how effectively

AW coach instructions could be organized across many categories, for example instructions could help users make difficult decisions, navigate difficult life and epistemic dilemmas, work through values conflicts, achieve career goals, improve relational/mental/physical/emotional well-being, increase fulfillment/happiness, etc.

As the base model, and hence the AW coach, gets increasingly intelligent and eventually superhumanly intelligent, it will be able to help humans perform at increasingly high levels. As discussed in the first piece, high performing humans who make fewer large-scale mistakes could be incredibly important in certain existential security/longtermist domains, and effective, happy humans seem good to have in general.

In addition to individual humans, this AW design could also be adapted to apply to teams, for-profit and non-profit organizations, and governments.

Premortem/Postmortem Bot

I was originally going to include this as a separate AW design, but realized I don't have enough technical knowledge to fully flesh it out as its own project, so I'm including it as a sub-idea within the GitWise coach.

The idea is that an LLM could help perform premortems on important projects. It could help think through all of the ways that a project could go wrong at each step, and pre-plan how to avoid or deal with the most likely obstacles and most serious risks.

It could also help perform postmortems on projects that have failed, in which projects are analyzed to see what went wrong and what could have been have done differently, the various ways failure modes could have been systematically avoided or resolved, and how to perform better in the future, including redesigning processes to broadly avoid similar classes of problems as effectively and efficiently as possible.

While such a bot could easily be created by giving an LLM instructions on how to help perform a premortem/postmortem, and A/B testing until optimal instructions are found, it would likely be more effective to custom pre-train or fine-tune an LLM to be especially effective at helping perform premortems and postmortems.

Perhaps one way to train such a model is to give it historical accounts of goal-directed events (perhaps curated by LLM’s), such as individual’s biographies, or stories/public data of companies/non-profits/governments, and have it start with the goal of the entity and then have it try to do a premortem to predict what might go wrong and how the entity could try to prevent or address the issue, and then do gradient descent updating it with what actually went wrong and how it was resolved, if it was successfully resolved (unless the LLM's own solutions are rated as more effective/efficient.)

Premortems and postmortems seem especially useful in avoiding large-scale errors. They allow an entity to do a detailed analysis of everything that could go wrong with their plans, and to effectively learn from what went wrong in previous plans and how they can systematically avoid similar errors in the future. An LLM that has been custom trained to do premortems and postmortems extremely effectively might be able to see many ways that plans could go wrong that most people would miss, using its extensive database of examples and highly developed reasoning regarding premortems/postmortems.

AlphaWise

The second way of training an AW coach is one I admit to not having enough technical knowledge to know whether it is actually feasible with current technology. Even if this idea is ahead of its time, I think it may still be good to have an abundance of such ideas shovel-ready since time is rapidly shrinking with the rapid progress of AI, so even some advanced ideas may be possible quite soon.

The idea is that you could use biographical knowledge of numerous individuals and create a game board on which these individuals, who possess various traits (including epistemic & decision-making processes,) make various decisions within various environments, which in turn increase and decrease various traits, and lead to various outcomes.

LLM’s could be used to analyze biographies and create a set of scores at various checkpoints in historical people’s lives by giving estimates of how much/what type of each trait an individual possesses across time; for example, social support, intellectual ideas, character traits, values, habits, education, social skills, finances and access to various types of resources, cultural access and cultural knowledge, cultural competence of various types, aspects of artistic/scientific/political/etc. pursuits, etc. etc. etc. (good candidate traits could be drawn from psychological and other social science research.)

All of these, along with the individual's environment, would be quantified and mapped, as though the person was playing their life on a many dimensional game board, for example each of these traits could be scored on a scale of 1-100 (perhaps some traits could be estimated or ignored for some individuals if there was low information), and then an updated score for each trait could be given each year (or as often as possible and useful), with various decisions and decision processes people use that increase or decrease these scores modeled and mapped.

The outcomes of the person's life could also be mapped and scored across important values, for example how happy or successful they were, how much they contributed to/detracted from the happiness/success of others around them, how much they benefited or harmed society in various ways, etc.

If enough data could be gathered, and data could be cleaned up and made legible and reliable enough, the same gradient descent deep learning techniques, including the policy network, value network, and tree search used to train Chess and Go AI systems (and perhaps more relevantly, techniques used in RPG's and strategy games, though I haven't personally studied these much) could be used to model what types of policies are good to use in various types of environments for individuals with various traits, pursuing specific life goals.

It is even possible that an AI could learn from self-play; a simulated game-world could be created where many individuals are generated with randomized characteristics and then play with/against each other, trying to maximize various goals, and the AI could try various strategies to maximize outcomes for certain individuals, for example it could “coach” individuals in the environment to see if it can help them to achieve their goals, and learn what type of coaching is most effective for improving specified outcomes for certain individuals.

It seems to me the biggest difficulty here is probably gathering enough usable data, though overall the project seems highly ambitious.

The same concept could be applied to companies, nonprofits, or governments, using historical accounts and records, but also using present-day public data (such as economic data or public government records). Thus, the AW coach could also learn to give advice to these entities in order to achieve certain collective outcomes, creating greater wisdom across all levels and sectors of society.

If there was some other way to bootstrap such an AW coaching system, which didn't require so much upfront data and intensive pre-training, then once in use it could continuously collect data from users and over a few years or decades build up enough data to give increasingly helpful & wise advice.

Such a system would allow individual humans, companies, nonprofits, and governments to deliberately choose the outcomes they achieve and wisely select between various paths (subgoals) to achieve those outcomes. It would give people greater control to make wise decisions that don't accidentally sacrifice some type of value that is important to them. Each of these entities would be able to get immediate feedback when they are falling into certain predictable traps, and would have access to wise, contextually sensitive advice on the best ways of thinking about various parameters to keep them happy, healthy, functioning, and moving toward positive outcomes across important values with a high degree of certainty.

While I suspect AlphaWise may be a bit of a moon-shot at present, perhaps it could nonetheless inspire a simpler version that is more tractable, and perhaps one day soon something like this will be feasible.

The next post in this series will explore a design for artificial wisdom a la decision forecasting and Futarchy, or return to the Artificial Wisdom series page.

LESSWRONG
LW