Series on Artificial Wisdom

Jordan Arel

This series introduces Artificial Wisdom (AW) and illustrates several designs for AW systems. Artificial Wisdom refers to artificial intelligence systems which substantially increase wisdom in the world. Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," or as “having good terminal goals and sub-goals.” By “strapping” wisdom to AI via AW as AI takes off, we may be able to generate enormous quantities of wisdom which could help us navigate Transformative AI and The Most Important Century wisely.

On Artificial Wisdom

In the first post in the series I introduce the term “Artificial Wisdom (AW),” which refers to artificial intelligence systems which substantially increase wisdom in the world. Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," including both errors of commission and errors of omission; or as “having good goals” including terminal goals and sub-goals.

Due to orthogonality, it is possible we could keep AI under control and yet use it very unwisely. Four scenarios are discussed on how AI alignment interacts with artificial wisdom, with artificial wisdom being an improvement on any world, unless pursuit of AW significantly detracts from alignment, causing it to fail.

By “strapping” wisdom to AI via AW as AI takes off, we may be able to generate enormous quantities of wisdom in both humans and autonomous AI systems which could help us navigate Transformative AI and "The Most Important Century" wisely, in order to achieve existential security and navigate toward a positive long-term future.

Designing Artificial Wisdom: The Wise Workflow Research Organization

Even simple workflows can greatly enhance the performance of LLM’s, so artificially wise workflows seem like a promising candidate for greatly increasing AW.

This piece outlines the idea of introducing workflows into a research organization which works on various topics related to AI Safety, existential risk & existential security, longtermism, and artificial wisdom. Such an organization could make progressing the field of artificial wisdom one of their primary goals, and as workflows become more powerful they could automate an increasing fraction of work within the organization.

Essentially, the research organization, whose goal is to increase human wisdom around existential risk, acts as scaffolding on which to bootstrap artificial wisdom.

Such a system would be unusually interpretable since all reasoning is done in natural language except that of the base model. When the organization develops improved ideas about existential security factors and projects to achieve these factors, they could themselves incubate these projects, or pass them on to incubators to make sure the wisdom does not go to waste.

Designing Artificial Wisdom: GitWise and AlphaWise

Artificially wise coaches that improve human wisdom seem like another promising path to AW. Such coaches could have negligible costs, be scalable & personalized, and soon perform at a superhuman level. Certain critical humans receiving wise coaching could be decisive in humans navigating transformative AI wisely.

One path to AW coaches is by creating a decentralized system like a wiki or GitHub for wisdom-enhancing use-cases. Users could build up a database of instructions for LLM’s to act as AW coaches to help users make difficult decisions, navigate difficult life and epistemic dilemmas, work through values conflicts, achieve career goals, improve relational/mental/physical/emotional well-being, and increase fulfillment/happiness.

One especially wise use-case could be a premortem/postmortem bot that helps people, organizations, and governments to avoid large-scale errors.

Another path to creating an AW coach is to build a new system trained on biographical data, which analyses and learns to predict which decision-making processes and strategies of humans with various traits in various environments are most effective for achieving certain goals.

Designing Artificial Wisdom: Decision Forecasting AI & Futarchy

A final AW design involves using advanced forecasting AI to help humans make better decisions. Such a decision forecasting system could help individuals, organizations, and governments achieve their values while maintaining important side constraints and minimizing negative side effects.

An important feature to include in such AW systems is the ability to accurately forecast even minuscule probabilities of actions increasing the likelihood of catastrophic risks. The system could refuse to answer, attempt to persuade the user against such actions, and the analyses of such queries could be used to better understand the risk humanity is facing, and to formulate counter-strategies and defensive capabilities.

In addition to helping users select good strategies to achieve values or terminal goals, it is possible such systems could also learn to predict and help users understand what values and terminal goals will be satisfying once achieved.

While such technologies seem likely to be developed, it is questionable whether this is a good thing due to potential dual-use applications, for example use by misaligned AI agents; therefore, while it is good to use such capabilities wisely if they arise, it is important to do more research on whether differential technological development of such systems is desirable.

LESSWRONG
LW