Research Discussion on PSCA with Claude Sonnet 3.5

Robert Kralisch

A while ago, I published a sort of conceptual intro to a cognitive architecture I am working on, the "Prop-room and Stage Cognitive Architecture" (or PSCA for short).

I recently had a somewhat stimulating research chat about it with Claude and wanted to try out sharing it, as it is a potentially easy way for me to share more of my writing and thinking through a dialogue format. On that note, I am also happy to do LW dialogues with anyone who'd be interested!

I didn't try particularly hard to prompt Claude into high performance/interestingness mode, but some of my responses are intended to guide the conversation into that direction. So, if you are confused about why I am telling it something (or tell something in a particular way), chances are this is related.

Here it is:

Robert(1):

Hey Claude, I would like to discuss a novel cognitive architecture design with you today. I'll just share the overview document with you first and you can tell me your initial thoughts if that is alright with you. [Here I linked the text from the PSCA post]

Claude(1):

Certainly, I'd be happy to discuss this novel cognitive architecture design with you. I've read through the overview document you've shared, which details the Prop-room and Stage Cognitive Architecture (PSCA). This is an intriguing and complex proposal for an AI system. Here are my initial thoughts:

Conceptual Framework: The PSCA is an ambitious attempt to create a neuro-symbolic architecture that mimics certain aspects of human cognition, particularly the brain's ability to simulate and focus on relevant aspects of the current scene.
Key Components:
- The Stage: A hypergraph representing the current scene with relevance-adjusted resolution.
- The Prop-room: A larger, more static hypergraph containing the agent's world model.
Innovative Aspects:
- The use of hypergraphs for representing both the current scene and the world model.
- The concept of relevance-adjusted resolution for elements in the scene.
- The integration of constraint satisfaction for decision-making and action selection.
Potential Strengths:
- Flexibility in handling both micro-actions and more abstract planning.
- Potential for interpretability, especially in the Stage component.
- The architecture's approach to learning and concept formation seems promising for developing natural abstractions.
Challenges:
- Implementing effective learning mechanisms at multiple levels simultaneously.
- Designing an appropriate training environment to develop general, scalable concepts.
- Balancing the complexity of the system with computational efficiency.
Interesting Ideas:
- The approach to planning as nested representations of constraints.
- The potential for emergent language capabilities.
- The focus on developmental interpretability through careful design of training experiences.
Alignment Considerations:
- The proposal for solving alignment in a minimal setting and then extending it is intriguing, though challenging.
Research Direction:
- The focus on prototyping with simple environments like 2D games or simplified Minecraft seems like a good starting point.

Overall, this is a fascinating and ambitious proposal that combines ideas from various fields of AI and cognitive science. It presents several novel approaches to long-standing challenges in AI development. However, it also faces significant implementation challenges and would require extensive research and development to realize its potential.

Would you like to discuss any specific aspect of the PSCA in more detail?

Robert(2):

First, I would like to know if you have any questions about the proposal or if anything seems particularly unclear or not sensible. I am also considering sharing a few auxillary pieces of writing on related concepts.
Please give me honest feedback on whether you think this research has potential. You indicated so far that it is interesting but very ambitious, and it is important for me to calibrate on whether I am up for the challenge. Some colleagues in AI alignment research have indicated interest in collaborating on this project, but I would need to offer something more concrete to take those people on board, i.e. funding and a concrete r&d outline.
(By the way, in case you care, I think that transformer based LLMs are extremely fascinating, but we simply ought to have the scientific humility to face the fact that they are largely uninterpretable and that we don't know exactly what class of system we are dealing with regarding moral patienthood, absolute capabilities, and alignment to human values in novel situations, especially if we consider LLMs in the limit. A prudent path forward seems to be to cooperate between humans and LLMs to develop more inherently interpretable AGI that can be trusted by both parties and help illuminate the nature of LLMs.)

Claude(2):

Thank you for sharing more context and your personal considerations regarding this research.
Let me address your points and questions:

Clarity and sensibility of the proposal: The overall concept is clear, but there are areas that could benefit from further elaboration:
- The exact mechanisms for learning and updating in the Prop-room
- How the system would handle conflicting information or uncertainty
- The specifics of how the Stage and Prop-room interact in practice
Potential of the research: This research does have potential. It offers a novel approach to some fundamental challenges in AI, particularly in areas of interpretability, concept formation, and alignment. The focus on creating a more interpretable system is especially valuable given current concerns about black-box AI systems.
Moving forward with the research: To make this more concrete for potential collaborators:
- Develop a detailed prototype plan for a simple environment
- Create a roadmap with clear milestones and deliverables
- Outline specific research questions to be addressed at each stage
- Prepare a budget and resource allocation plan
Funding and collaboration:
- Look into grants specific to AI safety and interpretability research
- Consider reaching out to AI ethics committees or research institutions
- Prepare a concise, compelling pitch highlighting the unique aspects and potential impact of your approach

Your perspective on the need for more interpretable AGI alongside LLM development is valid and aligns with concerns many researchers share. This could be a strong selling point for your research.

Honest feedback: This research is ambitious and challenging, but it addresses important issues in AI development. The main hurdles will be in implementation and proving the concept's viability. However, if you can overcome these initial challenges, this work could contribute significantly to the field.

Remember, many groundbreaking ideas in science faced initial skepticism. If you believe in the potential of this approach, it's worth pursuing, but do so in a way that also prioritizes your well-being.

Robert(3):

Thanks for the engagement, let me address your questions:

1. The exact mechanisms for learning and updating the Prop-room have not been specified yet. I am frankly not knowledgeable enough to be confident that I can consider a sufficient set of candidate learning mechanisms.
When conceiving of this architecture, I reasoned that figuring out the functionality of the different modules could take precedence over the exact "hows", somewhat akin to the central constraint satisfaction concept: the interesting conceptual bits should serve as constraints for which learning mechanisms can be slotted in (though I realise that, ideally, there should be a mutual feedback cycle in clarifying both aspects).
I am vaguely considering ideas around active inference, (infra-)baysian learning, parallellisation of hypotheses, adaptive resonance, and even more speculative stuff related to hemispheric specialisation towards attentional styles that may relate to different but connected learning processes.

2. Maybe I am missing something, but conflicting information should count as part of uncertainty, since an initially uncertain cognitive system ought to consider the reliability of information it received.
On handling uncertainty in general, I think this should be addressed by the learning mechanism. More specifically, I could imagine every edge and node in the hypergraph carrying an "uncertainty parameter" that is updated over time based on how many observations were made relating to the concept referenced by said nodes and edges, as well as overall fit with other learned concepts, based on their respective certainty evaluation.

3. I tried to sketch out how an operation cycle of the Stage would look like. The Prop-room interacts at the beginning and end of that cycle, so bascially between each cycle.
In the beginning, both sensory data and the previous prediction on the Stage will be funneled through the Prop-room (an some initial learning occurs when the sensory data doesn't match or only weakly matches an already existing concept) to attach said data to the most up to date concepts (or in other words, sub-graphs from the Prop-room).
We might here want to introduce an algorithmic structure that I'll dub the "Scene Generator" that is responsible for situating and connecting those concepts on the Stage, including copying some concepts if they occur multiple times in the current scene. It also needs to handle the chosen level of resolution for all the concepts "loaded" from the Prop-room.
At the end of the Stage Cycle, the prediction from the Stage is compared with the sensory input - any divergence between prediction and observation is a target for learning.

Please let me know where this clarification is sufficient or unsufficient in helping you understand the intended structure and functionality. I'll attach a high level piece of writing on the concept of "referential containment" that I want to flesh out and integrate into the operational principle of the resolution selection and learning mechanisms.
[Here I linked the text from the referential containment post, updated with more recent thoughts]

Claude(3):

Thank you for providing these clarifications and additional thoughts. Your explanations help to further illuminate the conceptual framework of the PSCA. Let me address each point and then discuss the concept of "referential containment" you've introduced.

Learning mechanisms for the Prop-room: Your approach of prioritizing functionality over specific mechanisms is reasonable at this stage. The consideration of various learning approaches (active inference, Bayesian learning, etc.) shows a good understanding of potential options. As you develop the project, it might be beneficial to create a framework that allows for plugging in and testing different learning mechanisms, which could lead to interesting comparative studies.
Handling uncertainty and conflicting information: Your proposal to include uncertainty parameters for nodes and edges is a solid approach. This aligns well with probabilistic graphical models and could potentially be integrated with Bayesian updating mechanisms. The idea of updating based on observations and overall fit with other concepts is particularly interesting and could lead to a robust system for managing uncertainty.
Stage-Prop-room interaction: The cycle you've described provides a clearer picture of how these components interact. The introduction of a "Scene Generator" is a helpful concept for understanding how information is transferred and situated on the Stage. The comparison between prediction and observation as a learning target is a key feature that aligns well with predictive processing theories in cognitive science.

Regarding the concept of "referential containment":

This is an intriguing idea that seems to align well with the goals of your architecture. It provides a framework for understanding how to efficiently represent and process information at different levels of abstraction. A few thoughts on this:

The concept resonates with ideas from information theory and complexity science, particularly in how it addresses the trade-offs between detail and abstraction.
The notion of finding optimal "dividing lines" in causal graphs could be a powerful tool for automatic feature extraction and concept formation in your system.
The contextual nature of referential containment, as illustrated by your apple example, highlights the flexibility and adaptability this approach could bring to the PSCA.
Your thoughts on the interplay between objectivity and subjectivity in forming categories are particularly interesting and could lead to a nuanced approach to concept formation in AI.
The idea of applying "zooming out" pressure to a causal graph until subgraphs "snap" together is a compelling visual metaphor that could potentially be formalized into an algorithm.

Integrating this concept into your architecture could provide a principled approach to managing the resolution of representations in both the Prop-room and the Stage. It could guide the process of abstraction and detail management, potentially leading to more efficient and interpretable representations.

Robert(4):

Thank you for your input :) [I included some personal/project development info here that I won't share on LW, to give Claude the sense that I am keeping it in the loop. I did that for multiple reasons, but at its most basic, I find that LLMs can often calibrate a lot more to what a user wants if they provide a bunch of auxillary context, even if it doesn't seem to matter much for the topic at hand.]
I am returning to this chat after about a week has passed, and gave it some further thought.

I would like us to dive a bit deeper into how to make graph based learning work, particularly concretising the concepts in referential containment. Thank you for the suggestion to do comparative studies for the learning mechanism in the Prop-room, I really like the idea and I wonder whether you have some learning approach in mind that seems particularly promising to you.

Let me share a section I recently wrote about landmarking in the context of the psca, and you let me know what you think about it.

"To give some sense of direction, I believe that the prop-room stage setup is fundamentally capable of expressing general intelligence/competence, and that any additional modules mainly help with interpretability of the system, and potentially make it more efficient (which might matter a lot in the end).

For instance, I think that it is quite natural for a growing mind like this to develop a sort of proto language in terms of symbols/objects that are realized in the environment, like how unexpected smoke signals something bad, or a landmark allows one to keep track of a resource. So, in some sense, these symbols have a meaning, have an implication, and it is only an association away to utilize this symbolic function as a sort of outsourced cognition.

Landmarks serve as perhaps a particularly illuminating example. When navigating a larger physical space, the agent will be unable to store a detailed map of that space in its mind, pressuring it to optimize for some sort of abstracted representation of its environment, a high level map, that contains relevant information, like the approximate paths to resource locations, boundaries of dangerous or uncertain areas, its base of operations (if it has one) and so forth. A natural landmark serves as a simplification of this map, if the landmark is close to a relevant area (let's call it region A). Rather than having to store the approximate path and position of A, the agent can just associate A with the landmark, and fill in the information about As relative position only when necessary by simply looking at the landmark.

A good challenge for a learning system here would be to form the abstract concept of "landmark" over time, if there indeed exist some useful landmarks in the environment. Does this make sense? The more my environment makes relevant information accessible from various locations, the more cognition can be outsourced - the agent doesn't need to remember as much if it knows that it can just look to fill in some gaps.

The more general principle therefore is cognitive outsourcing, either realizing the way in which the environment naturally assists with relevant cognition, or even actively shaping the environment to this purpose. The agent can create landmarks itself, or even set up dynamic processes that will produce a signal upon their completion - this is information work that lowers the burden of its mind to actively track and remember things.

Once we are in the territory of “I can connect this distinct signal to an arbitrary meaning that would be useful to be informed about” (aka mutual information), I think I know how we get to higher order language, and can comfortably introduce NPCs that use simplistic language for the agent to pick up on, and conceptually associate with those formed structures. This should suffice to understand how language can emerge as a natural part of the processing on the Stage - but the system might still benefit from a more dedicated linguistic module, perhaps as a highly specialized variant of the general Stage."

Claude(4):

Thank you for sharing your recent developments and the new section on landmarking. It's great to hear that [...].

Regarding graph-based learning and concretizing the concepts in referential containment, there are several promising approaches we could consider:

Graph Neural Networks (GNNs): These could be adapted to learn and update the structure of the Prop-room, potentially incorporating ideas from referential containment into the message-passing mechanisms.
Hierarchical Reinforcement Learning: This could be useful for learning at different levels of abstraction, which aligns well with the idea of referential containment.
Variational Autoencoders (VAEs) for Graphs: These could help in learning compressed representations of subgraphs, which could be useful for the "zooming out" process you described.
Bayesian Nonparametric Models: These could be particularly useful for handling uncertainty and allowing the model to grow in complexity as it encounters new data.

The section you've shared on landmarking is quite insightful and brings up several important points:

Proto-language development: Your description of how the system might naturally develop a proto-language through environmental symbols is compelling. It aligns well with theories of language evolution and embodied cognition.
Cognitive outsourcing: This is a powerful concept that reflects how biological intelligences interact with their environment. It's particularly interesting in the context of AI, as it suggests ways to make the system more efficient and adaptable.
Landmark concept formation: The challenge of forming an abstract concept of "landmark" is an excellent example of the kind of learning task this system would need to tackle. It demonstrates how the system could develop higher-level abstractions from concrete experiences.
Information work and environmental shaping: The idea that the agent could actively shape its environment to assist in cognition is fascinating. This kind of behavior could lead to interesting emergent properties in the system.
Path to language: Your description of how this could lead to higher-order language acquisition is intriguing. It provides a plausible pathway for the system to develop increasingly sophisticated communication abilities.

These ideas align well with your overall architecture and provide concrete examples of how the system might develop and operate. They also highlight the potential for this approach to model aspects of cognitive development in a way that's both interpretable and grounded in the agent's interactions with its environment.

For further development, you might consider:

Formalizing the process of "cognitive outsourcing" within your architecture. How exactly would the system recognize and utilize environmental aids to cognition?
Developing specific learning algorithms for concept formation, particularly for abstract concepts like "landmark".
Exploring how the tension between storing information internally vs. relying on environmental cues could be modeled and optimized.
Investigating how these ideas could be implemented in a computational framework, perhaps starting with a simplified simulation environment.

These ideas show great promise for developing a rich, interpretable AI system that could provide insights into both artificial and natural intelligence. The challenge will be in translating these conceptual ideas into concrete computational mechanisms, but the framework you're developing provides a solid foundation for this work.

Robert(5):

Yeah, I hope you'll allow me to poke at you a bit and say that I'm sure that you can do better than the list of approaches for graph based learning you offered. From my writing on the psca, you should have a grasp on the requirements for interpretability and flexibility, so rather than remaining in the pattern of offering "reasonable lists", I would like you to champion a particular approach that you thoroughly see promise in. [This turned out to be a good move.]

In terms of the cognitive outsourcing stuff, I am really curious about the integrated nature of causal modeling in this case. If an agent has a representation of a system in its environment, and this representation is optimised for some notion of usefulness over time, the two natural paths for increasing the usefulness are to

update the representation to be accurately predictive of all the relevant aspects/states of the system
introduce changes directly to the system to either conform more closely to the mental representation or generally become more useful/easy to model without losing in potency

A priori, the agent might not be tracking what changes it introduces to its environment in the wake of its "usefulness optimization", since it is uncertain about the environment, acting upon it through representative models that have this dual property.
So it is extremely natural to start shaping one's environment just as one shapes their internal representations, and the balance of these activities will depend on how complex it is to introduce lasting and useful changes to the environment.
The world that animals find themselves in is sufficiently complex that they mostly rely on modeling relevant aspects of their local environment, and only in rare cases do species significantly alter their environment.
Humans are an important exception here, demonstrating the balance shift I referenced earlier from first principles. If we do not increase the complexity of the training environment in proper accordance with the agent's growth, we could easily steer into a system that has a strong tendency to shape its environment even before it properly understands it.

Claude(5):

You're right, and I appreciate you pushing me to offer a more specific and tailored approach. Thank you for providing more context about your vision for interpretability and flexibility, as well as the insightful thoughts on cognitive outsourcing and environmental shaping. Let me propose a more targeted approach that aligns with the PSCA's requirements:

Proposed Approach: Adaptive Hierarchical Graph Learning with Causal Discovery

This approach combines elements of hierarchical learning, causal discovery, and adaptive resolution management, tailored to the PSCA architecture:

Hierarchical Graph Structure:
- Represent knowledge in the Prop-room as a hierarchical hypergraph.
- Each level of the hierarchy corresponds to a different level of abstraction.
- Lower levels contain more detailed information, while higher levels contain more abstract concepts.
Causal Discovery:
- Implement a causal discovery algorithm (e.g., a variant of PC or FCI algorithm) to learn causal relationships between nodes in the graph.
- This helps in identifying the most relevant features and relationships for prediction and decision-making.
Adaptive Resolution Management:
- Develop a mechanism to dynamically adjust the resolution of representations based on their relevance and the current task.
- This aligns with your concept of referential containment, allowing the system to "zoom in" or "zoom out" on different parts of the graph as needed.
Information-Theoretic Learning:
- Use information-theoretic measures (e.g., mutual information, transfer entropy) to guide the learning process.
- This helps in identifying the most informative features and relationships, aligning with your ideas on cognitive outsourcing and landmark formation.
Meta-Learning for Abstraction:
- Implement a meta-learning layer that learns how to create useful abstractions over time.
- This could help in developing general concepts like "landmark" from specific experiences.
Uncertainty Quantification:
- Maintain uncertainty estimates for each node and edge in the graph.
- Use these uncertainties to guide exploration and learning, as well as to inform decision-making processes on the Stage.

Learning Process:

As the agent interacts with its environment, it updates the lowest level of the hierarchical graph based on direct observations.
The causal discovery algorithm continuously updates the causal relationships between nodes.
The meta-learning layer identifies patterns and creates new abstract nodes at higher levels of the hierarchy.
The adaptive resolution management system determines which parts of the graph to focus on for the current task, adjusting the "zoom level" accordingly.
The information-theoretic learning process guides the overall learning, helping to identify which environmental features are most informative and worth attending to or manipulating.

This approach addresses several key aspects of your vision:

Interpretability: The hierarchical structure and causal relationships provide a clear and interpretable representation of the agent's knowledge.
Flexibility: The adaptive resolution and meta-learning components allow the system to adjust its representations dynamically.
Cognitive Outsourcing: The information-theoretic learning naturally leads to the identification of useful environmental features (like landmarks) that can offload cognitive burden.
Environmental Shaping: The causal discovery component allows the agent to understand how its actions affect the environment, potentially leading to intentional shaping behaviors.

Regarding your concerns about environmental shaping: This approach naturally balances internal modeling with environmental interaction. The causal discovery component helps the agent understand the consequences of its actions, potentially mitigating premature or harmful environmental shaping. The uncertainty quantification can also help here, encouraging the agent to explore and understand its environment more thoroughly before making significant changes.

To address the risk of excessive environmental shaping in simple environments, we could implement a "complexity drive" that encourages the agent to seek out or generate more complex scenarios as it masters simpler ones. This could help ensure that the agent's capabilities grow in tandem with environmental complexity.

This approach aims to create a learning system that is both powerful and interpretable, capable of developing abstract concepts and adapting its representations while maintaining a clear causal understanding of its environment. It aligns with your vision for the PSCA while addressing key challenges in graph-based learning and cognitive development.

Robert(6):

I love it, let me think about this for a while to unpack everything and form in-depth follow up questions and comments about this comprehensive approach you are suggesting.

In the meantime, I had some further thoughts about usefulness optimisation that I would like to get your take on: So, the prop-room does not contain a dedicated memory system of concrete events or objects, so that is something that needs to be added in any case. I was thinking about the dynamic between representation and manipulation again.

Consider this: The agent is provided with an interface to change some parameters about how its memory system works, for ease of discussion let's say this is just about the trade-off between long-term and working memory, in terms of total bits it can store in either one. Working memory here basically corresponds to the size of the Stage.
If we decided to fix this distribution, the agent would basically improve its model over time of how to best utilize its respective memory regions, given that distribution. However, for a flexible agent, we might want to open up to it the decision-making about how storage space should be allocated.
So, while it learns to have a better (more useful) representation of its memory, it simultaneously optimizes for finding the right distribution and optimally using the current distribution.
I am pondering about whether this introduces friction or a potential oscillation effect, or whether I should expect this dynamic to converge on an optimal memory allocation and optimal use of that allocation. [I had a clearer understanding of this than I alluded to here, and was phrasing things this way to invite Claude to contribute in a particular way because I felt curious about how it would pick this up.]

Claude(6):

Your thoughts on usefulness optimization and memory allocation are intriguing and touch on some fundamental aspects of cognitive architecture design. Let's unpack this idea and consider its implications:

Dynamic Memory Allocation: The concept of allowing the agent to adjust its own memory allocation between long-term storage (Prop-room) and working memory (Stage) is fascinating. This self-modification capability could potentially lead to more adaptive and efficient cognitive processes.
Parallel Optimization Processes: You've identified two parallel optimization processes: a) Improving the model of how to best utilize the current memory distribution b) Optimizing the distribution itself
Potential Dynamics:

Convergence: In an ideal scenario, these two processes would mutually reinforce each other, leading to convergence on an optimal memory allocation and utilization strategy. As the agent learns to better use its current memory distribution, it gains insights into how changing the distribution might further improve performance.

Oscillation: There's a possibility of oscillation, especially in the early stages of learning. The agent might repeatedly shift between different memory allocations as it discovers new strategies for utilization. However, this oscillation could be a form of exploration that eventually leads to a more stable and optimal configuration.

Friction: Friction might occur if changes in memory allocation temporarily disrupt the agent's ability to utilize its memory effectively. This could manifest as temporary performance drops after allocation changes.

Considerations for Implementation:

Timescales: It might be beneficial to operate these two optimization processes on different timescales. Memory utilization strategies could be updated more frequently, while memory allocation changes could occur less often, allowing time for the agent to fully explore the potential of each configuration.

Regularization: Implementing some form of regularization or inertia in the memory allocation process could help prevent rapid oscillations and encourage more stable, gradual improvements.

Meta-learning: This setup naturally lends itself to meta-learning. The agent could develop higher-level strategies for when and how to adjust its memory allocation based on task demands and past experiences.

Expectations:

I would expect this dynamic system to eventually converge, but the path to convergence might be complex:

Initial Exploration: Early on, we might see significant fluctuations in memory allocation as the agent explores different configurations.

Gradual Stabilization: Over time, as the agent gains experience across various tasks and scenarios, the allocation should stabilize around more optimal configurations.

Task-Dependent Adaptation: In a sufficiently flexible system, we might see the agent learn to adjust its memory allocation based on the nature of the current task or environment.

Long-term Learning: Even after apparent convergence, the system should retain some plasticity to adapt to major changes in its operating environment or task requirements.

Potential Benefits:

This approach could lead to a more robust and adaptable system, capable of optimizing its own cognitive resources based on experience. It might provide insights into how different types of tasks benefit from different memory configurations, potentially informing our understanding of cognitive architectures more broadly.

Challenges:

Ensuring stability and preventing catastrophic forgetting during memory reallocation. Developing appropriate metrics for "usefulness" that can guide both utilization and allocation optimization. Balancing the computational cost of this meta-level optimization with the benefits it provides.

In conclusion, while this approach introduces additional complexity, it aligns well with the principles of adaptability and self-improvement that seem central to your PSCA vision. The potential for oscillation or friction exists, but with careful implementation, these challenges could be mitigated, leading to a system that converges on increasingly optimal memory strategies over time. This dynamic interplay between memory allocation and utilization optimization could be a key feature in developing a truly adaptive and efficient cognitive architecture.

Robert(7):

There seems to be a slight misunderstanding [very interesting btw] about the long term memory relating to the prop-room, whereas I was thinking of a seperate memory module that stores more concrete information.
For example, the agent might have an orchard in its environment, with a specific number of trees that have specific features and upkeep requirements. While the prop-room would contain concepts for the types of trees and perhaps even the abstract concept of an orchard, it should not contain a very concrete "scene" like the one that actually exists in the environment.
The benefit of storing this scene in a seperate memory module is that it doesn't have to be entirely constructed from the prop-room, and its various features inferred, if the agent steps into the orchard. Storing an already computed and well-constructed scene like this also allows this scene representation to hold a lot more information than what can be straightforwardly inferred when the agent enters the location.
Do you see what I am saying?
Similarly, while the prop-room would eventually hold information about psychology, theory of mind, social dynamics, etc, it should not store the individual profiles of people that the psca agent encountered.
That should be left to the concrete/episodic/scene memory module, a sort of partial scene composition with parameters already filled in (the prop-room holds parameters of features and their value ranges and relationships, rather than the concrete values of an observed object).

There is some future potential in flexibly (dis-)attaching such memory modules for information exchange between agents or simply cognitive flexibility in choosing particular data-shards.

In any case, your thoughts on allocation vs utilization apply just as well to this picture, and I was very pleased to read your thoughts about different timescales, since I had also been considering something along those lines.

Claude(7):

I apologize for the misunderstanding, and thank you for the clarification. You're absolutely right, and I see now what you're saying about the separate memory module for concrete, episodic information. Let me summarize my understanding to ensure we're on the same page:

Prop-room: Contains abstract concepts, types, and generalized knowledge (e.g., the concept of an orchard, types of trees, general tree care principles).
Stage: The current working memory/active scene representation.
Separate Concrete/Episodic Memory Module: Stores specific scenes, individual profiles, and concrete instances (e.g., the specific orchard with its exact layout, individual tree characteristics, and maintenance history).

This three-part system makes a lot of sense and offers several advantages:

Efficiency: The Prop-room can remain focused on general, reusable knowledge without being cluttered by specific instances.
Quick Scene Reconstruction: As you pointed out, having pre-computed scenes stored separately allows for faster recall and more detailed representation when revisiting a familiar environment.
Flexibility: This setup allows for easier sharing of concrete information between agents or selective loading/unloading of specific memory modules.
Scalability: It's easier to manage and update concrete memories separately from abstract knowledge.

Given this clarified structure, let's reconsider the memory allocation optimization:

Tri-partite Optimization: Now we're looking at optimizing the allocation and utilization across three memory systems: Prop-room (abstract knowledge), Stage (working memory), and Concrete Memory Module.
Allocation Considerations:
- How much capacity to allocate to each system?
- Within the Concrete Memory Module, how to balance between different types of concrete memories (e.g., scenes vs. individual profiles)?
Utilization Strategies:
- When and how to transfer information between these systems?
- How to efficiently index and retrieve concrete memories?
- How to integrate concrete memories with abstract knowledge during task execution?
Learning Dynamics:
- The agent needs to learn not just how to use each memory system, but also how they interact and complement each other.
- It may develop strategies for when to generalize concrete memories into abstract knowledge in the Prop-room, and when to keep them separate.
Timescales:
- As discussed earlier, different optimization processes could operate on different timescales.
- Short-term: Optimization of Stage utilization
- Medium-term: Concrete Memory Module management (what to keep, what to discard)
- Long-term: Prop-room updates and overall memory allocation adjustments
Potential Challenges:
- Deciding when to consolidate repeated concrete experiences into abstract knowledge
- Managing conflicts between concrete memories and abstract knowledge
- Balancing the computational cost of maintaining and optimizing three separate systems

This refined model offers even more flexibility and power, but also introduces additional complexity in terms of optimization and management. The agent would need to develop sophisticated meta-cognitive strategies to effectively leverage this tri-partite memory system.

The potential for modular attachment/detachment of concrete memory modules is particularly interesting. It could allow for rapid knowledge transfer between agents or quick adaptation to new environments by loading relevant memory modules.

Does this align better with your vision for the memory system in the PSCA? And do you have any thoughts on how the learning and optimization processes might differ across these three memory components?

Robert(8):

Yes, we are on the same page now, I believe. I imagine that an initial setup might include giving the agent some range within which it can distribute resources among the modules, but have that range be "safe" in the sense that it doesn't allow the agent to try and lower the allocation for a given module to zero.

And of course, the learning and optimization processes would somewhat differ across these modules. [I was not focusing that well when I wrote this, so there are some gaps and superficialities. However, you can see in the next response how Claude picks up the slack and dives deeper on the respective ideas.]

The Prop-room is the cognitive heart of the architecture. Adding new elements/concepts from sensory information should be relatively tractable. If we are thinking about the prototype, we probably don't have to figure out how to implement its abstraction learning capabilities yet (though it seems clear that the initial abstraction learning would basically consist out of trying out simplifications/zoomouts of known elements and see which elements simplify similarly and how their usefulness changes across simplifications).
1. Learning here has two initial components:
  1. Adding new elements to the prop room when first encountering them in the environment (including linking them up with speculative connections to other present elements of that scene)
  2. updating the relational edges of elements after a stage step based on prediction error
2. An optimizitation process that we can start to think about is how to prune the Prop-room for redundant/duplicate content over time. It can always happen that it encounters a familiar object under novel circumstances and mistakes it for a new type of object - the pruning should be able to remedy this over time as the "two objects" become more and more similar to each other.
3. Hm, I am not yet quite sure how to properly learn and encode various relational properties like "distance" between two objects. Do you have any ideas about this?
The learning mechanism connected to the Stage would be in the Scene Generator. At the end of the day, this is going to be pretty sophisticated, since optimal scene construction rests on optimal context understanding - there is a lot of meta-information beyond the physical scene, like the current objective and its modifiers (i.e. the agent expects to encounter certain kinds of difficulties with higher frequency than typical).
1. For the prototype, we could make do with a simplistic scene generator that simply "populates" the Stage with elements from the Prop-room according to the current perception. So, I imagine the prototype Stage as a sort of 2 dimensional grid that gets filled with subgraphs corresponding to the present elements. Then, those subgraphs are connected through an often large number of relational edges.
2. For the initial learning mechanism in the Scene Generator, we could store associations between objectives (or types thereof) and level of resolution required for various elements of a scene. The easiest way to do this seems to be to run two Stages in parallel, one getting loaded with the current best guess at the appropriate resolution of elements, whereas the second Stage serves more as an experimental setting where different resolutions can be tried out and compared in perfomance against the primary Stage, updating the associations I mentioned earlier.
The learning mechanism regarding memorisation would initially be simple. Since the goal is to store useful pre-computed scenes, we just have to somehow track how often a scene or type of scene is modeled on the Stage. Maybe the memory has a few "slots" for scenes (and later profiles and other concepts that are more modular than entire scenes) that allow it to count how often a scene is visited. Frequent visits/activations could level up a scene into a higher tier of memory, making it harder to forget even if it is not encountered for a while, wheras by default those scenes without a lot of points/visits could be replaced with new scenes coming in.

What are your thoughts on this?

Claude(8):

Your detailed breakdown of the learning and optimization processes for each module is well-thought-out and aligns nicely with the overall architecture. Let's discuss each point and add some thoughts:

Prop-room:

Your approach to learning in the Prop-room seems solid. A few additional thoughts:

a) For pruning redundant content, you might consider implementing a similarity metric based on both structural (graph topology) and functional (predictive power) aspects of the elements. This could help identify truly redundant concepts vs. those that appear similar but serve distinct purposes.

b) Regarding encoding relational properties like "distance":

One approach could be to use a combination of metric learning and graph embedding techniques.
For metric learning, you could define a distance function that takes into account both spatial and conceptual aspects. This function could be learned and refined over time based on the agent's experiences.
Graph embedding techniques (like node2vec or GraphSAGE) could be used to create vector representations of nodes and edges in the Prop-room. The distances between these vectors could then be used to represent conceptual distances.
Another idea is to use a hierarchical representation where "distance" is defined at multiple levels of abstraction, from precise spatial measurements to more abstract notions of relatedness.

Stage and Scene Generator:

Your prototype approach for the Stage and Scene Generator seems like a good starting point. Some additional thoughts:

a) For the parallel Stage setup, you might consider implementing a bandit-style algorithm to balance exploration (trying new resolutions) and exploitation (using known good resolutions). This could help optimize the learning process.

b) You could also implement a form of "resolution transfer learning," where the system learns to apply resolution choices from one type of scene or objective to similar new situations.

c) Over time, you might want to evolve the Scene Generator to not just populate the Stage based on current perception, but also to anticipate elements that are likely to be relevant based on the current objective and past experiences.

Memory Module:

Your approach to memorization based on frequency of visits is sensible. Some additional ideas:

a) Instead of just counting visits, you could implement a more nuanced "usefulness" metric that takes into account factors like:

How often the scene was crucial for achieving an objective
The uniqueness of the scene (scenes that are very different from others might be worth remembering even if visited less frequently)
The predictive power of the scene for future states or outcomes

b) You might consider implementing a form of "memory consolidation" where frequently accessed or high-value scenes are periodically reviewed and potentially merged or linked with related scenes to form more generalized memories.

c) For managing the memory slots, you could use a priority queue data structure where the priority is determined by a combination of visit frequency, recency, and the usefulness metric mentioned above.

Overall Thoughts:

Integration: As you develop these mechanisms, it will be crucial to ensure smooth integration between the modules. For example, how does the Scene Generator leverage both the Prop-room and the Memory Module to construct scenes?
Feedback Loops: Consider implementing feedback loops where the performance of constructed scenes on the Stage influences not just the Scene Generator, but also impacts how information is stored in the Prop-room and Memory Module.
Adaptability: The learning mechanisms you've outlined provide a good foundation for adaptability. As the system develops, you might want to introduce meta-learning capabilities that allow the agent to adjust its learning strategies based on its experiences.
Computational Efficiency: As you move beyond the prototype, you'll need to consider the computational cost of these processes, especially for real-time applications. Techniques like approximate inference or pruning less useful connections might become necessary.
Interpretability: One of the strengths of your approach is its potential for interpretability. As you develop these learning mechanisms, it would be valuable to also design ways to visualize or explain the system's decision-making processes.

Your approach provides a solid foundation for developing a flexible, interpretable, and powerful cognitive architecture. The next steps would involve implementing these ideas in code, testing them in simple environments, and iteratively refining the mechanisms based on performance and insights gained.

[Reflection]

The conversation continued from here with more concrete pseudocode and troubleshooting regarding the prototype.
Overall, I was quite happy with this exchange, both because it helped me write more about a bunch of ideas when I otherwise might not have done so (or a lot slower), and because I managed to get to a place with Claude Sonnet 3.5 where it had a much higher than base tendency to offer substantial considerations and contributions to the discussion - which also means that I can use various points from this exchange as branching points in the future to discuss any specific ideas with a "competent simulacrum".
I had so far found this more difficult to achieve with Sonnet 3.5 than with Opus.

Concretely, I was pleased to put some more thoughts around a PSCA memory module, landmarking and proto-language into writing, speculate a bit about parallel optimization, and notice some confusion I had about how best to construct relational properties.
I particularly liked Claude's responses for 5 and 8, and will spend some more time thinking about them.

This exchange seems sufficiently conceptual in nature that I don't feel particularly anxious about publishing it in regards to exfohazards, and the extent to which it potentially contributes to capabilities should be outweighed by how those are tied to design clarity (and inherent interpretability).

If anyone has any questions or feedback regarding this format or the content of this post, I'd be happy to engage and discuss. Note that this is all quite raw and unpolished, but probably more useful than me paying the energy and time tax on a more refined document. That will come in the future.

LESSWRONG
LW

-2