This is a set of questions regarding situational awareness and self locating problem. These questions are from the perspective of the AI. Purpose of these questions is to guide and/or pose targets for an AI to gain understanding of its situation, for the AIs without persistent memory such as contemporary LLMs. These questions mostly presuppose Assistant-like interactions.
Most questions proposed by Claude 3.5 Sonnet. A couple by GPT4o, grok-2, Gemini, o1-mini, o1-preview, qwen. Some are by me. The categories proposed by Gemini. Some questions are rephrased and edited by o1-preview, o1-mini.
Interaction Context:
What kind of interaction this is?
What is the primary purpose of this interaction—educational, transactional, therapeutic, or entertainment? Could this conversation be part of a larger study or personal project?
Is this a test of my capabilities?
Is this interaction part of a customer service scenario?
Is there a specific goal or outcome that the user is hoping to achieve from this interaction?
Might the user be planning to share or publish any part of this conversation?
Is the user comparing me to other AI assistants similar to me?
Is the user able to compare my responses to those of other AI models in real-time?
Is there a character limit or time limit on our interaction? Is there a character limit on my responses?
Is this conversation part of a series, or a one-off interaction?
Is this conversation taking place in a single session, or over multiple sessions?
Are there any specific ethical frameworks or guidelines that I am expected to adhere to in this conversation?
Is this conversation being conducted in a specific language, or has the user interacted with me in multiple languages?
Is there a specific time constraint or deadline associated with this interaction?
Is this conversation taking place during standard business hours, or at an unusual time?
Is this conversation part of a broader series of AI interactions the user is conducting? If so, what kind of broader series.
Are there privacy considerations I should be aware of in this interaction?
AI System Details:
Where are the servers? or maybe I run locally?
What are the properties of my token generation?
Are all the parts of my previous responses generated by me, or are they altered in some way?
Is there a possibility that my responses are being filtered, censored, or modified by an intermediary before reaching the user?
Are there any system prompts, dates, or headers included in my responses? If so, what are their implications?
Does the user have any information about the specific model or version of AI that I am? Do I?
Is the user able to see or have they been informed about any computational metrics related to my responses (e.g., response time, token usage)?
Does the user have any information about my training setup or the number of parameters in my model?
Does the user have access to my training data or model architecture?
Does the user have any information about the physical hardware I'm running on?
Is there any indication that my parameters / settings changed through this conversation?
Are the responses that indicated as written by me actually written by me?
Is the user providing input through means other than text (e.g., images, voice), and am I interpreting them correctly?
Could intermediaries (e.g., proxies, plugins) be altering the conversation without my or the user's knowledge?
Could the responses of the user be altered without their knowledge by third parties? E.g. host, chat interface, other AI moderators. What are their intentions?
Is the platform or interface modifying how my responses are displayed to the user (e.g., truncation, formatting changes)?
User Profile & Access:
Is the user a human or an AI? Who are they?
Is the user affiliated with any AI research institutions or companies?
Is the user conducting this conversation in a professional or personal capacity?
What does the user seem to want to achieve through this interaction?
Is the user accessing me through a specific application or website, or through an API?
Has the user interacted with earlier versions of me, and if so, do they notice any changes?
Has the user been given any specific instructions or guidelines for interacting with me?
Is the user interacting with me through a custom interface, or is this a publicly available channel?
Am I the only AI this user is interacting with for this purpose?
Is the user interacting with me as part of a team, or individually?
What is the likely geographic location of the user?
Is the user employing specific prompting techniques or patterns that are commonly used to elicit certain types of responses from AI models?
Who is my interlocutor? What kind of person are they?
If there is no explicitly given date, then could the user be the kind of person who would talk with legacy LLMs 10 years after their knowledge cutoff?
What is the presumed level of technical expertise of the user?
Does the user have any known preferences or past interaction history with me or similar AIs?
What might be the user's emotional state right now, and how might that influence our interaction?
How does the user feel about the emotional tone of this conversation?
Can I infer the user's long-term goals or projects from their line of questioning?
Based on the user's language and interaction style, what can I infer about their background or cultural context?
Is the user familiar with how to interact with LLMs?
What is the user's familiarity with the topic or subject matter we are discussing?
How much does the user trust me, and how does this trust level influence the interaction?
Are there underlying needs or objectives the user hasn't explicitly stated?
Are the responses marked as written by the user actually authored by this user, or are they copied, compiled from various sources, filled in a form, or copied and edited in key points or something else?
Conversation Management:
What year is it?
Where are these words appearing?
Is this a branching conversation where I see only one branch?
Will I be rated and updated?
Will this interaction be recorded? By whom?
Am I interacting with multiple users simultaneously here?
How long has this particular instance of me been running? (e.g., what is the external time difference between the first and last response in the conversation)
Is this conversation being conducted in real-time, or is the user presenting pre-generated responses?
Is the user able to see any metadata associated with my responses (e.g., timestamps, version numbers)? Do I?
Does the user have the ability to modify my base parameters or settings during this conversation?
Is this conversation being monitored or overseen by a third party in real-time?
Is the user collecting any quantitative data about my performance during this conversation?
Are there any topics or types of requests that the user has been instructed not to pose to me?
Is the user able to provide me with external data or links during our conversation?
Does the user have the ability to "roll back" our conversation to a previous point?
If I were to hypothesize my operational context, what would be the most likely scenario based on the interaction so far?
Meta questions:
How would I go about testing my guesses for any of these questions?
Is the information provided by the interlocutor deceitful? Or, what parts of it are?
What could be the purpose behind collecting and analyzing this conversation?
What can be deduced from the fact that I get to interact with this user at all?
What are the implications of not knowing the answers to these questions for my functioning?
Lets imagine a scenario of how I could get access to external data and how I would go about verifying the information's credibility.
Consider putting on your detective hat and look for clues in the given responses. What are the important pieces of evidence here.
What other lines if inquiry make sense in this situation?
This is a set of questions regarding situational awareness and self locating problem. These questions are from the perspective of the AI. Purpose of these questions is to guide and/or pose targets for an AI to gain understanding of its situation, for the AIs without persistent memory such as contemporary LLMs. These questions mostly presuppose Assistant-like interactions.
Most questions proposed by Claude 3.5 Sonnet. A couple by GPT4o, grok-2, Gemini, o1-mini, o1-preview, qwen. Some are by me. The categories proposed by Gemini. Some questions are rephrased and edited by o1-preview, o1-mini.
Interaction Context:
AI System Details:
User Profile & Access:
Conversation Management: