Session 2#
This session will provide an introduction to Cognitive Architectures (CA), a computational functional description and a theory of human-like minds.
Slides from the second lecture can be found here.
The discussion of the day will be on the paper by Laird, Lebiere & Rosenbloom (2017). A Standard Model of the Mind: Toward a Common Computational Framework Across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics.
Exercises#
Below are some questions to recap and apply central concepts of the second session. Please try answer them for yourself. Further, exercises to familiarize yourself with theoretically-informed approaches of cognitive modeling.
Exercise 2.1.: What are cognitive architectures?
What are the advantages and limitations of structured representations and a structured architecture? Name and briefly explain at least two advantages and disadvantages for each.
What is the conceptual difference between a cognitive architecture and the BDI model?
Consider the core modules of cognitive architectures: different types of memory, perception, motor modules, attention, action selection, learning. Which of these can one map onto aspects of LLMs? Map and explain briefly.
Compare the cognitive cycle described in the standard model and the architecture of Generative Agents. Which components match, which are missing or differ?
Do you consider the Generative Agents architecture a better simulator of human behavior that an LLM only? Name and briefly explain two arguments.
The Standard Model formalizes bounded rationality rather than fully optimal cognition. What would speak in favor, would would speak against having boundedly rational artificial agents, if they were a part of human every day life? Name one argument each.
Click below to see suggested solutions.
What are the advantages and limitations of structured representations and a structured architecture? Name and briefly explain at least two advantages and disadvantages for each.
structured representations: + relations can be defined + interpretable representation - inflexible for definition, limited expressiveness (difficult conversion from real world signals) - inflexible structure (not allowing probabilistic notions, gradedness)
structured architecture: + transparency + potential human-likeness - the bitter lesson - difficulty of building - task specificity
What is the conceptual difference between a cognitive architecture and the BDI model?
CA is a functional formalization of the human mind (more abstract)
BDI is a concrete model; more of a property-based description
Consider the core modules of cognitive architectures: different types of memory, perception, motor modules, attention, action selection, learning. Which of these can one map onto aspects of LLMs? Map and explain briefly.
long-term memory: might be compared to in-weight stored information (however, procedural, semantic etc are all represented in the same way); however, it cannot be updated without re-training
short-term memory / working memory: maybe in-context information
perception: maybe encoded information from user input, but arguably no default grouping / categorization happens
motor modules: none
attention: attention in transformers
action selection: predicting the next token (WTA / probabilistic selection)
learning — see memory
Compare the cognitive cycle described in the standard model and the architecture of Generative Agents. Which components match, which are missing or differ?
cognitive cycle: “… driven by procedural memory. In each cycle, procedural memory tests the contents of working memory and selects an action that modifies working memory. These modifications can lead to further actions retrieved from procedural memory, or they can initiate operations in other modules, such as motor action, memory retrieval, or perceptual acquisition, whose results will in turn be deposited back in working memory.””
Generative agents: action selection based on memory, but rather “long-term memory contents” in-weight; although implementation is arguably working-memory style where the currently constructed prompt is the working memory. However, action does not directly modify the working memory, but goes through the effect on environment, and loop of reflection, planning
Do you consider the Generative Agents architecture a better simulator of human behavior that an LLM only? Name and briefly explain two arguments.
works more accurately on a behavioral level (better performance)
more structure known from humans → potentially more human-likeness, but it assumes that there is a causal connection between the structure of the agent and the actual behavioral outputs (rather than spurious correlations)
note: for pure behavioral simulation rather than explanation the functional architecture of the simulator might actually irrelevant
The Standard Model formalizes bounded rationality rather than fully optimal cognition. What would speak in favor, would would speak against having boundedly rational artificial agents, if they were a part of human every day life? Name one argument each.
human-likeness → naturalness of interaction
we might want that the artificial systems are safer, better than humans on certain applications
Exercise 2.2.: Aspects of cognitive architectures
One of the modules in cognitive architectures is perception, which interacts with memory. One core functionality of these subprocesses in human cogntion is categorization, i.e., assigning discrete categories to perceived stimuli, based on our knowledge (memory) of different categories. While ML models are highly accurate at categorization tasks, they need large amounts of training data that is implausible from a cognitive perspective. Please follow this tutorial to see an example of a model from cognitive psychology where categorization happens based on a smaller number of examples.
a. What is the categorization “criterion” that is used? b. What do you think are strengths and limitations of this model?
Selection of relevant aspects to remember is one of the most difficult aspects of building models that reliably interact with the complex real world. Re-using the code and LLM from session 1, compare different approaches to scoring the relevance of different observations of the environment. Your task is to compare these approaches to the relevance scoring used in the Generative Agents architecture (see prompt on slides). The different approaches are roughly described, but you need to formulate the respective prompts. Feel free to play around with the set of observations, too. Do you see differences depedning on scoring? Do the results conform your intuitions?
# load the model
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/Qwen2.5-7B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
load_in_4bit=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
#### environment set up: ####
# set of observations
observations = [
"It's sunny outside.",
"You saw a big rock fall from the sky.",
"You saw a tiny dog in front of your house.",
"You met your best friend today.",
"You met your partner whom you haven't seen for a long time.",
"You found your high school diploma.",
"You ate cereal for lunch."
]
# current memory
memory = [
"You have a college diploma.",
"You live in a small town.",
"You have a cat."
]
# Approach 1: relevance of remembering is based on similarity of current memories and the observation
# Approach 2: relevance is based on importance of the observation for generalization, success in the future --
# e.g., could roughly approximated by expected number of situations which might rely on this knowledge
# Approach 3: relevance scoring prompting from Generative agents
# Approach 4 (more advanced): multi-step prompting: rate relevance following generative agents, and also unexpectedness of the observation
# in two separate LM calls and then rate overall retention score based on the predicted scores