Memory models
Andrea Stocco
University of Washington
Seattle, WA
PSYCH448D, Week 5 Models of Memory /2 Multiple Trace Models
The probability of retrieving memory m in context Q
made of q1, q2 … qi … qN cues is expressed as:
Where we left off: Bayesian Considerations
p(qi |
m) p(qi)
Πi
p(m | Q)
p(Q | m)p(Q|
¬m)
p(m) p(¬m)
p(¬m | Q)
= ✕
p(m) p(¬m)
✕=
The priors: The history of a
memory
The probability of retrieving memory m in context Q
made of q1, q2 … qi … qN cues is expressed as:
Focus on priors
p(qi |
m) p(qi)
Πi
p(m | Q)
p(Q | m)p(Q|
¬m)
p(m) p(¬m)
p(¬m | Q)
= ✕
p(m) p(¬m)
✕=
A mechanistic model of memory: ACT-R
● ACT-R = Adaptive Character of Thought - Rational
● John Anderson’s follow-up to rational analysis
● Other models exists: ○ REM model by Shiffrin
○ MINERVA model by Hintzman
○ … They are all very similar
● All of these models are rooted in the Multiple
Trace Theory
The Multiple Trace Theory
● Perhaps the dominant framework on the
neurobiology of memory
● Every time you encode information, you make a
new trace
● New traces can be made for the same memory
The Multiple Trace Theory: Formal definition
● Formally, a memory m is a collection of n traces,
encoded at times t1, t2… tn.
t1
m t2
t3
In Italian, “Fish” is “Pesce”
T im
e
1st time, In classroom
2nd time, In restaurant
3rd time, At the market
Retention curve for a trace
The odds of retrieving the i-th trace decrease with a
power function. At time t, they are:
P(i) / P(¬i) = (t - ti) -d
● ti = time when the i-th trace was created
● (t - ti) = time since creation of i-th trace
● d = decay rate
P(i) / P(¬i) = (t - ti) -d
● ti = time the i-th trace was created (ti = 0),
● d = decay rate (d = 0.5)
Retention curve for a trace (visualized)
Retention curve for a memory
● The odds of remembering m are the odds of
remembering any of its constituent traces
● Thus, it is the sum of the odds of remembering
each trace
P(m) / P(¬m) = ∑i (t - ti) -d
Memory activation
● The activation A(m) of a memory m is a scalar
value that reflects the current availability of a
memory.
● Mathematically, it is defined as the log of the
odds
A(m) = log P(m) / P(¬m) = log ∑i (t - ti) -d
● Notice: ○ Activation goes from -Inf to +Inf
○ When P(m) = 0.5, log[ P(m)/ P(¬ m)] = 0: Natural forgetting
threshold
Traces, memory, and activation
● Memory m is made
of three traces
● Traces created at
○ t1 = 0s
○ t2 = 4s
○ t3 = 12s
Three Laws of Memory
Every memory model needs to account for three
established phenomena:
● Frequency
● Recency
● Spacing
Recency and Frequency
From base activations to retrieval times
● Activation encodes log odds of retrieval
● But log odds are related to effort and, and effort is
related to retrieval retrieval times (RT)
● The RT for retrieving a memory m is calculated
from its activation A(m):
RT = Ter + FeA(m)
● Ter = non-retrieval time (e.g., perception and
motor times)
● F is a scaling parameters
The context: Environmental cues
The probability of retrieving memory m in context Q
made of q1, q2 … qi … qN cues is expressed as:
Focus on contextual cues
p(qi |
m) p(qi)
Πi
p(m | Q)
p(Q | m)p(Q|
¬m)
p(m) p(¬m)
p(¬m | Q)
= ✕
p(m) p(¬m)
✕=
… But activation is in logs
p(qi |
m) p(qi)
Πi
p(m) p(¬m)
✕ A(m) =
log( )
p(qi |
m) p(qi)
Πi
p(m) p(¬m)
) + A(m) =
log( )
log (
p(qi |
m) p(qi)
A(m) = log∑i (t -
ti) -d +
) ∑i
log(
Sum of the support given from each individual cue qi
Context = spreading activation
Old idea in cognitive science:
● Memories form a network
● Activation flows through connected nodes in the
network
● Activation flow from cues q1, q2 … to memory mq1
m
q2
q3 A (m )
How activation spreads /1
● To understand spreading activation, we need to
understand how memories are represented in ACT-
R
Memory representation in ACT-R
● Filing cabinet
metaphor
● Assumes that
declarative memory
takes the form of
records
● A memory is a
collection of
structured
information, with
labeled entries
Example of memory representation in ACT-R
PSYCH448D Is a………
undergrad class
Quarter.....Fall 2022
Instructor.. Andrea Stocco
Andrea Stocco Is a…………
Professor
Born in…… Italy
Works at… UW
Memories
How activation spreads /2
● To understand spreading activation, we need to
understand how memories are represented in ACT-
R
● Memories are records of structured
information
● They form networks of shared contents
From records to semantic network
Canary-sings
isa
object
attribute PropertyCanary
Sings
Property
Canary
Sings
Canary-sings
Canary-is-yellow
Yellow
object
isa
isa
object
attribute
attribute
Adele-Sings
attribute
Sun-is-yellow
attribute
Star
isa
…
Canary-is-yellow
object
attribute
Canary
Yellow
Propertyisa
Links have associative strengths
PropertyCanary
Sings
Canary-sings
Canary-yellow
Yellow
Sproperty, canary-sings
Scanary,canary-yellow,
s yellow, canary-yellow,
s property,canary-yellow
Ssings, canary-sings
scanary,canary-sings
● sq,m = Strength of
association between cue q
and memory m
● Reflects the odds of co-
occurrence of q and i
p(qi |
m) p(qi)
) ∑i
log(
Estimating associative strengths
PropertyCanary
Sings
Canary-sings
Canary-yellow
Yellow
Scanary, canary-yellow
Scanary, canary-sings,
● Sq,i = Strength of
association between
chunks q and i
● Reflects the odds of co-
occurrence of q and i p(canary | canary-
yellow) p(canary)
) log (
log p(canary | canary-yellow) - log p(canary)
Unknown constant Number of memories about “canary”
Estimating associative strengths /2
● Associative strengths can be estimated based on
the occurrences of cue m across memories
● Specifically, as Sq,i = k – log(Ni )
● Ni is the number of memories that contain q as a
slot ○ i.e., memories that link to i
● Bayesian idea; the more memories contain q, the
least q predictive q is of m!
Important consequence: The “fan”
Fan = 1, Sq,m = k – ln(1) = k
Fan = 2, Sq,m = k – ln(2) = k - 0.3
Fan = 3, Sq,m = k – ln(3) = k – 0.48
Cue q
shape circle
Sun
shape circle
Sun
shape circle
Freesbee
shape circle
Cue q
shape circle
Cue q
shape circle
Sun
shape circle
Freesbee
shape circle
Coffee Coaster shape circle
The fan effect
● The very first prediction made by an ACT-R model ○ And a honest prediction as well
● Counterintuitive: Retrieval takes more time as
you learn more facts related to the same subject ○ Decrease in activation as you learn more facts
○ Goes against the principle of frequency
The original experiment: Persons and Locations
Hippie
Captain
Debutante
Fireman
Giant
Earl
Lawyer
Park
Church
Bank
Cave
Beach
Castle
Store
Fan = 3, 3Fan = 1, 1
The fan effect: Experimental results
Anderson, 1974, Cogn. Psych.
- PSYCH448D, Week 5 Models of Memory /2 Multiple Trace Models
- Where we left off: Bayesian Considerations
- The priors: The history of a memory
- Focus on priors
- A mechanistic model of memory: ACT-R
- The Multiple Trace Theory
- The Multiple Trace Theory: Formal definition
- Retention curve for a trace
- Retention curve for a trace (visualized)
- Retention curve for a memory
- Memory activation
- Traces, memory, and activation
- Three Laws of Memory
- Recency and Frequency
- From base activations to retrieval times
- The context: Environmental cues
- Focus on contextual cues
- … But activation is in logs
- Context = spreading activation
- How activation spreads /1
- Memory representation in ACT-R
- Example of memory representation in ACT-R
- How activation spreads /2
- From records to semantic network
- Links have associative strengths
- Estimating associative strengths
- Estimating associative strengths /2
- Important consequence: The “fan”
- The fan effect
- The original experiment: Persons and Locations
- The fan effect: Experimental results