Memory models

profileMIMI206
PSYCH448D-08-Accessible.pdf

Andrea Stocco

University of Washington

Seattle, WA

PSYCH448D, Week 5 Models of Memory /2 Multiple Trace Models

The probability of retrieving memory m in context Q

made of q1, q2 … qi … qN cues is expressed as:

Where we left off: Bayesian Considerations

p(qi |

m) p(qi)

Πi

p(m | Q)

p(Q | m)p(Q|

¬m)

p(m) p(¬m)

p(¬m | Q)

= ✕

p(m) p(¬m)

✕=

The priors: The history of a

memory

The probability of retrieving memory m in context Q

made of q1, q2 … qi … qN cues is expressed as:

Focus on priors

p(qi |

m) p(qi)

Πi

p(m | Q)

p(Q | m)p(Q|

¬m)

p(m) p(¬m)

p(¬m | Q)

= ✕

p(m) p(¬m)

✕=

A mechanistic model of memory: ACT-R

● ACT-R = Adaptive Character of Thought - Rational

● John Anderson’s follow-up to rational analysis

● Other models exists: ○ REM model by Shiffrin

○ MINERVA model by Hintzman

○ … They are all very similar

● All of these models are rooted in the Multiple

Trace Theory

The Multiple Trace Theory

● Perhaps the dominant framework on the

neurobiology of memory

● Every time you encode information, you make a

new trace

● New traces can be made for the same memory

The Multiple Trace Theory: Formal definition

● Formally, a memory m is a collection of n traces,

encoded at times t1, t2… tn.

t1

m t2

t3

In Italian, “Fish” is “Pesce”

T im

e

1st time, In classroom

2nd time, In restaurant

3rd time, At the market

Retention curve for a trace

The odds of retrieving the i-th trace decrease with a

power function. At time t, they are:

P(i) / P(¬i) = (t - ti) -d

● ti = time when the i-th trace was created

● (t - ti) = time since creation of i-th trace

● d = decay rate

P(i) / P(¬i) = (t - ti) -d

● ti = time the i-th trace was created (ti = 0),

● d = decay rate (d = 0.5)

Retention curve for a trace (visualized)

Retention curve for a memory

● The odds of remembering m are the odds of

remembering any of its constituent traces

● Thus, it is the sum of the odds of remembering

each trace

P(m) / P(¬m) = ∑i (t - ti) -d

Memory activation

● The activation A(m) of a memory m is a scalar

value that reflects the current availability of a

memory.

● Mathematically, it is defined as the log of the

odds

A(m) = log P(m) / P(¬m) = log ∑i (t - ti) -d

● Notice: ○ Activation goes from -Inf to +Inf

○ When P(m) = 0.5, log[ P(m)/ P(¬ m)] = 0: Natural forgetting

threshold

Traces, memory, and activation

● Memory m is made

of three traces

● Traces created at

○ t1 = 0s

○ t2 = 4s

○ t3 = 12s

Three Laws of Memory

Every memory model needs to account for three

established phenomena:

● Frequency

● Recency

● Spacing

Recency and Frequency

From base activations to retrieval times

● Activation encodes log odds of retrieval

● But log odds are related to effort and, and effort is

related to retrieval retrieval times (RT)

● The RT for retrieving a memory m is calculated

from its activation A(m):

RT = Ter + FeA(m)

● Ter = non-retrieval time (e.g., perception and

motor times)

● F is a scaling parameters

The context: Environmental cues

The probability of retrieving memory m in context Q

made of q1, q2 … qi … qN cues is expressed as:

Focus on contextual cues

p(qi |

m) p(qi)

Πi

p(m | Q)

p(Q | m)p(Q|

¬m)

p(m) p(¬m)

p(¬m | Q)

= ✕

p(m) p(¬m)

✕=

… But activation is in logs

p(qi |

m) p(qi)

Πi

p(m) p(¬m)

✕ A(m) =

log( )

p(qi |

m) p(qi)

Πi

p(m) p(¬m)

) + A(m) =

log( )

log (

p(qi |

m) p(qi)

A(m) = log∑i (t -

ti) -d +

) ∑i

log(

Sum of the support given from each individual cue qi

Context = spreading activation

Old idea in cognitive science:

● Memories form a network

● Activation flows through connected nodes in the

network

● Activation flow from cues q1, q2 … to memory mq1

m

q2

q3 A (m )

How activation spreads /1

● To understand spreading activation, we need to

understand how memories are represented in ACT-

R

Memory representation in ACT-R

● Filing cabinet

metaphor

● Assumes that

declarative memory

takes the form of

records

● A memory is a

collection of

structured

information, with

labeled entries

Example of memory representation in ACT-R

PSYCH448D Is a………

undergrad class

Quarter.....Fall 2022

Instructor.. Andrea Stocco

Andrea Stocco Is a…………

Professor

Born in…… Italy

Works at… UW

Memories

How activation spreads /2

● To understand spreading activation, we need to

understand how memories are represented in ACT-

R

● Memories are records of structured

information

● They form networks of shared contents

From records to semantic network

Canary-sings

isa

object

attribute PropertyCanary

Sings

Property

Canary

Sings

Canary-sings

Canary-is-yellow

Yellow

object

isa

isa

object

attribute

attribute

Adele-Sings

attribute

Sun-is-yellow

attribute

Star

isa

Canary-is-yellow

object

attribute

Canary

Yellow

Propertyisa

Links have associative strengths

PropertyCanary

Sings

Canary-sings

Canary-yellow

Yellow

Sproperty, canary-sings

Scanary,canary-yellow,

s yellow, canary-yellow,

s property,canary-yellow

Ssings, canary-sings

scanary,canary-sings

● sq,m = Strength of

association between cue q

and memory m

● Reflects the odds of co-

occurrence of q and i

p(qi |

m) p(qi)

) ∑i

log(

Estimating associative strengths

PropertyCanary

Sings

Canary-sings

Canary-yellow

Yellow

Scanary, canary-yellow

Scanary, canary-sings,

● Sq,i = Strength of

association between

chunks q and i

● Reflects the odds of co-

occurrence of q and i p(canary | canary-

yellow) p(canary)

) log (

log p(canary | canary-yellow) - log p(canary)

Unknown constant Number of memories about “canary”

Estimating associative strengths /2

● Associative strengths can be estimated based on

the occurrences of cue m across memories

● Specifically, as Sq,i = k – log(Ni )

● Ni is the number of memories that contain q as a

slot ○ i.e., memories that link to i

● Bayesian idea; the more memories contain q, the

least q predictive q is of m!

Important consequence: The “fan”

Fan = 1, Sq,m = k – ln(1) = k

Fan = 2, Sq,m = k – ln(2) = k - 0.3

Fan = 3, Sq,m = k – ln(3) = k – 0.48

Cue q

shape circle

Sun

shape circle

Sun

shape circle

Freesbee

shape circle

Cue q

shape circle

Cue q

shape circle

Sun

shape circle

Freesbee

shape circle

Coffee Coaster shape circle

The fan effect

● The very first prediction made by an ACT-R model ○ And a honest prediction as well

● Counterintuitive: Retrieval takes more time as

you learn more facts related to the same subject ○ Decrease in activation as you learn more facts

○ Goes against the principle of frequency

The original experiment: Persons and Locations

Hippie

Captain

Debutante

Fireman

Giant

Earl

Lawyer

Park

Church

Bank

Cave

Beach

Castle

Store

Fan = 3, 3Fan = 1, 1

The fan effect: Experimental results

Anderson, 1974, Cogn. Psych.

  • PSYCH448D, Week 5 Models of Memory /2 Multiple Trace Models
  • Where we left off: Bayesian Considerations
  • The priors: The history of a memory
  • Focus on priors
  • A mechanistic model of memory: ACT-R
  • The Multiple Trace Theory
  • The Multiple Trace Theory: Formal definition
  • Retention curve for a trace
  • Retention curve for a trace (visualized)
  • Retention curve for a memory
  • Memory activation
  • Traces, memory, and activation
  • Three Laws of Memory
  • Recency and Frequency
  • From base activations to retrieval times
  • The context: Environmental cues
  • Focus on contextual cues
  • … But activation is in logs
  • Context = spreading activation
  • How activation spreads /1
  • Memory representation in ACT-R
  • Example of memory representation in ACT-R
  • How activation spreads /2
  • From records to semantic network
  • Links have associative strengths
  • Estimating associative strengths
  • Estimating associative strengths /2
  • Important consequence: The “fan”
  • The fan effect
  • The original experiment: Persons and Locations
  • The fan effect: Experimental results