Ooo, very good questions! :) I think there are a few different reasons why.... one small clarification, though, I don't think ACT-R shrunk to a small group -- I'd say more that it gradually grew from a small group (starting out of John Anderson's lab at CMU) up to about 100 active researchers around the world, and then sort of stabilized at that level for the last decade or two.
But, as for why it didn't take over everything or at least get more widely known, I'd say one big reason is that the tasks it historically focused on were very specific -- usually things involving looking at letters and numbers on a screen and pressing keys on a keyboard or moving a mouse. So lots of the early examples were that sort of specific experimental psychology task. It's been expanded a lot since then (car driving, for example), but that's where its history was, and so for people who are interested in different sorts of tasks, I can see them maybe initially feeling like it's not relevant. And even now, lots of the tasks in the paper you provided are so far away from even modern ACT-R that I can see people believing that they can just ignore ACT-R and try to develop completely new theories instead.
Another more practical reason, however, is that there's a pretty high barrier to entry to getting into ACT-R, partly due to the fact that the reference implementation is in Lisp. Lisp made tons of sense as being the language to use when ACT-R was first developed, but it's very hard to find students with experience in Lisp now. There's been a big movement in the last decade to make alternate implementations of ACT-R (Python ACT-R, ACT-Up, jACT-R), and the latest version of ACT-R has interfaces to other languages which I think will help to make it more accessible. But, even with a more common programming language, there's still a lot of teaching/training/learning required to get new people used to the core ideas. And even to get people used to the idea of sticking with the constraints of ACT-R. For example, I can remember a student building a model that needed to do mental arithmetic, and it took a while to explain why they couldn't just say "c = a + b" and have the computer do the math (after all, you're implementing these models on a computer, and computers are good at math, so why not just do the math that way?). Forcing yourself to break that addition down into steps (e.g. trying to recall the result from memory, or trying to do a memory recall of some similar addition fact and then doing counting to adjust to this particular question, or just doing counting right from the beginning, or doing the manual column-wise addition method in your head) gets pretty complicated, and it can be hard to adjust to that sort of mind-set.
I will note that this high-barrier-to-entry problem is probably true of all other cognitive architectures (e.g. Sigma, Clarion, Soar, Dynamic Field Theory, Semantic Pointer Architecture, etc: https://en.wikipedia.org/wiki/Comparison_of_cognitive_architectures ). But one thing ACT-R did really well is to address this by regularly running a 2-week summer-school (since 1994 http://act-r.psy.cmu.edu/workshops/ ). That seems to me to be a big reason why ACT-R got much more widely used (and thus much more widely tested and evaluated) than the other architectures that are out there. There was an active effort to teach the system and to spread it into new domains, and to combat the common approach in computational cognitive modelling of people sticking with the one model that they (or their supervisor) invented. It's much more fun to build my own model from scratch and to evaluate it on the one particular task that I had in mind when I was inventing the model. But that just leads to a giant proliferation of under-tested models. :( To really test these theories, we need a community, and ACT-R is the biggest and most stable cognitive architecture community so far. It'd be great to have more such communities, but they're hard to grow.
That's a very good point, CounterBlunder, and I should have highlighted that as well. It is definitely fairly common for cognitive science researchers to never work with or make use of ACT-R. It's a sub-community within the cognitive science community. The research program has continued past the 90's, and there's probably around 100 or so researchers actively using it on a regular basis, but the cognitive science community is much larger than that, so your experience is pretty common.
As for whether ACT-R is "actually amazing and people have been silly to drop it", well, I definitely don't think that everyone should be making use of it, but I do think more people should be aware of its advantages, and the biggest reason for me is exactly what you point out about "fitting human reaction times" not being impressive. You're completely right that that's a basic feature of many, many models. But the key difference here is that ACT-R uses the same components and same math to fit human reaction times (and error patterns) across many different tasks. That is, instead of making a new model for a new task, ACT-R tries to use the same components, with the same parameter settings, but with perhaps a different set of background knowledge. The big advantage here is that it starts getting away from the over-fitting problem: when dealing with comparisons to human data, we normally have relatively few data points to compare to. And a cognitive model is going to, almost by definition, be fairly complex. So if we only fit to the data available for one task, the worry is that we're going to have so many free parameters in our model that we can fit anything we like. And there's also a worry that if I'm developing a cognitive model for one particular task, I might invent some custom component as part of my model that's really highly specialized and would only ever get used in that one task, which is a bit worrying if I'm aiming for a general cognitive theory. One way around these problems is to find components and parameter settings that works across many different tasks. And right now, the ACT-R community is the biggest cognitive modelling community where there are many different researchers using the same components to do many different tasks.
(Note: by "same components in different tasks" I'm meaning something a lot more specific than something like "use a neural network". In neural network terms, I'm more meaning something like "train up a neural network on this particular data X and then use that same trained neural network as a component in many different tasks". After all, people very quickly change tasks and can re-purpose their existing neural networks to do new tasks extremely quickly. This hasn't been common in neural networks until the recent advent of things like GPT-3. And, personally, I think GPT-3 would make an excellent module to be added to ACT-R, but that's a whole other discussion.)
As for the paper you linked to, I really like that paper (and I'm even cited in it -- yay!), but I don't think it gives an overview of overarching theories of human cognition. Instead, I think it gives a wonderful list of tasks and situations where we're going to need some pretty complicated components to perform these different tasks, and it gives a great set of suggestions as to what some of those components might be. But there's no overarching theory of how we might combine those components together and make them work together and flexibly use them for doing different tasks. And that, to me, is what ACT-R provides an example of. I definitely don't think ACT-R is the perfect, final solution, but it at least shows an example of what it would be like to coordinate components like that, and applies that to a wider variety of tasks than any particular system discussed in that paper. But lots of the tasks in that paper are also things that are incredibly far away from anything that ACT-R has been applied to, so I'm quite sure that ACT-R will need to change a lot to be expanded to include these sorts of new components needed for these new tasks. Still, it makes a good baseline for what it would take to have a flexible system that can be applied to different tasks, rather than building a new model for each task.
Hi Vanessa, hmm, very good question. One possibility is to point you at the ACT-R reference manual http://act-r.psy.cmu.edu/actr7/reference-manual.pdf but that's a ginormous document that also spends lots of time just talking about implementation details, because the reference ACT-R implementation is in Lisp (yes, ACT-R has been around that long!)
So, another option would be this older paper of mine, where I attempted to rewrite ACT-R in Python, and so the paper goes through the math that had to be reimplemented. http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/641stewartPaper.pdf
Yes, that Tenison paper is a great example of arithmetic modelling in ACT-R, and especially connecting it to the modern fMRI approach for validation! For an example of the other sorts of math modelling that's more psychology-experiment-based, this paper gives some of the low-level detail about how such a model would work, and maps it onto human errors:
- "Toward a Dynamic Model of Early Algebra Acquisition" https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.5754&rep=rep1&type=pdf
(that work was expanded on a few times, and led to things like "Instructional experiments with ACT-R “SimStudents”" http://act-r.psy.cmu.edu/?post_type=publications&p=13890 where they made a bunch of simulated students and ran them through different teaching regimes)
As for other cool tasks, the stuff about playing some simple video games is pretty compelling to me, especially in as much as it talks about what sort of learning is necessary for the precise timing that develops. http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2019/03/paper46a.pdf Of course, this is not as good in terms of getting a high score as modern deep learning game-playing approaches, but it is very good in terms of matching human performance and learning trajectories. Another model I find rather cool a model of driving a car, which then got combined with a model of sleep deprivation to generate a model of sleep-deprived driving: http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/9822011-gunzelmann_moore_salvucci_gluck.pdf
One other very cool application, I think is the "SlimStampen" flashcard learning tool developed out of Hedderik van Rijn's lab at the University of Groningen, in the Netherlands: http://rugsofteng.github.io/Team-5/ The basic idea is to optimize learning facts from flashcards by presenting a flashcard fact just before the ACT-R declarative memory model predicts that a person is going to forget a fact. This seems to improve learning considerably http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2012/12/867paper200.pdf and seems to be pretty reliable https://onlinelibrary.wiley.com/doi/epdf/10.1111/tops.12183
I think that sort of task might be modellable with ACT-R -- the hardest part might be getting or gathering the animal data to compare to! Most of the time ACT-R models are validated by comparing to human data gathered by taking a room full of undergraduates and making them do some task 100 times each. It's a bit trickier to do that with animals. But that does seem like something that would be interesting research for someone to do!
That sounds right to me. It gives what types of information are processed in each area, and it gives a very explicit statement about exactly what processing each module performs.
So I look at ACT-R as sort of a minimal set of modules, where if I could figure out how to get neurons to implement the calculations ACT-R specifies in those modules (or something close to them), then I'd have a neural system that could do a very wide variety of psychology-experiment-type-tasks. As far as current progress goes, I'd say we have a pretty decent way to get neurons to implement the core Production system, and the Buffers surrounding it, but much less of a clear story for the other modules.
No particularly strong reason -- the main thing is that, when building these models, you also have to build a model of the environment that the system is interacting with. And the codebase for helping people build generic environments is mostly focused on handling key-presses and mouse-movements and visually looking at screens, while there's a separate codebase for handing auditory stimuli and responses, since that's a pretty different sort of behaviour.
As for mapping ACT-R onto OpenWorm, unfortunately ACT-R's at a much much higher level than that. It's really meant for modelling humans -- I seem to remember a few attempts to model tasks being performed by other primates by doing things like not including the Goal Buffer, but I don't think that work went very far, and didn't map well to simpler animals. :(
As someone who can maybe call themselves an ACT-R expert, I think the main thing I'd say about the intentional module being "not identified" is that we don't have any fMRI data showing activity in any particular part of the brain being correlated to the use of the intentional module in various models. For all of the other parts that have brain areas identified, there's pretty decent data showing that correlation with activity in particular brain areas. And also, for each of those other areas there's pretty good arguments that those brain areas have something to do with tasks that involve those modules (brain damage studies, usually).
It's worth noting that there's no particular logical reason why there would have to be a direct correlation between modules in ACT-R and brain areas. ACT-R was developed based on looking at human behaviour and separating things out into behaviourally distinct components. There's no particular reason that separating things out this way must map directly onto physically distinct components. (After all, the web browser and the word processor on a computer are behaviourally distinct, but not physically distinct). But it's been really neat that in the last 20 years a surprising number of of these modules that have been around in various forms since the 70's have turned out to map onto physically distinct brain areas.
I agree that there isn't an overarching theory at the level of specificity of ACT-R that covers all the different aspects of the mind that cognitive science researchers wish it would cover. And so yes, I can see cognitive scientists saying that there is no such theory, or (more accurately) saying that even though ACT-R is the best-validated one, it's not validated on the particular types of tasks that they're interested in, so therefore they can ignore it.
However, I do think that there's enough of a consensus about some aspects of ACT-R (and other theories) that there are some broader generalizations that all cognitive scientists should be aware of. That's the point of the two papers listed in the original post on the "Common Model of Cognition". They dig through a whole bunch of different cognitive architectures and ideas over the decades and point out that there are some pretty striking commonalities and similarities across these models. (ACT-R is just one of the theories that they look at, and they point out that there are a set of commonalities across all the theories, and that's what they call the Common Model of Cognition). The Common Model of Cognition is much more loosely specified and is much more about structural organization rather than being about the particular equations used, though, so I'd still say that ACT-R is the best-validated model. But CMC is surprisingly consistent with a lot of models, and that's why the community is getting together to write papers like that. The whole point is to try to show that there are some things that we can say right now about an overarching theory of the mind, even if people don't want to buy into the particular details of ACT-R. And if people are trying to build overarching theories, they should at least be aware of what there is already.
(Full disclosure: I was at the 2017 meeting where this community came together on this topic and started the whole CMC thing. The papers from that meeting are at https://www.aaai.org/Library/Symposia/Fall/fs17-05.php and that's a great collection of short papers of people talking about the various challenges of expanding the CMC. The general consensus from that meeting is that it was useful to at least have an explicit CMC to help frame that conversation, and it's been great to see that conversation grow over the last few years. Note: at the time we were calling it the Standard Model of the Mind, but that got changed to Common Model of Cognition).