A brief history of VSAs/Hyperdimensional Computing

The materials for these brief notes come from works [17] and [18]. First of all, it is worth noting that as such there is no single coherent history of Vector Symbolic Architecrures (VSAs) as different techniques used in VSAs have been developed simultaneously and independently.

VSA is an umbrella term for a family of computational frameworks using high-dimensional distributed representations and relying on the algebraic properties provided by a set of operators (bundling, binding, and permutation) on the high-dimensional representation space. VSAs are heavily insrired by D. Gabor’s holographic associative memory [19] and other models of associative memories inspired by holography [12], [13], [14]. The theory of holography was developed by Gabor in the late 1940s. Psychologists working in the 1960s first suggested holography as a theory of the operation of the brain (see D. Willshaw [20] for a brief review). K. Pribram [21], for instance, suggested that the firing patterns of different neurons may interfere with each other in a manner analogous to the way beams of light interfere, producing hologram-like patterns of interference in the brain. These hologram-like patterns could form the basis of human memory.

Gabor [19] noted that a system that uses the mathematical operations of convolution and cross-correlation can mimic a hologram and could be used as an associative memory. Holography, thus, can be used to define a computational memory system. A. Borsellino and T. Poggio [13] note a formal identity between Gabor’s holographic associative memory [19] and a functional characterization of the neurophysiology of the optomotor response of insects, suggesting that convolution and correlation may indeed be fundamental to neural processes. Cognitive models based on holographic associative memory, such as TODAM by B. Murdock [10] and CHARM by J. Metcalfe-Eich [11], explain and predict a variety of human memory phenomena.

In a way, VSAs are a refinement of Gabor’s holographic associative memory. Research into VSAs has been motivated by limitations in the ability of traditional connectionist models (i.e., nonrecurrent models with one or two layers of connections) to represent knowledge with complicated structure [2]. In traditional connectionist models, an item is represented by a pattern of activation across a group of neurons. Mathematically, the pattern of activation is represented by a vector of numbers that stand for the activations of the neurons. Relationships between pairs of items are defined by the connection weights between groups of neurons. The connection weights between two groups of neurons can be represented as a matrix of numbers.

Relationships between more than two items can be defined using more connections, represented as tensors of numbers (i.e., multidimensional arrays; for psychological theory see M. Humphreys, J. Bain, and R. Pike [22]; for computational theory see P. Smolensky [1]). Smolensky’s tensor product binding [1] provides a powerful approach to representing and manipulating relationships between items that differs substantively from traditional connectionist models. The key points demonstrated were that it is possible to represent complex composite data structures (usually thought of as symbolic data structures) in a vector space, and that it is possible to define transformations on that vector space, which manipulate the represented composite structures without needing to decompose them.

Smolensky's method uses the tensor product as the binding operator, which results in the dimensionality of the resultant vector being the product of the dimensionalities of the operand vectors. In particular, tensor product binding (and, in fact, VSAs) do not need to be trained. However, as the number of items bound together into an association grows, the size of the tensor needed to represent the relationship grows exponentially.

VSAs are intimately related to tensor product binding [1] and to similar works by E. Mizraji [15], [16]. They provide a means by which relationships between items can be represented using vectors of dimensionality equal to that of the vectors that represent the individual items. In other words, VSA binding operations can be interpreted as forming the tensor product of the operands and then projecting that result back into a vector space with the same dimensionality as each of the operand vectors. Consequently, the dimensionality of the representational space remains constant.

VSAs are best understood as a tool for creating and manipulating compact representations of associations that can be situated within a larger framework of structures and processes. As a family, the VSAs have been committed to using a fixed dimensionality vector space for the representation and manipulation of composite data structures by exploiting the algebraic properties of a small number of operators on that vector space. Specific VSAs frameworks were introduced by T. Plate [2], P. Kanerva [3], R. Gayler [4], D. Rachkovskij [5], and others [6], [7], [8] in the time span of the 1990s to 2000s. The interested readers are kindly referred to [9] where eight different VSAs frameworks are compared and taxonomized.

Starting off with a neurophysiological inspiration that cognitive functions in biological brains are achieved through highly parallel activations of a large number of neurons, which through the learning process form statistically persistent patterns, VSAs offer a computational model where everything (items/entities/concepts/symbols/scalars etc.) is represented by high-dimensional vectors, which act as a distributed representation. In other words, VSAs present a plausible means by which the brain could recursively combine patterns of neural activations representing simple concepts to generate new patterns of activation that represent novel, complicated concepts.

VSAs provide an extreme form of distributed representation in that it is holographic and stochastic. The holographic property means that the information content of the representation is spread over all the dimensions of the high-dimensional space. Any subset of the dimensions can be used as a noisy approximation to the full set of dimensions. The representations have to be interpreted stochastically, in that the relative proximities of vectors (as measured by the angles between them) are the important properties and the value of any specific dimension makes only a small contribution to the direction of a vector and is consequently nearly irrelevant (in isolation). This is the major departure from the standpoints of the conventional computing, where each component of a representation has a different specific meaning and the exact value of each component is generally important.

Literature

[1] P. Smolensky, “Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems,” Artificial Intelligence, vol. 46, pp. 159–216, 1990.

[2] T. A. Plate, “Holographic Reduced Representations,” IEEE Transactions on Neural Networks, vol. 6, no. 3, pp. 623–641, 1995.

[3] P. Kanerva, “Fully Distributed Representation,” in Real World Computing Symposium (RWC), pp. 358–365, 1997.

[4] R. W. Gayler, “Multiplicative Binding, Representation Operators & Analogy,” in D. Gentner, K. J. Holyoak, B. N. Kokinov (Eds.), Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences, pp. 1–4, 1998.

[5] D. A. Rachkovskij, “Representation and Processing of Structures with Binary Sparse Distributed Codes,” IEEE Transactions on Knowledge and Data Engineering, vol. 3, no. 2, pp. 261–276, 2001.

[6] S. I. Gallant and T. W. Okaywe, “Representing Objects, Relations, and Sequences,” Neural Computation, vol. 25, no. 8, pp. 2038–2078, 2013.

[7] D. Aerts, M. Czachor, and B. D. Moor, “Geometric Analogue of Holographic Reduced Representation,” Journal of Mathematical Psychology, vol. 53, pp. 389–398, 2009.

[8] M. Laiho, J. H. Poikonen, P. Kanerva, and E. Lehtonen, “High-Dimensional Computing with Sparse Vectors,” in IEEE Biomedical Circuits and Systems Conference (BioCAS), pp. 1–4, 2015.

[9] K. Schlegel, P. Neubert, and P. Protzel, “A Comparison of Vector Symbolic Architectures,” arXiv:2001.11797, pp. 1–9, 2020.

[10] B. B. Murdock, “A Theory for the Storage and Retrieval of Item and Associative Information,” Psychological Review, vol. 89, no. 6, pp. 609–626, 1982.

[11] J. Metcalfe-Eich, “A Composite Holographic Associative Recall Model,” Psychological Review, vol. 89, no. 6, pp. 627–661, 1982.

[12] D. J. Willshaw, O. P. Buneman, and H. C. Longuet-Higgins, “Non-Holographic Associative Memory,” Nature, vol. 222, no. 5197, pp. 960–962, 1969.

[13] A. Borsellino and T. Poggio, “Convolution and Correlation Algebras,” Kybernetik, vol. 13, no. 2, pp. 113–122, 1973.

[14] P. H. Schönemann, “Some Algebraic Relations between Involutions, Convolutions, and Correlations, with Applications to Holographic Memories,” Biological Cybernetics, vol. 56, no. 5-6, pp. 367–374, 1987.

[15] E. Mizraji, “Context-Dependent Associations in Linear Distributed Memories,” Bulletin of Mathematical Biology, vol. 51, pp. 195-205, 1989.

[16] E. Mizraji, “Vector Logics: The Matrix-Vector Representation of Logical Calculus,” Fuzzy Sets and Systems, vol. 50, no. 2, pp. 179-185, 1992.

[17] M. A. Kelly, D. Blostein, and D. J. K. Mewhort, “Encoding Structure in Holographic Reduced Representations,” Canadian Journal of Experimental Psychology, vol. 67, no. 2, pp. 79-93, 2013.

[18] D. Kleyko, R. W. Gayler, and E. Osipov, “Commentaries on "Learning Sensorimotor Control with Neuromorphic Sensors: Toward Hyperdimensional Active Perception" [Science Robotics Vol. 4 Issue 30 (2019) 1-10],” arXiv:2003.11458, pp. 1-10, 2020.

[19] D. Gabor, “Associative Holographic Memories,” IBM Journal of Research and Development, vol. 13, no. 2, pp. 156–159, 1969.

[20] D. J. Willshaw, “Holography, Associative memory, and Inductive Generalization,” in Parallel Models of Associative Memory, pp.83–104, 1981.

[21] K. H. Pribram, “The Neurophysiology of Remembering,” Scientific American, vol. 220, no. 1, pp. 73–87, 1969.

[22] M. S. Humphreys, J. D. Bain, and R. Pike, “Different Ways to Cue a Coherent Memory System: A theory for Episodic, Semantic, and Procedural Tasks,” Psychological Review, vol. 96, no. 2, pp. 208–233, 1989.