Hyperdimensional Computing aka Vector Symbolic Architectures

This website is devoted to a computing framework 

known as Hyperdimensional Computing aka Vector Symbolic Architectures 


 

Mission of the website

Here we aim at collecting all possible information about 

Hyperdimensional Computing/Vector Symbolic Architectures (HD/VSA for short)/ such as 

publications, video recordings, webinars, software, and information about the community developing HD/VSA 

Hyperdimensional Computing/Vector Symbolic Architectures

The original version of the text below is a courtesy of Prof. Simon D. Levy

Motivation

Vector Symbolic Architecture(s) (VSA) is a term coined by psychologist R. W. Gayler [1] to refer to a family of connectionist network models developed since the late 1980s. An alternative term Hyperdimensional Computing (HD) was proposed by neuroscientist P. Kanerva [7].  Nowadays, it is common to refer to the family as HD/VSA. HD/VSA models were developed in an attempt to address the challenges to connectionism posed by J. Fodor, Z. Pylyshyn, R. Jackendoff, and other researchers. These researchers have argued that connectionist networks are – either in principle or in practice – incapable of modeling crucial features cognition (language and thought), such as compositionality (the ability to put together complex thoughts or sentences from simpler thoughts or words) and systematicity (the fact that someone who understands the sentence John kissed Mary can also understand the sentence Mary kissed John) [2]. HD/VSA addresses these challenges by providing a binding operator associating individual (John, Mary) with roles (AGENT, PATIENT) and a superposition operator that allows multiple associations to be composed into a coherent whole. The name HD/VSA comes from the fact that vectors are  high-dimensional and they are the sole means of representing all entities (roles, fillers, compositions).

Origins

Page History attempts to provide a detailed account for the early ideas that lead to HD/VSA. Perhaps the earliest crystallization of these ideas into a concrete HD/VSA model is P. Smolensky's Tensor Product Representations [3] and less known related works by E. Mizraji [16], [17]. In Tensor Product Representations binding was implemented  via outer product. Although this approach worked in principle, it was subject to a combinatorial explosion problem: binding two vectors of N elements resulted in a square matrix of N2 elements; binding that matrix with another vector of N elements resulted in a cube of N3 elements, etc.

Models

All HD/VSA models implement composition as vector addition and solve the combinatorial explosion problem by reducing the outer product back to a single vector of N dimensions. Individual models differ in the kind of reduction they perform and the contents of the vectors they use. T. Plate's Holographic Reduced Representations (HRR) [4] use real-valued vectors and reduce the matrix through circular convolution. P. Kanerva's Binary Spatter Codes (BSC) [5] use binary (0/1) vectors and ignore the off-diagonal matrix elements, so that binding is implemented as elementwise (Hadamard) product ⊗. R. Gayler's Multiply / Add / Permute (MAP) model likewise ignores the non-diagonal elements but uses +1/-1 instead of 0/1, so that each vector is its own multiplicative inverse. This self-inverse property is convenient for recovering the elements of individual associations but necessitates some additional mechanism to 'protect' or quote associations embedded in others (Bill knows that [John kissed Mary]). Hence, MAP uses a permutation operator for quoting/protecting. Permutation is also useful for encoding order or precedence: rather than simply associating some other item A with an item B, we can represent the fact that A comes before B by binding the vector for A with the permutation of the vector for B; i.e., A⊗P(B).

Advantages

Having solved the problem of combinatorial explosion, HD/VSA models are free to use vectors of arbitrarily large dimensionality (size); N=10,000 is typical. Such large vectors provide a number of desirable features; for example, the space of such vectors contains on the order of 2N nearly-orthogonal vectors [6] , and each such vector can be degraded by up to 30% and still be closer to its original form than to any of the other vectors in the space [7].  As with ordinary arithmetic, vector multiplication and addition are associative and commuative, and multiplication distributes over addition.

The use of large vectors also makes HD/VSA a variety of distributed representation, where the identity of a symbol is encoded across all elements of the representation (vector), and each element the representation takes part in representing the symbol. Chris Eliasmith and his colleagues have argued that, compared to 'localist' representations (one element per symbol), such representations are more capable of scaling up to nontrivial problems [8]. Stewart, Bekolay, and Eliasmith have also provided an implemention of HRR in spiking neurons , which they embed in an architecture that uses them to do processing, [9] adding to the plausibility of HD/VSA as a model of cognition in humans and other animals.

Unlike many traditional neural networks, HD/VSA do not rely on backpropagation or other compute-intensive learning algorithms, and, as shown in the example below, can often induce a pattern from a single example. Further, elementwise multiplication and addition are embarrassingly parallel operations that can be performed efficiently (in principle, in constant time).

Noise

Because they use finite dimensionality and finite precision, all HD/VSA models can be considered a variety of lossy (noisy) compression. To recover the original vectors in the presence of noise, HD/VSA employs a cleanup memory that stores the vector representations of the items of interest (Mary, AGENT, etc.) Cleanup memory can be implemented as a simple list of items, as a Hopfield Network, or in a spiking-neuron model [15].

Example

This example, due to Pentti Kanerva [10], shows how the MAP variety of HD/VSA can be used to perform analogical reasoning.

Consider the knowledge we have about a particular country: for example, the name of our country is United States of America, and our capital is Washington DC; we might say that Washington DC fills the CAPITAL role for the US. Our monetary unit is the dollar. The name of our neighbor to the south is Mexico; its capital is Mexico City, and its currency is the peso. Using the notation <X> to mean the vector representation of X, we can represent this knowledge as follows:

V1 = [<NAME>⊗<USA> + <CAP>⊗<WDC> + <MON>⊗<DOL> ]

V2 = [<NAME>⊗<MEX> + <CAP>⊗<MXC> + <MON>⊗<PES> ]

To compare the United States to Mexico, we can take the vector product V1V2 of these representations, which is itself just another vector:

[<NAME>⊗<USA> + <CAP>⊗<WDC> + <MON>⊗<DOL> ] ⊗

[<NAME>⊗<MEX> + <CAP>⊗<MXC> + <MON>⊗<PES> ]

Multiplying out, we see that this vector is equal to the vector

<NAME>⊗<USA>⊗<NAME>⊗<MEX> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<NAME>⊗<USA>⊗<MON>⊗<PES> +

<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<CAP>⊗<WDC>⊗<CAP>⊗<MXC> +

<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<MON>⊗<DOL>⊗<CAP>⊗<MXC> +

<MON>⊗<DOL>⊗<MON>⊗<PES>

Thanks to associativity and commutativity, we can rewrite this expression as

<NAME>⊗<NAME>⊗<USA>⊗<MEX> +

<CAP>⊗<CAP>⊗<WDC>⊗<MXC> +

<MON>⊗<MON>⊗<DOL>⊗<PES> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<NAME>⊗<USA>⊗<MON>⊗<PES> +

<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<MON>⊗<DOL>⊗<CAP>⊗<MXC>

Because of the self-canceling property of MAP, this expression can be rewritten as

<USA>⊗<MEX> +

<WDC>⊗<MXC> +

<DOL>⊗<PES> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<NAME>⊗<USA>⊗<MON>⊗<PES> +

<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<MON>⊗<DOL>⊗<CAP>⊗<MXC>

To ask 'What is the 'dollar of Mexico'? we multiply this vector by <DOL>, our HD/VSA representation of dollar:

<DOL> ⊗

[<USA>⊗<MEX> +

<WDC>⊗<MXC> +

<DOL>⊗<PES> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<NAME>⊗<USA>⊗<MON>⊗<PES> +

<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<MON>⊗<DOL>⊗<CAP>⊗<MXC>]

=

[<DOL>⊗<USA>⊗<MEX> +

<DOL>⊗<WDC>⊗<MXC> +

<DOL>⊗<DOL>⊗<PES> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<DOL>⊗<NAME>⊗<USA>⊗<MON>⊗<PES> +

<DOL>⊗<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<DOL>⊗<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<DOL>⊗<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<DOL>⊗<MON>⊗<DOL>⊗<CAP>⊗<MXC>]

=

[<PES> +

<DOL>⊗<USA>⊗<MEX> +

<DOL>⊗<WDC>⊗<MXC> +

<NAME>⊗<USA>⊗<CAP>⊗<MXC> +

<DOL>⊗<NAME>⊗<USA>⊗<MON>⊗<PES> +

<DOL>⊗<CAP>⊗<WDC>⊗<NAME>⊗<MEX> +

<DOL>⊗<CAP>⊗<WDC>⊗<MON>⊗<PES> +

<DOL>⊗<MON>⊗<DOL>⊗<NAME>⊗<MEX> +

<DOL>⊗<MON>⊗<DOL>⊗<CAP>⊗<MXC>]

This vector will be highly correlated with the vector <PES> representing the peso, and will have a low correlation with any of the other vectors (<DOL>, <USA>, <NAME>, etc.) Hence we can rewrite the expression as

[<PES> + noise]

where noise is a vector uncorrelated with any other vector in our model. If we like, we can get a perfect reproduction of the representation <PES> by passing this noisy vector through a cleanup memory. More important, though, is that we have correctly answered the question 'What is the dollar of Mexico?' with the answer Peso, without the need for decomposing or extracting any of the components of the encoded representation, as we would with a traditional computational model based on data structures and pointers.

Applications

Because of the generality of the role/filler mechanism, HD/VSA can be applied to a variety of tasks. Recent applications have included solutions to intelligence tests [11] and graph isomorphisms [12], as well as a behavior-based robotics [13] and the representation of word meaning and order in natural language [14].

Literature

[1] R. W. Gayler, “Vector Symbolic Architectures Answer Jackendoff's Challenges for Cognitive Neuroscience,” in International Conference on Cognitive Science (ICCS/ASCS), pp. 133–138, 2003.

[2] J. A. Fodor and Z. W. Pylyshyn, “Connectionism and Cognitive architecture: A Critical Analysis,” Cognition, vol. 28, no. 1-2, pp. 3–71, 1988.

[3] P. Smolensky, “Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems,” Artificial Intelligence, vol. 46, pp. 159–216, 1990.

[4]  T. A. Plate, Holographic Reduced Representations: Distributed Representation for Cognitive Structures.  Stanford: Center for the Study of Language andInformation (CSLI), 2003.

[5] P. Kanerva, “Fully Distributed Representation,” in Real World Computing Symposium (RWC), pp. 358–365, 1997.

[6] R. Hecht-Nielsen, “Context Vectors: General Purpose Approximate Meaning Representations Self-organized from Raw Data,” in Computational Intelligence: Imitating Life, pp. 43–56, 1994.

[7] P. Kanerva, “Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors,” Cognitive Computation, vol. 1, no. 2, pp. 139-159, 2009.

[8] T. Stewart and C. Eliasmith, “Compositionality and Biologically Plausible Models,” in The Oxford handbook of compositionality, pp. 596–615, 2011.

[9] T. Stewart, T. Bekolay, and C. Eliasmith, “Neural Representations of Compositional Structures: Representing and Manipulating Vector Spaces with Spiking Neurons,” Connection Science, vol. 22, no. 3, pp. 145-153, 2011.

[10] P. Kanerva, “What We Mean When We Say "What's the Dollar of Mexico?": Prototypes and Mapping in Concept Space,” in Quantum Informatics for Cognitive, Social, and Semantic Processes (AAAI), pp. 2–6, 2010.

[11] D. Rasmussen and C. Eliasmith, “A Neural Model of Rule Generation in Inductive Reasoning,” Cognitive Science, vol. 3, pp. 140–153, 2011.

[12] R. W. Gayler and S. D. Levy, “A Distributed Basis for Analogical Mapping: New frontiers in Analogy Research,” in International Conference on the Analogy (ANALOGY), pp. 165-174, 2009.

[13] S. Levy, S. Bajracharya, and R. W. Gayler, “Learning Behavior Hierarchies via High-Dimensional Sensor Projection,” in The Twenty-Seventh AAAI Conference on Artificial Intelligence (AAAI), pp. 1-4, 2013.

[14] M. N. Jones and D. J. Mewhort, “Representing Word Meaning and Order Information in a Composite Holographic Lexicon,” Psychological Review, vol. 114, no. 1, pp. 1-37, 2007.

[15] T. Stewart, Y. Tang, and C. Eliasmith, “A Biologically Realistic Cleanup Memory: Autoassociation in Spiking Neurons,” Cognitive Systems Research, vol. 12, pp. 84-92, 2010.

[16] E. Mizraji, “Context-Dependent Associations in Linear Distributed Memories,” Bulletin of Mathematical Biology, vol. 51, pp. 195-205, 1989.

[17] E. Mizraji, “Vector Logics: The Matrix-Vector Representation of Logical Calculus,” Fuzzy Sets and Systems, vol. 50, no. 2, pp. 179-185, 1992.

 

Questions?

Contact us at hdvecsym@gmail.com to get more 

information on HD/VSA