Home-> Sequences->Clusters

Nataraja, a cluster of a thousand suns.


Clustering is a simple idea containing notions of coalescence and merging. Clustering is an concept whose significance as a physcial process I've come to appreciate more of as time in years pass by. Through experience and studying the writings of physicists, particularly Satosi Watanabe and David Bohm, I've grown aware of the dynamic movement in physical objects as a kind of essential expression of the physical laws.

I think that Satosi Watanabe's illustrations [1] of clustering leading to the formula for entropy is beautiful in simplicity. It captures the essence of David Bohm's idea of movement in physical processes; something which I think is deeply fundamental about the universe.

The clustering process can also be seen as a splitting of a whole into parts [2]. This pattern in Nature can be seen in binary tilings.

Piet Mondrian; image:

The binary symmetry can be seen in the binary Yin-Yang pattern of King Wen-Wang's 64 hexagrams [3].

64 Hexagrams; image:

A technique in computer programming called hashing is an efficient way of indexing data using a type of pseudo-random number generator algorithm. The computation of this index uses the data itself, and an algorithm that tries to point this index evenly among the data. With respect to tile coding [4] hashing introduces a direct computational way to get from a branch down to a leaf quickly.

A cluster, with respect to neurons and software, is composed of a list of the automaton's attributes in the form of computer data structures. These representations of synthetic neurons can be transformed or rewritten into primitive input sequences of long strings containing words and phrases.

Phi above describes the neural wave in a cluster layer.

Neuronal Cluster

We can make a conceptual jump to the Prism software now by having the Phi function be the "hashbuckets". The hashbuckets in Prism is used solely for speeding up the computational process and reducing complexity. It has no analog to any real neural process discovered so far. This scheme makes it possible to use a randomizing algorithm like a directional pointer into a neural circuit location.

I now think of the Prism software as a the neural network which consists of template matching lookup tables aided by fast hashing algorithms to the Fourier lexicons in these tables. I've evolved the model for neurons to be analogous to one which looks formally like a wave model because of the way the real brain waves propagate through specific neural channels or pathways. On the other hand, I want to model the semantic network which can recognize language.

We want to do as much preprocessing or "training" of the input data as possible. In terms of wave analysis we try to get a best estimate or description of the input signal as possible by creating a description of it in terms of a simple function. This simple function is a Gabor wave packet. In the grammatical representation using strings, the pattern cluster becomes the characteristic coefficient. The Prism program uses these "pattern descriptor coefficients" to classify an input signal. A group of these coefficients make up the pattern descriptor dictionary.


1. Satosi Watanabe, Pattern Recognition: Human and Mechanical,
Dynamic Coalescence Model - Clustering As Merger, p 160-166.

2. ibid., Clustering as Cleavage, p 166-178 (this is a continuation on the theme of entropy minimization)

3. The following is a quote from:
In his article Explication de l'Arithmetique Binaire (1703) Gottfried Leibniz writes that he has found in the hexagrams a base for claiming the universality of the binary numeral system. He takes the layout of the combinatorial exercise found in the hexagrams to represent binary sequences, ...

4. Richard S. Sutton and Andrew G. Barto web page on:
Reinforcement Learning: An Introduction, section 8.3.2
Tile Coding

Next: Code