Prism is a pattern recognition software. It associates character strings
with pattern formations. For example, image processing features extracted
from a photograph are coded with string sequences. Identification
of pictures is done by constructing a pattern descriptor from string
sequences which are unique to each picture.
Once a sequence pattern is extracted it can be further classified
using a logical filter. Unlike the neural network which contain no
explicit logical or Boolean operatives, PRISM classifies sequential
data according to a rule-based profile dictionary. The image processing
software uses generic edge enchancement techniques from syntactic pattern
recognition, and contour detection.
The Prism software uses the idea of grouping or clustering sets of words
to form an abstract description of the data. The cluster of words is
called a pattern descriptor, PD. The PD is a
representation of the characteristic properties of the data. They lay
the foundation or bases upon which the data is structured or classified.
The PDs may also be thought as the components parts of a large set of
characteristic descriptors of the data.
Representing Images as Strings
PRISM can recognize images which have been rotated or scaled in size,
and can recognize different image elements randomly scattered throughout
the picture. This is because each image is an abstract representation of
the 2-dimensional picture in the form of a sequence of characters like
the mapping of genetic code.
For example, multi-dimensional image recognition is accomplished by
converting the graphical image into an abstract representation. This
representation maybe be implemented by transforming the image picture
elements into sequences of character symbols or strings. One of the most
efficient ways to describe a picture is by drawing the contour outline of
the objects in the picture verses describing the filled objects. For human
beings, recognizing contours comes very naturally, however, for machines
which have not evolved the mechanisms to do this, contours are sets of
pixels which are similar in intensity or color. We call the constant
intensity or color contour an isoterm. Pictures may contain large sets of
pixels with similar intensity and colors. They may be hard to to identify
if you had to disinguish many picture elements in the image. On the other
hand, an image with distinct, sharply contrasting intensity or color patterns
would be easier to recognize. The algorithm to trace the contour of constant
intensity is to propagate around the image in a clockwise direction always
keeping the area of constant intensity on the right-hand side. The trace
is a string representing the propagation direction in each cell or grid.
If the center cell is represented by 0, then there are eight neighborhood
cells surrounding cell 0.
Eight Neighborhood Cell Octal Primitive
For example, a simple rectangle is represented by the string, 234345678781.
The starting point of the trace is P0. It is found by a left to right,
and up to down, scan of the image. The contour of an object is traced
by finding the cell on the leftmost side as you trace the contour keeping
the interior of it on the rightmost side.
Image pattern recognition on a series of handwritten characters can
be greatly enhanced using a neural network. Isotermal sequence of
handwritten characters are used to train a simple neural network. For
a small set of characters, the simple neural network performs perfectly.
The pattern compiler, PC, condenses sets of pattern descriptors,
PDs, into associative lists which are used to perform the pattern
recognition. A PD is composed of a logical sequence of character
strings describing the pattern data. These string sets are transformed
into complex representation of patterns with boolean operators. The PC
condenses many pattern descriptors together in an efficient manner.
The attribute filter, AF, uses the dictionary produced by the PC
module to evaluate the text data and assign attribute codes to the data.
Single words or sequences of words are the elements which PRISM uses
to classify data. These language elements in the pattern descriptors are
the nodes in a semantic network.
logic: A + B.
A: item1, item2 item3, item4;
Data descriptors in a semantic network are linked to other elements in
the network thereby forming relationships. These sets or sequences of words
which are used in logic processing are called the operands. The operands
build the logical relationships in the semantic network.
Pattern descriptors contain attributes of the text data you want to match.
A PD's logical disposition is defined in relationship to all the other
PD's which are nodes in the semantic network. The pattern descriptor is
defined by the following items:
(1) Node name which is an element defined
globally across the entire network.
(2) Logic operands which are nodal elememts
defined system wide.
(3) Description of each operand consisting
of keywords and phrases.
(4) Association consists of words describing
the semantic connections.
A PD or nodal element can represent any abstract description of
an object such as an image, idea or concept. The PD is composed by
using the logic operands to create functions or relationships, complex
attribute descriptions, and associative interconnections.
A specific example is of the form:
logic: A but not B.
definition: A: lion, zebra, hyena;
B: panda, kangarro.
There are numerous ways to interweave the PD elements into computer data
structures like binary trees, hash buckets, graphs and so forth.
Binary lists or trees are good for ordering or sorting a set. However, if
the number of items in the set grows large, the search time increases. We
can very effectively decrease the search time if we used many small sets
instead of one large one. This can be implemented by using hash buckets.
The elements used to identify a word in the text is to first use a hash
map, HMAP, and then funnel the words through a hashbucket index to the
binary keyword trees.
| | Hashmap (6 blocks = 24576 bits)
| | Index (10 blocks)
| | Tree (Variable extension. Nodes for binary
| | trees, keywords, strings, node data
| | lists, etc.)
The index is composed of keyword nodes, KNODEs, pointers. Each entry is
4 bytes. A total of 1270 entries occupy the 10 block index. Index entry
1271 is the pointer to data descriptor node or profile, PNODE, root.
The keyword, continuation and PD nodes have the same primary node
|___| c, contents
|___| l, left node link
|___| r, right node link
|___| w, weight or subtree size
|___| d, data pointer
KNODE CLIST (if phase)
___ ____ ____
|___| ---> |____| clist ---> |____| CNODE
Element A in the logic statement above is the logic operand or ONUM.
The ONUMs are assigned in consecutive ascending order during the
PD parsing phase, and is used as bit numbers in a bitmap, OMAP,
which covers all of the logic operands in the resultant compiled
dictionary. An operand list, OLIST, which contains ONUMs is built
for each keyword or phrase. The keyword list, KLIST, or continuation
phrase, CLIST, contains the pointer to the OLIST. Logic processing using
these structures determines how the bit number in the OMAP will be filled.
The OMAP covers all of the operands in the dictionary. The current maximum
number of operands in the PRISM logic statements can total 8192 because
the operand bitmap, OMAP, is 2 (512 bytes) blocks in size.
|____| operand number, ONUM
|____| next OLIST pointer
The logic routine parses the logic statement in the pattern descriptor,
PD. The logic operand, ONUM, is used as bit numbers in the OMAP. The
OLIST entry consists of ONUMs and the zone number. There are five types
of logic operators:
4. and within n words,
5. not within n words.
The statement "A" and within 5 words "B" is parsed into short
form as "A+5B". Any statement which contains the "within" clause is
called a relational statement. The logic control record, LCR, contains
logical relationships between the operands which are marked in the OMAP.
The OMAP can only validate non-relational statements. If the LCR
contains any relational statement logic, then the RNODES which key
off the ONUM must be examined to determine whether the "within" window
|_c_| ---> ONUM
|_d_| ---> RLIST |___| RTWC 1
|___| RTWC 2
The relational statement is processed by reading sequentially
through the RLIST, comparing each RTWCs in the list for ONUM n to
the RTWCs in the list for ONUM n+1.
Relational statements, RS, are the internal LCR representation of one
relational clause of a PD logic statement. There are two types of
relational statements: ANDs and
NOTs. AND Relational Statement This type of RS has the form (...((A + mB)
+ nC)...) where operands are related by the AND WITHIN N WORDS clause.
The proximity of words are indicated by the m, n, etc. NOT Relational
Statement This type of RS has the form (...((A - mB) - nC)...) where
each operand is related by the NOT WITHIN N WORDS clause.
The Logic Control Record
The Logic Control Record, LCR, containing the logic statement with
the ONUM values for the operand. For example, the statement "A and B and
within 5 words C" is shorten by the parser to the form: "A+(B+5C)
with A, B and C" represented by ONUM values. The algorithm to effect the
"within" clause uses information in the Logic Control Record, LCR.
The relative text word count, RTWC, is used to keep track of the keyword
position relative to other words in the text. Evaluation of the LCRs are
separated into processing non-relative, NR, and relative, R, statements.
The R statements contains a windowing factor between operands. The compound
statement "A+(B+5C)" is made up of an NR and R statement. The
expression B+5C is an R statement. R statements should be processed first.
The operand "B" is looked up in the OMAP. If it is not found, then the
statement (B+5C) is void. If it is found, then the operand "C"
is checked to see if it is marked. If it is the RNODEs for both the B and
C operands are construclists. The RTWC in both the RNODEs are compared to
see if they differ by more than 5. If they do, the statement is void,
otherwise it is satisfied.
An extension of this algorithm would be to associate
simple groups of words in a sentence together. Particularly, a very often
seen pattern in sentences is the grouping of words around verbs. For example,
the verb "likes" will immediately trigger a process operator
which will look for a noun preceding the verb and an object following the
verb. Usually, a noun precedes the verb by only a few words. Similarly,
the object of the action verb follows within a couple of words. Consider,
for example, the sentence "Tom likes airplanes." In the pattern descriptor
example of the node Africa above, suppose we made the statement "Zebra
exists in Africa." This statement would be transposed into "exists
(zebra; Africa)" which parses into a syntactic tree. These "deep
mechanisms" can be built into the pattern descriptor. The outcome could be
a fast parser which relates simple patterns for simple finite machines.
The test file is opened. Each word is hashed to see if it might be
contained in one of the 1270 hashbuckets. If the bit corresponding
to the hash value is set in the HMAP, the KTREE is scanned to check
for the keyword. If the word is there, and the zone(s) match, a bit
is marked in the OMAP using the OLIST pointed to in the KLIST.
The OLIST contains both the non-relative and relative operands. Both type
of operands are marked in the OMAP. Nodes for relative operands are
inserted into the KNODE list.
| | Text |___| RNODE
|_____| _|_ list
/ \ |___|
__/_ _\__ _|_
mark | | | | fill in |___|
OMAP |____| |____| RNODE
The ONUM is coded with a N or R ascii character which indicates that
the operand is either non-relative or relative. The following ascii
string is the operands integral values.
ONUM | N/R Ascii digits |
The RLIST has the following structure.
RNODE c -> onum
l left node pointer
r right node pointer
d -> RLIST
RLIST chain structure
unsigned short number (RTWC)
struct chain next
Each LCR is processed in ascending order of the PNUM. The OMAP is
used to evaluate the LCRs. If the operand in the LCR was marked in
the OMAP in STEP1, then check the OLIST to see if the operand is a
non-relative or relative number. If the operand is non-relative, no
further checks are needed. If it is a relative operand scan through
the RLIST to validate the within word factor. If the LCR of the PD
is validated, then the PD bit number is set in the PMAP.
KLIST ---> OLIST
PDATA ---> LCR
During STEP3, the text assignments are obtained by looking at the bits
set in the PMAP.
The attribution codes are extracted from the PDATA of each PD. These
codes are inserted into the text descriptor record.
The logic statement is of the form: "LOGIC: A and B and within 5 words C".
The parser shortens the statement to the form: "A+(B+5C)".
The parser identifies relative statements within parenthesis. The logic
assignment is evaluated using the operand bit map, OMAP. The PD bit
map, PMAP, is produced as a result of the logic evaluation of the OMAP.
The OMAP is obtained by reading the OLIST in each keyword record. The PD
number bit map, PMAP, is obtained by reading the PLIST in each keyword
Evaluation of the Logic Control Record, LCR
(i) Evaluation of the LCR are separated into processing non-relative
statements, and relative statements. The non-relative, NR, statements
do not contain a windowing factor between operands whereas the relative,
R, statements do.
(ii) The compound statement "A+(B+5C)" is made up of an NR and R statement.
The statement B+5C is a R-statement. The relative statement should be
(iii) Operand "B" is looked up in the OMAP. If it is not found, then
the statement (B+5C) is void. If it is found, then the operand "C" is
checked to see if it is marked. If it is the RNODEs for both the B and C
operands are constructed. The RTWCs are inserted in each of the RNODE
lists. The RTWC in both the RNODEs are compared to see if they differ
by more than 5. If they do, the statement is void. If not, the
statement is satisfied. After each LCR is processed the PD bit
number, PNUM, in the PMAP should be either cleared or left on as an