Difference between revisions of "Directory:Jon Awbrey/Essays/Prospects For Inquiry Driven Systems"
Jon Awbrey (talk | contribs) (→References: begin markup) |
Jon Awbrey (talk | contribs) (→A: markup) |
||
Line 984: | Line 984: | ||
===A=== | ===A=== | ||
− | * Abelson, H. | + | * Abelson, H., and Sussman, G.J., with Sussman, J., ''Structure and Interpretation of Computer Programs'', Foreword by A.J. Perlis, MIT Press, Cambridge, MA, 1985. |
− | |||
− | |||
− | * Aczel, P., | + | * Aczel, P., ''Non-Well-Founded Sets'', CSLI Lecture Notes 14, Center for the Study of Language and Information, Stanford, CA, 1988. |
− | |||
− | |||
− | * Aho, A.V., Hopcroft, J.E., | + | * Aho, A.V., Hopcroft, J.E., and Ullman, J.D., ''The Design and Analysis of Computer Algorithms'', Addison-Wesley, Reading, MA, 1974. |
− | |||
− | |||
− | * Ait-Kaci, H., | + | * Ait-Kaci, H., "Type Subsumption as a Model of Computation", in (Kerschberg, 1986). |
− | |||
− | |||
− | * Albus, J.S., | + | * Albus, J.S., ''Brains, Behavior, and Robotics'', BYTE Books, Peterborough, NH, 1981. |
− | |||
− | |||
− | * Alkon, D.L., | + | * Alkon, D.L., ''Memory Traces in the Brain'', Cambridge University Press, Cambridge, UK, 1987. |
− | |||
− | |||
− | * Allen, J., | + | * Allen, J., ''Anatomy of LISP'', McGraw-Hill, New York, NY, 1978. |
− | |||
− | |||
− | * Amit, D.J., | + | * Amit, D.J., ''Modeling Brain Function : The World of Attractor Neural Networks'', Cambridge University Press, Cambridge, UK, 1989. |
− | |||
− | |||
− | |||
− | * Andersen, H.C., | + | * Andersen, H.C., ''The Complete Hans Christian Andersen Fairy Tales'', L. Owens (ed.), H.B. Paull, J. Hersholt, and H.O. Sommer (trans.), originally published 1883, 1885, 1895. Avenel Books, New York, NY, 1981. |
− | |||
− | |||
− | |||
− | * Anderson, D.R., | + | * Anderson, D.R., ''Creativity and the Philosophy of C.S. Peirce'', Martinus Nijhoff Publishers, Dordrecht, Netherlands, 1987. |
− | |||
− | |||
− | * Anderson, J.R. (ed.), | + | * Anderson, J.R. (ed.), ''Cognitive Skills and Their Acquisition'', Lawrence Erlbaum Associates, Hillsdale, NJ, 1981. |
− | |||
− | |||
− | * Anderson, J.R., | + | * Anderson, J.R., ''The Architecture of Cognition'', Harvard University Press, Cambridge, MA, 1983. |
− | |||
− | |||
− | * Anderson, J.R., | + | * Anderson, J.R., ''Cognitive Psychology and Its Implications'', 3rd edition, W.H. Freeman, New York, NY, 1990. |
− | |||
− | |||
− | * Anderson, J.R. | + | * Anderson, J.R., and Bower, G.H., ''Human Associative Memory'', V.H. Winston and Sons, Washington, D.C., 1973. |
− | |||
− | |||
− | * Anosov, D.V. | + | * Anosov, D.V., and Arnold, V.I. (eds.), ''Dynamical Systems 1 : Ordinary Differential Equations and Smooth Dynamical Systems', Springer-Verlag, Berlin, 1988. |
− | |||
− | Ordinary Differential Equations and Smooth Dynamical Systems', | ||
− | |||
− | * Arbib, M.A., | + | * Arbib, M.A., ''The Metaphorical Brain : An Introduction to Cybernetics as Artificial Intelligence and Brain Theory'', John Wiley and Sons, New York, NY, 1972. |
− | |||
− | |||
− | |||
− | * Arbib, M.A., | + | * Arbib, M.A., ''Brains, Machines, and Mathematics'', 1st ed. 1964. 2nd ed., Springer-Verlag, New York, NY, 1987. |
− | |||
− | |||
− | * Arbib, M.A., | + | * Arbib, M.A., ''The Metaphorical Brain 2 : Neural Networks and Beyond'', John Wiley & Sons, New York, NY, 1989. |
− | |||
− | |||
− | * Arbib, M.A. | + | * Arbib, M.A., and Manes, E.G., ''Arrows, Structures, and Functors : The Categorical Imperative'', Academic Press, New York, NY, 1975. |
− | |||
− | |||
− | * Arnold, V.I., | + | * Arnold, V.I., ''Ordinary Differential Equations'', R.A. Silverman (ed. and trans.), MIT Press, Cambridge, MA, 1973. |
− | |||
− | |||
− | |||
− | * Arnold, V.I., | + | * Arnold, V.I., ''Catastrophe Theory'', 2nd edition, G.S. Wasserman and R.K. Thomas (trans.), Springer-Verlag, Berlin, 1986. |
− | |||
− | |||
− | |||
− | * Arnold, V.I., | + | * Arnold, V.I., ''Mathematical Methods of Classical Mechanics'', 2nd ed., K. Vogtmann and A. Weinstein (trans.), Springer-Verlag, New York, NY, 1989. |
− | |||
− | |||
− | |||
− | * Arnold, V.I., | + | * Arnold, V.I., ''The Theory of Singularities and Its Applications'', Accademia Nazionale dei Lincei and Scuola Normale Superiore, Pisa, 1991. |
− | |||
− | Accademia Nazionale dei Lincei | ||
− | |||
− | * Arrowsmith, D.K. | + | * Arrowsmith, D.K., and Place, C.M., ''Ordinary Differential Equations : A Qualitative Approach with Applications'', Chapman and Hall, London, UK, 1982. |
− | |||
− | |||
− | |||
− | * Ascher, M. | + | * Ascher, M., and Ascher, R., ''Code of the Quipu : A Study in Media, Mathematics, and Culture', University of Michigan Press, Ann Arbor, MI, 1981. |
− | |||
− | |||
− | * Ash, R.B., | + | * Ash, R.B., ''Information Theory'', 1st published, John Wiley and Sons, New York, NY, 1965. Reprinted, Dover Publications, Mineola, NY, 1990. |
− | |||
− | |||
− | |||
− | * Ashby, W.R., | + | * Ashby, W.R., ''An Introduction to Cybernetics'', Chapman and Hall, London, UK, 1956. Methuen and Company, London, UK, 1964. |
− | |||
− | |||
− | |||
− | * Awbrey, J. | + | * Awbrey, J., and Awbrey, S., "Exploring Research Data Interactively. Theme One : A Program of Inquiry", pp. 9–15 in ''Proceedings of the Sixth Annual Conference on Applications of Artificial Intelligence and CD-ROM in Education and Training'', Society for Applied Learning Technology, Washington, DC, August 22–24, 1990. |
− | |||
− | |||
− | |||
− | |||
− | |||
− | * Awbrey, S. | + | * Awbrey, S., and Awbrey, J., "An Architecture for Inquiry : Building Computer Platforms for Discovery", pp. 874–875 in ''Proceedings of the Eighth International Conference on Technology and Education'', G. McKye and D. Trueman (eds.), Toronto, Ontario, May 8–12, 1991. |
− | |||
− | |||
− | |||
− | |||
− | * Awbrey, S. | + | * Awbrey, S., and Awbrey, J., "Interpretation as Action : The Risk of Inquiry", presented at ''The Eleventh International Human Science Research Conference'', Oakland University, Rochester, MI, June 9–13, 1992. Abstract in the ''Proceedings'', pp. 58–59. |
− | |||
− | |||
− | |||
− | |||
===B=== | ===B=== |
Revision as of 16:26, 28 January 2008
Author's Note. The initial portion of this essay is the "Interest Statement" that I submitted as a part of my application to graduate school in the Systems Engineering doctoral program at Oakland University, Rochester, Michigan in September 1992.
Systems Engineering : Interest Statement
Jon Awbrey, September 1, 1992
It seemed useful, as a way of sharpening my focus on goals ahead, to write up an extended statement of current research interests and directions. I realize that many features of this sketch are likely to change as details are clarified and as new experience is gained. As an alternative to the longer essay, an abstract is provided as a minimal statement.
Abstract
In briefest terms my project is to develop mutual applications of systems theory and artificial intelligence to each other. In the current phase of investigation I am taking a standpoint in systems theory, working to extend its concepts and methods to cover increasingly interesting classes of intelligent systems. A natural side-project is directed toward improving the economy of effort by unifying a selection of tools being used in these two fields. My instrumental focus is placed on integrating the methods of differential geometry with the techniques of logic programming. I will attempt to embody this project in the form of computer-implemented connections between geometric dynamics and logical AI, and I will measure its success by the extent and usefulness of this realization.
Description of Current and Proposed Work
I intend to focus primarily on the research area of artificial intelligence. In my work of the past few years I have sought to apply the framework of systems theory to the problems of AI. I believe that viewing intelligent systems as dynamic systems can provide a unifying perspective for the array of problems and methods that currently constitutes the field of AI.
The return benefits to systems theory would be equally valuable, enabling the implementation of more intelligent software for the study of complex systems. The engineering of this software could extend work already begun in simulation modeling (Widman, Loparo, & Nielsen, 1989), (Yip, 1991), nonlinear dynamics and chaos (Rietman, 1989), (Tufillaro, Abbott, & Reilly, 1992), and expert systems (Bratko, Mozetic, & Lavrac, 1989), with increasing capabilities for qualitative inference about complex systems and for intelligent navigation of dynamic manifolds (Weld & de Kleer, 1990).
1. Background
In my aim to connect the enterprises of systems theory and artificial intelligence I recognize the following facts. Although the control systems approach was a prevailing one in the early years of cybernetics and important tributaries of AI have sprung from its sources, e.g. (Ashby, 1956), (Arbib, 1964, '72, '87, '89), (Albus, 1981), the two disciplines have been pursuing their separate evolutions for many years now. The intended scope of AI, overly ambitious or otherwise, forced it to break free of early bonds, shifting for itself beyond the orbit of its initial paradigms and the cases that conditioned its origin.
A sample of materials from transition phases of AI's developmental crises may be found in (Shannon & McCarthy, 1956), (Wiener, 1961, 1964), (Sayre & Crosson, 1963), (Young, 1964, 1978), (McCulloch, 1965), (Cherry, 1966), (MacKay, 1969). Any project to resolder the spun-off domains of AI and systems theory will probably resort to a similar flux. In the course of this investigation it was surprising at first to see these old issues rise again, but the shock has turned to recognition. A motion to reinstate thought with action, to amalgamate intelligence with dynamics in the medium of a computational reconstruction, will naturally revert to the neighborhoods of former insights and ride the transits of formative ideas. It is only to be expected that this essay will run across many of the most intersected rights-of-way, if not traveling down and tripping over the very same ruts, then very likely switching onto any number of parallel tracks.
Informed observers may see good reasons for maintaining the separation of perspectives between AI and systems theory. However, the proposition that these two realms share a common fund of theory and practice, not only historically but one that demands and deserves a future development, is an assertion that motivates my efforts here. Consequently, I thought that a justification of my objectives might be warranted. In light of these facts I have written up this extended rationale and informal review of literature, in hopes of making a plausible case for attempting this work.
Rudiments and Horizons
There are harvests of complexity which sprout from the earliest elements and the simplest levels of the discussion that follows. I will try to clarify a few of these issues in the process of fixing terminology. This may create an impression of making much ado about nothing, but it is a good idea in computational modeling to forge connections between the complex, the subtle, and the simple -- even to the point of forcing things a bit. Further, I will use this space to profile the character and the consistency of the grounds being tended by systems theory and AI. Finally, I will let myself be free to mention features of this work that connect with the broader horizons of human cultivation. Although these concerns are properly outside the range of my next few steps, I believe that it is important to be aware of our bearings: to know what our practice depends upon, to think what our activity impacts upon.
1.1. Topos : Rudiments and Immediate Resources
This inquiry is guided by two questions that express themselves in many different guises. In their most laconic and provocative style, self-referent but not purely so, they typically bring a person to ask:
- Why am I asking this question?
- How will I answer this question?
Cast in with a pool of other questions these two often act as efficient catalysts of the inquiry process, precipitating and organizing what results. Expanded into general terms these queries become tantamount to asking:
- What accumulated funds and immediate series of experiences lead up to the moment of surprise that causes the asking of a question?
- What operational resources and planned sequences of actions lead on to the moment of solution that allows the ending of a problem?
Phrased in systematic terms, they ask yet again:
- What capacity enables a system to exist in states of question?
- What competence enables a system to exit from its problem states?
1.1.1. Systematic Inquiry
In their underlying form and tone these questions sound a familiar tune. Their basic tenor was brought to a pitch of perfection by Immanuel Kant, in a canon of inquiry that exceeds my present range. Luckily, my immediate aim is much more limited and concrete. For the present it is only required to ask: How are systematic inquiry and knowledge possible? That is, how are inquiry and knowledge to be understood and implemented as functions of systems and how ought they be investigated by systems theory? In short: How can systems have knowledge as a goal? This effort is constrained to the subject of systems and the frame of systems theory. It will attempt to give system-theoretic analyses of concepts and capacities that can be recognized as primitive archetypes, at least, of those that AI research pursues with avid interest and aspires one day to more fully capture. By limiting questions about the possibility of inquiry and knowledge to the subject and scope of systems theory there may be reason to hope for a measure of practical success.
Kant's challenge is this: To say precisely how it is possible, in procedural terms, for contingent beings and empirical creatures, physically embodied and even engineered systems, to move toward or synthetically acquire forms of knowledge with an a priori character, that is, declarative statements with a global application to all of the situations that these agents might pass through. It is not feasible within the scope of systems theory and engineered systems to deal with the larger question: Whether these forms of knowledge are somehow necessary laws, applying to all conceivable systems and universes. But it does seem reasonable to ask how a system's trajectory might intersect with states whose associated knowledge components have a wider application to the system's manifold as a whole.
1.1.2. Intelligence, Knowledge, Execution
Intelligence, for my purposes, is characterized as a technical ability of choice in a situation as represented. It is the ability to pick out a line on a map, to find a series of middle terms making connections between represented positions. In the situation that commonly calls it out, intelligence is faced with two representations of position. This pair of pointers to points on a map are typically interpreted as indices of current and desired positions. The two images are symbols or analogues of the actual site and the intended goal of a system. They themselves exist in a space that shadows the dynamic reality of the agent involved. But the dynamic reality of the intelligent agent forms a manifold of states that subsists beneath its experience and becomes manifest only gradually and partially in the observations of that agent. It is among the states of this basic manifold that all the real sites and goals of the agent are located.
The concept of intelligence laid out here has been abstracted from two capacities that it both requires and supports: knowledge and execution. Knowledge is a fund of available representations, a glove-box full of maps. Execution is an array of possible actions and the power of performing them, an executive ability that directs motor responses in accord with the line that is picked out on the map. To continue the metaphor, execution is associated with the driving-gloves, which must be sorted out from the jumble of maps and used to get a grip on the mechanisms of performance and control that are capable of serving in order to actualize choices.
1.1.2.1. Vector Field and Control System
Dynamically, as in a control system, intelligence is a decision process that selects an indicator of a tangent vector to follow at a point or a descriptor of a corresponding operator to apply at a point. The pointwise indicators or descriptors can be any relevant signs or symbolic expressions: names, code numbers, address pointers, or quoted phrases. A "vector field" attaches to each point of phase space a single tangent vector or differential operator. The "control system" is viewed as a ready generalization of a vector field, in which whole sets of tangent vectors or differential operators are attached to each point of phase space. The "strategy" or "policy problem" of a controller is to pick out one of these vectors to actualize at each point in accord with reaching a given target or satisfying a given property. An individual control system is specified by information attached to each dynamic point that defines a subset of the tangent space at that point. This pointwise defined subset is called "the indicatrix of permissible velocities" by (Arnold, 1986, chapt. 11).
In the usage needed for combining AI and control systems to obtain autonomous intelligent systems, it is important to recognize that the pointwise indicators and descriptors must eventually have the character of symbolic expressions existing in a language of non-trivial complexity. Relating to this purpose, it does not really matter if their information is viewed as represented in the states of discrete machines or in the states of physical systems to which real and complex valued measurements are attributed. What makes the system of indications and descriptions into a language is that its elements obey specific sets of axioms that come to be recognized as characterizing interesting classes of symbol systems. Later on I will indicate one very broad definition of signs and symbol systems that I favor. I find that this conception of signs and languages equips the discussion of intelligent systems with an indispensable handle on the levels of complexity that arise in their description, analysis, and clarification.
1.1.2.2. Fields of Information and Knowledge
Successive extensions of the vector field concept can be achieved by generalizing the form of pointwise information defined on a phase space. A subset of a tangent space at a point can be viewed as a boolean-valued function there, and as such can be generalized to a probability distribution that is defined on the tangent space at that point. This type of probabilistic vector field or "information field" founds the subject of stochastic differential geometry and its associated dynamic systems. An alternate development in this spirit might embody pointwise information about tangent vectors in the form of linguistic expressions and ultimately in knowledge bases with the character of empirical summaries or logical theories attached to each point of a phase space.
It is convenient to bring together under the heading of a "knowledge field" any form of pointwise information, symbolic or numerical, concrete or theoretical, that constrains the set of pointwise tangent vectors defined on a phase space. In computational settings this information can be procedural and declarative program code augmented by statistical and qualitative data. In computing applications a knowledge field acquires an aptly suggestive visual image: bits and pieces of code and data elements sprinkled on a dynamic surface, like bread crumbs to be followed through a forest. The rewards and dangers of so literally a "distributed" manner of information storage are extremely well-documented (Hansel & Gretel, n.d.), but there are times when it provides the only means available.
1.1.2.3. The Trees, The Forest
A sticking point of the whole discussion has just been reached. In the idyllic setting of a knowledge field the question of systematic inquiry takes on the following form:
- What piece of code should be followed in order to discover that code?
It is a classic catch, whose pattern was traced out long ago in the paradox of Plato's Meno. Discussion of this dialogue and of the task it sets for AI, cognitive science, education, including the design of intelligent tutoring systems, can be found in (H. Gardner, 1985), (Chomsky, 1965, '72, '75, '80, '86), (Fodor, 1975, 1983), (Piattelli-Palmarini, 1980), and in (Collins & Stevens, 1991). Though it appears to mask a legion of diversions, this text will present itself at least twice more in the current engagement, both on the horizon and at the gates of the project to fathom and to build intelligent systems. Therefore, it is worth recalling how this inquiry begins. The interlocutor Meno asks:
Can you tell me, Socrates, whether virtue can be taught, or is acquired by practice, not teaching? Or if neither by practice nor by learning, whether it comes to mankind by nature or in some other way? (Plato, Meno, p. 265).
Whether the word "virtue" (arete) is interpreted to mean virtuosity in some special skill or a more general excellence of conduct, it is evidently easy, in the understandable rush to "knowledge", to forget or to ignore what the primary subject of this dialogue is. Only when the difficulties of the original question, whether virtue is teachable, have been moderated by a tentative analysis does knowledge itself become a topic of the conversation. This hypothetical mediation of the problem takes the following tack: If virtue were a kind of knowledge, and if every kind of knowledge could be taught, would it not follow that virtue could be taught?
For the present purpose, it should be recognized that this "trial factorization" of a problem space or phenomenal field is an important intellectual act in itself, one that deserves attention in the effort to understand the competencies that support intelligent functioning. It is a good question to ask just what sort of reasoning processes might be involved in the ability to find such a middle term, as is served by "knowledge" in the example at hand. Generally speaking, interest will reside in a whole system of middle terms, which might be called a "medium" of the problem domain or the field of phenomena. This usage makes plain the circumstance that the very recognition and expression of a problem or phenomenon is already contingent upon and complicit with a particular set of hypotheses that will inform the direction of its resolution or explanation.
One of the chief theoretical difficulties that obstructs the unification of logic and dynamics in the study of intelligent systems can be seen in relation to this question of how an intelligent agent might generate tentative but plausible analyses of problems that confront it. As described here, this requires a capacity for identifying middle grounds that ameliorate or mollify a problem. This facile ability does not render any kind of demonstrative argument to be trusted in the end and for all time, but is a temporizing measure, a way of locating test media and of trying cases in the media selected. It is easy to criticize such practices, to say that every argument should be finally cast into a deductively canonized form, harder to figure out how to live in the mean time without using such half-measures of reasoning. There is a line of thinking, extending from this reference point in Plato through a glancing remark by Aristotle to the notice of C.S. Peirce, which holds that the form of reasoning required to accomplish this feat is neither inductive nor deductive and reduces to no combination of the two, but is an independent type.
Aristotle called this form of reasoning "apagogy" (Prior Analytics, 2.25) and it was variously translated throughout the Middle Ages as "reduction" or "abduction". The sense of "reduction" here is just that by which one question or problem is said to reduce to another, as in the AI strategy of goal reduction. Abductive reasoning is also involved in the initial creation or apt generation of hypotheses, as in diagnostic reasoning. Thus, it is natural that abductive reasoning has periodically become a topic of interest in AI and cognitive modeling, especially in the effort to build expert systems that simulate and assist diagnosis, whether in human medicine, auto mechanics, or electronic trouble-shooting. Recent explorations in this vein are exemplified by (Peng & Reggia, 1990) and (O'Rorke, 1990).
But there is another reason why the factorization problem presents an especially acute obstacle to progress in the system-theoretic approach to AI. When the states of a system are viewed as a manifold it is usual to imagine that everything factors nicely into a base manifold and a remainder. Smooth surfaces come to mind, a single clear picture of a system that is immanently good for all time. But this is how an outside observer might see it, not how it appears to the inquiring system that is located in a single point and has to discover, starting from there, the most fitting description of its own space. The proper division of a state vector into basic and derivative factors is itself an item of knowledge to be discovered. It constitutes a piece of interpretive knowledge that has a large part in determining exactly how an agent behaves. The tentative hypotheses that an agent spins out with respect to this issue will themselves need to be accommodated in a component of free space that is well under control. Without a stable theater of action for entertaining hypotheses, an agent finds it difficult to sustain interest in the kinds of speculative bets that are required to fund a complex inquiry.
States of information with respect to the placement of this fret or fulcrum can vary with time. Indeed, it is a goal of the knowledge directed system to leverage this chordal node toward optimal possibilities, and this normally requires a continuing interplay of experimental variations with attunement to the results. Therefore it seems necessary to develop a view of manifolds in which the location or depth of the primary division that is effective in explaining behavior can vary from moment to moment. The total phenomenal state of a system is its most fundamental reality, but the way in which these states are connected to make a space, with information that metes out distances, portrays curvatures, and binds fibers into bundles — all this is an illusion projected onto the mist of individual states from items of code in the knowledge component of the current state.
The mathematical and computational tools needed to implement such a perspective goes beyond the understanding of systems and their spaces that I currently have in my command. It is considered bad form for a workman to blame his tools, but in practical terms there continues to be room for better design. The languages and media that are made available do, indeed, make some things easier to see, to say, and to do than others, whether it is English, Pascal (Wirth, 1976), or Hopi (Whorf, 1956) that is being spoken. A persistent attention to this pragmatic factor in epistemology will be necessary to implement the brands of knowledge-directed systems whose intelligence can function in real time. To provide a computational language that can help to clarify these problems is one of the chief theoretical tasks that I see for myself in the work ahead.
A system moving through a knowledge field would ideally be equipped with a strategy for discovering the structure of that field to the greatest extent possible. That ideal strategy is a piece of knowledge, a segment of code existing in the knowledge space of every point that has this option within its potential. Does discovery mark only a different awareness of something that already exists, a changed attitude toward a piece of knowledge already possessed? Or can it be something more substantial? Are genuine invention and proper extensions of the shared code possible? Can intelligent systems acquire pieces of knowledge that are not already in their possession, or in their potential to know?
If a piece of code is near at hand, within a small neighborhood of a system's place in a knowledge field, then it is easy to see a relationship between adherence and discovery. It is possible to picture how crumbs of code could be traced back, accumulated, and gradually reassembled into whole slices of the desired program. But what if the required code is more distant? If a system is observed in fact to drift toward increasing states of knowledge, does its disposition toward knowledge as a goal need to be explained by some inherent attraction of knowledge? Do potential fields and propagating influences have to be imagined in order to explain the apparent action at a distance? Do massive bodies of knowledge then naturally form, and eventually come to dominate whole knowledge fields? Are some bodies of knowledge intrinsically more attractive than others? Can inquiries get so serious that they start to radiate gravity?
Questions like these are only ways of probing the range of possible systems that are implied by the definition of a knowledge field. What abstract possibility best describes a given concrete system is a separate, empirical question. With luck, the human situation will be found among the reasonably learnable universes, but before that hope can be evaluated a lot remains to be discovered about what, in fact, may be learnable and reasonable.
1.1.3. Reality and Representation
A sidelight that arose in the characterization of intelligence is recapitulated here. Beginning with experience described in phenomenal terms, the possibility of objective knowledge appears to depend on a certain factorization or decomposition of the total manifold of experience into a pair of factors: a fundamental, original, objective, or base factor and a representational, derivative, subjective, or free factor. To anticipate language that will be settled on later, the total manifold of phenomenal experience is said to factor into a bundle of fibers. The bundle structure corresponds to the base factor and the fiber structure corresponds to the free factor of the decomposition. Fundamental definitions and theorems with respect to fiber bundles are given in (Auslander & MacKenzie, ch. 9). Discussions of fiber bundles in physical settings are found in (Burke, p. 84-108) and (Schutz, 1980). Concepts of differential geometry directed toward applications in control engineering are treated in (Doolin & Martin, ch. 8). An ongoing project in AI that uses simple aspects of fiber methods to build cognitive models of physics comprehension is described in (Bundy & Byrd, 1983).
An exorbitant number of words has just been wrapped around the seemingly obvious and innocuous distinction between a reality and a representation. Of course, whole books have been written on the subjects of reality and representation, though not necessarily in that order (Putnam, 1988). The topic is especially debated in the philosophy of science, e.g. (Duhem, 1914), (Russell, 1956), (Van Fraassen, 1980), (Hacking, 1983), (Salmon, 1990), and various individual essays in (Quine, 1960, '69, '74, '76, '80, '81). Much of what is said there about the relation of theories to realities has a bearing on the relation of simulation models and AI representations to their underlying realities (Halpern, 1986), (Ginsberg, 1987). A useful historical perspective on the problem of scientific knowledge in relation to the world is supplied by (Losee, 1980). The history of an alternative tradition is treated in (Prasad, 1958).
These questions go back to the beginnings of philosophy. Plato's dialogue The Sophist is one early inquiry that has a special relevance, in its substance and method, for the current context. There is a certain type of recursive and paradigmatic character to the strategy of its analysis. In its quest after the nature of the true philosopher it proceeds in manner that strikingly foreshadows modern debates about the Turing test. What spirit can winnow the grain from the chaff, what screen can sift the fine from the coarse, what threshold can keep the spirit in the letter? These may indeed have been our kind's earliest decision problems. Modern commentary on this dialogue and the context of its times may be found in (Plato/Benardete, 1986), (Kerferd, 1981), (Rosen, 1983), and (Lanigan, 1986).
There is a reason for the seeming excess of labels and packaging invested around this distinction between reality and representation. The razor that would function as advertised and earn its patent to separate sharp realities from fuzzy impressions is not a toy to be wielded lightly. Until it is certain just where to cut, other means may be required to manage, organize, store, and control the fringes of a systematic imagination. It is my hope to turn this measure of redundancy to an informative purpose later on when the distinction begins to seem both more elusive and more vital. An uncertainty in this dimension can become positively noisy in its interference with the observation and communication of static situations and potentially noxious in its undermining of a system's capacity for dynamic control. The difficulty to be faced is this: There can be genuine questions about what actually forms the best factorization of the total manifold into a base space and a remainder.
The most fitting factorization is not necessarily given in advance, though any number of possibilities may be tried out initially. The most suitable distinction between phenomenal reality and epiphenomenal representation can be a matter determined by empirical or pragmatic factors. Of course, with any empirical investigation there can be logical and mathematical features that place strong constraints on what is conceivably possible, but the risk remains that the proper articulation may have to be discovered through empirical inquiry carried on by a systematic agent delving into its own world of states without absolutely dependable lines as guides. The appropriate factorization, ideally the first item of description, may indeed be the chief thing to find out about a system and the principal thing to know about the total space of phenomena it manifests, and yet persist in being the last fact to be fully settled.
1.1.3.1. Levels of Analysis
The primary factorization is typically only the first in a series of analytic decompositions that are needed to fully describe a complex domain of phenomena. The question about proper factorization that this discussion has been at pains to point out becomes compounded into a question about the reality of all the various distinctions of analytic order. Do the postulated levels really exist in nature, or do they arise only as the artifacts of our attempts to mine the ore of nature? An early appreciation of the hypothetical character of these distinctions and the post hoc manner of their validation is recorded in (Chomsky, 1975, p. 100).
In linguistic theory, we face the problem of constructing this system of levels in an abstract manner, in such a way that a simple grammar will result when this complex of abstract structures is given an interpretation in actual linguistic material.
Since higher levels are not literally constructed out of lower ones, in this view, we are quite free to construct levels of a high degree of interdependence, i.e., with heavy conditions of compatibility between them, without the fear of circularity that has been so widely stressed in recent theoretical work in linguistics.
To summarize the main points: A system of analytic levels is a theoretical unity, to be judged as a whole for the insight it provides into a whole body of empirical data mediately gathered. A level within such a system is really a perspective taken up by the beholder, not a cross-section slicing through the phenomenon itself. Although there remains an ideal of locating natural articulations, the theory is an artificial device in relation to the nature it explains. Facts are made, not born, and already a bit factitious in being grasped as facts.
The language of category theory preserves a certain idiom to express this aspect of facticity in phenomena (MacLane, 1971), which incidentally has impacted the applied world in the notions of a database view (Kerschberg, 1986) and a simulation viewpoint (Widman, Loparo, & Nielsen, 1989). In this usage a level constitutes a functor, that is, a particular way of viewing a whole category of objects under study. For direct applications of category theory to abstract data structures, computable functions, and machine dynamics see (Arbib & Manes, 1975), (Barr & Wells, 1985, 1990), (Ehrig, et al., 1985), (Lambek & Scott, 1986), and (Manes & Arbib, 1986). A proposal to extend the machinery of category theory from functional to relational calculi is developed in (Freyd & Scedrov, 1990).
1.1.3.2. Base Space and Free Space
The base space is intended to capture the fundamental dynamic properties of a system, those aspects to which the other dynamic properties may be related as derivative quantities, free parameters, or secondary perturbances. The remainder consists of tangential features. For simple physical systems this second component contains derivative properties, like velocity and momentum, that are represented as elements of pointwise tangent spaces. In an empirical sense these features do not properly belong to a single point but are attributed to a point on account of measurements made over several points. Of course, from the dual perspective it is momentum that is real and position that is illusion, but that does not affect the point in question, which concerns the uncertainty of their discernment, not the fact of their complementarity.
1.1.3.3. Unabridgements
Part of my task in the projected work is to make a bridge, in theory and practice, from simple physical systems to the more complex systems, also physical but in which new orders of features have become salient, that begin to exhibit what is recognized as intelligence. At the moment it seems that a good way to do this is to anchor the knowledge component of intelligent systems in the tangent and co-tangent spaces that are founded on the base space of a dynamic manifold. This means finding a place for knowledge representations in the residual part of the initial factorization. This leads to a consideration of the questions: What makes the difference between these supposedly different factors of the total manifold? What properties mark the distinction as commonly intended?
From a naturalistic perspective everything falls equally under the prospective heading of physis, signifying nothing more than the first inklings of natural process, though not everything is necessarily best explained in detail by those fragments of natural law which are currently known to us. So it falls to any science that pretends to draw a distinction between the more and the less basic physics to describe it within nature and without trying to get around nature. In this context the question may now be rephrased: What natural terms distinguish every system's basic processes from the kinds of coping processes that support and crown the intelligent system's personal copy of the world? What protocols attach to the sorting and binding of these two different books of nature? What colophon can impress the reader with a need to read them? What instinct can motivate a basis for needing to know?
1.1.4. Components of Intelligence
In a complex intelligent system a number of relatively independent modules will emerge as utilities to subserve the purpose of knowledge acquisition. Chief among these are the faculties of memory and imagination, which operate in closely coordinated representation spaces of the manifold, and may be no more than complementary ways of managing the same turf. These capacities amplify the sensitivity and selectivity of intelligence in the system. They support the transcription of momentary experience into records of its passing. Finally, they collate the fragmentary notes and diverse notations of dynamic experience and catalyze their conversion into unified forms and organizations of rational knowledge.
1.1.4.1. Imagination
The intellectual factor or knowledge component of a system is usually expected to have a certain quality of mercy, that is, to involve actions which are Reversible, Assuredly, Immediately, Nearly. Even though every action obeys physical and thermodynamic constraints, processes that suit themselves to being used for knowledge representation must exhibit a certain forgiveness. It must be possible to move pointers around on a map without irretrievably committing forces on the plain of battle. Actions carried out in the image space should not incur too great a pain or price in terms of the time and energy they dissipate. In sum, a virtue of symbolic operations is that they be as nearly and assuredly reversible as possible. This "virtual" construction, as usual, declares a positively oriented proportion: operations are useful as symbolic transformations in proportion to their exact and certain reversibility.
Imagination's development of elaborate and seemingly superabundant resources of imagery is actually governed by strict obedience to the cybernetic law of requisite variety, which determines that only variety in the responses of a regulator can counter the flow of variety from disturbances to essential variables, the qualities the system must act to keep nearly constant in order to survive in its current and preferred form of being (Ashby, ch. 10 & 11). Aristotle, thinking that the human brain was too flimsy and spongy a material to embody the human intellect, thought it might be useful as a kind of radiator to cool the blood. This is actually a pretty good theory, I think, if it is recognized that the specialty of the brain is to regulate essential variables of human existence on a global scale through the discovery of natural laws. To view the brain as a theorem-o-stat is then fairly close to the mark.
1.1.4.2. Remembrance
The purpose of memory, on the other hand, requires states that can be duly constituted in fashions that are diligently preserved by the normal functioning of the system. The expectation must be faithfully met that such states will be maintained until deliberately reformed by due processes. Intelligent systems cannot afford to indiscriminately confound the imperatives to forgive and forget. Reversibility applies to exploratory operations taking place interior to the dynamic image. An irreversible recording of events is generally the best routine strategy to keep in play between outer and inner dynamics. But reversibility and its opposite interact in subtle ways even to maintain the stability of stored representations. After all, when physical records are disturbed by extraneous noise without the mediation of attention's due process, the ideal system would work to immediately reverse these unintentional distortions and ungraceful degradations of its memories. To abide their time, memories should lie in state spaces with stable equilibria, resting at the bottoms of deep enough potential wells to avoid being tossed out by minor quakes.
A collection of classic and recent papers on the significance of reversibility questions for information acquisition and computational intelligence is gathered together in (Leff & Rex, 1990). The bearing of irreversible processes on the complex dynamics of physical systems is treated in (Prigogine, 1980). Monographs on the physics of time asymmetry and the time direction of information are found in (Davies, 1977) and (Reichenbach, 1991). Relationships between periodicity properties of formal languages and ultimately periodic behavior of finite automata are discussed in (Denning, Dennis, & Qualitz, sec. 6.4) and (Lallement, sec. 7.1). Existential and cultural reflections on the themes of return, repetition, and reconstruction are presented in Kierkegaard, 1843) and (Eliade, 1954). The topographic, potential-surface, or landscape metaphor for system memories, e.g. as elaborated in the self-organizing "memory surface" model of (de Bono, 1969), was influential in the early history of AI and continues to have periodic reincarnations, e.g. (Amit, sec. 2.3). Distributed models of information storage emphasizing sequential memory and reconstructive retrieval are investigated in (Albus, 1981) and (Kanerva, 1988).
Work in cognitive science and AI, in the history of its ongoing revolutions and partial resolutions, has shown a recurring tendency to lose sight of the breadth and power that originally drew it to examine such faculties as memory and imagination. The fact that this form of forgetfulness happens so often is already an argument that there may be some reason for it, in the sociology and psychology of science if not in the nature of the subject matter. No matter what the cause the pattern is seen again and again. The spirit of the original quest that imparts a certain verve to the revolutionary stages of a field's development repeatedly devolves into a kind of volleyball game, an exercise engaged in by opposing parties to settle, by rhetorical hook or strategic crook, which side of a conceptual net the whole globe in contention shall be judged to rest. But most of the purportedly world-shattering distinctions are rendered ineffective by the lack of any operational, much less consensual, definitions. The most heated border disputes arise over concepts for which no clear agreement exists even as to the proper direction of inquiry, whether the form of argument demanded ought to be working from a definition or groping toward a definition of the concept at issue.
It may be inevitable as a natural part of the annealing process of any specialized instrument of science to periodically enter phases of chafing over indeterminate trifles. But it remains a good idea to preserve a few landmarks sighting on the initial aims and the original goals of the inquiry. With respect to imagination, memory, and their interaction within the media of description and expression, a wide field of illumination on the expanses rolling out from under their natural scope is cast by the following sources: (Sartre, 1948), (Yates, 1966), and (Krell, 1990). The critique of pragmatism for "differences that don't make a difference" is legend, e.g. (James, 1907). The form of reasoning that argues toward a definition is bound up with the question of abductive reasoning as described by C.S. Peirce (CE, CP, NE). An interesting, recent discussion of the problem of definition appears in (Eco, 1983).
1.2. Hodos : Methods, Media, and Middle Courses
To every thing there is a season. To every concept there are minimal contexts of sensible application, the most reduced situations in which the concept can be used to make sense. Every concept is an instrument of thought, and like every method has its bounds of useful application. In directions of simplicity, a concept is bounded by the minimum levels of complexity out of which it is, initially, recurrently, transiently, ultimately, able to arise. There is one form of rhetorical question that people often use to address this issue, if somewhat indirectly. It presents itself initially as a genuine question but precipitates the answer in enthymeme, dashing headlong to break off inquiry in the form of a blank. Ostensibly, the space extends the question, but it is only a pretext. The pretense of an open sentence is already filled in by the unexpressed beliefs of the questioner.
"What could be simpler than ___ ?" this question asks, and the automatic completions that different customs have usually borne in mind are these: sets, functions, relations. My present purpose is to address the concept of information, and specifically the kind that results from observation. In answer to the question of its foundation, I have not found that the concept of information can make much sense in anything short of the following framework.
Three-place relations among systems are a minimum requirement. Information is a property of a sign system by virtue of which it can reduce the uncertainty of an interpreting system about the state of an object system. Thus information is a feature that a state in a system has in relation to two other systems. The fundamental reality underlying the concept of information is the persistence of individual systems of relation, each of which exhibits a certain kind of relation among three domains and satisfies a definable set of definitive properties. Each domain in the relation is the state space of an entire system: sign system, interpreting system, object system, respectively. When a set of properties is identified that captures what all such sign systems have in common, a definition of the concept of a sign system will have been discovered. But what form of argument will serve to bring us to a definition, in this case or in its more general setting? Certainly, it cannot be that form of thought, unaided, that requires a definition to start.
More carefully said, information is a property that can be attributed to signs in a system by virtue of their relation to two other systems. This attribution projects a relation among three domains into a simpler order of relation. There are various ways of carrying out this reduction. Not all of them can be expected to preserve the information of the original sign relation. An attribution may create a logical property of elements in the sign domain or it may construct functions from the sign domain to ranges of qualitative meaning or quantitative measure.
1.2.1. Functions of Observation
An observation preserved in a permanent record marks the transience of a certain compound event, the one that an observational account creates in conjunction with the events leading up to it. If an observation succeeds in making an indelible record of an event, then a certain transient of the total system has been passed. To the extent that the record is a lasting memory there is a property of the system that has become permanent. The system has crossed a line in state space that it will not cross again. The state space becomes strictly divided into regions the system may possibly visit again and regions it never will. Of course, an equivalent type of event may happen again, but it will be indexed with a different count. The same juxtaposition of events in the observed system and accounts by the observing system will never again be repeated, if memory faithfully serves.
But perfect observation and permanent recordings are seldom encountered in real life. Therefore, informational content must be found in the distribution of a system's behavior across the whole state space. A system must be recognized as informed by events whenever this distribution comes to be anything other than uniform, or in relative terms deviating from a known baseline. As to what events caused the information there is no indication yet. That kind of decoding requires another cycle of hypotheses about reliable connections with object systems and experiments that lay odds on the systematic validation of these bets. The impressions that must be worked with have the shape of probability distributions. The impression that an event makes on a system lies in the difference between the total distribution of its behavior and the distribution generated on the hypothesis that the event did not happen.
A system of observation constitutes a projection of the object system's total behavior onto a space of observations, which may be called the space of observing or observant states. The object system's total state space is not necessarily a well-defined entity. It can only be imagined to lie in some unknown extension of the observing space. How much information a system may have is defined only relative to a particular system of observation. It is often convenient to personify all the various specifications of observational systems and spaces under a single name, the observer. Every bit of information that a system maintains with respect to an observer constrains the system's behavior to half the observed state space it would otherwise have. When designing systems it is preferred that this bit of information reside in a well-defined register, a localized component of anatomical structure in a designed-to-be-known decomposition of the intelligible object system.
However, the kind of direct product decomposition that would make this feasible is not always forthcoming. When investigating a system of unknown design, it cannot be certain that all its information is embodied in localized components. It is not even certain that a given observation system is detecting the level, mode, or site in which the majority of its information is stored. Even when it is found that a system occupies a small selection or a narrow distribution of its possible states and increases its level of informedness with time, this may yield a quantitative measure of its determination and progress but it does not offer a motive, neither a reference to the objects nor a sense of the objectives that may be driving the process.
In order to assess the purpose and object of an information process, it is important to examine and distinguish several applications that the common measure of information might receive. A first employment scales the information that an object system possesses by virtue of being in a certain state, as among the possibilities envisioned by an observer. A second grades the information that a state in a sign system provides to reduce the uncertainty of an interpreter about the state of an object system. A third weighs the information that a self-informed intelligent system can exercise with respect to the control of its own state transformations.
These distinctions can be traced back through the ideas of pragmatism to a couple of distinctions made by Aristotle in the first textbook of psychology. In Aristotle's Peri Psyche or On the Soul he discerns in the essential nature of things the factors of form and matter. In regard to animate creatures Aristotle divines that the actuality of their intelligence is found in their form while it is the potentiality of mind that is embodied in matter. The form and actuality of mentality is like the edge of an implement that makes it effective in its intended purpose. The formal aspect is an organic shape impressed upon and infused within the material substrate of life. The matter of the mind merely supplies a medium for the potentiality of mental functioning. Subsequently Aristotle divides the form of actuality into two senses, exemplified in turns by the possession and the exercise of knowledge. Can such distinctions, devices of ancient pedigree on fields of patent verdigris, bring a significant bearing to the conduct of modern inquiries in applied AI? This question is considered among the topics of (Awbrey & Awbrey, 1990).
At this point the notion of observation put forward above would seem identical to the notion of representation that is usual in AI in cognitive science. But mathematicians and physicists reserve the status of representation to maps that are homomorphisms, in which some measure of structure preservation is present. And if these two notions are confounded, what sort of observation would enable the detection of whether maps preserve structure or not?Therefore it seems necessary to preserve a more general notion of observation which permits arbitrary transformations, not just the linear mappings or morphisms that properly constitute representations.
It has been appreciated in mathematics and physics for at least a century that an isomorphism is almost totally useless for the purposes that motivate representation and that a single representation is hardly ever enough. Representations are exactly analogous to coordinate projections or spectral coefficients in a fourier analysis. It is a necessary part of their function to severely reduce the data, and this engenders the complementary necessity of having a complete set of projections in order to reconstitute the data to the extent possible.
The extent to which a representation found embodied in a system is an isomorphic representation of its object system is the extent to which that information has not really been processed yet. Only a piecemeal reductive, jointly analytic form of representation can supply grist for the mill that applies rational knowledge to making incisive judgments about action. To object that the reality itself does not exist in the analyzed form created by a system of representation is like objecting to changing the form of bread in the process of digesting it. It is only necessary to remember that representations are supposed to be different from the realities they address, and that the nature of one need not existentially conflict with the nature of the other.
In exception to the general rule, some work in AI and cognitive science has reached the verge of applying the homomorphic idea of representation, although in some cases the arrows may be reversed. Notable in this connection is the concept of structure-mapping exploited in (Gentner & Gentner, 1983) and (Prieditis, 1988) and the notion of quasi-morphism introduced in (Holland, et al., 1986). One of the software engineering challenges implicit in this work is to provide the kind of standardized category-theoretic computational support that would be needed to routinely set up and test whole parametric families of such models. An off-the-shelf facility for categorical computing would of course have many other uses in theoretical and applied mathematics.
1.2.1.1. Observation and Action
It seems clear that observations are a special type of action, and that actions are a special type of observable event. At least, actions are events that may come to be observed, if only in the way that outcomes of eventual effects are recognized to confirm the hypotheses of specific causes. Is every action in some sense an observation? Is every observable event in some sense an observation, a commemoration, an event whose occasion serves to observe something else? If this were so, then the concepts of observation and action would be special cases of each other. Computer scientists will have no trouble accepting the mutual recursion of complex notions, so long as the conceptual instrument as a whole does its job, and so long as the recursion bottoms out somewhere. The mutual definition can find its limit in two ways. It can ground out centrally, with a single category of primitive element that has all the relevant aspects being analyzed, here both perception and action. It can scatter peripherally, resolving into simple elements that distinctively belong to one category or another.
1.2.1.2. Observation and Observables
Independently of their distinctness as categories, what is the relation of the observing and the observable as roles played out in the theater of observation? Observation may be the noting of internal or external events, but more than contemplation it requires the possibility of leaving a record. Nothing serves as observation unless notches can be made in a medium that retains the indenture through time. By this analysis, observation is found to be involved in the very same relation that signs have to their objects. The observation is a sign of its observed object, event, or action. In spite of the active character of concrete observation, it still seems convenient in theoretical models (like turing machines) to divide observation across two abstract components: an active, empirical part that arranges apparatus for a complex test and goes looking for what's happening (on unforeseen segments of tape), and a passive, logical part that represents the elementary reception and pure contingency of simply noting without altering what's under one's nose (or read head).
1.2.1.3. Observation and Interpretation
The foregoing discussion of observation and observables seems like such a useless exercise in hair-splitting that a forward declaration of its eventual purpose is probably called for at this point. Section 2 will introduce a notation for propositional calculus, and Section 3 will describe a proposal for its differential extension. To anticipate that development a bit schematically, suppose that a symbol "x" stands for a proposition (true-false sentence) or a property (qualitative feature). Then a symbol "dx" will be introduced to stand for a primitive property of "change in x". Differential features like "dx", depending on the circumstances of interpretation, may be interpreted in several ways. Some of these interpretations are fairly simple and intuitive, other ways of assigning them meaning in the subject matter of systems observations are more subtle. In all of these senses the proposition "dx" has properties analogous to assignment statements like "x := x+1" and "x := not x". In spite of the fact that its operational interpretation entails difficulties similar to that of assignment statements, I think this notation may provide an alternate way of relating the declarative and procedural semantics of computational state change.
In one of its fuller senses the differential feature "dx" can mean something like this: The system under consideration will next be observed to have a different value for the property "x" than the value it has just been observed to have. As such, "dx" involves a three-place relationship among an observed object, a signified property, and a specified observer. Note that the truth of "dx" depends on the relative behavior of the system and the observer, in a way that cannot be analyzed into absolute properties of either without introducing another observer. If "dx" is interpreted as the expectation of a certain observer, then its realization can be imagined to depend on both the orbit of the system and the sampling scheme or threshold level of the observer. In general, differential features can involve the dynamic behavior of an observed system, decisions about a designated property, and the attention of a specified observer in ways that are irreducibly triadic in their level of complexity.
For example, the system may "actually" have crossed the line between "x" and "not x" several times while the observer was not looking, but without additional oversight this is only an imaginary or virtual possibility. And it is well understood that oversight committees, though they may serve the purpose of a larger objectivity by converging in time on broadly warranted results, in the mean time only compound the complexity of the question at issue. Therefore, it should be clear that the relational concept indicated by "dx" is a primitive notion, in the general case irreducible to concepts of lower order. The relational fact asserted by "dx" is a more primary reality than the manifold ways of parceling out responsibility for it to the interaction of separate agents that are subsystems of the whole. The question of irreducibility in this three-place relation is formally equivalent to that prevailing in the so-called sign relation that exists among objects, signs, and interpreting signs or systems.
If a particular observer is taken as a standard, then discussion reduces to a universe of discourse about various two-place relations, that is, the relations of a system's state to several designated properties. Relative to this frame, a system can be said to have a variety of objective properties. An observer may be taken as a standard for no good reason, but usually a system of observation becomes standardized by exhibiting properties that make it suitable for use as such, like the fabled daily walks of Kant through the streets of Konigsberg by which the people of that city were able to set their watches (Osborne, p. 101). This reduction is similar to the way that a pragmatic discussion of signs may reduce to semantic and even syntactic accounts if the context of usage is sufficiently constant or if a constant interpreter is assumed. Close analogies between observation and interpretation will no doubt continue to arise in the synthesis of physical and intelligent dynamics.
1.2.2. Symbolic Media
A critical transition in the development of a system is reached when components of state are set aside internally or assimilated from the environment to make relatively irreversible changes, indelible marks to record experiences and note intentions. Where in the dynamics of a system do these signs reside? In what nutations of equilibrium does the system insinuate its libraries of notation, the tokens of passed, pressing, and prospective moments of experience? What parameters are concretely set as memorials to the results of observations performed, the outcomes of actions observed, and the plans of action contemplated to provide the experience of desired effects? What bank accumulates all the words coined and spent on sights and deeds? What mint guarantees the content and determines the form of their first impressions?
1.2.2.1. Papyrus, Parchment, Palimpsest
Starting from the standpoint of systems theory a sizable handicap must be overcome in the quest to figure out: "What's in the brain that ink may character?" and "Where is fancy bred?" (McCulloch, 1965). If localized deposits of historical records and promissory notes are all that can be found, a considerable amount of reconstruction may be necessary to grasp the living reality of experience and purpose that underlies them still. A distinction must be made between the analytic or functional structure of the phase space of a system and the anatomical structure of a hypothetical agent to whom these states are attributed. The separation of a system into environment and organism and the further detection of anatomical structure within the organism depend on a direct product decomposition of the space into relatively independent components whose interactions can be treated secondarily. But the direct product is a comparatively advanced stage of decomposition and not to be expected in every case.
This point draws the chase back through the briar patch of that earlier complexity theory, the prime decomposition or group complexity theory of finite automata and their associated formal languages or transformation semigroups (Lallement, ch. 4). This more general study requires the use of semi-direct products (Rotman, 1984) and their ultimate extension into wreath products or cascade products, along with the corresponding notions of divisibility, factorization, or decomposition (Barr & Wells, 1990, ch. 11). This theory seems to have reached a baroque stage of development, either too difficult to pursue with vigor, too lacking in applications, or falling short of some essential insight. It looks like another one of those problem areas that will need to be revisited on the way to integrating AI and systems theory.
1.2.2.2. Statements, Questions, Commands
When signs are created that can be placed in reliable association with the results of observations and the onsets of actions, these signs are said to denote or evoke the corresponding events and actions. This is the beginning of declarative, imperative, and interrogative uses of symbolic expressions.
The interrogative mode is associated with residual properties of the state occupied by a system. The question marks a difference between states denoted by declarative expressions, a divergence between expectation and actuality. The inquisitive use of a sign notes a surprise to be explained, usually by adducing the signs of less obvious facts to the account. A surprise incites the system to an effort whose end is to bring the system's habits of expectation in line with what actually happens on a recurring basis.
The imperative mode is associated with convergent possibilities of the states in which a system may come to reside. The command calls attention to a discrepancy between actuality and intention, a difference between the states independently declared to be observed and desired. The injunctive use of a sign sets a problem to be resolved, usually by executing the actions enjoined by a series of signs. A problem incites the system to an effort whose end is to bring what actually happens on a recurring basis in line with the system's hopeful anticipations. If this problem turns out to be intractable, then the expectation that these intentions can be fulfilled may have to be changed. In this way the different modes of inquiry are often embroiled in intricate patterns of interaction.
In proceeding from surprise and problem states to arrive at explanations and plans of action that are suited to resolving these states, the system's aim is expedited by certain resources, all of which involve massive and complex systems of signs and symbolic expressions. It helps to have a library, an access to the records of individual and collective past efforts and experiences. To be used for clear and present indications this library must have a ready index of its contents, a form of afterthought that is not too thoughtless in design. It helps to a have laboratory, a workshop or a study, any facility where imagination reigns for composing and testing improvised programs and theories, for prototyping on-the-spot inventions. To be used for free and unbiased evaluation this factory of imagination must be a mechanism of forethought without malice, where symbolic expressions extempore are not confused with actions and do not exact the same price in energy spent and pain risked.
But how can all this information and flexibility, constraint vying with freedom of interpretation, be accorded a place in the present state of a system? Can Epimetheus and Prometheus find a way to "get along" in the current state of things? Is the phase space of a system really big enough for both of them?
If signs and symbols are to receive a place in systems theory it must be possible to construct them from materials available on that site. But the only thing a system has to work with is its own present state. How do states of a system come to serve the role of signs? How can it make sense to say that system regards one of its own states as a sign of something else? How do certain states of a system come to be taken by that system, as evidenced by its interpretive behavior, as signs of something else, some object or objective? A good start toward answering these questions would be made by defining the words used in asking them. In looking at the concepts that remain to be given system-theoretic definitions it appears that all of these questions boil down to one: What character in the dynamics of a system would cause it to be called a sign-using system, one that acts as an interpreter in a non-trivial sense?
1.2.2.3. Pragmatic Theory of Signs
The theory of signs that I find most useful was developed over several decades in the last century by C.S. Peirce, the founder of modern American pragmatism. Signs are defined pragmatically, not by any essential substance, but by the role they play within a three-part relationship of signs, interpreting signs, and referent objects. It is a tenet of pragmatism that all thought takes place in signs. Thought is not placed under any preconceived limitation or prior restriction to symbolic domains. It is merely noted that a certain analysis of the processes of perception and reasoning finds them to resolve into formal elements which possess the characters and participate in the relations that a definition will identify as distinctive of signs.
One version of Peirce's sign definition is especially useful for the present purpose. It establishes for signs a fundamental role in logic and is stated in terms of abstract relational properties that are flexible enough to be interpreted in the materials of dynamic systems. Peirce gave this definition of signs in his 1902 "Application to the Carnegie Institution":
Logic is formal semiotic. A sign is something, A, which brings something, B, its interpretant sign, determined or created by it, into the same sort of correspondence (or a lower implied sort) with something, C, its object, as that in which itself stands to C. This definition no more involves any reference to human thought than does the definition of a line as the place within which a particle lies during a lapse of time. (Peirce, NEM 4, 54).
It is from this definition, together with a definition of "formal", that I deduce mathematically the principles of logic. I also make a historical review of all the definitions and conceptions of logic, and show, not merely that my definition is no novelty, but that my non-psychological conception of logic has virtually been quite generally held, though not generally recognized. (Peirce, NEM 4, 21).
A placement and appreciation of this theory in a philosophical context that extends from Aristotle's early treatise On Interpretation through John Dewey's later elaborations and applications (Dewey, 1910, 1929, 1938) is the topic of (Awbrey & Awbrey, 1992). Here, only a few features of this definition will be noted that are especially relevant to the goal of implementing intelligent interpreters.
One characteristic of Peirce's definition is crucial in supplying a flexible infrastructure that makes the formal and mathematical treatment of sign relations possible. Namely, this definition allows objects to be characterized in two alternative ways that are substantially different in the domains they involve but roughly equivalent in their information content. Namely, objects of signs, that may exist in a reality exterior to the sign domain, insofar as they fall under this definition, allow themselves to be reconstituted nominally or reconstructed rationally as equivalence classes of signs. This transforms the actual relation of signs to objects, the relation or correspondence that is preserved in passing from initial signs to interpreting signs, into the membership relation that signs bear to their semantic equivalence classes. This transformation of a relation between signs and the world into a relation interior to the world of signs may be regarded as a kind of representational reduction in dimensions, like the foreshortening and planar projections that are used in perspective drawing.
This definition takes as its subject a certain three-place relation, the sign relation proper, envisioned to consist of a certain set of three-tuples. The pattern of the data in this set of three-tuples, the extension of the sign relation, is expressed here in the form: ‹Object, Sign, Interpretant›. As a schematic notation for various sign relations, the letters "s", "o", "i" serve as typical variables ranging over the relational domains of signs, objects, interpretants, respectively. There are two customary ways of understanding this abstract sign relation as its structure affects concrete systems.
In the first version the agency of a particular interpreter is taken into account as an implicit parameter of the relation. As used here, the concept of interpreter includes everything about the context of a sign's interpretation that affects its determination. In this view a specification of the two elements of sign and interpreter is considered to be equivalent information to knowing the interpreting or the interpretant sign, that is, the affect that is produced in or the effect that is produced on the interpreting system. Reference to an object or to an objective, whether it is successful or not, involves an orientation of the interpreting system and is therefore mediated by affects in and effects on the interpreter. Schematically, a lower case "j" can be used to represent the role of a particular interpreter. Thus, in this first view of the sign relation the fundamental pattern of data that determines the relation can be represented in the form ‹o, s, j› or ‹s, o, j›, as one chooses.
In the second version of the sign relation the interpreter is considered to be a hypostatic abstraction from the actual process of sign transformation. In other words, the interpreter is regarded as a convenient construct that helps to personify the action but adds nothing informative to what is more simply observed as a process involving successive signs. An interpretant sign is merely the sign that succeeds another in a continuing sequence. What keeps this view from falling into sheer nominalism is the relation with objects that is preserved throughout the process of transformation. Thus, in this view of the sign relation the fundamental pattern of data that constitutes the relationship can be indicated by the optional forms ‹o, s, i› or ‹s, i, o›.
Viewed as a totality, a complete sign relation would have to consist of all of those conceivable moments — past, present, prospective, or in whatever variety of parallel universes that one may care to admit — when something means something to somebody, in the pattern ‹s, o, j›, or when something means something about something, in the pattern ‹s, i, o›. But this ultimate sign relation is not often explicitly needed, and it could easily turn out to be logically and set-theoretically ill-defined. In physics, it is important for theoretical completeness to regard the whole universe as a single physical system, but more common to work with "isolated" subsystems. Likewise in the theory of signs, only particular and well-bounded subsystems of the ultimate sign relation are likely to be the subjects of sensible discussion.
It is helpful to view the definition of individual sign relations on analogy with another important class of three-place relations of broad significance in mathematics and far-reaching application in physics: namely, the binary operations or ternary relations that fall under the definition of abstract groups. Viewed as a definition of individual groups, the axioms defining a group are what logicians would call highly non-categorical, that is, not every two models are isomorphic (Wilder, p. 36). But viewing the category of groups as a whole, if indeed it can be said to form a whole (MacLane, 1971), the definition allows a vast number of non-isomorphic objects, namely, the individual groups.
In mathematical inquiry the closure property of abstract groups mitigates most of the difficulties that might otherwise attach to the precision of their individual definition. But in physics the application of mathematical structures to the unknown nature of the enveloping world is always tentative. Starting from the most elemental levels of instrumental trial and error, this kind of application is fraught with intellectual difficulty and even the risk of physical pain. The act of abstracting a particular structure from a concrete situation is no longer merely abstract. It becomes, in effect, a hypothesis, a guess, a bet on what is thought to be the most relevant aspect of a current, potentially dangerous, and always ever-insistently pressing reality. And this hypothesis is not a paper belief but determines action in accord with its character. Consequently, due to the abyss of ignorance that always remains to our kind and the chaos that can result from acting on what little is actually known, risk and pain accompany the extraction of particular structures, attempts to isolate particular forms, or guesses at viable factorizations of phenomena.
Likewise in semiotics, it is hard to find any examples of autonomous sign relations and to isolate them from their ulterior entanglements. This kind of extraction is often more painful because the full analysis of each element in a particular sign relation may involve references to other object-, sign-, or interpretant-systems outside of its ostensible, initially secure bounds. As a result, it is even more difficult with sign systems than with the simpler physical systems to find coherent subassemblies that can be studied in isolation from the rest of the enveloping universe.
These remarks should be enough to convey the plan of this work. Progress can be made toward new resettlements of ancient regions where only turmoil has reigned to date. Existing structures can be rehabilitated by continuing to unify the terms licensing AI representations with the terms leasing free space over dynamic manifolds. A large section of habitable space for dynamically intelligent systems could be extended in the following fashion: The images of state and the agents of change that are customary in symbolic AI could be related to the elements and the operators which form familiar planks in the tangent spaces of dynamic systems. The higher order concepts that fill out AI could be connected with the more complex constructions that are accessible from the moving platforms of these tangent spaces.
1.2.3. Architecture of Inquiry
The outlines of one important landmark can already be seen from this station. It is the architecture of inquiry, in the style traced out by C.S. Peirce and John Dewey on the foundations poured by Aristotle. I envision being able to characterize the simplest drifts of its dynamics in terms of certain differential operators.
It is important to remember that knowledge is a different sort of goal from the run-of-the-mill setpoints that a system might have. The typical goal is a state that a system has actually experienced many times before, like normal body temperature for a human being. But a particular state of knowledge that an intelligent system moves toward may be a state it has never been through before. The fundamental equivocation on this point expressed in Plato's Meno, whether learning is functionally equivalent to remembering, was discussed above. In spite of this quibble, it still seems necessary to regard states of knowledge as a distinctive class. The reasons for this may lie in the fact that a useful definition of inquiry for human beings necessarily involves a whole community of inquiry.
On account of this social character of inquiry, even those states of knowledge which might be arrived at through accidental, gratuitous, idiosyncratic, transcendental, or otherwise inexplicable means are useless for most human purposes unless they can be communicated, that is, reliably reproduced in the social system as a whole. In order to do this it seems necessary as a practical matter, whatever may have been the original process of construction, that such states of knowledge be obtainable through the option of a rational reconstruction. Hence the familiar requirement of proof for mathematical results, no matter how inspired their first glimmerings. Hence the discipline of programming that challenges workers in AI to represent intelligent processes in terms of computable functions, however differently intelligence may have evolved in the frame of biological time.
Aristotle long ago pointed out that there can be no genuine science of the purely idiosyncratic subject, no systematic knowledge of the totally isolated event. Science does not have as its domain all experience but only that subset which is indefinitely repeatable. Likewise on the negative branch, concerning the lack of knowledge that occasions a problem, a state that never recurs does not present a problem for a system. This limitation of scientific problems and knowledge to recurrent phenomena yields an important clue. The placement of intelligence and knowledge in analogy with system attributes like momentum and frequency may turn out to be based on deeply common principles.
1.2.3.1. Inquiry Driven Systems
One goal of this work is to develop a formalism adequate to the description of knowledge-oriented inquiry-driven systems in logical and differential terms, to be able to write down and run as simulations qualitative differential equations that describe individual cases of systems with knowledge-directed behavior, systems which exhibit a progress toward a goal of knowledge. A knowledge-oriented system is one which maintains a knowledge base which figures into its behavior in a dual role, both as a guide to action and as the object of a system goal to increase the measure of its usefulness. An inquiry-driven system is one that develops its knowledge base in response to the differences existing between three aspects of state that may be projected or generated from its total state, components which might be called: expectations, observations, and intentions.
It is not clear at this point if there can be interesting classes of inquiry-driven systems which are purely deterministic, but a recognition of what such a system would be like might help to clarify the limits of the notion. In some sense a deterministic inquiry-driven system would fulfill a behaviorist dream. It would correspond to a scientific agent whose every action is predictable, even to the phenomena it will encounter, hypotheses it will entertain, and experiments it will perform as a consequence. If it is accepted that behaviorist proposals are tantamount to a restriction of methodology to the level of finite state description, then less elaborate characterizations of such systems are always available. Proper hypotheses, which are not just summaries of finite past experience but can refer to an infinite set of projected examples, are commonly associated with complexities in behavior that proceed from the essentially context-free level on up.
One important use of a system's current knowledge base is to project expectations of what is likely to be actualized in its experience, an image of what state it envisions probable. Another use of a system's personal knowledge base is to preserve intentions during the execution of series of actions, to keep a record of a current goal, a picture of what it would like to find actualized in its experience, an image of what state it envisions desirable. From these uses of images two kinds of differences crop up in the process of inquiry.
1.2.3.2. Surprises to Explain
One of the uses of a knowledge base is to support the generation of expectations. In return, one of the inputs to the operators which edit and update a knowledge base is the set of differences between expected and observed states. An inquiry-driven system requires a function to compare expected states, as represented in the images reflexively generated from present knowledge, and observed states, as represented in the images currently delivered as unquestioned records of actual happenings. In human terms this kind of discrepancy between expectation and observation is experienced as a surprise, and is usually felt as constituting an impulse toward an explanation that can reduce the sense of disparity. The specification of a particular inquiry-driven system would have to detail this relation between states of uncertainty and resultant actions.
1.2.3.3. Problems to Resolve
Since a system's determination of its own goals is a special case of knowledge in general, it is convenient to allocate a place for this kind of information in the knowledge component of an intelligent system. Thus, the intellectual component of a knowledge-oriented system may be allowed to preserve its intentions, the images of currently active goals. Often there is a difference between an actual state, as represented by the image developed in free space by a trusted process of observation, and an active goal, as represented by an image in the same space but cherished within the frame of intention or otherwise distinguished by an attentional affect. This situation represents a problem to be solved by the system through actions that effect changes on the level of its primary dynamics. The system chooses its trajectory in accord with reducing the difference between its active intentions and the observations that record actual conditions.
1.2.4. Simple Minded Systems
Of course, not every total manifold need have a nice factorization. It might be thought to dispense with such spaces immediately, to put them aside as not being reasonable. But it may not be possible to dismiss them quite so easily and summarily. Intelligent systems of this sort may end up being refractory to routine analysis and will have to be regarded as simple minded. That is, they may turn out to be simple in the way that algebraic objects are usually called simple, having no interesting proper factors of the same sort. Suppose there are such simple minded systems, otherwise deserving to be called intelligent but which have no proper factorization into the kind of gross dynamics and subtle dynamics that might correspond to the distinction ordinarily made between somatic and mental behavior. That is, they do not have their activity sorted into separate scenes of action: one for ordinary physical and thermal dynamics, another for information processing dynamics, symbolic operations, knowledge transformations, and so on up the scale. In the event that this idea of simplicity can be found to make sense, it is likely that simple minded systems would be deeply involved in or place extreme bounds on the structures of all intelligent systems.
A realm of understanding subject to a certain rule of analysis may have a boundary marked by simple but adamant exceptions to its further reign. Or it may not have a boundary, but that seems to verge on an order of understanding beyond the competence of computational systems. Whether the human form of finitude abides or infringes this limitation is something much discussed but not likely to be settled any time soon. In order to pursue the question of simplicity the form of analysis must be examined more carefully. The type of factorization that system-theoretic analogies suggested was gotten by locating a convenient stage at which to truncate or abridge the typical datum. This amounts to a projection of the data space onto a stage set by this process of critical evaluation. The fibers of this projection are the data sets that form the inverse images of points in its range.
In reflecting on the form of analysis that has naturally arisen at this point it appears to display the following character. An object is presented to contemplation in the light of a finite collection of features. If the object is found to possess every one of the listed features, this incurs the existence of another object, simpler in some sense, to which analytic attention is then shifted. It may be figuratively expressed that the analysis descends to a situation closer to the initial conditions or bounds to a site nearer the boundary conditions.
The cases of simple minded systems appear to contain at least the following two possibilities. First, a simple minded system may come into being already knowing itself perfectly, in which case all the irony of a Socrates would be lost on it, in terms of bringing it a wit closer to knowledge. The system already knows its whole manifold of possible states, that is, its knowledge component is in some sense complete, containing an answer to every possible dynamic puzzle that might be posed to it. Rather than an overwhelming richness of theory, this is more likely to arise from a structural poverty of the total space and a lack of capacity for the reception of questions that can be posed to it, as opposed to those posed about it. Second, a simple minded system might be born into an initial condition of ignorance, with the potential of reaching states of knowledge within its space, but these states may be discretely distributed in a continuous manifold. This means that states of knowledge could be achieved only by jumping directly to them, without the benefit of an error-controlled feedback process that allows a system to converge gradually upon the goals of knowledge.
1.3. Telos : Horizons and Further Applications
In its etymology, intelligence suggests a capacity that contains its goal (telos) within itself. [No, insert correction here.] Of course, it does not initially grasp that for which it reaches, does not always possess its goal, otherwise it would be finished from the start. So it must be that it contains only a knowledge of its goal. This need not be a perfect knowledge even of what defines the goal, leaving room for clarification in that dimension, also. Some thinkers on the question suspect that the capacity for setting goals may answer to another name: wisdom (sophia), prudence (phronesis), and even elegance (arete) are among the candidates often heard. If so, intelligence would have a relationship to this wisdom and sagacity that is analogous to the relationship of logic to ethics and esthetics. At least, this is how it appears from the standpoint of one philosophical tradition that recommends itself to me.
1.3.1. Logic, Ethics, Esthetics
The philosophy I find myself converging to more often lately is the pragmatism of C.S. Peirce and John Dewey. According to this account, logic, ethics, and esthetics form a concentric series of normative sciences, each a subdiscipline of the next. Logic tells how one ought to conduct one's reasoning in order to achieve the stated goals of reasoning in general. Thus logic is a special application of ethics. Ethics tells how one ought to conduct one's activities in general in order to achieve the good appropriate to each enterprise. What makes the difference between a normative science and a prescriptive dogma is whether this "telling" is based on actual inquiry into the relationship of conduct to result, or not.
In this view, logic and ethics do not set goals, they merely serve them. Of course, logic may examine the consistency of an arbitrary selection of goals in the light of what science tells about the likely repercussions in nature of trying to actualize them all. Logic and ethics may serve the criticism of certain goals by pointing out the deductive implications and probable effects of striving toward them, but it has to be some other science which finds and tells whether these effects are preferred and encouraged or detested and discouraged relative to a particular form of being.
The science which examines individual goods, species goods, and generic goods from an outside perspective must be an esthetic science. The capacity for inquiry into a subject must depend on the capacity for uncertainty about that subject. Esthetics is capable of inquiry into the nature of the good precisely because it is able to be in question about what is good. Whether conceived as empirical science or as experimental art, it is the job of esthetics to determine what might be good for us. Through the exploration of artistic media we find out what satisfies our own form of being. Through the expeditions of science we discover and further the goals of own species' evolution.
Outriggers to these excursions are given by the comparative study of biological species and the computational study of abstractly specified systems. These provide extra ways to find out what is the sensible goal of an individual system and what is the perceived good for a particular species of creature. It is especially interesting to learn about the relationships that can be represented internally to a system's development between the good of a system and the system's perception, knowledge, intuition, feeling, or whatever sense it may have of its goal. This amounts to asking the questions: What good can a system be able to sense for itself? How can a system discover its own best interests? How can a system achieve, from the evidence of experience, a cognizance, evidenced in behavior, of its own best interests?
1.3.2. Inquiry and Education
My joint work with Susan Awbrey speculates on the yield of AI technology for new seasons of inquiry-based approaches to education and research (Awbrey & Awbrey, 1990, '91, '92). A fruitful beginning can be made, we find, by returning to grounds that were carefully prepared by C.S. Peirce and John Dewey, and by asking how best to rework these plots with the implements that the intervening years have provided. There is currently being pursued a far-ranging diversity of work on the applications of AI to education, through research on problem solving performance (Smith, 1991), learner models and the novice-expert shift (Gentner & Stevens, 1983), the impact of cognitive strategies on instructional design (West, Farmer, & Wolff, 1991), the use of expert systems as teaching tools (Buchanan & Shortliffe, 1984), (Clancey & Shortliffe, 1984), and the development of intelligent tutoring systems (Sleeman & Brown, 1982), (Mandl & Lesgold, 1988). Other perspectives on AI's place in science, society, and the global scene may be sampled in (Wiener, 1950, 1964), (Ryan, 1974), (Simon, 1982), (Gill, 1986), (Winograd & Flores, 1986), and (Graubard, 1988).
1.3.3. Cognitive Science
Remarkably, seeds of a hybrid character, similar to what is sought in the intersection of AI and systems theory, were planted many years ago by one who explored the farther and nether regions of the human mind. This model blended recognizably cybernetic and cognitive ideas in a scheme that included associative networks for adaptive learning and recursion mechanisms for problem solving. But these ideas lay dormant and untended by their originator for over half a century. Sigmund Freud rightly estimated that this model would always be too simple-minded to help with the complex and subtle exigencies of his chosen practice. But the "Project" he wrote out in 1895 (Freud, 1954) is still more sophisticated, in its underlying computational structure, than many receiving serious study today in AI and cognitive modeling. Again, here is another stage, another window, where old ideas and directions may be worth a new look with the new 'scopes available, if only to provide a basis for informing the speculations that get a theory started.
The ideas of information processing, AI, cybernetics, and systems theory had more direct interactions, of course, with the development of cognitive science. A share of these mutual influences and crosscurrents may be traced through the texts and references in (Young, 1964, 1978), (de Bono, 1969), (Eccles, 1970), (Anderson & Bower, 1973), (Krantz, et al., 1974), (Johnson-Laird & Wason, 1977), (Lachman, Lachman, & Butterfield, 1979), (Wyer & Carlston, 1979), (Boden, 1980), (Anderson, 1981, '83, '90), (Schank, 1982), (Gentner & Stevens, 1983), (H. Gardner, 1983, 1985), (O'Shea & Eisenstadt, 1984), (Pylyshyn, 1984), (Bakeman & Gottman, 1986), (Collins & Smith, 1988), (Minsky & Papert, 1988), (Posner, 1989), (Vosniadou & Ortony, 1989), (Gottman & Roy, 1990), and (Newell, 1990).
1.3.4. Philosophy of Science
Continuing the angle of assault previously taken toward the abandoned mines of intellectual history, there are many other veins and lodes, subsided and shelved, that experts assay too low a grade for current standards of professional work. Yet many of these superseded courses and discredited vaults of theory are worth retooling and remining in the shape of computer models. Computational reenactments of these precept chapters in human thought, not just repetitions but analytic representations, could serve the purpose of school figures, training exercises and stock examples, to be used as instructional paradigm cases.
But there is a further possibly. Many foregone projects were so complex that not everything was understood about their implications at the time they were rejected for some critical flaw or another. It is conceivable that new things might be learned about the global character of these precursory models from computer simulations of their axioms, leading principles, and general lines of reasoning. Even though their flaws were eventually detected by unaided analysis, their positive features and possible directions of amendment may not have been so easily appreciated. An extended reflection on the need for various kinds of reconstruction in and of philosophy, and the conditions for their meaningful application to unclear but present situations, may be found in (Dewey, 1948).
A prime example of a project awaiting this kind of salvage operation is the submerged edifice of Carnap's "world building" (1928, 1961), the remains of a mission dedicated to "the rational reconstruction of the concepts of all fields of knowledge on the basis of concepts that refer to the immediately given … the searching out of new definitions for old concepts" (1969, v). The illusory stability of the "immediately given" has never been more notorious than today. But the relevant character to be appreciated in this classical architecture is the degree of harmony and balance, the soundness in support of lofty design that subsists and makes itself evident in the relationship of one level to another. Much that is toxic in our intellectual environment today could be alleviated by a suitably analytic and perceptive movement to recycle, reclaim, and restore the artifacts and habitations of former times.
2. Conceptual Framework
2.1. Systems Theory and Artificial Intelligence
If the principles of systems theory are taken seriously in their application to AI, and if the tools that have been developed for dynamic systems are cast in with the array of techniques that are used in AI, a host of difficulties almost instantly arises. One obstacle to integrating systems theory and artificial intelligence is the bifurcation of approaches that are severally specialized for qualitative and quantitative realms, the unavoidable differences between boolean-discrete and real-continuous domains. My way of circumventing this obstruction will be to extend the compass of differential geometry and the rule of logic programming to what I see as a locus of natural contact. Continuing the inquiry to naturalize intelligent systems as serious subjects of dynamic systems theory, a whole series of further questions comes up:
- What is the proper notion of state?
- How is the knowledge component or the "intellectual property" of this state to be characterized?
In accord with customary definitions, the knowledge component would need to be represented as a projection onto a knowledge subspace. In those intelligences for whom not everything is knowledge, or at least for whom not everything is known at once, that is, the great majority of those we are likely to know, there must be an alternate projection onto another subspace. Some real difficulties begin here which threaten to entangle our own resources intelligence of irretrievably.
The project before me is simply to view intelligent systems as systems, to take the ostended substantive seriously. To succeed at this it will be necessary to answer several questions:
- What is the proper notion of a state vector?
We need to analyze the state of the system into a knowledge component and a remaining or a sustaining component. This "everything else" component may be called the physical component so long as this does not prejudice the issue of a naturalistic aim, which seeks to understand all components as physis, that is, as coming under the original Greek idea of a natural process. Even the ordinary notion of a state vector, though continuing to be useful as a basis of analogy, may have to be challenged:
- Are the state elements, the moments of the system's experience, really vectors?
Consider the common frame of a venn diagram, overlapping pools of elements arrayed on a nondescript plain, an arena of conventional measure but not routinely examined significance.
A certain figure of speech, a chiasmus, may be used to get this point across. The universe of discourse, as a system of objective realities, is something that is not yet perfectly described. And yet it can be currently described in the signs and the symbols of a discursive universe. By this is meant a formal language that is built up on terms that are taken to be simple. Yet the simplicity of the chosen terms is not an absolute property but a momentary expedient, a side-effect of their current interpretation.
2.2. Differential Geometry and Logic Programming
In this section I make a quick reconnaissance of the border areas between logic and geometry, charting a beeline for selected trouble spots. In the following sections I return to more carefully survey the grounds needed to address these problems and to begin settling this frontier.
2.2.1. Differences and Difficulties
Why have I chosen differential geometry and logic programming to try jamming together? A clue may be picked up in the quotation below. When the foundations of that ingenious duplex, AI and cybernetics, were being poured, one who was present placed these words in a cornerstone of the structure (Ashby, 1956, p. 9).
The most fundamental concept in cybernetics is that of "difference", either that two things are recognisably different or that one thing has changed with time.
A deliberate continuity of method extends from this use of difference in goal-seeking behavior to the baby steps of AI per se, namely, the use of difference-reduction methods in the form of what is variously described as means-ends analysis, goal regression, or general problem solving.
2.2.1.1. Distance and Direction
Legend tells us that the primal twins of AI, the strife-born siblings of Goal-Seeking and Hill-Climbing, began to stumble and soon came to grief on certain notorious obstacles. The typical scenario runs as follows.
At any moment in time the following question is posed:
In this problem space how ought one choose to operate
in order to forge of one's current state a new update
that has hopes of being nearer to one's engoaled fate?
But before Jack and Jill can start up the hill they will need a whole bucket of prior notions to prime the pump. There must be an idea of distance, in short, a metric function defined on pairs of states in the problem space. There must be an idea of direction, a longing toward a goal that informs the moment, that fixes a relation of oriented distances to transition operators on states. Stated in linguistic terms the directive is a factor that commands and instructs. It arranges a form of interpretation that endows disparities with a particular sense of operational meaning.
Intelligent systems do not get to prescribe the problem spaces that will be thrown their way by nature, society, and the outside world in general. These nominal problems would hardly constitute problems if this were the case. Thus it pays to consider how intelligent systems might evolve to cast ever wider nets of competence in the spaces of problems that they can handle. Striving to adapt the differential strategies of classical cybernetics and of early AI to "soaring" new heights (Newell, 1990), to widening gyres of ever more general problem spaces, there comes a moment when the predicament thickens but the atmosphere of theory and the wings of artifice do not.
2.2.1.2. Topology and Metric
Topology is the most unconstrained study of spaces, beginning as it does with spaces that have barely enough hope of geometric structure to deserve the name of spaces (Kelley, 1961). An attention to this discipline inspires caution against taking too lightly the issue of a metric. There is no longer any reason to consider the question of a metric to be a trivial one, something whose presence and character can be taken for granted. For each space that can be contemplated there arises a typical suite of questions about the existence and the uniqueness of a possible metric. Some spaces are not metrizable at all (Munkres, sec. 2-9). Those that are may have a multitude of different metrics defined on them. My own sampling of differential methods in AI, both smooth and chunky style, suggests to me that this multiplicity of possible metrics is the ingredient that conditions one of their chief sticking points, a computational viscosity that consistently sticks in the craw of computers. Unpalatable if not intractable, it will continue to gum up the works, at least until some way is found to dissolve the treacle of complexity that downs our best theories.
2.2.1.3. Relevant Measures
Differences between problem states are not always defined. And even when they are, relevant differences are not always defined in the manner that would form the most obvious choice. Relevant differences are differences that make a difference, in the well-known pragmatist phrase, bearing on the problem and the purpose at hand. The qualification of relevance adds information to the abstractly considered problem space. This extra information has import for the selection of a relevant metric, but nothing says it will ever determine a unique metric suited to a given situation. Relevant metrics are generally defined on semantic features of the problem domain, involving pragmatic equivalence classes of objects. Measures of distinction defined on syntactic features, in effect, on the language that is used to discuss the problem domain, are subject to all of the immaterial differences and the accidental collision of expression that acts to compound the computational confusion and distraction.
When the problem of finding a fitting metric develops the intensity to cross a critical threshold, a strange situation is constellated. The new level of problemhood is noticed as an afterthought but may have a primeval reality about it in its own right, its true nature. The new circle of problem states may circumscribe and underlie the initial focus of attention. Can the problem of finding a suitable metric for the original problem space be tackled by the same means of problem solving that worked on the assumption of a given metric? A reduction of that sort is possible but is hardly ever guaranteed. The problem of picking the best metric for the initial problem space may be as difficult as the problem first encountered. And ultimately there is always the risk of reaching a level of circumspection where the problem space of last resort has no metric definable.
2.2.2. Logic with a Difference
In view of the importance of differential ideas in systems theory and against the background of difficulties just surveyed, I have thought it worthwhile to carefully pursue this quest: to extend the concepts of difference and due measure to spaces that lack the obvious amenities and expedients. The limits of rational descriptive capacity for any conceivable sets of states have their ultimate horizon in logic. This is what must be resorted to when only qualitative characterizations of a problem space are initially available. Therefore I am led to ask what will be a guiding question throughout this work: What is the proper form of a differential calculus for logic?
2.3. Differential Calculus of Propositions
There are two different analogies to keep straight in the following discussion. First is the comparison of boolean vs. real types with regard to functions and vectors. These types provide mathematical representation for the qualitative vs. quantitative constituencies, respectively. Second is the three-part analogy within the qualitative realm. It relates logical propositions with mathematical functions and sets of vectors, both functions and vectors being of boolean type.
2.3.1. Propositions and Differences
As a first step, I have taken the problem of propositional calculus modeling and viewed it from the standpoint of differential geometry. In this I exploit an analogy between propositional calculus and the calculus on differential manifolds. In the qualitative arena propositions may be viewed as boolean functions. They are associated with areas or arbitrary regions of a Venn diagram, or subsets of an n-dimensional cube. Logical interpretations, in the technical sense of boolean-valued substitutions in propositional expressions, may be viewed as boolean vectors. They correspond to single cells of a Venn diagram, or points of an n-cube. Put altogether, these linkages form a three part analogy between conceptual objects in logic and the two mathematical domains of functions and sets. In its pivotal location, critical function, and isosceles construction this analogy suggests itself as the pons asinorum of the subject I can see developing. But I can't tell till I've crossed it.
2.3.2. Three Part Analogy
For future use it is convenient to label the various elements of the three-part analogy under discussion.
2.3.2.1. Functional Representation
Functional representation is the link that converts logical propositions into boolean functions. Its terminus is an important way station for mediating the kinds of computational realizations I hope eventually to reach. This larger endeavor is the project of declarative functional programming. It has the goal of giving logical objects a fully operational meaning in software, implementing logical concepts in a functional programming style without sacrificing any of their properly declarative nature. I have reason to hope this can be a fruitful quest, in part from the reports of more seasoned travelers along these lines, e.g. (Henderson, 1980), (Peyton Jones, 1987), (Field & Harrison, 1988), (Huet, 1990), (Turner, 1990).
The next stage in the coevolution of functional and logical programming appears to involve an orbital commute between the spheres of category theory and combinatory logic. The aim of functional programming is to implement programs as functions on typed or universal domains. This aim quite naturally casts a glance that falls within the purview of category theory (Arbib & Manes, 1975), (Barr & Wells, 1990), (Freyd & Scedrov, 1990). The intercept aim of logic programming is to specify concrete programs as typical inhabitants of niches on which the intents or indents of programmers may be brought to bear. This aim fulfills its cardinal goal only when these creatures live in the harmony of a genetic and generic environment, a domain ruled over by a nature that speaks to, and is wise to, their abstract properties. In sum, it is desirable that programs be developed under the jurisdiction of a theorem prover that "knows" about categories of types and programs.
This knowledge would consist of axioms, inferential procedures, and a running accumulation of theorems. A developmental programming system of this sort would permit designers to anticipate many features of contemplated programs before running the risk of risking to run them. One vital requirement of the ideal system must be provisioned in the most primitive elements of its construction. The ideal system plus knowledge-base plus intelligence needs to be developmental in the added sense of a developing mentality. Undistracted by all the positive features that an ideal system must embody, a great absence must also be arranged by its designers. To the extent foreseeable there must be no foreclosure of interpretive freedom. The intended programming language, the sans critical koine of the utopian realm, must place as little possible prior value on the primitive tokens that fund its form of expression. An early implementation of a knowledge-based system for program development, using a refinement tree to search a space of correct programs, is described in (Barstow, 1979).
2.3.2.2. Characteristic Relation
Characteristic relation denotes the two-way link that relates boolean functions with subsets of their boolean universes, whether pictured as Venn diagram regions or n-cube subsets does not matter. Indicative conversion describes the traffic or exchange on this link between the two termini. Given a set A, the function fA which has the value 1 on A and 0 off A is commonly called the characteristic function or the indicator function of A. Since every boolean function f determines a unique set S = Sf of which it is the indicator function f = fS , this forms a convertible relationship between boolean functions and sets of boolean vectors. This fact is also described as an isomorphism between the function space (U → B) and the power set P(U) = 2U of the universe U. The associated set Sf is often called the support of the function f. Alternatively, it may serve as a helpful mnemonic and a useful handle on this edge of the analogy to call Sf the characteristic region, indicated set, or simply the indication of the function f, and to say that the function characterizes or indicates the set where its value is positive (that is, greater than 0, and therefore equal to 1 in B).
2.3.2.3. Indicative Conversion
The term indicative conversion and the associated usages are especially apt in light of the ordinary linguistic relationship between declarative sentences and verb forms in the indicative mood, which "represent the denoted act or state as an objective fact" (Webster's). It is not at all accidental that a fundamental capacity needed to support declarative programming is the pragmatic facilitation of this semantic relation, the ready conversion between propositions as indicator functions and properties in extension over indicated sets. The computational organism that would function declaratively must embody an interior environment with plenty of catalysts for the quick conversion of symbolically expressed functional specifications into images of their solution sets or sets of models.
2.3.3. Pragmatic Roles
The part of the analogy that carries propositions into functions combines with the characteristic relation between functions and sets to generate a multitude of different ways to describe essentially the same conceptual objects. From an information-theoretic point of view "essentially the same" means that the objects in comparison are equivalent pieces of information, parameterized or coded by the same number of bits and falling under isomorphic types. When assigning characters to individual examples of these entities, I think it helps to avoid drawing too fine a distinction between the logical, functional, and set-theoretic roles that have just been put in correspondence. Thus, I avoid usages that rigidify the pragmatic dimensions of variation within the columns below:
Proposition : Interpretation → Boolean { False , True } Function : Vector → Binary { 0 , 1 } Region : Cell → Content { Out , In } Subset : Point → Content { Out , In }
Though it may be advisable not to reify the practical distinctions among these roles, this is not the same thing as failing to see them or denying their use. Obviously, these differences may vary in relative importance with the purpose at hand or context of use. However, the mere fact that a distinction can generally be made is not a sufficient argument that it has any useful bearing on a particular purpose.
2.3.3.1. Flexible Roles and Suitable Models
When giving names and habitations to things by the use of letters and types, a certain flexibility may be allowed in the roles assigned by interpretation. For example, in the form "p : U → B", the name "p" may be taken to denote a proposition or a function, indifferently, and the type U may be associated with a set of interpretations or a set of boolean vectors, correspondingly, whichever makes sense in a given context of use. One dimension that does matter is drawn through these three beads: propositions, interpretations, and values. On the alternate line it is produced by the distinctions among collections, individuals, and values.
One relation that is of telling importance is the relation of interpretations to the value they give a proposition. In its full sense and general case this should be recognized as a three-place relation, involving all three types of entities (propositions, interpretations, and values) inextricably. However, for many applications the substance of the information in the three-place relation is conveyed well enough by the data of its bounding or derivative two-place relations.
The interpretations that render a proposition true, that is, the substitutions for which the proposition evaluates to true, are said to satisfy the proposition and to be its models. With a doubly modulated sense that is too apt to be purely accidental, the model set is the "content" of the proposition's formal expression (Eulenberg, 1986). In functional terms the models of a proposition p are the pre-images of truth under the function p. Collectively, they form the set of vectors in p–1(1). In another usage the set of models is called the fiber of truth, in other words, the equivalence class [1]p of the value 1 under the mapping p.
2.3.3.2. Functional Pragmatism
The project of functional programming itself fits within a broader philosophical mission, the pragmatism of C.S. Peirce and John Dewey, which seeks to clarify abstract concepts and occult properties by translating them into operational terms, see (Peirce, Collected Papers) and (Dewey, 1986). These thinkers had clear understandings of the relation between information and control, giving early accounts of inquiry processes and problem-solving, intelligence and goal-seeking that would sound quite familiar to cyberneticians and systems theorists. Similar ideas are reflected in current AI work, especially by proponents of means-ends analysis and difference reduction methods (Newell, 1990), (Winston, ch. 5).
Themes and variations from the pragmatists' full scale treatment of inquiry are echoed by investigators of inductive reasoning (Holland, et al., 1986), abductive or diagnostic reasoning (Charniak & McDermott, ch. 8), (Peng & Reggia, 1990), analogical reasoning and instrumental learning (Vosniadou & Ortony, chs. 1, 4, 8, 17), narrative explanation and language comprehension (Charniak & McDermott, ch. 8), and by computational modelers of scientific discovery and innovation (Shrager & Langley, 1990), (Thagard, 1992).
Dewey aphorized intelligent thinking as "response to the doubtful as such", the so-minded creature being marked by a faculty that "reacts to things as problematic" (Dewey, 1984, p. 179). He was fully aware that uncertainty is the inverse side of information and knew that his portrayal embroiled intelligent agents in both the felicities and the liabilities of responding to information-theoretic properties as all too solid realities. Dewey desired to naturalize the concept of intelligence. To the covert activity and shiftless agency of intelligence he sought to supply regulative principles and a natural basis, engendering behavior according to laws that might yet be discovered.
The realization that the observation necessary to knowledge enters into the natural object known cancels this separation of knowing and doing. It makes possible and it demands a theory in which knowing and doing are intimately connected with each other. Hence, as we have said, it domesticates the exercise of intelligence within nature. (Dewey, 1984, p. 171).
This kind of reconnection between theoretical knowledge and interactive experience is one of the features that must be embodied in software for exploring complex systems. In being confronted with such intricate dynamics there is simply not available to finite creatures the kind of absolute viewpoint that could place them totally outside the action.
The intelligent activity of man is not something brought to bear upon nature from without; it is nature realizing its own potentialities in behalf of a fuller and richer issue of events. (Dewey, 1984, p. 171).
Of all the complex systems that attract human interest, the human mind's own doings, knowing or not, must eventually form a trajectory that ensnares itself in questions and wonderings: Where will it be off to next? What is it apt to do next? How often will it recur to the various things it does? The mind's orbit traced in these questions has a compelling power in its own right to generate wonder.
2.3.4. Abstraction, Behavior, Consequence
There are many good reasons to preserve the logical features and constraints attaching to computational objects, i.e. programs and data structures. Chief among these reasons are: axiomatic abstraction, behavioral coordination, and consequential definition.
2.3.4.1. Axiomatic Abstraction
The capacity for abstraction would permit an expert system for dynamic simulation to rise above the immediate flux of the process simulated. Eventually, this could enable the software intelligence to adduce, reason about, and test hypotheses about generic properties of the system under study. Even short of this autonomy, the resources of abstract representation could at least provide a medium for transmuting embedded simulations into axioms and theories. For the systems prospector such an interface, even slightly reflective, can heighten the chances of panning some nugget of theory and lifting some glimmer of insight from the running stream of simulations.
2.3.4.2. Behavioral Coordination
The guidelines of pragmatism are remarkably suited as regulative principles for synthesizing AI and systems theory, where it is required to clarify the occult property of intelligence in terms of dynamic activity and behavior. This involves realizing abstract faculties, like momentum and intelligence, as hypotheses about the organization of trajectories through manifolds of observable features. In these post-revolutionary times, cognitively and chaotically speaking, it is probably not necessary to be reminded that this effort contains no prior claim of reductionism. The pragmatic maxim can no more predetermine the mind to be explained by simple reflexes than it can constrain nature to operate by linear dynamics. If these reductions are approximately true of particular situations, then they have to be discovered on site and proven to fit, not imposed with eyes closed.
2.3.4.3. Consequential Definition
The ability to deduce consequences of specified/acquired features and generic/imposed constraints would support the ultimate prospects toward unification of several stylistic trends in programming. Among these are the employment of class hierarchies and inheritance schemes in frame-system and semantic network knowledge bases (Winston, ch. 8), object-oriented programming methodologies (Shriver & Wegner, 1987), and constraint based programming (Van Hentenryck, 1989). The capacity for deduction includes as a special case the ability to check logical consistency of declarations. This has applications to compilation type-checking (Peyton Jones, 1987) and deductive data-base consistency (Minker, 1988).
2.3.5. Refrain
The analogy between propositional calculus and differential geometry is extended as far as possible by continuing to cast propositions and interpretations in roles similar to those exercised by real-valued functions and real-coordinate vectors in the quantitative world. In a number of reaches tentative trials of the analogy will render fit correspondences. Beyond these points it is critically important to examine those stretches where the analogy breaks, and there to consider the actual temperament and proper treatment of the qualitative situation in its own right.
A text that has been useful to me in relating classical and modern treatments of differential geometry is (Spivak, 1979). The standard for logic programming via general resolution theorem proving was set by (Chang & Lee, 1973). A more recent reference is (Lloyd, 1987), which concentrates on Prolog type programming in the Horn clause subset of logic. My own incursions through predicate calculus theorem proving and my attempts to size up the computational complexity invested there have led me to the following opinions.
2.4. Logic Programming
Militating against the charge of declarative programmers to achieve their goals through logic, a surprising amount of computational resistance seems to reside at the level of purely sentential or propositional operations. In investigating this situation I have come to believe that progress in logic programming will be severely impeded unless these factors of computational complexity at the level of propositional calculus are addressed and either resolved or alleviated.
At my current state of understanding I can propose nothing more complicated than to work toward a position of increased knowledge about the practical logistics of this problem domain. A reasonable approach is to explore the terrain at this simplest level, using the advantages afforded by a propositional calculus interpreter and relevant utilities in software. A similar strategy of starting from propositional logic and working up in stages to predicate logic is exploited by (Maier & Warren, 1988), in this case building a Prolog interpreter by successive refinement.
2.4.1. Differential Aspects
The fact that a difference calculus can be developed for boolean functions is well-known (Kohavi, sec. 8-4,), (Fujiwara, 1985) and was probably familiar to Boole, who was a master of difference equations before he turned to logic. And of course there is the strange but true story of how the Turin machines of the 1840's prefigured the Turing machines of the 1940's (Menabrea, p. 225-297). At the very outset of general-purpose, mechanized computing we find that the motive power driving the Analytical Engine of Babbage, the kernel of an idea behind all his wheels, was exactly his notion that difference operations, suitably trained, can serve as universal joints for any conceivable computation (Morrison & Morrison, 1961), (Melzak, ch. 4).
2.4.2. Algebraic Aspects
Finally, there is a body of mathematical work that investigates algebraic and differential geometry over finite fields. This usually takes place at such high levels of abstraction that the field of two elements is just another special case. In this work the principal focus is on the field operations of sum (\(+\)) and product ( \(\cdot\) ), which correspond to the logical operations of exclusive disjunction (xor, neq) and conjunction (and), respectively. The stress laid on these special operations creates a covert bias in the algebraic field. Unfortunately for the purposes of logic, the totality of boolean operations is given short shrift on the scaffold affecting this algebraic slant. For example, there are sixteen operations just at the level of binary connectives, not to mention the exploding population of k-ary operations, all of which deserve in some sense to be treated as equal citizens of the logical realm.
Moreover, from an algebraic perspective the dyadic or boolean case exhibits several features peculiar to itself. Binary addition (\(+\)) and subtraction (\(-\)) amount to the same operation, making each element its own additive inverse. This circumstance in turn exacts a constant vigilance to avert the verbal confusion between algebraic negatives and logical negations. The property of being invertible under products ( \(\cdot\) ) is neither a majority nor a typical possession, since only the element 1 has a multiplicative inverse, namely itself. On account of these facts the strange case of the two element field is often set aside, or set down as a "degenerate" situation in algebraic studies. Obviously, in turning to take it up from a differential standpoint, any domain that confounds "plus" and "minus" and "not equal to" is going to play havoc with our automatic intuitions about difference operators, linear approximations, inequalities and thresholds, and many other critical topics.
2.5. Differential Geometry
One of the difficulties I've had finding guidance toward the proper form of a differential calculus for logic has been the variety of ways that the classical subjects of real analysis and differential geometry have been generalized. As a first cut, two broad philosophies may be discerned, epitomized by their treatment of the differential df of a function f : X → R. Everyone begins with the idea that df ought to be a locally linear approximation dfu(v) or df(u, v) to the difference function Dfu(v) = Df(u, v) = f(u + v) – f(u). In this conception it is understood that "local" means in the vicinity of the point u and that "linear" is meant with respect to the variable v.
2.5.1. Local Stress and Linear Trend
But one school of thought stresses the local aspect, to the extent of seeking constructions that can be meaningful on global scales in spite of coordinate systems that make sense solely on local scales, being allowed to vary from point to point, for example, (Arnold, 1989). The other trend of thinking accents the linear feature, looking at linear maps in the light of their character as representations or homomorphisms (Loomis & Sternberg, 1968). Extenuations of this line of thinking go to the point of casting linear functions under the headings of the vastly more general morphisms and abstract arrows of category theory (Manes & Arbib, 1986), (MacLane, 1971).
2.5.1.1. Analytic View
The first group, more analytic, strives to get intrinsic definitions of everything, defining tangent vectors primarily as equivalence classes of curves through points of phase space. This posture is conditioned to the spare frame of physical theory and is constrained by the ready equation of physics with ante-metaphysics. In short they regard physics as a practical study that is prior to any a priori. Physics should exert itself to save the phenomena and forget the rest. The dynamic manifold is the realm of phenomena, the locus of all knowable reality and the focus of all actual knowledge. Beyond this, even attributes like velocity and momentum are epiphenomenal, derivative scores attached to a system's dynamic point from measurements made at other points.
This incurs an empire of further systems of ranking and outranking, teams and leagues and legions of commissioners, all to compare and umpire these ratings. When these circumspect systems are not sufficiently circumscribed to converge on a fixed point or a limiting universal system, it seems as though chaos has broken out. The faith of this sect that the world is a fair game for observation and intelligence seems dissipated by divergences of this sort. It wrecks their hope of order in phenomena, dooms what they deem a fit domain, a single rule of order that commands the manifold to appear as it does. To share the universe with several realities, to countenance a real diversity? It ruins the very idea they most favor of a cosmos, one that favors them.
2.5.1.2. Algebraic View
The second group, more algebraic, accepts the comforts of an embedding vector space with a less severe attitude, one that belays and belies the species of anxiety that worries the other group. They do not show the same phenomenal anguish about the uncertain multiplicity or empty void of outer spaces. Given this trust in something outside of phenomena, they permit themselves on principle the luxury of relating differential concepts to operators with linear and derivation properties. This tendency, ranging from pious optimism to animistic hedonism in its mathematical persuasions, demands less agnosticism about the reality of exterior constructs. Its pragmatic hope allows room for the imagination of supervening prospects, without demanding that these promontory contexts be uniquely placed or set in concrete.
2.5.1.3. Compromise
In attempting to negotiate between these two philosophies, I have arrived at the following compromise. On the one hand, the circumstance that provides a natural context for a manifold of observable action does not automatically exclude all possibility of other contexts being equally natural. On the other hand, it may happen that a surface is so bent in cusps and knots, or otherwise so intrinsically formed, that it places mathematical constraints on the class of spaces it can possibly inhabit.
Thus a manifold can embody information that bears on the notion of a larger reality. By dint of this interpretation the form of the manifold becomes the symbol of its implicated unity. But what I think I can fathom seems patent enough, that the chances of these two alternatives, plurality and singularity, together make a bet that is a toss up and open to test with each new shape of manifold encountered. It is likely that the outcome, if at all decidable, falls in accord with no general law but is subject to proof on a case by case basis.
2.5.2. Prospects for a Differential Logic
Pragmatically speaking, the proper form of a differential logic is likely to be regulated by the purposes to which it is intended to be put, or determined by the uses to which it is actually, eventually, and suitably put. With my current level of uncertainty about what will eventually work out, I have to be guided by my general intention of using this logic to describe the dynamics of inquiry and intelligence in systematic terms. For this purpose it seems only that many different types of fiber bundles or systems of spaces at points will have to be contemplated.
Although the limited framework of propositional calculus seems to rule out this higher level of generality, the exigencies of computation on symbolic expressions have the effect of bringing in this level of arbitration by another route. Even though we use the same alphabet for the joint basis of coordinates and differentials at each point of the manifold, one of our intended applications is to the states of interpreting systems, and there is nothing a priori to determine such a program to interpret these symbols in the same way at every moment. Thus, the arbitrariness of local reference frames that concerns us in physical dynamics, that makes the arbitrage or negotiation of transition maps between charts (qua markets) such a profitable enterprise, raises its head again in computational dynamics as a relativity of interpretation to the actual state of a running interpretive program.
2.6. Reprise
In summing up this sample of literature bearing on my present aims, there is much to suggest a deep relationship between the topics of systems, differentials, logic, and computing, especially when considered in the accidental but undeniable stream of historical events. I have not come across any strand of inquiry that plainly, explicitly, and completely weaves differential geometry and propositional logic in a computational context. But I hope to see one day a scintilla of a program that can weld them together in a logically declarative, functionally dynamic platform for intelligent computing.
3. Instrumental Focus
3.1. Propositional Calculus
A symbolic calculus is needed to assist our reasoning and computation in the realm of propositions. With an eye toward efficiency of computing and ease of human use, while preserving both functional and declarative properties of propositions, I have implemented an interpreter and assorted utilities for one such calculus. The original form of this particular calculus goes back to the logician C.S. Peirce, who is my personal favorite candidate for the grand-uncle of AI. Among other things, Peirce discovered the logical importance of NAND/NNOR operators (CP 4.12 ff, 4.264 f), (NE 4, ch. 5), inspired early ideas about logic machines (Peirce, 1883), is credited with "the first known effort to apply Boolean algebra to the design of switching circuits" (M. Gardner, p. 116 n), and even speculated on the nature of abstract interpreters and other "Quasi-Minds" (Peirce, CP 4.536, 4.550 ff).
Thought is not necessarily connected with a brain. It appears in the work of bees, of crystals, and throughout the purely physical world; and one can no more deny that it is really there, than that the colors, the shapes, etc., of objects are really there. (CP 4.551).
One could hardly invent a better anthem for the work being done today in the AI/systems hybrid areas of cellular automata (Burks, 1970), (Ulam, ch. 12), (Nicolis & Prigogine, 1989), emergent computation (Forrest, 1991), and "society of mind" theories (Minsky, 1986). I hope it will emerge that these workers achieve the same grade of well-honed insight regarding the mind's apical functions that Peirce was able to inspire, having once acquired a taste for it in the higher combines of logic's hive.
More than any logician, before or since, Peirce appreciated the importance of the fact that the physical properties of signs, from elementary signals to symbolic representations of the most general kind, involve practical constraints on their processing transformations. These pragmatic factors have a real bearing on the actualities of logic and interpretation, as executed in the performance of physically implemented minds, mental agencies, or quasi-interpreters.
Logical representation and interpretation, as physical and recursive processes, have boundary conditions that are especially significant. Consequently, Peirce could think it worth the trouble to ask: What would have to be the logical meaning of the blank sheet of paper on which logical expressions are intended to be written? His speculations on such questions show his sensitivity to the issue of the how the medium constrains and thus informs the message.
It can be imagined how mindless such inquiries must have seemed to Peirce's contemporaries, and it is possible to read the remarks of later commentators who should have known better. But these are exactly the kinds of practical questions that have to be addressed in implementing formal languages with recursive syntax and in defining semantic valuations on such domains in the form of computational interpreters.
The boundary is the region in computational space where initial, adaptive, and interactive parameters are determined. It can extend from initial conditions and fixed code to the current interface and forward in time. The shape of this boundary and the values attached to it are critical questions for the definition of a semantic function. A reasonably useful semantic function has to be almost wholly determined by the values it takes within a finite neighborhood of its boundary.
But natural objects, and there are beginning to be hints that natural languages must be counted among them, often take the form of fractals (Cherry, 1966), (Mandelbrot, 1977, 1983), (Rietman, 1989), shapes, regions, and topographies that are almost all boundary. This area of inquiry is still in flux. In the realm of natural language processing, where AI makes contact with the concerns of linguistics, one school of thought responds to these questions under the rubric of "principles and parameters" and carries on a vigorous dialogue about the distribution of labor between the core and the periphery of natural languages and their associated learning or development processes (Chomsky, 1965, 1981, 1986). In his emphasis on the physicality of signs and the fact that their processes would have to be subsumed under natural laws, Peirce anticipated another cornerstone of AI, the "physical symbol system hypothesis" of Newell and Simon.
All of these issues that occupied Peirce would be encountered again later in the 20th century when computer scientists, linguists, communication engineers, media theorists, and others would be forced to deal with them in their daily practice and would perforce discover many workable answers. These are the topics that have come to be recognized as the reality of information and uncertainty, the physicality of symbol systems, the independent dimension of syntax, the complexity of semantics and evaluation, the pragmatic metes and bounds of interactive communication and interpretive control. All in all, as acutely discovered in AI systems engineering, these factors sum up to the general resistance of matter to being impressed with our minds.
3.1.1. Peirce's Existential Graphs
Peirce devised a graphical notation for predicate calculus, or first order logic, that he called the system of "Existential Graphs" (EG). In its emphasis on relations and its graphic depiction of their logic, EG anticipated many features of present-day semantic networks and conceptual graphs. Not only does it remain logically more exact than most of these later formulations, but EG had transformation rules that rendered it a literal calculus, with a manifest power for inferring latent facts. An explicit use of Peirce's EG for knowledge base representation appears in (Sowa, 1984). A software package that uses EG to teach basic logic is documented in (Ketner, 1990). The calculus presented below is related in its form and interpretation to the propositional part of Peirce's EG. A similar calculus, but favoring an alternate interpretation, was developed in (Spencer-Brown, 1969).
3.1.1.1. Blank and Bound Connectives
Given an alphabet A = {a1, …, an} and a universe U = áAñ, we write expressions for the propositions p : U → B upon the following basis. The ai : U → B are interpreted as coordinate functions. For each natural number k we have two k-ary operations, called the blank or unmarked connective and the bound or marked connective.
The blank connectives are written as concatenations of k expressions and interpreted as k-ary conjunctions. Thus,
e1 e2 e3 means e1 and e2 and e3 .
The bound connectives are written as lists of k expressions (e1, …, ek), where the parentheses and commas are considered to be parts of the connective notation. In text presentations the parentheses will be superscripted, as (e1, …, ek), to avoid confusion with other uses. The bound connective is interpreted to mean that just one of the k listed expressions is false. That is, (e1, …, ek) is true if and only if exactly one of the expressions e1, …, ek is false. In particular, for k = 1 and 2, we have:
(e1) means not e1 . (e1 , e2) means e1 xor e2 , e1 + e2 , or e1 neq e2 , e1 ≠ e2 .
A remaining sample of typical expressions will finish illustrating the use of this calculus.
(e1 (e2)) means e1 ⇒ e2 , i.e. if e1 then e2 , i.e. not e1 without e2 . ((e1)(e2)(e3)) means e1 or e2 or e3 . (e1 , (e2 , e3)) means e1 + e2 + e3 .
3.1.1.2. Partitions : Genus and Species
Especially useful is the facility this notation provides for expressing partition constraints, or relations of mutual exclusion and exhaustion among logical features. For example,
- ((p1),(p2),(p3))
says that the universe is partitioned among the three properties p1, p2, p3. Finally,
- (g , (s1),(s2),(s3))
says that the genus g is partitioned into the three species s1, s2, s3. Its venn diagram looks like a pie chart. This style of expression is also useful in representing the behavior of devices, for example: finite state machines, which must occupy exactly one state at a time; and Turing machines, whose tape head must engage just one tape cell at a time.
3.1.1.3. Vacuous Connectives and Constant Values
As a consistent downward extension, the nullary (or 0-ary) connectives can be identified with logical constants. That is, blank expressions " " are taken for the value true ("silence assents"), and empty bounds "( )" are taken for the value false. By composing operations, negation and binary conjunction are enough in themselves to obtain all the other boolean functions, but the use of these k-ary connectives lends itself to a flexible and powerful representation as graph-theoretical data-structures in the computer.
3.1.2. Implementation Details
The interpreter that has been implemented for EG employs advanced data-structures for the reprsentation of both lexical terms and logical expressions.
On the syntactic side of its operation this interpreter literally incorporates a sequential learning algorithm that unifies individual terms and term sequences on an interactive basis. In effect the program's inductive module creates a statistical model of a two-level formal language, that is, the sets of words and phrases that have occurred in its interaction with the user.
On the logical side the propositional modeler uses data-structures that are related to two classes of treelike graphs, known as "cacti" (Harary & Palmer, p. 71) and "cone graphs" (Hoffmann, p. 72). The underlying graphs chosen for these data-structures were selected partly for the rich character of their automorphism groups. Suitable exploitation of permutation group properties can serve to reduce the combinatorial complexity of many routine operations, such as sorting and searching.
Next, the correspondence between propositional expressions and graphical data structures needs to be described, but a few remarks on nomenclature are required first.
Conforming to several dialects of graph theory, the description will list a variety of terminologies. However, the only usage difference of any real importance is this: "Michigan" graph theorists use "labels" once and only once on each point of a labeled graph, whereas others use labels more freely. If tokens of the same feature can attach to many points or none at all, then MI graph theorists (Harary, 1969) and certain game theorists (Conway, 1976) call these attributes "colors".
The game theorists make a further distinction in the way that "spots" (places or points) can be colored. They see a difference between spots that are "painted", excluding other paints, and spots that are "tinted", permitting other tints, on the same spot (Conway, p. 91).
Mathematically, all this verbiage is just a way of talking about two topics: (1) functions and relations from structured objects to sets of features, and (2) equivalence relations (for example, orbits under symmetry group actions) on these structured objects. But the visual metaphors seem to assist thought, most of the time, and are in any case a part of the popular iconography.
3.1.2.1. Painted Cacti
Viewing a propositional expression in EG as a "cactus", the bound connectives ( , , , ) constitute its "lobes" (edges or lines) and the positive literals ai are tantamount to "colors" (paints or tints) on its "points" (vertices or nodes). One of the chief tasks of processing logical expressions is their systematic clarification. This involves transforming arbitrary expressions into logically equivalent expressions whose latent meaning is manifest, their "canonical" or "normal" forms. The normalization process implemented for EG, in the graphical language just given, takes an arbitrary tinted cactus and turns it into a special sort of painted cactus.
3.1.2.2. Concept and Purpose
What good is this? What conceivable purpose is there for these inductive and deductive capacities, that enable the personal computer to learn formal languages and to turn propositional calculi into painted cacti? By developing these abilities for inductive learning and accurate inference, aided by a facility for integrating their alternate "takes" on the world, I hope that AI software will gain a new savvy, one that helps it be both friendly to people and faithful to truth, both politic and correct. To do this demands a form of artificial intelligence that can do both, without the kinds of trade-off that make it a travesty to both.
3.1.3. Applications
The current implementation of this calculus is efficient enough to have played a meaningful part in realistically complex investigations, both practical and theoretical. For example, it has been used in qualitative research to represent observational protocols of event sequences as propositional data bases. It has also been used to analyze the behavior of finite state machines and space-time limited Turing machines, exploiting a coding that is similar to but more succinct than the one used in Cook's theorem (on the NP-completeness of propositional calculus satisfiability). See (Garey & Johnson, 1979) and (Wilf, 1986).
3.2. Differential Extensions of Propositional Calculi
In order to define a differential extension of a propositional universe of discourse U, the alphabet A of U’s defining features must be extended to include a set of symbols for differential features, or elementary "changes" in the universe of discourse. Intuitively, these symbols may be construed as denoting primitive features of change, or propositions about how things or points in U change with respect to the features noted in the original alphabet A. Hence, let dA = {da1, …, dan} and dU = ádAñ = áda1, …, danñ. As before, we may express dU concretely as a product of distinct factors:
- dU = ×i dAi = dA1 × … × dAn.
Here, dAi is an alphabet of two symbols, dAi = {(dai), dai}, where (dai) is a symbol with the logical value of "not dai". Each dAi has the type B, under the ordered correspondence {(dai), dai} = {0, 1}. However, clarity is often served by acknowledging this differential usage with a distinct type D, as follows:
- D = {(dx), dx} = {same, different} = {stay, change}.
Finally, let U′ = U × dU = áA′ ñ = áA + dAñ = áa1, …, an, da1, …, danñ, giving U′ the type Bn × Dn.
All propositions of U have natural (and usually tacit) extensions to U′, with p : U = Bn → B becoming p : U′ = Bn × Dn → B. It is convenient to approach the study of the differential extension U′ from a globally democratic perspective, viewing all the differential propositions p : U′ → B as equal citizens. Devolving from this standpoint, the various grades of differential forms are then defined by their placement in U′ with regard to the basis A′. Extending previous usage, we say that p is singular in U′ if it has just one satisfying interpretation in U′. A proposition p : U′ → B is called singular in U if its projection to U is singular in U, that is, if all its interpretations in U′ share the same cell in U.
Using the isomorphism between function spaces:
- (Bn × Dn → B) \(\cong\) (Bn → (Dn → B)),
each p : U′ → B has a unique decomposition into a p′ : Bn → (Dn → B) and a set of p″ : Dn → B such that:
- p : Bn × Dn → B \(\cong\) p′ : Bn → p″ : (Dn → B).
For the sake of the visual intuition we may imagine that each cell x in the diagram of U has springing from it the diagram of the proposition p′(x) = p″ in dU.
From a theoretical perspective the issue of this difference (between the extended function p and its decomposition p′\(\cdot\)p″) may seem trifling and in view of the isomorphism largely in the eye of the beholder. But we are treading the ground between formal parameters and actual variables, and experience in computation has taught us that this distinction is not so trivial to treat properly in the "i" of a concrete interpreter. With this level of concern and area of application in mind the account so far is still insufficiently clear.
To attempt a clarification let us now make one more pass. Let x and y be variables ranging over U and dU, respectively. Then each p : U′ = U × dU → B has a unique decomposition into a p′ : U → B and a set of p(x)′ : dU → B such that
- p(x, y) = p′(x)(y) = p(x)′(y).
The "x" in p(x)′(y) would ordinarily be subscripted as a parameter in the form px′ , but this does not explain the difference between a parameter and a variable. Here the difference is marked by the position of the prime (′), which serves as a kind of "run-time marker". The prime locates the point of inflexion in a piece of notation that is the boundary between local and global responsibilities of interpretation. It tells the intended division between individual identity of functions (a name and a local habitation) and "socially" defined roles (signs falling to the duty of a global interpreter). In the phrase p′(x) the p′ names the function while the parenthetical (x) is part of the function notation, to be understood by a global interpreter. In p(x)′ the parenthetical (x) figures into the name of an individual function, having a local significance but only when x is specified.
I am not yet happy with my understanding of these issues. The most general form of the question at hand appears to be bound up with the need to achieve mechanisms of functional abstraction and application for propositions. It seems further that implementing moderate and practical forms of this functionality would have to be a major goal of the research projected here. On the one hand Curry's paradox warns that this is a non-trivial problem, that only approximate and temporizing forms of success can reasonably be expected. See (Lambek & Scott, 1986), but (Smullyan, chapt. 14) is probably more suitable for summer reading. On the other hand the work of (Spencer-Brown, 1969), (Aczel, 1988), and (Barwise & Etchemendy, 1989) seems to suggest that logic can be extended to include "fixed points of negation" without disastrous results. I can only hope that time and work will untie the mystery.
Work Area
Author's Note. This section consists of very rough working notes that resided at the ends of my old disk files for this essay. Most of this material is superseded by later work, in particular, by the paper, "Differential Logic And Dynamic Systems".
Logical Tangent Vectors
Discuss variation in portrayal of v in df(u, v):
1. as ordinary vector in second component of product space Bn × Bn, 2. as tangent vector map : (Bn → B) → B, dual to Bn ? 3. as tangent vector map : (Dn → B) → B, dual to Dn ?
Discuss differential as map : T(U) = UT → B.
Analogies between Real and Boolean Spaces
It helps to introduce some notation:
Let R = {real values} Let B = {boolean values} = {0, 1} = {false, true}. Let X = Rn, f : Rn → R. Let U = Bn, p : Bn → B.
In these terms, analogies of the following form are being explored:
Rn f: Rn -> R A A || || V V Bn p: Bn -> B
There are several circumstances that prevent the qualitative study from reducing to a special application of the quantitative theory and method. These aspects of the logical problem domain make it something more than "differential geometry over the field of two elements", though it is always an advantage to recognize any facet of the problem region that does so reduce.
First, in PC (propositional calculus) we are interested in all 2^2n propositions or functions p: U -> B to a greater extent and more equally than in R, where linear functions (and those that can be analyzed in terms of them) have pride of place.
Second, an important part of using propositional calculus as a logical system, one in which we can reason from asserted propositions to definite conclusions, and in which we can find models (solution sets) of constraint systems expressed in propositions, is that we maintain a dual interpretation of the propositions or functions. That is, we interpret a proposition letter "p" in two ways: (1) it denotes a function p: U -> B, the characteristic function of a region or subset of the universe U; and (2) it denotes more literally that same region or subset, the characteristic region S = p-1(1) of the function p, an element S in the power set of U.
By the isomorphism between the function domain (Bn -> B) and the power domain P(Bn), both with 2^2n elements, this dual interpretation of proposition letters "p" is always legitimate. In implementing for practical use a symbolic calculus that exploits the advantages of both functional and declarative properties, we need the constantly available flexibility of shifting back and forth between these two different modes of interpretation.
Third, maintaining this type of dual interpretation with the constructions we need for a differential extension requires some rather tricky mental gymnastics just to figure out what the proper interpretations are.
For example, a tangent vector at a point should be a certain kind of map of type
- v : (Bn -> B) -> B.
Consequently, a vector field should be a certain kind of map of type
- w : Bn -> ((Bn -> B) -> B),
or
- w : (Bn x (Bn -> B)) -> B,
and this is isomorphic to a derivation of type
- z : (Bn -> B) -> (Bn -> B).
Derivations, alias vector fields, also known as infinitesimal transformations, are the elements of Lie algebras, whose theory provides a systematic framework for the study of differential dynamics.
Up to this point my terminology, to the extent that it matters for the qualitative case, has been roughly consistent with the usage in standard accounts, e.g. (Chevalley, 1946) and (Doolin & Martin, 1990). The treatment that follows is much more tentative. I am less certain here about the best way to adapt the geometric concepts to the logical context.
A word of preparation for what is to come: much of the scaffolding we need to build will seem overly definitional and lacking in substance. These te deums are not recited for their own sake merely, but are dictated by our desire for computational implementations, for which careful specifications are of course crucial, and yes, sometimes a bit excruciating.
By way of motivation, to provide something more tantalizing to muse upon while the definitions drone by, you might ask yourself why we never had a "frame problem" in physics and system dynamics with the same paralyzing severity that we still have in AI. I suggest that we did, actually, but that the mathematical developments needed to deal with relative invariance and differential dynamics had the automatic bonus of allowing us to avoid the swamp of irrelevant, and usually meaningless, absolute specifications.
I believe that the proper solution of the frame problem in AI will turn on similar developments in extending our logical representations. Of course, dealing with time and relative change in logic has been notoriously difficult since the days of Parmenides and Heraclitus. I would consider myself lucky to make even the slightest improvement in this situation.
Upon each point of the universe is built a duality of spaces, a pair of spaces that are linear duals of each other, the tangent (co-normal) and normal (co-tangent) spaces at that point. As duals, either one may be chosen to institute their reciprocal definitions. The functional bias that serves the purpose of programming computational implementations of these concepts makes it slightly more expedient to define the normal or co-tangent space first.
Original Universe of Discourse
To do this, it helps to put concrete units, qualitatively distinctive features, back into the discussion. For this we need to introduce some further notation. Let A = {a1,...,an} be an alphabet of n symbols (letters, words, or sentences). These symbols are interpreted as denoting the basic events, features, or propositions of a universe of discourse to which the logical calculus is being applied. Graphically, the ai correspond to the property circles of a Venn diagram. In functional terms A is a system of coordinate maps ai: U -> B.
The circumstance that the universe U of type Bn is generated by the alphabet A is indicated by U = <A> = <a1,...,an>. In concrete terms, we may express U as a product of distinct factors:
- U = Xi Ai = A1 x ... x An.
Here, Ai is an alphabet of two symbols, Ai = {(ai), ai}, where (ai) is a symbol with the logical value of "not ai". Each Ai has the type B, under the ordered correspondence {(ai), ai} = {0, 1}. The relation between the concrete signature and the abstract type of the universe U is indicated by the form:
- Xi Ai = A1 x ... x An -> Bn.
Special Forms of Propositions
Among the 2^2n propositions or functions in (Bn -> B) are several fundamental sets of 2n propositions each that take on special forms with respect to a given basis A. Three of these forms are especially common, the singular, the linear, and the simple propositions. Each set is naturally parameterized by the vectors in Bn and falls into n+1 ranks, with a binomial coefficient (n;k) giving the number of propositions of rank or weight k.
The singular propositions may be expressed as products:
- e1 ... en where ei = ai or ei = (ai).
The linear propositions may be expressed as sums:
- e1 + ... + en where ei = ai or ei = 0.
The simple propositions may be expressed as products:
- e1 ... en where ei = ai or ei = 1.
In each case the rank k ranges from 0 to n and counts the number of positive appearances of coordinate propositions ai in the expression. For example, with n = 3: the singular proposition of rank 0 is (a1)(a2)(a3); the linear proposition of rank 0 is "0"; the simple proposition of rank 0 is "1".
Finally, two things are important to keep in mind with regard to the singularity, linearity, and simplicity of propositions. First, these properties are all relative to a particular basis. That is, a singular proposition with respect to a basis A will not remain singular if A is extended by a number of new and independent features. Second, the singular propositions Bn -> B, picking out as they do a single cell or vector of Bn, are the vehicles or carriers of a certain type-ambiguity, vacillating between the duals Bn and (Bn -> B) and infecting the whole system of types.
Logical Boundary Operator
I think it may be useful at this point to say a few words about the form of the bound connective, which I also call the boundary operator of this calculus.
The form of the bound connective may seem a bit ad hoc. This particular logical connective was arrived at by reflecting on Peirce's system of Existential Graphs and by trying to extend it along the lines of some principles I saw exemplified there, in order to overcome some of the problems and limitations that still affected it. These principles are not relevant here. However, some features of this connective provide a natural bridge (anticipate in a natural way) the uses it will have in the differential extension of propositional calculus. In this guise, the bound connective is also known as the boundary operator of the calculus.
To understand this connection, consider a set of k propositional expressions, for example: e1, e2, e3. Now ask what would be the derivative p' of their logical conjunction p, which in EG becomes the multiplicative product of functions: p = e1.e2.e3. By a time-honored rule one would expect:
- p' = e1'e2.e3 + e1.e2'e3 + e1.e2.e3'.
Extended Universe of Discourse
The time has come to try and determine appropriate analogues in PC of tangent vectors and differential forms. I am adapting terminology to the extent possible from (Flanders, 1989) and (Bott & Tu, 1982). There are propositions p: U' -> B which are essentially no more than propositions p: U -> B in disguise. These are the p for which every p'(x) is already determined to B, that is, those p for which all the p" are constant maps on dU. These propositions are the differential forms of degree 0 on U and make up the space of 0-forms, F0(U).
With the above definitions and distinctions in mind the type of a tangent vector can be expressed more clearly as
v: (Dn -> B) -> B. v: (Bn -> B) -> B. v: F0(U) -> B. ???
This indicates that v acts on a domain of functions q: Dn -> B in the normal (co-tangent) space of U. A basis for such functions is provided by the differential alphabet dA = {da1,...,dan}. v(q) = ?
Consequently, a vector field should be a certain kind of map of type
- w : Bn -> ((Dn -> B) -> B),
or
- w : (Bn x (Dn -> B)) -> B,
and this is isomorphic to a derivation of type
- z : (Dn -> B) -> (Bn -> B).
The tangent vectors at a point x in U collectively form the tangent space at a point, Tx(U). The tangent vectors at all points of U collectively form the tangent space of U, also called the tangent bundle T(U) or UT.
Applying the principle of dual interpretation to these function domains would have the following implications. A tangent vector v represents a proposition about propositions. A vector field w represents a proposition about the relation of points and propositions. A derivation z represents an operation that induces a transformation between propositions. What meaning, expressed in natural logical terms, could these constructions possibly have?
If v is a tangent vector (at a point u) and q is a proposition, then v, as a proposition about propositions, is either true or false of q. Using a mixed logical and geometric metaphor, the boolean value v(q) is called the derivative of q in the direction v. If v(q) is true we say that q is in the direction v at u, or true to the direction v at u, otherwise it is outside the direction v, orthogonal to, or false to the direction v at u. In other language v splits the propositions p: U -> B into two equivalence classes, those in and out of the direction v at u. Two propositions are equivalent with respect to v, written p =v q, if and only if v(p) = v(q). The equivalence class of p with respect to v is denoted [p]v.
To define the differential extension of a propositional universe, it is necessary to define tangent spaces and differential forms. To do this we extend the alphabet A to include differential features. On intutive terms these may be construed as primitive features of change, or propositions about changes in the original set of features. Hence, let dA = {da1,...,dan} and dU = <dA> = <da1,...,dan>. Let U' = U x dU = <A + dA> = <a1,...,an, da1,...,dan>.
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o Systems Engineering: Interest Statement Jon Awbrey, September 1, 1992 Version 3, 01 Sep 1992 Version 4, 01 Mar 1997 Intelligent Systems Project: Part 1 Keywords: Systems Engineering, Artificial Intelligence o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o the function notation, to be understood by a global interpreter. In p(x)' the parenthetical (x) figures into the name of an individual function, having a local significance but only when x is specified. I am not yet happy with my understanding of these issues. The most general form of the question at hand appears Project Sys 2 Ver 4 be discerned, epitomized by their in state spaces with stable equilibria, resting at the bottoms of deep enough potential wells to avoid being ... be expected in every case. This point draws the chase back through the briar patch of that earlier complexity theory, the complexity theory of finite automata and their associated formal languages or transformation semigroups ... theory seems to have reached a baroque stage of development, either too difficult to pursue with vigor, too lacking in applications, or falling short of some essential insight. It looks like another one of those ...
References
Note. This bibliography belongs to a larger paper still in progress.
A
- Abelson, H., and Sussman, G.J., with Sussman, J., Structure and Interpretation of Computer Programs, Foreword by A.J. Perlis, MIT Press, Cambridge, MA, 1985.
- Aczel, P., Non-Well-Founded Sets, CSLI Lecture Notes 14, Center for the Study of Language and Information, Stanford, CA, 1988.
- Aho, A.V., Hopcroft, J.E., and Ullman, J.D., The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1974.
- Ait-Kaci, H., "Type Subsumption as a Model of Computation", in (Kerschberg, 1986).
- Albus, J.S., Brains, Behavior, and Robotics, BYTE Books, Peterborough, NH, 1981.
- Alkon, D.L., Memory Traces in the Brain, Cambridge University Press, Cambridge, UK, 1987.
- Allen, J., Anatomy of LISP, McGraw-Hill, New York, NY, 1978.
- Amit, D.J., Modeling Brain Function : The World of Attractor Neural Networks, Cambridge University Press, Cambridge, UK, 1989.
- Andersen, H.C., The Complete Hans Christian Andersen Fairy Tales, L. Owens (ed.), H.B. Paull, J. Hersholt, and H.O. Sommer (trans.), originally published 1883, 1885, 1895. Avenel Books, New York, NY, 1981.
- Anderson, D.R., Creativity and the Philosophy of C.S. Peirce, Martinus Nijhoff Publishers, Dordrecht, Netherlands, 1987.
- Anderson, J.R. (ed.), Cognitive Skills and Their Acquisition, Lawrence Erlbaum Associates, Hillsdale, NJ, 1981.
- Anderson, J.R., The Architecture of Cognition, Harvard University Press, Cambridge, MA, 1983.
- Anderson, J.R., Cognitive Psychology and Its Implications, 3rd edition, W.H. Freeman, New York, NY, 1990.
- Anderson, J.R., and Bower, G.H., Human Associative Memory, V.H. Winston and Sons, Washington, D.C., 1973.
- Anosov, D.V., and Arnold, V.I. (eds.), Dynamical Systems 1 : Ordinary Differential Equations and Smooth Dynamical Systems', Springer-Verlag, Berlin, 1988.
- Arbib, M.A., The Metaphorical Brain : An Introduction to Cybernetics as Artificial Intelligence and Brain Theory, John Wiley and Sons, New York, NY, 1972.
- Arbib, M.A., Brains, Machines, and Mathematics, 1st ed. 1964. 2nd ed., Springer-Verlag, New York, NY, 1987.
- Arbib, M.A., The Metaphorical Brain 2 : Neural Networks and Beyond, John Wiley & Sons, New York, NY, 1989.
- Arbib, M.A., and Manes, E.G., Arrows, Structures, and Functors : The Categorical Imperative, Academic Press, New York, NY, 1975.
- Arnold, V.I., Ordinary Differential Equations, R.A. Silverman (ed. and trans.), MIT Press, Cambridge, MA, 1973.
- Arnold, V.I., Catastrophe Theory, 2nd edition, G.S. Wasserman and R.K. Thomas (trans.), Springer-Verlag, Berlin, 1986.
- Arnold, V.I., Mathematical Methods of Classical Mechanics, 2nd ed., K. Vogtmann and A. Weinstein (trans.), Springer-Verlag, New York, NY, 1989.
- Arnold, V.I., The Theory of Singularities and Its Applications, Accademia Nazionale dei Lincei and Scuola Normale Superiore, Pisa, 1991.
- Arrowsmith, D.K., and Place, C.M., Ordinary Differential Equations : A Qualitative Approach with Applications, Chapman and Hall, London, UK, 1982.
- Ascher, M., and Ascher, R., Code of the Quipu : A Study in Media, Mathematics, and Culture', University of Michigan Press, Ann Arbor, MI, 1981.
- Ash, R.B., Information Theory, 1st published, John Wiley and Sons, New York, NY, 1965. Reprinted, Dover Publications, Mineola, NY, 1990.
- Ashby, W.R., An Introduction to Cybernetics, Chapman and Hall, London, UK, 1956. Methuen and Company, London, UK, 1964.
- Awbrey, J., and Awbrey, S., "Exploring Research Data Interactively. Theme One : A Program of Inquiry", pp. 9–15 in Proceedings of the Sixth Annual Conference on Applications of Artificial Intelligence and CD-ROM in Education and Training, Society for Applied Learning Technology, Washington, DC, August 22–24, 1990.
- Awbrey, S., and Awbrey, J., "An Architecture for Inquiry : Building Computer Platforms for Discovery", pp. 874–875 in Proceedings of the Eighth International Conference on Technology and Education, G. McKye and D. Trueman (eds.), Toronto, Ontario, May 8–12, 1991.
- Awbrey, S., and Awbrey, J., "Interpretation as Action : The Risk of Inquiry", presented at The Eleventh International Human Science Research Conference, Oakland University, Rochester, MI, June 9–13, 1992. Abstract in the Proceedings, pp. 58–59.
B
- Barwise, J. & Etchemendy, J.,
'The Liar: An Essay on Truth and Circularity' Oxford University Press, New York, NY, 1989.
- Bott, R. & Tu, L.W.,
'Differential Forms in Algebraic Topology', Springer-Verlag, New York, NY, 1982.
- Bratko, I., Mozetic, I., & Lavrac, N.,
'KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems', MIT Press, Cambridge, MA, 1989.
C
- Chang, C. & Lee, R.C.,
'Symbolic Logic and Mechanical Theorem Proving', Academic Press, New York, NY, 1973.
- Charniak, E. & McDermott, D.V.,
'Introduction to Artificial Intelligence', Addison-Wesley, Reading, MA, 1985.
- Charniak, E., Riesbeck, C.K., & McDermott, D.V.,
'Artificial Intelligence Programming', Lawrence Erlbaum Associates, Hillsdale, NJ, 1980.
- Chevalley, C.,
'Theory of Lie Groups', Princeton University Press, Princeton, NJ, 1946.
- Conway, J.H.,
'On Numbers and Games', Academic Press, London, UK, 1976.
D
- Doolin, B.F. & Martin, C.F.,
'Introduction to Differential Geometry for Engineers' Marcel Dekker, New York, NY, 1990.
E
- Easter, S.S., Jr., Barald, K.F., & Carlson, B.M. (eds.),
'From Message to Mind: Directions in Developmental Neurobiology', Sinauer Associates, Sunderland, MA, 1988.
- Ebbinghaus, H.-D., Flum, J., & Thomeas, W.,
'Mathematical Logic', translated by A.S. Ferebee, Springer-Verlag, New York, NY, 1984.
F
- Flanders, H.,
'Differential Forms with Applications to the Physical Sciences', Dover Publications, New York, NY, 1989.
G
- Garey, M.R. & Johnson, D.S.,
'Computers and Intractability: A Guide to the Theory of NP-Completeness', W.H. Freeman, New York, NY, 1979.
H
- Harary, F.,
'Graph Theory', Addison-Wesley, Reading, MA, 1969.
- Harary, F. & Palmer, E.M.,
'Graphical Enumeration', Academic Press, New York, NY, 1973.
- Hoffmann, C.M.,
'Group-Theoretic Algorithms and Graph Isomorphism', Lecture Notes in Computer Science, Volume 136, Edited by: G. Goos & J. Hartmanis, Springer-Velag, Berlin, 1982.
- Holland, J.H., Holyoak, K.J., Nisbett, R.E., & Thagard, P.R.,
'Induction: Processes of Inference, Learning, and Discovery', MIT Press, Cambridge, MA, 1986.
I
- Ihde, D.,
'Experimental Phenomenology: An Introduction', Paragon Books & G.P Putnam's Sons, New York, NY, 1979.
J
- Jackson, P.C., Jr.,
'An Introduction to Artificial Intelligence', 2nd edition, Dover Publications, Mineola, NY, 1985.
- James, W.,
'Pragmatism: A New Name for Some Old Ways of Thinking', Longmans, Green, & Company, New York, NY, 1907.
- Johnson, M.,
'Attribute-Value Logic and the Theory of Grammar', CSLI Lecture Notes 16, Center for the Study of Language and Information, Stanford, CA, 1988.
K
- Kohavi, Z.,
'Switching and Finite Automata Theory', Second Edition, McGraw-Hill, New York, NY, 1978.
L
- Lambek, J. & Scott, P.J.,
'Introduction to Higher Order Categorical Logic', Cambridge University Press, Cambridge, UK, 1986.
- Lloyd, J.W.,
'Foundations of Logic Programming', Springer-Verlag, Berlin, NY, 1984, Second, Extended Edition, 1987.
- Loomis, L.H. & Sternberg, S.,
'Advanced Calculus', Addison-Wesley, Reading, MA, 1968.
M
- Maier, D. & Warren, D.S.,
'Computing with Logic: Logic Programming with Prolog' Benjamin/Cummings, Menlo Park, CA, 1988.
- Manes, E.G. & Arbib, M.A.,
'Algebraic Approaches to Program Semantics', Springer-Verlag, New York, NY, 1986.
- Menabrea, L.F.,
"Sketch of the Analytical Engine Invented by Charles Babbage", Originally Published 1842, With Notes by the Translator: Ada Augusta (nee Byron), Countess of Lovelace, In (Morrison & Morrison, 1961).
- Morrison, P. & Morrison, E. (eds.),
'Charles Babbage on the Principles and Development of the Calculator, And Other Seminal Writings by Charles Babbage and Others', With an Introduction by the Editors, Dover Publications, Mineola, NY, 1961.
N
- Newell, A.,
'Unified Theories of Cognition', Harvard University Press, Cambridge, MA, 1990.
- Nicolis, G. & Prigogine, I.
'Exploring Complexity: An Introduction', W.H. Freeman, New York, NY, 1989.
- Nijenhuis, A. & Wilf, H.S.,
'Combinatorial Algorithms: For Computers and Calculators', 2nd edition, Academic Press, New York, NY, 1978.
- Nilsson, N.J.,
'Principles of Artificial Intelligence', Tioga Publishing, Palo Alto, CA, 1980.
O
- O'Rorke, P.,
"Review of AAAI 1990 Spring Symposium on Automated Abduction", SIGART Bulletin, Vol. 1, No. 3, ACM Press, October 1990, pages 12-17.
P
- Pearl, J.,
'Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference', Revised 2nd printing, Morgan Kaufmann, San Mateo, CA, 1991.
- Peirce, C.S.,
'Collected Papers of Charles Sanders Peirce', Eight Volumes, Edited by: C. Hartshorne, P. Weiss, & A.W. Burks, Harvard University Press, Cambridge, MA, 1931-1960.
- Peng, Y. & Reggia, J.A.,
'Abductive Inference Models for Diagnostic Problem-Solving', Springer-Verlag, New York, NY, 1990.
Q
R
S
- Smullyan, R.,
'To Mock a Mockingbird: And Other Logic Puzzles Including an Amazing Adventure in Combinatory Logic', Alfred A. Knopf, New York, NY, 1985.
- Sowa, J.F.,
'Conceptual Structures: Information Processing in Mind and Machine', Addison-Wesley, Reading, MA, 1984.
- Sowa, J.F. (ed.),
'Principles of Semantic Networks: Explorations in the Representation of Knowledge', Morgan Kaufmann, San Mateo, CA, 1991.
- Spencer-Brown, G.,
'Laws of Form', George Allen & Unwin, London, 1969.
- Spivak, M.,
'A Comprehensive Introduction to Differential Geometry', Second Edition, Publish or Perish Incorporated, Houston, TX, 1979.
T
U
V
W
- Wiener, N.
'The Human Use of Human Beings: Cybernetics and Society', Houghton Mifflin, Boston, MA, 1950.
- Wiener, N.
'Cybernetics: or, Control and Communication in the Animal and the Machine', 1st edition, 1948. 2nd edition, MIT Press, Cambridge, MA, 1961.
- Wiener, N.
'God and Golem, Inc. A Comment on Certain Points where Cybernetics Impinges on Religion', MIT Press, Cambridge, MA, 1964.
X Y Z
Yip, K.M., 'KAM: A System for Intelligently Guiding Numerical Experimentation by Computer', MIT Press, Cambridge, MA, 1991.
Zajonc, A., 'Catching the Light: The Entwined History of Light and Mind', Bantam, 1993. Oxford University Press, New York, NY, 1995.
Document History
Author's Note. The initial portion of this essay is the "Interest Statement" that I submitted as a part of my application to graduate school in the Systems Engineering doctoral program at Oakland University, Rochester, Michigan in September 1992.
The above text is Version 5.5, from May 2002. I will be in the process of folding in changes from Version 8.0, from February 2004.
| Author: Jon Awbrey | Version: Draft 8.00 | Created: 12 Nov 1991 | Relayed: 01 Sep 1992 (Version 3.0) | Revised: 01 Mar 1997 (Version 4.0) | Revised: 22 May 2002 (Version 5.5) | Revised: 26 Aug 2002 (Version 6.0) | Revised: 11 Mar 2003 (Version 7.0) | Revised: 11 Feb 2004 (Version 8.0) | Setting: Oakland University, Rochester, Michigan, USA