27 Sep 2011 @ 12:21 PM 


I have a starting point for my architectural model of conceptual systems (which I borrowed from the latest theorizing and research in cognitive psychology). One part of this framework can be stated: cognitive competences (visible high-level capacities such as reading, categorizing, reasoning, planning, language understanding, etc) are implemented with cognitive processes (lower-level mechanisms that operate on concepts and perhaps other mental resources). It’s a pretty simple viewpoint that will be of great help in organizing my thinking, and it also implies a methodology: Enumerate the cognitive competences, determining for each one which cognitive process(es) implement it. Then, determine what concepts must do to support the cognitive processes. Following this methodology isn’t very straightforward (although I have only just begun to use it)… very little is known about most of these cognitive processes, and there is little consensus on what they even are. That’s expected, I suppose.

My task is subtly different than modelling human concepts and conceptual processes. What I am actually trying to do (eventually) is build a conceptual system that is also modelled by the psychological models of human concepts and conceptual processes. I don’t want to make claims about how human minds work (which is what a model of human concepts and cognitive competences and processes would be). I want to build a conceptual system that operates similarly enough to the human system to be described by competent and interesting psychological models.

I become more and more convinced, as I study the research programs of various academic disciplines which are aimed at characterizing conceptual thought in human beings, that:

  • Concepts appear to be very complex things. A startlingly large number of cognitive processes operate on concepts, each of which seems to have its own subtly different way of using concepts. This implies a broad representational flexibility in the concepts themselves.
  • Despite this heterogeneity, I think that ultimately these varied representational requirements will be satisfied by a relatively small number of core mechanisms and structures. These are what I think of as “substrate features”. My ideas on the details of substrate features are very young, and I have a lot of cognitive processes to study before I feel somewhat confident about what the requirements are.
  • Many cognitive competences are acquired from human culture, either through artifacts, instruction, or imitation. I have been calling these ways to think to distinguish them from innate competences. These competences are skills, and like other skills can often be “compiled” into efficient subconscious processes.
  • Most of our concepts are similarly acquired from human culture.
  • I think that before we can acquire concepts from our culture, we have to have a base inventory of concepts. These concepts are either innate or are acquired through interaction with our physical environment. It is critically important for me to determine what these base concepts are. A few of them are pretty obvious: object, actor, cause. It is less clear whether the concepts labeled “Image Schemas” by cognitive linguists are base-level concepts, but I expect that most of them are.

Besides the main question (what these concepts and processes can be built from), the critical question is: Where does the base inventory of concepts comes from? Are they innate, acquired, or is it some combination of both? Such a combination could involve some “parts” of the concepts being innate. It is even possible (in fact, likely) that the concept acquisition mechanism has more-or-less concept-specific biases ensuring the acquisition of certain concepts. This question has been debated by philosophers for hundreds of years!

A related question: regardless of where concepts come from, what features of non-conceptual brain function act as “inputs” to the concept acquisition, recognition, and instantiation processes? Evolution has provided us with an immense array of mental functions that can be considered as “prior” to the conceptual system, and it seems as if any part of our evolutionary heritage could be made use of.

There are many such brain systems that might feed the conceptual system, and I want to explore the possibilities in detail. But the most obvious are the sensory modalities. I have a few theories about the interface between sense modalities and base-level concepts, but I’ll save that for another time rather than getting too far off track.

This all leads us toward “tests” or challenge problems that are closer to perception and action than the language-based tests I considered in the last couple of posts. One interesting such test was posed by Steve Wozniak (who co-founded Apple Computer with Steve Jobs): a robot that can walk into an unfamiliar house and figure out how to make a cup of coffee. Amusingly, Wozniak gives this task as an example of something that will “never” happen.

It’s an interesting challenge that hovers just beyond the edge of what is technologically possible at the moment. Focusing on the role of concepts in guiding physical interaction with the world is very appealing because it is more directly linked to the issue of base-level concepts and intense interaction with hardwired brain systems for perception, motor control, mapping, attention, etc.

Unfortunately, the robotic aspect of this is far too challenging; building such a capable humanoid robot is beyond my means. So, as stated, the Wozniak Test is not something I can really attempt. However, I believe that a robotic challenge that is similar in spirit could be developed that would use a robot arm and visual system. The important aspect of such a test, besides the robust robotics capabilities, is to demonstrate flexible use of concepts and “problem solving” using those concepts. It’s worth thinking about.

These observations about the probable importance of evolutionarily-supplied brain systems in the creation of base-level concepts (or their mechanical support if such concepts are innate) are somewhat disturbing. They imply that I will need an equivalent software/hardware combination as part of my learning environment… which is going to be a lot of work to approximate.

Drat. :) Oh well, I’ve always liked building robots!

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 27 Sep 2011 @ 12 21 PM

E-mailPermalinkComments Off

 26 Sep 2011 @ 11:30 AM 

Marvin Minsky is one of the larger-than-life heroes of AI. There aren’t many academics who continue to produce thought-provoking and important contributions to their field for fifty years, but Minsky is one of them. Read this — Minsky wrote that in 1960! It’s almost sad that, despite five decades of interesting work in AI since that was written, it still feels almost like a relevant summary of AI research today. To list just a few of his accomplishments, Minsky did foundationally important work in neural networks (Perceptrons) in the 60′s, paradigm-defining work in knowledge representation (Frames) in the 70′s, visionary work on architectural aspects of minds (Society of Mind) in the 80′s, and far-reaching insights into mental processes (Emotion Machine) in the 00′s. Once upon a time, I used to carry Society of Mind around with me as if it were Mao’s little Red Book.

But enough hero worship.

A couple years ago, with something like five million dollars of research grants, MIT announced the Mind Machine Project (MMP), an effort to “start over” and take a new hard charge at solving the Big Questions. Minsky, of course, is heavily involved in this initiative, which (having learned some lessons from the limitations of past AI research) has chosen three issues to focus on:

  • Modelling thought in many diverse ways, instead of trying to find a single “key” to intelligence
  • Embrace the messy, ambiguous, and inconsistent nature of memory
  • Incorporate modern insights into parallelism, embodiment, and alternative substrates

I have been thinking a lot about substrates as I have been studying the requirements for a conceptual modelling system, brainstorming about requirements for a type of processing unit that could support the higher-level requirements. I don’t want to write much about that yet, since it is so preliminary (and I really shouldn’t be doing it at all until I have a more complete set of concept requirements, but I can’t help myself). Because of this, I am highly interested in substrate ideas that appear in AI literature. The MMP project has produced a pretty interesting substrate, called Reconfigurable Asynchronous Logic Automata (RALA). I need to study it more closely. It seems to be very different from my own ideas (while sharing some of the same goals), which is always exciting because it is an opportunity to expand my own thinking. If anybody reading this has any opinions about RALA, I’d be very interested to hear about them.

But I’m digressing from my point.

In the context of the MMP, Minsky has proposed a “test” for AI systems that is designed to be similar to the Turing Test, but not quite as demanding. The Minsky Test:

…whether the machine can read a simple children’s book, understand what the story is about, and explain it in its own words or ask reasonable questions about it.

It’s an intriguing idea: instead of the generality of the Turing Test, it focuses more directly on particular cognitive competences and avoids the requirement to reproduce the decades of learning that go into an adult’s inventory of conceptual knowledge. It also moves beyond pure text, as most children’s books include illustrations. The picture I chose for this blog entry is a scanned pair of pages from a book called Moo, Baa, La La La! by Sandra Boynton (a prolific author of books for young children).

I bought a bunch of used books of this type (thick cardboard pages, small amounts of text, funny pictures) and was struck by how these stories focus on embodied multi-sensory experiences — which reinforces the idea that our conceptual system builds on years of basic concept formation and elaboration occurring right at the immediate sensory interface with the world… and our ability to deal with more abstract realms of thought is thoroughly grounded in gargantuan amounts of learning that take place as our bodies experience our physical environment. Also fascinating: the early stages of how we build “world models” on the fly, filling in lots of pieces given only small amounts of text and pictures. Replicating these very important cognitive capacities with AI programs would be a huge accomplishment.

So I certainly think there is a lot of value in a test like this. There are problems, too, though. This Boynton book, for example, is typically “read” by a parent to a child, interactively, maybe around age 2 or so, which is a very different process than children beginning to read on their own, maybe around age 4 or 5. I think Minsky’s test needs to take this into account, that there are important phases in the way that children learn to deal with absorbing information from cultural artifacts.

And yet a large conceptual inventory is required to understand a book even as simple as Moo, Baa, La La La!… it’s not clear to me that using books like this is really a good way to test or emulate those earliest phases of concept formation. It could be that Minsky does have in mind more advanced books (kindergarten level), which is even worse from this point of view: it doesn’t really test for or focus on the explosive beginnings of concept acquisition that children go through in their first couple years of life.

Still, we have to start somewhere, and a test like Minsky’s is a big improvement over Turing’s.

I think the best way to go about attacking a task like this is to attempt a comprehensive conceptual analysis of the material in books for young children. If we can sketch the hierarchy of concepts required for understanding these books, it will be very helpful in designing a research program focusing on early concept acquisition — which is of central importance to my research.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 26 Sep 2011 @ 11 30 AM

E-mailPermalinkComments (2)

 26 Sep 2011 @ 9:11 AM 

The first long phase of developing a coherent and comprehensive set of requirements for an artificial conceptual system is primarily an application of scholarship — so many brilliant thinkers and experimenters have already worked on related projects that it would be foolish to ignore what has been discovered.

Nevertheless, it would be nice to have an application or test case to work on simultaneously:

  • Dealing with specific cases provides a concrete context for thinking about the issues in a less abstract way.
  • A clearly-defined project focuses attention on specific requirements, which are applications of general principles.
  • A more-or-less “real world” scenario provides a sanity check on the plausibility and usefulness of ideas.
  • It is easier to see what knowledge is missing, which helps prioritize the huge volume of potentially useful study materials.
  • Making progress on a test case, even if just a simple one, would demonstrate (to myself and possibly others) that something real is being accomplished.

It’s not a simple thing, though, to find such test cases.  Any worthwhile project in its fully general form needs to be AGI-complete (or else it is not general enough to be distinguished from narrow AI work), but it has to rely primarily on a subset of conceptual functionality, or at least rely on something less than the full mass of knowledge and skills possessed by you or I.

I think I’ll write a series of brief posts about possible applications, which will also be a good context for discussing a few issues of more general interest.

The first and in some ways most obvious application is the Turing Test (or “imitation game”), where an AI system tries to fool a conversation partner into thinking that the program is a human being.  There are some definite advantages to this:  it is easy to understand; it has a long history in AI research; there is a yearly competition (the Loebner Prize) to showcase the work; and some degree of success at the task would be very impressive.

Unfortunately, these good things are outweighed by negatives:

  • It is too hard.  The ability to hold a fully general conversation requires comprehensive implementations of many cognitive competences (language comprehension, language production, the ability to dialog (maintaining conversational context), theory of mind (to understand the mental state of a conversation partner), etc.  And it also requires a nearly fully-developed conceptual inventory.
  • There are inadequate opportunities to demonstrate incremental progress.  Embryonic versions of the cognitive competences will be inadequate to be of any practical use, and the earliest most basic concepts do not really support general conversation ability.
  • The Turing Test is out of fashion in AI circles.  The Loebner Prize is dominated by chatterbots, which don’t really try to solve any AGI issues; they are focused on regurgitating memorized text fragments.  Because of this, most AI researchers think of the task as a joke, and working on it is a very low-status activity.

So the Turing Test is not a very good candidate application.  There are lots of other options, though, which I will write about over the next few days.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 26 Sep 2011 @ 09 11 AM

E-mailPermalinkComments Off

 20 Sep 2011 @ 11:39 AM 

I recently read Edouard Machery’s 2009 book Doing Without Concepts, and found it quite stimulating.  Machery is a philosophy professor at the University of Pittsburgh, and the book is the “offspring” of his PhD thesis.

The point Machery builds his argument toward is: “The notion of concept ought to be eliminated from the theoretical vocabulary of psychology.”  Ouch!  On the face of it, such a conclusion would devastate my research program:  I have no intention of chasing after ghosts or building a system based on a foundation that is inconsistent or incoherent!

Luckily, the situation is not nearly as dire as that.  Machery’s real argument is that concepts have different types — in particular, that a human mind very often possesses several coreferential concepts for the same things, and that these parallel concepts are distinct in important ways.  An example he uses is water; he claims that a concept of water as H2O is critically different from a concept of water based (e.g.) on sensory features of water in our common experience.  These are different enough, he claims, that if they were to be part of a single concept for water, that would lead to all kinds of undesirable consequences — confused categorization of objects in the world, inconsistent reasoning, and so on.

Because of this, Machery suggests that the word “concept” be abandoned because it encourages psychologists to think of concepts as monolithic.

For my purposes, if it does turn out that I need several coreferential concepts for things in the world, I am not really bothered by that, so the conclusion doesn’t really upset me very much.  However, I am not even really convinced by Machery’s argument.  He argues against a particular approach he calls “Hybrid Concepts” which could accomodate these different facets of concepts, but I find his argument against Hybrid Concepts to be unconvincing.  It seems to me right now to be much better to increase the sophistication of the hybrid concepts to address the issues he raises, rather than taking the (to me) extreme step of replicating a lot of conceptual detail into distinct coreferential concepts.  The details would be kind of boring so I won’t go into them.  Against Machery’s alternative, though, imagine that there are two bottles in front of you.  One is opaque black and labeled “H20″, and the other is clear and appears to contain water.  I have a hard time believing that you think about the contents of those two bottles in fundamentally different ways.

None of that is really critical, though… as I said before, if I need coreferential concepts then so be it, no biggie.  I certainly don’t see that possibility as justification for discarding the word “concept”!

The really great thing about this book is the introductory material.  Typical historical treatments of “concepts” in philosophy (and even psychology) focus so much on interesting details like prototype representations versus exemplar representations, that the richness of the conceptual system is rarely acknowledged (categorization being only one small facet of conceptual processing).  Recently it seems this is changing, and this very recent book reflects that by giving a really nice characterization of concepts in general.  I like it so much, in fact, that I’m going to use it as the starting point of my framework for organizing requirements for a conceptual system.  It will have to be extended, because it is not rich enough, but it’s a good start.

Thanks, Professor Machery!  I reject your thesis but love the way you organize your thinking! :)

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 20 Sep 2011 @ 11 39 AM

E-mailPermalinkComments (2)

 16 Sep 2011 @ 1:41 PM 

Given a viewpoint on intelligence in which accumulating knowledge is useful, it seems clear that an AGI system wants a lot of it, and we’d really like it to be automatically acquired.  I trust this is merely stating the obvious.

In a particular system, we have to think about knowledge in the context of how it is represented.  For my project, that is mainly in the form of “concepts”, although a system architecture which makes good use of a conceptual modelling subsystem would have other forms of knowledge as well (e.g. “facts” and “procedures”).  Right now I’d like to be more general though, so indulge me a little bit, dear reader, and let’s talk about an intuitively defined abstract “Knowledge Unit” (KU).  In one system, a KU might be a statement in predicate calculus; in another, it might be a classifier, or a production rule, or a neural module, or a chunk of LISP, or whatever.

A KU is a description of a part of the universe, which is useful for prediction or other “intelligent” tasks, and by necessity it is expressed in some particular knowledge representation formalism.  I view the acquisition of useful knowledge as the result of a search through the possible KUs in that formalism.  “Useful” means helping the system behave intelligently (e.g. achieving goals, or building comprehensive domain models).  If a system can construct a KU directly — by, say, applying preconceived transformations to enviornmental inputs — I prefer to think of that as a very targeted search. :)  Also: note that KUs can be learned as the result of processing “sensory” data (collected while interacting with the world), but learning can also occur purely as the result of internal processes (introspection, if you like).

It’s not too hard to build a system that learns something useful — by picking the right input and massaging it into an easily-digestible form, so that the “inferential distance” between a learned KU and the input data is small.  For narrow AI tasks this is not a bad thing at all, but for general AI we need to look beyond this, to productivity.  A productive system will continue to find useful KUs indefinitely, and (critically) will use what it has learned to help learn more, in a way that also increases indefinitely.  A productive system is worth running for a long time because it does not stagnate.  I don’t believe any productive systems have ever been built.  For this reason, when somebody does finally produce one, that event will be of great interest!

Because hierarchy is an inherent feature of our universe (and because intelligent systems mirror the universe), knowledge is hierarchical:  knowledge builds on other knowledge.  Also, complexity limits the ability to learn — long “descriptions” are hard to find because of the way search spaces combinatorially explode.  There are two complementary approaches to deal with this fact:

  • Bias the language (the knowledge representation formalism itself) in such a way that a lot of useful descriptions are short, so that KUs can be found more effectively.
  • Bias the search through descriptions in such a way that useful descriptions are found sooner.  One way to bias search is with the use of control heuristics (which in a sufficiently nifty* system would themselves be learned and continually improved).  The other way to bias the description search is to exploit the above-mentioned, empirically observed, hierarchical structure of reality, by finding useful KUs in layers, with newly-found useful KUs depending on older ones.  Any interesting knowledge representation formalism will have the property that learned KUs modify the search space in a way that enables the discovery of new, higher-level (often more abstract), KUs.

The process of building these layers is productive if:

  • Deep dependency chains between KUs form (that is, there are many layers) — this will generate increasing abstraction.
  • Lots of particular KUs are used by many higher-layer KUs.
  • The goal-achieving capability of the system increases as KUs are learned.

I’m looking forward eagerly to the day when AGI systems begin to exhibit productivity.  And, of course, one of the requirements for my conceptual modelling system is that it must be productive — when interacting in a rich and dense way with the real world — as part of a complete architecture.


* a theoretical treatment of Niftiness will be available Real Soon.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 17 Sep 2011 @ 01 49 PM

E-mailPermalinkComments Off

 14 Sep 2011 @ 6:35 PM 

Since about 1988 or so, I have been fascinated by the prospect of making computers truly intelligent — in the “Big Picture”, general, sense of the word.  This fascination is driven primarily by profoundly deep senses of wonder and awe and curiosity about the fact of my own existence and the nature of the peculiar and magnificent universe into which I was born.

This fascination could have found expression in different ways.  I could have been drawn to study physics or cosmology; or biology or evolution; or art or music; or meditation or theology; or psychology or philosophy; or history or economics.  I could even have pursued a shallow interest in all of these astonishing things.

However, since childhood, my aptitudes and interests have led me to specialize in designing and writing computer programs.  I’m a “computer guy”, and since computers were built, from the beginning, to mirror aspects of our minds (automating mental tasks), and minds are mirrors of the universe, and my own existence (as a conscious mind) is such an unlikely fact, AI is the natural subject for me to focus on.

And I do have to focus.  Rather than just consuming interesting and entertaining material from our vast cultural reserves, I am driven to investigate, understand, and (most importantly) create.

Of course, being born a human creature leads to more than just contemplating the awe-inspiring.  We live, we love, we work, we struggle, we sing, we change, we tango… for most of those fleeting decades of my life I hardly paid any attention to AI at all.  Besides a few years as a graduate student long ago, it is only recently that I have immersed myself in the Big Question.

Since then, I have come to appreciate just how big an endeavor AGI is.  Some existing projects (like OpenCog and various academic efforts) are aiming explicitly at full AGI architectural implementations, but I don’t see answers in them for the difficult questions, or mechanisms that seem capable of scaling to real-world complexity.  What I do see is slow incremental progress, and other nearly parallel lines of inquiry slowly converging on solutions to the mysteries of mind:  neuroscience mapping brain architecture and function; robotics inching toward adequate world models; cognitive psychology and linguistics developing better descriptions of mental function; philosophy digging toward the core of existence.

Some decades down the road — maybe two, maybe four, maybe ten — we are going to get there, and I have written before about why this is a Good Thing.  It won’t be an individual achievement:  the problem is too vast for one human to solve; the mind is too complex to be characterized by a simple equivalent of a physics equation; an AGI program will require too much code for one coder to write.  And the fuzzy, exception-laden, emergence-loaded, cultural, thingy universe is too complex for a puny 2011 computer to model in sufficient detail.  Evolution gave each of us 100 billion neurons with 100 trillion synapses for a reason.  Computers will get faster over time, though, and we will get there.

I wonder:  how do I fit into all of this?

I wanted (and still do want) to break off a person-sized piece of the puzzle and click it into place.  That would be a worthy achievement for one lifetime.  But there is, as yet, no agreed-upon foundation for breaking the enterprise into such pieces.  At this stage, every AGI researcher must invent their own perspective, and strive for understanding in their own idiosyncratic way — all the while feeding back into the subconscious babble of the global brain, which will actually produce the solution.

I have chosen a speculative study of the requirements for a close analogue of the human conceptual system as the most sensible direction for my own effort.  Even that is likely too broad and deep a task for complete solution in just a few years — the mind is too integrated and interdependent for “concepts” to be isolated and targeted independently of the rest of the mind.  But progress can be made, I believe, once I absorb enough of the brilliant work of other thinkers to serve as a base for moving forward.  Even simply providing some kind of synthesis of the multidisciplinary work on this and related subjects would be a valuable achievement.

My hope is that, after a lot more such study and thought, early-stage experimental implementations of a synthetic conceptual system will prove academically and commercially interesting — enough for some kind of small business venture exploiting the technology to become worthwhile.

Because:  I would really like to be a continuing part of the AGI development effort — and to make a living while doing so!

That’s the plan, anyway.

At the moment, I am still thinking about the relationship between perception and “lowest level” concept creation, which may form the foundation for a complete conceptual system.  Having made a first pass through a lot of the literature on Image Schemas, I have moved on to collect thoughts and ideas from developmental psychology; I am currently absorbed in Susan Carey’s wonderful book The Origin of Concepts, and there are a few more books on the pile from this general academic discipline.  These readings touch on many different requirements for conceptual systems, so it is very fertile ground to be digging around in.

Until next time, dear reader, I hope your own loving, working, struggling, singing path through existence is full of joy.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 16 Sep 2011 @ 02 07 PM

E-mailPermalinkComments Off

 10 Sep 2011 @ 12:40 PM 

Since self-improvement through self-reprogramming is part of the Grand Vision of the far-future evolution of AI systems, it’s interesting to investigate how Automatic Programming is being used in the context of contemporary AGI research.  The importance of the idea was reflected in the AGI-11 conference, which had a special workshop on the subject of Self-Programming.  As an observer, I was struck primarily by the lack of such reflective capabilities in the various exploratory architectural projects, and the general lack of focus on Automatic Programming by the AGI community.  It isn’t really a shock, though… it is hardly surprising that pre-or-proto-AGI systems do not include sophisticated self-programming, because the general programming task is AGI-complete.

Mostly the workshop revealed the fuzziness of the distinction between self-programming and learning in general.  Cynically speaking, it seemed to me as if some of the papers merely adapted unrelated research, describing learning as a type of self-programming (in order to get a quick and easy publication credit).  However, there certainly is some validity to the viewpoint that learning and self-programming overlap:  people “self-program” to change their behavior; any sort of script learning is procedural; any other kind of “learning how” is like learning a program; and learning by imitation (as in Hall’s workshop paper) can reasonably be considered self-programming.  It is less clear, though, that there is any real advantage in viewing these kinds of learning tasks and mechanisms as “self-programming”.

On a smaller scale, variants of Genetic Programming (and other types of limited-capacity search through “program-space”) are surprisingly common as parts of AGI research architectures.  One example of this, presented at the workshop, is the AGINAO project.  Other more mature examples are the LIDA and OpenCog architectures.  To me, there are four interesting questions about this application of Automatic Programming:

  • The purpose of the programs:  What functions are they intended to perform?
  • The programming language used:  Why was it chosen?  What features does it have that make it appropriate for the purpose of the resulting programs?
  • The search method:  How is program-space explored to produce the programs?
  • The utility function:  How are the programs evaluated?

Since I currently imagine that some functionality of this type will be useful in the conceptual modelling system I am working on, I am quite curious about the answers to these questions for all AGI-like systems that use program search.  I’ll make a blog entry another day about the results for these three systems at least.  I wish there was more research out there about theoretical issues in this area… e.g.: the structure of fitness landscapes, or search spaces more generally, for different types of languages; designing languages for evolvability; hierarchical or incremental utility functions; information-theoretic principles of search control in program-space; etc…

There are a few theoretical ideas out there about more direct approaches to self-reprogramming AGI.  One that interests me particularly is Schmidhuber’s Gödel Machines.  A Gödel Machine is a “self-referential problem solver” that can operate arbitrarily on its own source code to make improvements, if and only if it can prove that those improvements are optimal with respect to a built-in utility function (which for many AGI-like applications includes a reward signal from the environment).  Besides whatever “behavior” modules it starts with, it also has to start with a proof system capable of doing those optimality proofs — and (here’s the cool part) it will replace the code of that proof system if it finds a better prover (by searching for better ones using a search mechanism that can itself be improved because it is all part of the “self” that the Gödel machine is designed to modify.  This thing has some very desirable properties!  Like, for example, it could replace the utility function itself with a “better” one, but will do so only if the new one is better by the criteria laid down by the original one… that is, it can’t “wirehead” itself into fake utility, and it will preserve whatever policy was embodied in the original utility function.  This is a beautifully elegant idea that appears to be much better than any of the other formal mathematical approaches to AGI (like AIXI or Schmidhuber’s own previous OOPS system) because of the optimal self-improvement feature (which, to me, is Automatic Programming in its truest sense).

Unfortunately, like the other gorgeous formal approaches to AGI, Gödel Machines appear to me to be wildly impractical for actual systems:

  • The “initial system” that is envisioned bases its utility function on an expected REWARD signal from the environment (i.e. it is a Reinforcement Learning framework).  Defining such a reward signal for a general intelligence in the human world seems approximately impossible — capriciously arbitrary or unhelpfully sparse and discontinuous.
  • Even with such a REWARD signal, the practical intelligence of such a system lies in the utility function (its sophistication and accuracy as a world model expresses that intelligence), so improving it would seem to be a major goal of the Gödel Machine’s self-modification.  However, I don’t see how such changes can be made to the utility function since it has to provably model the REWARD signal (which may exhibit regularities, but not logically necessary ones).
  • So it seems most likely that the RL framework will have to be abandoned in favor of a more speculative utility function.  It isn’t at all clear what should it be….  In a system like this, the intelligence of everything, except the self-modification mechanism, ends up embedded in the utility function — which is my usual critique of all these highly formal approaches to AGI.  Perhaps there is some formal definition of intelligence (different from cumulatively maximizing the REWARD signal) which could start as a small utility kernel and be built up in this way, but I don’t know what it would be.
  • Besides these fundamental objections, the processing power required for the state of the art in this kind of proof search is insane.  Perhaps some variant using a different underlying structure designed specifically for evolvability will be invented — or, maybe, someday a fundamental change in computer technology will arrive which provides virtually infinite processing speed.

Anyway, the elegant formal nature of Gödel Machines makes the approach seem to fit well with the Friendly AI requirement of maintaining provably stable goal systems under radical self-modification.  The whole thing is so lovely that I’m now a huge Schmidhuber fan! :)  I had the pleasure of hearing him give a talk at AGI-11 (on a very different topic); he is an entertaining and eloquent speaker.

There are still two things I want to do before leaving the topic of Automatic Programming:

  1. Muse for a little while about representing knowledge with program code.
  2. Think about a way that programming (real AGI-complete programming) could form part of an AGI application test case.  I’ve been spending a lot of time lately thinking about things a conceptual modelling system could do (applications, “tests”, challenges, etc).  A system that can learn about programming-related concepts might be part of an interesting development path.  To an old programmer, computer programming seems like an obvious domain task. :)
Tags Categories: Uncategorized Posted By: Derek
Last Edit: 17 Sep 2011 @ 01 34 PM

E-mailPermalinkComments (4)

 09 Sep 2011 @ 6:57 AM 

Stanford University is doing something very interesting this semester: they are offering three Comp Sci courses online for free.  The courses are Artificial Intelligence, Databases, and Machine Learning.  I signed up for the AI and ML courses.  Here’s why:

  • Although almost all of the material will be review for me, it’s been quite a while since I did the college courses (like 25 years!) and it won’t hurt to review it.  I used some targeted AI and ML stuff in my recent computer security career arc (because of the irritating habit companies have of buying each other, that included working for Novatix, PC Tools, and finally Symantec although I worked with the the same team for the most part throughout those five-ish years).  But that work only used a few AI techniques, and a refresher seems like a good idea.
  • My current work as an independent AGI researcher is actually pretty far away from the technical details of either mainstream AI or Machine Learning, so I don’t expect it will be very helpful to me right now, but you never know…  looking at old material through a new perspective often provokes interesting insights and ideas.
  • I am interested by the sheer scale of it.  So far, over 100,000 people are signed up for the AI course, and over 50,000 for the ML course.  It’s amazing!  I want to see how they manage such large courses, and I expect the lectures to be of very high quality.
  • With so many people taking these courses, I am certain to work in the future with people who have also taken them.  So it will be useful not only for water-cooler conversation :) but also because it will provide a shared context of terminology and subject matter that could be useful in working with future teams.

Even though I’m pretty familiar with most of the material already, these are real courses with homework and exams, etc — so it will take some time and effort.  But I think it will be worth it.

See you in class, dear reader!

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 09 Sep 2011 @ 06 58 AM

E-mailPermalinkComments Off

 06 Sep 2011 @ 11:07 AM 

Long, long ago… when dreams were simple and the world was young (which of course means when I was young), Comp Sci types dreamed a dream of Automatic Programming.

The idea of computers programming themselves has that aura of self-referential loopiness that us nerdy types find delightfully irresistible.

It’s also the most easily-imagined startup mechanism behind I. J. Good‘s “Intelligence Explosion”, which has memetically evolved into and flourished as the Technological Singularity. As Good wrote in 1965:

… an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.

Besides all that, though, from the very start of the Computer Age, people who worked with computers and wrote a lot of computer programs thought a lot about computer programming. To the uninitiated, computer programming seems like it must be an airy, abstract task, full of creative precision and Deep Thought. But programmers punching their Fortran programs onto stacks of cards knew that, in fact, most of the actual work involved in programming was frustratingly error-prone, and (as with any other sort of trade skill) it was very often repetitive and mechanical. That’s all still true, by the way.

Through the late ’80s, at least, Automatic Programming was a topic of furious interest in AI, and it followed the usual pattern of AI dreams:

  • The tractable early steps in attacking the problem generated tremendous commercial successes, after being spun off into other disciplines (programming languages, compilers, software development tools, etc).
  • The “big picture” turned out to be way harder than expected, progress became slow and murky, and interest died out.

Like many other such things, computer programming turns out to require a high degree of general intelligence (i.e., it is “AGI-complete”). To see why this is, consider what it means to write a computer program:

  • On one end of the process, we have a computer — the ultimate (so far) in physically-realized formal systems: its behavior is pretty close to being exactly characterized by an abstract formal description. That’s why and how we can program it! We can predict exactly what it will do with the specific commands we give it — if we are careful and if weird error conditions outside the formal description (like, say, cosmic rays corrupting memory values) do not occur.
  • On the other end of the process, we have a “specification” of what the program should do.
  • Programming is the task of connecting those two sides.

I had the good fortune at the AGI-11 conference to meet J. Storrs Hall, who has written (in his book Beyond AI) a good description of why the Automatic Programming task (and so many other AI tasks) fizzled out. I’m writing this from my own perspective, but I think he’d agree with my characterization. Formal systems — in mathematics and science, and their application in engineering — are very powerful and have had a huge impact not only on our ability to control our world, but also on how we view that world. It was kind of unexpected that the effort to formalize the world would crash and burn when it came time to characterize the concepts that we humans use to think and communicate. The whole project of AI (as originally conceived) went up in smoke when that turned out to be the case.

In this instance, as any programmer knows, a specification is almost never a complete formal thing. It is a human construct, often vague even by human standards (sometimes not even existing as a textual description and instead just implied by a common unstated understanding of what the goal is). Most everybody agrees that it would be nice if specifications could be more formal but we just haven’t figured out how to formalize the kinds of things human beings want — including what we want computer programs to do. So we still specify them informally.

And the ability to understand “informal” human constructs (here: program specifications) with enough skill to (for example) connect them to a formal system (here: by writing a computer program) is the Great Unsolved Problem of our day. As I keep writing about, I believe the best approach to the problem is to try to get machines to conceptualize the thingy universe in a way that is similar to (and, at some level of sufficiency, compatible with) how we do it. I think many other AGI researchers would agree with that viewpoint (even though it is not the kind of language they would usually use to describe what they are up to), but we have vast disagreements about what “similar” means, and how best to get there.

More generally, though, we might wonder: Why did the effort to formalize the thingy world fail? Have we just not yet developed the correct formal mechanisms, or is it inherently intractable?

Because Automatic Programming is such an interesting example of this general problem (at least to an old code-monkey like me), I think I will continue musing on this subject for a while longer.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 17 Sep 2011 @ 02 05 PM

E-mailPermalinkComments Off

 31 Aug 2011 @ 5:22 PM 

Lots of interesting Real Life stuff has limited my research time lately (including a move to Seattle), but I have been doing some researchy things, and it seems like a good idea to blog a status update.

As the scope of the requirements for a conceptual modelling system continues to broaden in my thoughts, it seems inescapably clear that I really am an AGI researcher.  I came to this conclusion somewhat reluctantly… firstly, I am not actually trying to build an AGI; and, secondly, many of the issues faced by real (i.e. academic) AGI researchers (while definitely utterly fascinating) are not particularly relevant to my own goals… e.g.:

  • “Decision Theory” — how an agent decides what to do next, given its goals and feedback from an environment.
  • Supergoal selection for wisely-constructed generally-intelligent agents.
  • Mathematical theories of “intellegence” as a formally-defined property.
  • Grand (“complete”) cognitive architectures for generally-intelligent agents.
  • Neural Networks (per se).
  • Logic, truth, probability, etc — except insofar as they help characterize reasoning in a conceptual system.

Still, just a few weeks ago I attended the AGI-11 conference at the Google complex in Mountain View CA, and came away with the distinct impression that those researchers really are kindred spirits… even if their approaches and specific research programmes are very different from my own.  It’s kind of nice to feel like part of a group. :)

The conference definitely rekindled a desire to write more numerous (and, by necessity, briefer) blog entries — like this one.  We’ll see how that goes.

Meanwhile, in the process of trying to get a handle on the requirements for a conceptual modelling system, I have been wrestling with the issue of what should constitute a “requirement” in a domain that is far less well-specified than, say, a typical software engineering project.  It’s not just a question of vagueness or scope (although those are important and vexing concerns):  it’s not even clear what kinds of things should be considered for inclusion in the requirements list.  Such a situation is symptomatic of a field with so little common ground among its practitioners — and is part of what makes the whole enterprise so interesting and exciting and challenging.  But as a result, nearly everything I read in AI or cognitive science seems to produce yet more issues or capabilities that must be taken into account.

So I have over a hundred “requirements” to date — most of them barely fleshed-out at all, yet; and I have no doubt that this list will continue to grow.  Besides working through the required detailed analysis, it also seems like it is time to start considering one or more applications — both to help focus the research effort, and to aim toward some result with tangible value.

Since I intend to blog more often, I’ll have more to write about requirements and applications shortly — along with a variety of other things that my musing and meandering mind finds interesting.  Hopefully, dear reader, some of it will interest you as well.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 31 Aug 2011 @ 10 45 AM

E-mailPermalinkComments Off

\/ More Options ...
Change Theme...
  • Role »
  • Posts »
  • Comments »
Change Theme...
  • VoidVoid (Default)
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LiteLightweight
  • No Child Pages...
  • No Child Pages...
  • No Child Pages...