• site guide
  • Pangaro Incorporated

    Groupware and Corporate Repositories:

    A Proposal for Leveraging Intellectual Capital

    (c) Copyright Paul Pangaro 1994, All Rights Reserved,

    Combing the intelligence of human agents and the complexity handling of software agents to create a new level of personalized, knowledge-based systems

    This text was written while working for a software company whose groupware products related to the proposal contained herein. Examples used were chosen to be specfic to that context, and may seem a little narrow (or even obscure) for a general description.

    It is difficult to detail all that is involved or has been thought through in the many years spent designing and constructing systems along the lines described. My goal here is to be evocative, not complete. I invite further conversation on the content and would be very happy to discuss technical and philosophical details, and business consequences, to any depth.


    Organizations are poor at harnessing their internal knowledge, and present-day groupware does far too little to help -- only some corporate knowledge is captured, and so very little becomes available enterprise-wide for review and use.

    Humans will remain for considerable time the most effective "intelligent agents", and present-day software does little to make us efficient at this role on behalf of our colleagues (and ourselves). With intelligent software agents taking all the attention, the idea of capturing our own every-day, knowledge-based desktop activities and making the results available to all, is missed.

    Both of these areas can be addressed with the same software tool. Such a tool would utilize a small "knowledge web" that would represent the core content of any material that is produced at the desktop (e-mail, documents, spreadsheets, calendar entries....). (You can think of a k-web as a set of topics that get associated in small bundles; for example, "workflow" "product information" and "Notes" would be connected in a bundle, and connected to that bundle would be a product brochure explaining the application of Notes to workflow.) Every piece of data would travel and/or be databased with its corresponding k-web, and a "k-web server process" would integrate the k-webs from all users.

    Such a system would:
    • capture all useful exchanges to become part of a corporate repository, becoming one tangible form of the "intellectual capital" of an organization
    • keep track of what you have seen, distinguish different work contexts that you use, and provide a level of personalization in its responses to you that includes awareness of your background, your history with the system, and your cognitive learning preferences
    • provide a conversational medium in which new problems can be formalized and solved, all the while capturing the problem-solving strategies and history
    • significantly increase the efficiency and effectiveness of existing groupware for any organization.


    We have all had the following experiences:
    • not knowing where to look on-line for a piece of data about our organization, procedures, objectives
    • not knowing whom to ask about something
    • spending considerable time responding to a question via e-mail, only to be asked the same question again by someone else
    • not getting a response from someone in the organization who probably knows what you need, and who may be the only one
    • spending lots of time browsing through today's e-mail, most of which isn't immediately relevant but we might want to see later
    • categorizing" our e-mail and then forgetting how to find it
    This document proposes an approach to solving all of the above problems without significantly increasing our individual overhead.

    Organizational Need

    Organizations that rely on "knowledge as its chief resource and result" are increasingly interested in harnessing their "intellectual capital", defined as "intellectual material that has been formalized, captured and leveraged to produce a higher-valued asset." [Fortune October 3, 1994].

    Despite the near-saturation of information technology in organizations today, very little if any of it addresses the issues of intellectual capital directly. Even when an organization is in the forefront of using e-mail and other groupware products, so much corporate knowledge is left completely outside the data networks. And so much of it that does sometimes travel those networks ends up unattached and unrelated to daily work.

    It is tempting to apply a series of "band-aid" fixes to these problems without addressing the fundamental issues. Even if there were a database of databases; even if threaded conversations were the default; even if additions to the database triggered notices to all "appropriate" users; even if we all had time to read and respond to all of our e-mail; none of these improvements and ideals would ameliorate the problem of capturing the data in a manner that allows for flexible, conversational, and personalized recall.


    The following scenario constitutes a proposal for the development of a practical software tool to address the problems noted above, This proposal is not simply a concept, but a practical piece of software that has been prototyped in a number of applications.

    Imagine that, after each e-mail composition (or any activity at the desktop) and before mailing (or saving a file), the author spends a few moments to help the system create a simple knowledge web to represent the material just typed. The incentive to do so is the same as that of "categorization" , which we all do anyway because (a) it is a way to index the material, keeping it accessible; while at the same time (b) moving it out of the current scope of attention by getting it out of our current work queue.

    This knowledge web would travel along with the e-mail content itself. As it passed through its k-web server (and, given the appropriate access control settings) the new k-web piece would be integrated with the existing k-web. Much as a mail server now reports back about delivery failures and unresolved addresses, the k-web server would report back to the author whether other entries (by the same or other authors) appear to be closely related to the new one. (None of this is semantic processing, the server acts only on the k-web structure. The calculus required is well worked out.) The author would have the option of exploring those aspects which have been noted by the server to be possibly identical or contradicting. This activity has the very real and valuable advantage of stimulating the author to create the new entry in the context of existing thoughts, terminology and contexts. Thus the system itself stimulates further thought and refinement, helping to extract the creative capacity of users. Hence the system provides not just a repository, but a catalyst for conversation through the medium of the software itself.

    Once satisfied the author releases the e-mail (and its k-web) for forwarding to its recipients. The k-web server maintains all the data for access and reuse by the author, recipients and others, according to the scenarios that follow.

    Personalization & Navigation

    There are many consequences of linking the author's original data to a k-web. Navigation through the repository becomes a traversal of hyperlinks. "Refining the query" becomes part of the navigation, as the user moves from topic to topic and sees the corresponding connections to other topics along the way. Navigating along the links displays the bundles that they are part of; each bundle has at least one piece of media connected to it (text, sound, graphic, screen cam, animation, etc.). The system itself helps decide which piece of media to display in a given context, as explained below.

    The universe of possible topics is represented in a graphical display of the available web (a direction that commercial products are moving toward rapidly, for example, Visual Recall by XSoft). For example, a user could begin at "workflow" and move to "products" (a list of product names, as distinct from "product information") and then to "legacy connectivity." Note that all of the topics in the series then constitute a context for further interactions, should the user choose to use them. If the user navigates the k-web toward the topic called "SoftSwitch" in the above context, the system retrieves details of SoftSwitch's legacy connectivity for marketed products that connect to DBMSs, and not reports of their recent acquisition by Lotus or their offerings in messaging backbones. This is an example of "user context" aiding and personalizing the navigation.

    Given that certain roles and types of expertise exist within organizations, it is appropriate to pre-define classes of users (whose characteristics become individualized according to actual use by a particular individual). For example, an AA might have different uses of the navigator than a product architect or a QA expert. An AA's "user profile" might therefore indicate primary "interest" or "understanding" or "awareness" in topic bundles relating to expense reports, direct reports and locations to hold meetings. An AA's navigation through the web following the path of "meetings", "September 94" and "workflow" would at first reveal details of conference rooms and meeting organizers, whereas a product architect would first see technical outcomes or planning schedules.

    The more an individual navigates through the web, the more the web "knows" about the user's scope of interests. New navigations take place informed by this experience, in much the same way that human conversations are more directed and better matched after knowing another individual even a little. (We do this adjustment so automatically, and with everyone we meet, that we forget that we interact with one person differently than another based in large part on such a shared "user history.") A very simple example: a user's first navigation from "document management" to "standards" might offer explanations of the background of Shamrock and the acronym SGML. The next time, the user's need for such introductory material is far less likely and would be skipped. If the system is "wrong" to skip over, the user can simply navigate back along a time-line of transaction history. In practice, navigation based on user history becomes extremely specific to each user, saves a great deal of time, and creates a feeling of comfort and ease with the system.

    In addition to "user context", "user profile" and "user history", there is a further level of personalization that is possible: the system can take into account a user's "cognitive style." Cognitive style can be informally explained as follows: whenever we learn something new, some of us prefer the "big picture" before zooming into details; some of us like to cover every piece of detail before enlarging the scope of our query; some of us like graphics more than sound, or examples more than descriptions. Whatever our preference, a good listener (e.g., a good teacher or friend) realizes this and tends to accommodate it. It is very simple to create a series of such "user preference axes" (of which the above are only a few possibilities) and have the system track which values along each axis "please" us and which do not. Of course there must be some form of feedback to the system on these dimensions, which can be as simple as an "I like it, I get it" versus "I don't like it, I don't get it" dichotomy. The software can determine if there are patterns against the dimensions it is looking for.

    It is possible to handle this complexity because of the relationship between the k-web (which represents the structure of the knowledge in the repository) and the data in the repository (all the media itself, the text, screen cams, sound clips, etc.). All of the tracking of individualized data about a particular user is accomplished by user-specific markings on the k-web. Metrics are calculated on the k-web to determine the relevance of one piece of media attached to the web, versus another. If one is shown which is not "right" to the user, another is available with one more mouse click. If the reason the original one is not right is consistent against the user preference axes and across interactions, the system will learn to avoid later ones of the same type.

    Thus, in addition to affording a logical and powerful means for moving through the interrelations of topics via hyperlink navigation, the k-web provides the means for the system to make choices of what data in the repository to actually show the user. In the literal meaning of the term, "personalization agents" work on behalf of the user to tailor the interaction to that individual. Without this capability, all we would have for the additional effort to create the web is a clever navigation tool; what would remain would be the usual problems of standard query searching; for example, getting 50 or 500 "hits" to our SQL query. This problem does not arise here because of the combination of the k-web navigation and user personalization mechanisms.


    There are many details and consequences of this approach that are beyond the scope of this write-up. For example the system as described provides 100% of the features required for delivery of corporate training of any kind. A moment's reflection reveals the importance of training capabilities for any corporation, and especially for those concerned with preserving intellectual capital. Training courses, per se, could be delivered entirely by the very same software interfaces that comprise the capabilities described here. Training maintenance and support features such as testing and statistics could be added easily.

    In terms of "value for business", many demands of the modern corporation are met by the system proposed. Support of teams, cross-functional training, "the learning organization", shared mental models -- these key requirements for the "21st Century organization" are directly supported. At a more detailed level it should be possible to develop hard-valued metrics for the "intellectual capital" composed in such a system, and to have the system calculate same in terms of increased productivity, speed of converging on solutions, process effectiveness and efficiency measures, etc.

    The notion of "intellectual capital" can also be usefully extended to include "the capacity for conversations to formulate and solve problems." The system proposed has many interesting features that "fall out" of the design which support this extended notion of intellectual capital.

    Next Steps

    A live discussion of the features and functions set out here would clarify what cannot be described easily on paper. Also, there is an old (circa 1985) demo available on an AI workstation. This demo would make many of these concepts more tangible. (While it contains nearly all the functions noted here, it does not knit them together into the architecture as proposed; nonetheless it is a valuable example.)


    What is the k-web composed of? There is a huge body of work in AI and cybernetics on semantic nets, knowledge representation schemes, frame-based reasoning, etc. This particular proposal relies on a little-known but robust and proven means for knowledge representation, originally developed for course structuring and computer-based delivery of instruction. There is no commercially-available software system based on these concepts, though the work has considerable academic pedigree and considerable research and prototyping efforts behind it. The knowledge model has only a few objects and constructs, and simple but powerful rules for assembling the k-web. Semi-automatic tools are used to make the process very fast, and actually illuminating to the author, creating further clarity and distinctions (and hence adding knowledge) to the process.

    Why would anyone want to spend time creating the k-web? Users already spend time doing "categorization", which works poorly and requires human, manual memory of what categories are appropriate or have been used in similar cases previously. Such categories do not transfer over the net or even map to others' reference frames. For the investment of slightly more time, far greater value is reaped for the individual and by the organization as a whole. In short, the construction of the k-web will be performed by the user because the activity is in the stream of work: in order to archive for oneself, the same process is performed. And, the value-added can then be shared across the enterprise.

    Won't all this take vast amounts of processing power? The connectivity of a k-web is not very high, that is, each topic in the web is connected to a small fraction of other topics. Navigation consists of traversing topic to topic and hence the processing for each transaction is not prohibitive. Inverted lists and efficient indexing make the processing demand within reason. Integration of new materials into existing k-webs can for many applications take place over time and need not be "instantaneous" report of overlap, contradiction and resolution can take place over time, just as replication and delivery report failures do.

    Won't a lot of maintenance of the k-web be required, to keep it consistent, etc.? The process of k-web integration, described above, distributes the responsibility for this across the users; GUI tools make it fast and easy to resolve conflicts, consider synonyms, etc.

    Won't the system retrieve unwanted data a lot of the time? This depends in part on the consistency of the user's interactions and the length of time the user has used the system. And in general the system is not brittle, and suffers from "soft failures": getting something that doesn't make immediate sense is not far away from something that does. Each interaction is a path to converging on what is desired.

    If the system is "self-maintained" by authors, won't the data it contains be unreliable? The k-web is simply an extension of the "conversational field" of the organization. Hence, judgments about the data come against the experience of the individual, as well as by "considering the source": because all data includes the author/source, the reputation and track record that individuals have carries into the k-web.

    - end -

    © Copyright Paul Pangaro 1994 - 2000. All Rights Reserved.