Incremental formalization

So, do we force everyone in the organization to study formal representation of knowledge and use spreadsheet-like tools or language-oriented programming tools whenever they speak or record an Insight? Of course not. (Although I will certainly be accused of proposing that, no matter how often I deny it.) People initiate and respond to questions almost automatically in natural language. Later, however, they wish they had been much more precise — and often end up by drawing pictures to illustrate their points.

There are precedents for applying models for knowledge in organizations. Few people in any organization write relational databases, but many use them. RDBMS data entry is workable because the data and the processes of capturing and using that data are deconstructed appropriately and programmed in a way that limits the user’s responsibility to (1) understanding that there are different types of data (different kinds of things) in the real world and (2) understanding the principle that many kinds of things in the real world have one-to-many relationships with other, closely related things in the real world. That’s it. It’s easy for those recording data in a relational database. It’s even easier for users of those resources. It’s a basic form of knowledge representation. What’s more, the resulting structured resource (the database) is an important organizational asset.

How do we get to a similar place when dealing with the immensely diverse and seemingly amorphous knowledge of an organization? Not easily. There are no off-the-shelf knowledge-representation practices, methodologies, or technologies that meet this requirement. And I assert that we must insist on an additional requirement: That using such tools give contributors immediate rewards for doing so — just as conversations with others about a topic of interest provide feedback and are rewarding.

The short-term solution: Start informally

Shipmann and McCall correctly observed in 1999 that,

Computers require formally represented information to perform computations that support users; yet users who have needed such support have often proved to be unable or unwilling to formalize it. (p. 199)

SOURCE: Supporting incremental formalization with the hyper-object substrate. Frank M. Shipmann and Raymond J. McCall. ACM Transactions on Information Systems, 17, 2, pp. 199-227.

Yep. It’s true. We can barely get the average worker to fill out time sheets and expense reports — even with fill-in-the-blank forms that make the process relatively easy and almost foolproof. Getting people to encode “knowledge” in ways that make it more processable is far more daunting, because knowledge itself is hard to describe … and the human-language expressions that we use to describe that knowledge are imprecise, context-dependent, and often highly metaphorical.

Shipmann and McCall go on to describe their solution — a take on incremental formalization [leading to processability] — as having several levels of increasing formalization, from low to high:

  1. Unstructured text.
  2. Delimitation and linking of chunks of text — a characteristic of hypertext systems (and, in particular, of many pre-Web hypertext systems) and of “structured” document formats — which may include specification of the types and properties (metadata) of such chunks.
  3. Argumentation (identification of issues, answers, arguments, and resolutions) — a key aspect usually ignored in the semantic community.
  4. Frame-based organizations of knowledge — in particular, of Concepts. (Otherwise known as computer ontologies.)

Comments on how we adapt incremental formalization to representation of organizational and individual knowledge

The goal of Shipmann and McCall is to get to ontologies. This is what they do for a living. But for me, that’s a bit ironic.

  • The first step in formalization is creating, finding and organizing notes (usually in the form of natural language) in fragments of manageable scope [of meaning] and size. That’s just unavoidable, I believe, because meaningful communications arrive (almost invariably) in the form of unstructured information. Words are natural … and almost irresistible. But organizing words — even nicely “chunked” texts — may also be the part of formalization that is the most frustratingly unproductive. However, sometimes the seductiveness of nicely organized and chunked natural language — creating something that is “good enough” some of the time — actually prevents you from moving forward to meaning.
  • As in design of relational databases, the most fundamental principle is deconstruction of linear streams of information into objects and relationships. In development of computer ontologies, the objects are Concepts. In the model I propose, such graphs are essential, but it is even more important to identify relationships among Concepts that participate in Facts (as higher-order constructs), among the component Facts that constitute Insights, and among the Insights that form larger knowledge constructs.
  • We have to find better methods of getting to meaning. Integration, evaluation, and improvement of “knowledge resources” all depend on new ways of getting to increasingly formalized representations of meaning. But those representations must still be highly usable by people, both by individuals for their own purposes and by groups working together.
  • Almost as important, we have to identify and avoid those hidden impediments to getting to meaning. Focusing on obscure, unique, language-specific, or cleverly problematic aspects of natural language is one kind of impediment to improvement. (The statement All generalizations are false. is an iconic example of excessive cleverness.)
  • It might be instructive to run a Text Analytics application on the unstructured text resources of an organization — with two objectives:
    1. To show how often the same “ideas” (expressed as phrases and sentences) appear in those resources, indicating not only massive duplication of effort but also misunderstandings of those ideas.
    2. To discover candidate Insights for inclusion in a structured knowledgebase.

The wrong primary focus

Shipmann and McCall go on to say,

[Knowledge] need not be formalized to be used by humans; but to be used in any way by the machine, some formalization is required.” [emphasis by Shipmann]

And there, they go horribly wrong! Not because formalization for use by software applications isn’t useful (It is, of course.), but because use by humans is even more important.

First of all, the authors simply fail to grasp that information is not the solution for the information age.

And much of the information generated in knowledge work will never reach the point of system design, yet it will still benefit from formalization.

I do agree with Shipmann and McCall on the following point:

Formalization requires many extra decisions to divide information into chunks, to type and label these appropriately, to link them with useful relationships and assign properties to them. If users end up focusing their energies on formalization rather than the domain task they are supposed to perform, the use of the computer could lead to the quality of their work actually being decreased rather than enhanced.

But, again, they place too much emphasis on information.

You have to look at how knowledge became value in the past.

© Copyright 2017 Philip C. Murray


This entry was posted in aspects of practical knowledge representation, strategies and tactics. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *