jump to navigation

Unstructured Information in BI – Implementation Practicalities with Tacit Data August 30, 2007

Posted by Cyril Brookes in General, Tacit (soft) information for BI, Taxonomies, Tags, Corporate Vocabularies, Unstructured Information.
add a comment

Designing an unstructured information based BI system must take account of the explicit and tacit distinction. There is consensus for this, in blog-speak that means “me and my mate in the next office agree”! Feedback on my last two posts does, however, unanimously support this contention. The issue remains, however, so what do we do about it? Here’s what I propose.

Most businesses have a reasonably adequate process for collecting explicit unstructured information, the documents, news, emails, reports, etc. And if your’s doesn’t, the corporate portal experience awaits your attention. For heavy hitters the UIMA approach with its multi vendor retinue is available, and willing, for a substantial sum.

I have opined in the last posts here and here that explicit unstructured information is not where BI relevance is at. It can be a start, but the real value lies in the qualification that the executive and professional mind-space can give to seeds of BI, both explicit and tacit. The tacit realm is the goldmine; it is where the current, relevant, actionable, validated business intelligence lies.

How, then Dear Reader, do you capitalize on your tacit resources?

It’s a 9 step process, as I see it

  1. Encourage contributions from everyone, everywhere, based on credible rumor, opinion, assessment, etc.
  2. Scour the corporate world for knowledge building seeds, explicit and tacit – web crawlers, internal and external portals, news feeds, etc.
  3. Selectively disseminate raw data seeds to subject specialists – formally appointed for preference
  4. Encourage comments on those seeds by the specialists – acts, sources, cross-references, importance, time criticality – with discussion threads escalating in importance as appropriate
  5. Selectively disseminate comments – dynamic audience creation, so that more people, and more senior executives, are aware of more important issues
  6. Encourage issue identification by executives and professionals – implications, assessments, importance value adjustments, criticality adjustments
  7. Selectively disseminate the discussion – dynamic audience modification as business significance becomes clearer, possibly creating closed group discussions if the issue becomes strategic
  8. Propagate decisions made to the appropriate staff
  9. Store knowledge created – with time stamp, sunset clause if appropriate, to help avoid multiple solutions to the same problem

Obviously this must be an explicit process, where the tacit input is first encouraged, then amplified, assessed, amplified again until either the issue dies, is resolved, or mutates into another issue. But make no mistake; it’s the tacit input that drives the successful implementation.

Essentially we are making explicit that which was tacit; but on a selective basis, right time, right people, right place.

There is downside, however. Creating a workable tacit unstructured information BI system with the above features is non-trivial. I have done it many times, and it was never easy.

Caveats and Dependencies

Cultural Crevasses

Culture of collaboration is the all important enabler. If the people related barriers to sharing the knowledge creation process are not addressed, the venture will fail. No question about it. I have made an earlier post on the cultural issues and how they can be managed, but, briefly, the most critical barriers are, in my experience:

  • There’s no reward mechanism for contributing intelligence, and it’s a lot of work for no personal benefit
  • You don’t know who to tell, and it’s a lot of effort to find out
  • You don’t know if this BI snippet you have come across is accurate, you don’t want to bother someone else unnecessarily and someone else must know it anyway
  • There’s no important person around to hear what you have to say; so keep this intelligence to yourself until there is the right audience – the more valuable it is, the longer you’ll wait.
  • Tall poppies lose their heads, so keep your head down, and messengers get shot
  • You don’t want to embarrass your boss, or peer group, so keep it quiet

Source Validation

The source of intelligence is most people’s key to determining apparent accuracy of any tacit input. If you get a stock market tip, you will always check where it came from before acting. It’s the same for a rumor on a competitor’s product recall.

Audience Creation

Dissemination is completely dependent on adequate categorization. If a document, email, news item, etc. is not classified it cannot be circulated to the right audience. And everyone in the business must use the same terms for categorization, or they will miss relevant documents.

Crucial Taxonomies

This implies a standard comprehensive corporate vocabulary or taxonomy. Setting this up is not trivial either.

Automatic Categorization is Oversold

It’s not sufficient to classify documents by internal content references. The real, useful keywords for document that is relevant to BI may not even appear in the text. In spite of the tremendous advances in text analysis, the personal categorization by a subject expert still wins the classification stakes, in my opinion. By all means use the automated technique to get the item to a subject expert, but he/she will always be the best determinant of cross-references, importance and time criticality.

Finally

I believe that an important principle BI analysts need to fully understand is “the strategic and most valuable information in your business is in the minds of the managers and professionals” as first enunciated by Henry Minzberg. Turning this tacit unstructured information into explicit useful stuff is universally a high priority task. Done well, it creates the difference between learning and non-learning enterprises.

Advertisements

Unstructured Information – Tacit Versus Explicit for Profit and BI Best Practice August 9, 2007

Posted by Cyril Brookes in General, Issues in building BI reporting systems, Tacit (soft) information for BI, Unstructured Information.
add a comment

A picture may be worth a thousand words, a news item also has about a thousand and a marketing strategic plan may have around five thousand. OK, but a great idea for a new marketing message, an expert’s adverse comment on the marketing plan, or a chance serendipitous airplane conversation about a competitor’s plans may be each worth a million dollars, for just a few hundred words. What do you want, words or dollars?

I believe there is far too much emphasis on the analysis of documented unstructured information as a BI resource. The basic important data just isn’t there for most businesses. You can search as long as you like; mine it, categorize it, summarize it, but to no avail, the well is dry.

This post follows on from my earlier, definitional, piece on this subject.

Of course, I recognize there is potential import in some written material, for example, recent emails, salesperson call reports, customer complaints and their ilk. But these are like seeds, rather than the fruit off the tree. They are the beginning of a BI story, not the whole enchilada.

At risk of making the discussion too deep, Dear Reader, I think we need to consider the basic concepts before coming to any conclusions about how a corporation should manage its unstructured data, and the tools required.

I find it valuable to characterize unstructured information with a 2 x 3 matrix.

The horizontal axis has the above two basic categories of unstructured information:

Explicit unstructured items are those that are basically unformatted, but have a physical, computable, presence; e.g. documents, pictures, emails, graphs, etc.

Tacit items are basically anything unformatted that is not explicit, they’re still in the minds of professionals and managers, but are nonetheless both real and vital; e.g. mental models, ideas, rumors, phone calls, opinions, verbal commentary, etc.

The vertical axis has the three categories of unstructured information (according to moi!): independent, qualification and reference items.

Independent items stand alone being self-explanatory in the first instance, not requiring reference to other pieces of information, be they structured, unstructured, explicit or tacit.

Qualification items have an adjectival quality, since they add value to other items (structured or unstructured), but are therefore relatively useless without reference to the appropriate one or more Independent or other Qualification items (note there may be one or more threads to a discussion based on an Independent item)

Reference items are pointers to subject experts who can provide details or opinions, and other sources of information, structured or unstructured, together with quality assessments of the value, reliability and timeliness of those sources. As Samuel Johnson said “Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information on it”.

Here’s a descriptive tabulation.

Explicit

Tacit

Independent, stand alone items

Meeting minutes

News items

Analyst reports

Marketing call reports

Legal judgments

Proposals

Government regulations

Suggestion box items

Customer complaints

Strategic plans

Manuals of best practice

Emails about new issues or competitive intelligence

Unrecorded meeting discussions

Ideas

Suggestions (undocumented)

Potential problems

Know-how

Competitive intelligence from informal customer/industry contacts

Stock market (racehorse) tip

Rumors

Intuitions

Off-the-record talks with government officers

Qualification, commentary items

Written comments on a report/news/analyst item

Documented opinion on problem or situation

Formal assessments of status implications

Verbal comments on a report/news/analyst item

Verbal comments on emails

Verbal opinions on problems

Verbal assessments of issues

Possible solution options

Comments on a rumor

Reference, source quality items

Lists of subject experts

Ratings of experts

Document sources, catalogs

Written reviews of document sources

Unrecorded subject expert identity

Opinions on expert quality

People who “know-how”

Informal unrecorded information source documents

Assessments of document source utility

Ask yourself, Dear Reader, which of these cells contains high value information, likely to assist your corporate executives find problems and make decisions? If they’re only on the explicit side, then you’re in the sights of UIMA and lots of enthusiastic vendors; good luck. If some are on the tacit side, please read on.

I’ve covered several of the relevant aspects of managing tacit information in earlier posts, e.g. here and here. However, there are some additional relevant observations to be made in the tacit versus explicit context.

  • The first, possibly most important, observation is apparently self-defeating to my thesis. All important, currently relevant, items of tacit unstructured information should be made explicit as soon as practicable.
  • It is not possible to identify, collect, store, disseminate, and facilitate collaboration on purely tacit items; it will happen in a “same time” meeting, of course, but wider ramifications demand that the prelude and/or outcome be made explicit.
  • Independent intelligence items, be they initially explicit (e.g. a recent email) or tacit, are very rarely complete as regards background to the issue, its importance to the business, its time criticality, and assessments of potential impact. If you will, the knowledge has not yet been created, only its seed.
  • The information required to complete the knowledge building that starts with an Independent Item is rarely in one location or person’s mind.
  • The knowledge building is based mostly on tacit information
  • The knowledge building process is most effective if performed via collaboration between the people who have, or know where to find, the necessary Qualification Items of information.
  • Some process for collaboration audience selection is required, one based on issue content, criticality and importance. It shouldn’t be left to pure chance.
  • Desirably, the collaboration process, but certainly the end result, should be made explicit, to avoid resolving the same issue many times over.

In my previous post I offered some questions that might provoke your curiosity, Dear Reader

  1. What are the most useful sources of unstructured information in our business? Are they Explicit or Tacit?
  2. If Explicit, how do we best marshal the information and report it?
  3. If Tacit, ditto?
  4. Is the information we get from our unstructured sources complete, and ready for promulgation, or do we need to amplify or build on it before it’s useful?

I expect that you will be able to answer 1 and 4 for your business; I’ve outlined the issues as best I can.

I’ll defer offering pointers you might consider for 2 and 3 to the next post, because I believe we still need to revisit the processes and constraints that inhabit the strange corporate world of collaborative knowledge building.