Showing posts with label healthcare IT exceptionalism. Show all posts
Showing posts with label healthcare IT exceptionalism. Show all posts

Sunday, January 16, 2011

IBM's Watson, Jeopardy, and "Revolutionizing Medicine"

In the news recently was a story about a new supercomputer doing amazing things. In a technological tour de force, IBM's Watson supercomputer research project has reached a milestone, beating a group of contestants in the TV gameshow Jeopardy.


The TV game show "Jeopardy"

Will predictions of advancements in medicine follow? Predictably so:

IBM's Watson Supercomputer Beats Humans in Jeopardy Practice Match
eWeek.com
By: Fahmida Y. Rashid
2011-01-13

Watson, IBM's latest DeepQA supercomputer, defeated its two human challengers during a demonstration round of Jeopardy on Jan. 13. The supercomputer will face former Jeopardy champions Ken Jennings and Brad Rutter in a two-game, men-versus-machine tournament to be aired in February.

However, the Jeopardy match-up was not the "culmination" of four years of work by IBM Research scientists that worked on the Watson project, but rather, "just the beginning of a journey," Katharine Frase, vice president of industry solutions and emerging business at IBM Research, told eWEEK.

Supercomputers that can understand natural human language—complete with puns, plays on words and slang—to answer complex questions will have applications in areas such as health care, tech support and business analytics, David Ferrucci, the lead researcher and principal investigator on the Watson project, said at the media event showcasing Watson at IBM's Yorktown Heights Research Lab.

Watson analyzes "real language," or spoken language, as opposed to simple or keyword-based questions, to understand the question, and then looks at the millions of pieces of information it has stored to find a specific answer, said Ferrucci.

This is undoubtedly a remarkable accomplishment.

Indeed, accompanying the announcements we are also seeing predictions that such supercomputers "will have applications in health care."

Indexing of the medical literature, and data mining (for better or worse) from free text come to mind.

However, the current irrational exuberance about healthcare IT in 2011 is based on several misconceptions. This leads to predictions such as this ...

The technology has to process natural language to understand "what did they mean" versus "what did they say," which has a lot of implications in the health care sector, said Frase. Patients are not using the terms doctors learned in medical school to describe their ailments, but more likely the terms they picked up from their parents growing up, she said.


"Patients are not using the terms doctors learned in medical school to describe their ailments"?

What medical school(s), exactly, are being spoken of here?

It seems as if IT folks think medicine was invented just yesterday. In fact, in medical school, internship, residency and practice we learn all about that, and learn how to 'translate' that information or use it to elicit more information as needed in order to provide care. I'm not sure a multimillion dollar supercomputer is needed for that ...

... and related "platform database" predictions such as this:

... A Watson-like system can take that information and co-relate it against all the medical journals and relevant [who decides that? - ed.] information, and say, "Here's what I think [think? -ed] and why," while showing its evidence for how it came up with the conclusion, according to Frase.

(Actually, computers don't think. A more correct statement would be "here are the results of the algorithms that your faithful machine has crunched, using the medical literature as input.")

That's quite naïve and idealistic with regard to actual medical decision making. It is a computer technician's oversimplified, reductionist, amateur view regarding biomedicine, a domain of often wicked complexity.

As I intimated above, one key issue is what is "relevant" with regard to information.

Consider the issue of the medical literature suffering from numerous conflict of interest and dishonesty-related phenomema making it increasingly untrustworthy, as pointed out by Roy Poses in a Dec. 2010 post "The Lancet Emphasizes the Threats to the Academic Medical Mission", at my Aug. 2009 post "Has Ghostwriting Infected The "Experts" With Tainted Knowledge, Creating Vectors for Further Spread and Mutation of the Scientific Knowledge Base?" and elsewhere on this blog.

Then too, there are plausibility issues in medical research, as expressed in the paper "Why Most Published Research Findings Are False", John P. A. Ioannidis, PLoS Medicine 2(8): e124, 2005. Dr. Ioannidis observes:

There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias.

Dealing with these very real-world issues in patient care requires nothing less than human judgment borne of experience, critical thinking skills (emphases on "thinking", which computers cannot do, sorry all you HAL-9000 and M5 fans) and intuition to manage. "Garbage in, garbage out" applies in the extreme.


(The Ultimate Computer, Dr. Richard Daystrom's mid-23rd century Multitronic unit, M5. Fun to watch it beat up on starships Excalibur, Hood, Lexington and Potemkin due to malfunction. Even more fun to watch William Shatner talk it into cybernetic suicide via guilt, and yes, as a teen I knew all the ST trivia cold, but I think sci-fi has gone too far into people's heads.)

There are also other computer-vs-human mind issues at play.

First, let's look at Watson. Just to play the TV quiz show game show Jeopardy, a game largely about the knowledge and recall of trivia, it took the following:

  • Watson is a breakthrough human achievement in the scientific field of Question and Answering, also known as "QA." The Watson software is powered by an IBM POWER7 server optimized to handle the massive number of tasks that Watson must perform at rapid speeds to analyze complex language and deliver correct responses to Jeopardy! clues. The system incorporates a number of proprietary technologies for the specialized demands of processing an enormous number of concurrent tasks and data while analyzing information in real time.


A rack-mountable server ("blade").



Racks of servers, each with multiple advanced CPU's



Racks of servers in IBM's Watson supercomputer, all prettied up, HAL 9000-style


Second, it will be many years indeed before even the current Watson QA capabilities can be tailored to a domain as complex as biomedicine and made widespread for the hundreds of thousands of practicing physicians and the much larger number of allied healthcare professionals in the U.S. or worldwide.

Third, there's this from the same LA Times article linked above:

  • Like its human competitors, Watson won't have Internet access [or access to anything outside its immense local storage - ed.] during the games, so Googling an answer won't be an option, the report said.

I see a significant degree of machine-human unfairness right there. A computer has 100% reliable access to information in its storage media. The human mind does not. That's appropriate for a TV game show that tries to test a persons' knowledge of trivia, but that's not how medicine works.

What if the match were made more fair, giving the humans error-free access to the same information Watson has stored in its own 15 terabytes and
200 million pages?

Fourth, medicine is not about recalling trivial factoids of information based on parsing natural language queries in the "puzzle format" of Jeopardy. As I wrote here and here, medicine is not a platform database information retrieval problem. (I would argue medicine, except in simple cases, is in large part a matter of filtering the irrelevant, unlikely, and unreliable, of which there is an exponentially increasing amount, from information relevant to the subtleties of a complex medical situation.)

Certainly, today's clinical IT will make little dent in healthcare quality and error reduction, as those issues are not in majority due to record keeping problems of paper vs. electronic, as I pointed out in my Dec, 2010 post "Is Healthcare IT a Solution to the Wrong Problem?". The expectations for today's health IT are grossly exaggerated.

How about expert systems technology such as Watson? Regarding NLP and fact retrieval 'tours de force' like IBM Watson, medicine is about cognition, about human judgment born of experience in dealing with ambiguity not just of language but of observations, findings, lab data, image interpretation, etc., about human intuition and assemblage and integration of a huge amount of disparate information in ways not well understood even by its practitioners. The end result is not just recall of a piece of information.

I consider a statement such as:

... a Watson-like system can take that information and co-relate it against all the medical journals and relevant information, and say, "Here's what I think [actually, as mentioned previously, here's the results of the algorithms - ed.] and why"

... to imply just as grandiose a valuation to the technology as the statements I heard a decade ago about the health IT of the day - or even today - "revolutionizing medicine."


I'm not even sure such a capability would be very useful; we already have DXplain developed by domain experts over decades, and that's not had a major impact on healthcare to date.

The real breakthrough will be when a cybernetic expert system can take, say, cases from the Case Records of the Massachusetts General Hospital in the NEJM verbatim, and compete as a peer (e.g., as a peer not recognized to be a machine) with a round-table panel of expert physicians with facile access to the medical literature (e.g., PubMed) on the differential diagnosis, how to establish the diagnosis and to rule out others, the treatment strategies, and the likely outcomes, and then participate in the care.

When this happens (call this the "NEJM Turing Test"), and when such capabilities are affordable and widely available, then medicine will have been revolutionized.

Of course, the enormity of the hardware and the algorithmic advances required to make a truly "revolutionary" tool such as this are obviously staggering. Considering that it takes 10 racks of multiprocessor IBM servers with 15 terabytes of memory and a team of varied domain experts writing algorithms for several years to accomplish the NLP advances and lookups to answer Jeopardy-style trivia questions, one can only imagine what a truly useful cybernetic medical assistance system would look like.

It should also be remembered that Watson does not think. Humans do. Cybernetic Jeopardy and chess playing accomplishments notwithstanding, I believe a machine even close to passing a "NEJM Turing test" will be a long time in coming. Until then, we should be encouraging better support for human physicians struggling to use their medical expertise in a sea of bureaucracy, stress and overwork (part of which will increasingly be a struggle with mission hostile health IT).

Finally, far off as I believe it to be, I do think the hundreds of billions of dollars being devoted to today's health IT would be better devoted to developing a "Dr. Watson" that can pass a medical Turing test as described above, than deploying mission hostile, primitive HIT and developing an unsustainable mega-bureaucracy to support it as in my post here.

This is not to minimize the wonderful accomplishments of the Watson team. I just wish predictions of cybernetic miracles in medicine would be held in abeyance after the lessons of the past fifty years of computers in medicine.

In the meantime, perhaps the current Watson might be useful in remediating our very sclerotic and moribund "mainsteam media," especially in the domain of politics. Its reporters and writers can certainly use cybernetic help in their fact-checking and logic, Jeopardy-style, far more so than physicians.

-- SS

Feb. 23, 2011 addendum:

Gevalt!

From the article "IBM's Watson could usher in new era of medicine", Sharon Gaudin, Computerworld, February 17, 2011:

Jennifer Chu-Carroll, an IBM researcher on the Watson project, said the computer system is a perfect fit for the healthcare field ... Think of some version of Watson being a physician's assistant," Chu-Carroll said. "In its spare time, Watson can read all the latest medical journals and get updated. Then it can go with the doctor into exam rooms and listen in as patients tell doctors about their symptoms. It can start coming up with hypotheses about what ails the patient.

Gevalt indeed ... incredible irrational exuberance. We can easily go from Jeopardy to Medicine ... and then to the Moon, in a hot air balloon! (The moon is up, hot air balloons go up, what's the problem?)

How many patients has Chu-Carroll seen lately?

(Her bio at http://researcher.ibm.com/researcher/view.php?person=us-jencc was initially down at this moment; "The server at researcher.ibm.com is taking too long to respond." Watson must be taking a nap. Archive.org says "The connection has timed out. The server at web.archive.org is taking too long to respond." Just like a doctor!)

Anyway, it now appears:

I am a Research Staff Member at IBM T. J. Watson Research Center. I also manage the Knowledge Structures group which focuses on improving advanced search technology through the use of natural language processing and machine learning techniques. Prior to joining IBM in 2001, I spent 5 years as a Member of Technical Staff at Lucent Technologies Bell Laboratories. My research interests include question answering, semantic search, natural language discourse processing, and spoken dialogue management.

Please, please, please, computer scientists: STOP MAKING THESE PREDICTIONS OF CYBERNETIC MIRACLES JUST AROUND THE CORNER. YOU'VE BEEN DOING IT SINCE THE VACUUM TUBE-BASED MACHINES.

STOP! PLEASE!

(Maybe first they could give us a computer that does a perfect, lucid, coherent, fluent translation of, say, Russian-to-English, another promise made since the 1950's "Robby the Robot" years?)


Robby the Robot in "Forbidden Planet." Lost in Space fans will remember Robby as doing battle with the Class M-3 Model B9, General Utility Non-Theorizing Environmental Control Robot ("Danger, Danger, Will Robinson") as well!

I think the essay in today's WSJ by UC Berkeley philosopher John Searle is also apropos: Watson Doesn't Know It Won on 'Jeopardy!'

-- SS

Addendum: See my Wall Street Journal Letter to the Editor on these matters at this link.

I wrote on a somewhat sarcastic note at this March 2011 post: "Here Comes the Judge! A Quick Thought on Cybernetic Medicine: Why Can't Computers Also Do Law?"

Also see my Sept. 2011 followup post "Once Again, on IBM Watson, Cybernetic Miracles and Reductionist Views of Medicine."

-- SS

Tuesday, August 10, 2010

UK: ISO draft standards for the development, manufacture and deployment of healthcare IT focus on SAFETY

The UK's NHS has not had the best of success to date implementing national health IT, as indicated by reports here and here, for example.

However, they have appeared to have learned from their mistakes and in fact are on the way to being far ahead of the U.S. in terms of understanding what it truly takes for HIT to be efficacious - and perhaps even more importantly, as safe as possible.

From an informatics colleague who informed me of these developments:

The UK has recently adopted the ISO draft standards for the development and deployment of HIT. They don't go as far as premarket approval, but do require vendors to develop and deliver to healthcare organizations a formal hazard assessment for their products, require both to continually update their risk assessments, and require care delivery organizations to have an explicit process for identifying & mitigating risks, and formally accepting (or not) the residual risks that remain. The thinking is these standards will be adopted across the EU once the ISO approval process is completed.


These two remarkable documents are available from the UK's NHS:

http://www.isb.nhs.uk/documents/isb-0160/dscn-18-2009

"Health informatics — Guidance on the management of clinical risk relating to the deployment and use of health software"

Formerly ISO/TR 29322:2008(E)
DSCN18/2009

and

http://www.isb.nhs.uk/documents/isb-0129/dscn-14-2009

"Health Informatics — Application of clinical risk management to the manufacture of health software"
Formerly ISO/TS 29321:2008(E)
DSCN14/2009

From the first of these, the overall intro:

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

Then on to matters at hand:

Introduction

The threat to patient safety

There is mounting concern around the world about the substantial number of avoidable clinical incidents which have an adverse effect on patients, of which a significant proportion result in avoidable death or serious disability, see references [1], [2], [3], [4], [5] and [6]. A number of such avoidable incidents involved poor or "wrong" diagnoses or other decisions. A contributing factor is often missing or incomplete information, or simply ignorance, e.g. of clinical options in difficult circumstances or of the cross-reaction of treatments (a substantial percentage of clinical incidents are related to missing or incomplete information).

It is increasingly claimed that information systems such as decision support, protocols, guidelines and pathways could markedly reduce such adverse effects.

[As I have written in many places such as
here and here, this may or may not be true regarding today's commercial healthcare IT as it is currently designed and deployed. Evidence supporting the assertion, especially robust studies such as randomized controlled clinical trials, is scarce, and evidence contradicting it is growing. The technology remains experimental - ed.]


If for no other reason – and there are others – this is leading to increasing deployment and use of increasingly complex health software systems, such as for decision support and disease management. It can also be anticipated that, due to pressures on time and to medico-legal aspects, clinicians will increasingly rely on such systems, with less questioning of their "output", as a "foreground" part of care delivery rather than as a "background" adjunct to it. Indeed, as such systems become integrated with medical care, any failure by clinicians to use standard support facilities may be criticised on legal grounds.

Increased use of such systems is not only in clinical treatment but also in areas just as important to patient safety, such as referral decision-making. Failure to make a "correct" referral, or to make one "in time", can have serious consequences.

Economic pressures are also leading to more decision support systems. The area of generic and/or economic prescribing is the most obvious, but achieving economy in the number and costs of clinical investigative tests is another.

Thus the use of health software and medical devices in increasingly integrated systems, e.g. networks, can bring substantial benefit to patients. However unless they are proven to be safe and fit for purpose they may also present potential for harm or at least deter clinical and other health delivery staff from making use of them, to the ultimate detriment of patients. Annex A provides some examples of the potential for harm.

Harm can of course result from unquestioning and/or non-professional use, although the manufacturers of health software products, and those in health organizations deploying and using such products within systems, can mitigate such circumstances through, for example, instructions for use, training and on-screen presentation techniques, guidance, warnings or instructions.

Some of these system deficiencies are insidious, may be invisible to the end user [an obviously perilous situation - ed.] and are typically out of the sole control of either the manufacturer or the deploying health organization.

The reports note the obvious, something that the health IT vendors' contractual gag clauses and secrecy in the health IT industry make difficult to rigorously evaluate:

A necessary pre-cursor for determining and implementing controls to minimize risks to patients, from a health software systems that is manufactured and then deployed and used within a health organization, is a clear understanding of the risks which the deployed system might present to patients if malfunction or an unintended event were to occur, and the likelihood of such a malfunction or event causing harm to the patient.

These risks cannot be properly evaluated in an industry where the flows of information are dominated by the vendors.

Some examples of potential for harm, from annex (appendix) A, will likely sound quite familiar to readers of Healthcare Renewal:

  • Patient (mis)identification
  • Inadvertent accidental prescribing of dangerous drugs (such as methotrexate)
  • Incorrect patient details retrieved from radiology information system
  • CT and MRI images could not be seen after being moved to PACS
  • Drug mapping error
  • Pre-natal screening risk computation errors
  • Radiotherapy errors
  • Slack security

I especially note the following in the first document (on deployment):

5.3 Competencies of personnel

Persons performing risk management tasks will need to have the knowledge, experience and competencies appropriate to the tasks assigned to them. This will need to include, where appropriate, knowledge and experience of the particular health software systems (or similar health software products) and applications, the technologies involved and risk management techniques. This should include appropriate registered clinical input throughout the process. Appropriate competency and experience records will need to be maintained.

Clinical risk management tasks can, and should, be performed by a project team that contains representatives of each of the functions that are involved in deploying and subsequently using the health software systems or system, with each contributing their specialist knowledge to build both awareness and consensus. Of particular importance will be clinical input from clinicians who are familiar with the practical realities of the environments within which the software system will be used and the clinical processes to which the software system is directed.

Emphasis on the last sentence is mine. At a time when U.S. CIOs and health IT "talking heads" still find the need to write touchy-feely "Master of the Obvious" articles extolling the virtues of permitting clinicians 'input' into health IT projects, usually under the aegis of unempowered "Directors of Informatics" or "Chief Medical Information Officers" (a.k.a. Directors of Nothing and Chiefs of Nothing, with no true executive presence or authority), the latter direct, definitive sentence is refreshing.

Miracle of miracles, even postmarketing surveillance is covered (the pharma and medical device industries have been mandated by regulators to conduct such studies on their products for decades):

11 Post-deployment monitoring

Both manufacturers and organizations deploying and using health software and other products within systems, have a business need to establish, document and maintain a process to collect and review information about the clinical safety performance of the products and system in the post-deployment phase, at least to help manage their liabilities but also to enable them to optimize their products and systems.

There is much more in these documents.

Download and read the PDF's. I will have more to say in future posts, but thank god someone is considering the risks to patients of this technology, touted as universally beneficent by health IT exceptionalists, in a serious manner.

Now if only we can import this thinking into the United States.

-- SS