Human-Computer Interaction Overview

By Ed Hilpert

'Whether you like it or not, whether you are ready or not, the computer is encroaching on your everyday life at an ever-increasing pace. Small or special-purpose computers that power automated teller machines, digital cellular telephones, pagers, and video games have all become widely accepted by people other than the technical gadget-hounds who originally bought them. This article describes the social interfaces that are making our increasingly technology-controlled world more friendly.'



Many futurists predict that personal computers, in one form or another, are destined for broad usage, controlling everything from your evening entertainment to your air conditioner. Vice President Al Gore's widely accepted vision of a national information infrastructure as the backbone of an America that leads the world in information technology has sparked a rush of activity.

If America is to succeed in achieving this vision of widespread computer use, a fundamental change must occur in today's computer user interface. Studies have shown that before the personal computer can be brought to the ubiquitous consumer product level, it must be made easier for the broad range of consumers who will use it. Computers must be accessible to people without formal education, people who are afraid of computers, people who are physically challenged, and people who just aren't inclined to read manuals or attend training classes. We in the computer industry need to make it possible for people to apply the skills they already have to their computers - making this interaction closer to how they interact during their normal day-to-day activities. In OS/2 development, we are studying ways we can increase the "human-ness" of human-computer interaction (HCI) to lead us into the age of natural computing.

Natural computing? An obvious oxymoron, right? In the past it certainly has been, with the proliferation of manuals, cables, command lines, and "techno-babble" that only the propeller-head crowd could love. Our HCI work concentrates on improving the way you deal with your computer in a manner that is more consistent with the way you deal with other human beings in the real world. HCI includes new graphical user interfaces, enhanced communications, collaborative applications, speech and handwriting recognition, and signature verification. On the not-so-distant horizon, additions to this list will include agent technology, on-screen actors, virtual reality, computer vision, and speaker verification. Above all, HCI must be intuitive for the average consumer. We must remember that people have trouble setting the time on their digital watches and VCRs. We cannot expect consumer-level users to deal with their CONFIG.SYS file or install a new device driver.

Promises, Promises
Just about any "futuristic" science fiction movie or TV show in the past 30 years depicts people interacting conversationally with computers, without the aid of today's standard, limiting keyboard and mouse interfaces. Gene Roddenberry's (and Paramount Pictures) Star Trek TV series and movies are perhaps the most widely recognized and popularly followed examples of this genre. Throughout the years, we have heard captains and crew verbally request data and actions by addressing the computer with the word "Computer" (or a more personal name), had the computer independently decide how to solve the problem without human guidance, and heard the responses from a soothing voice.

Other examples outside Star Trek hold similar promises: HAL from 2001: A Space Odyssey, the holographic picture viewer from Blade Runner, and the vehicle security and operations interfaces in Demolition Man and Earth/2, just to name a few. All have a common thread woven throughout their stories. All depict verbal interfaces with flawless and immediate speech recognition, natural language analysis, independent computation and/or action on behalf of the user, often through vast networks of data and natural-sounding speech responses.

Writers, producers, and studios show the promise of "the future" through the magic of TV, movies, and the pages of novels. Those of us in the business of delivering these capabilities to the consumer have to live up to the promises that have been made for us by these imaginative, forward-thinking people. To use an entertainment industry cliché: it's a tough act to follow.

Foundations
The increase in personal computer power, with the accompanying decrease in size, has opened up new computer application possibilities. Notebook computers are now a popular alternative to their desktop brethren, providing almost as much functionality and speed, but for a price premium (mostly for the flat screen liquid crystal displays [LCDs]). Subnotebooks, tablets, and hand-held computers, while not enjoying as meteoric a rise in popularity, have found their niches. Wireless communications are beginning to impact portable computing and will really explode when digital cellular (CDPD) and Personal Communications Services (PCS) are fully deployed and pricing re-enters earth's atmosphere from its current orbital levels. The stage is nearly set for "anytime, anywhere" computing.

Even with all the advances, computer hardware has not yet provided enough processing power to fulfill all the promises made by science fiction writers. We are still limited by the speed and storage space of today's machines, especially for comprehensive natural language processing. Much of the interesting university research is being conducted with workstations considerably more powerful than today's average personal computer. Once the researchers discover new algorithms and applications, it will take several years for average computer equipment to attain the requisite computing power to perform these functions. Dr. Raj Reddy of Carnegie Mellon University in a recent talk at IBM in Boca Raton, Florida, predicted a "giga-computer" by the end of the millennium: one billion instructions per second and one billion bytes of RAM. In his speech he also predicted that even this advanced computer will still limit what computer scientists will be able to achieve.

Recent trends in personal computing have brought us multimedia computers, with larger, brighter, denser displays, high-fidelity speakers, and CD-ROMs. The next advance promises the marriage of telephones, cable TV, and computers and will integrate--in the same box--voice activated automatic dialers, speaker phones, answering machines, video phones, video on demand, and two-way communication with the mass media.

Personal computer software has moved with hardware into the 32-bit realm of nearly unlimited addressing capabilities, preemptive multitasking, and breathtakingly beautiful graphics. But if computers are truly going to move into the consumer electronics marketplace, software must be easy to write (for the producing companies), and it must be able to be "played" on a variety of systems, just like a music CD can be played on CD players from a multitude of providers. The competing standards of Object Linking and Embedding (OLE) from Microsoft and OpenDoc from the consortium including IBM, WordPerfect, Lotus, and Apple and the principles of code reuse that object-oriented programming and C++ provide are ushering in a new era in software that promises faster time to market, fewer programming bugs, and code that operates on more than one type of machine and operating system.

HCI Today
So if HCI means dealing with computers using the interpersonal skills you have developed in your lifetime, how have the advances of hardware and software advanced this interface, and what skills can you use today in dealing with your computer?

Much of the data absorbed by the average person in a normal day is through sight. It is not surprising then that considerable attention has been paid to the computer's graphical user interface (GUI). Apple Computer's Macintosh, originally touted as "the computer for the rest of us" because of the innovative graphics and plug-and-play ease of use, has lost much of its technological lead in the interface race. At the time this article was written, Microsoft had just announced "Bob," their first attempt at a "social interface" on a personal computer. This interface has drawn quite a bit of flack from the computer trade press and to date has been a marketing failure. Nevertheless, it is an interesting attempt at making the computer more accessible to "normal" people.

Most other computer manufacturers and software producers have added or will be adding a navigable real-world interface to their software. This means that the user is greeted by a picture, then chooses objects representing programs to accomplish tasks. To date these are mostly "launcher" technologies that thinly wrap the underlying window interface with a rich graphical representation of the world - not a fundamental change in the way the user interacts with the computer. Representative examples of these offerings are General Magic's Magic Cap, Packard Bell's Navigator, and Computer Associates' Simply Village.

Writing and speech are also common ways for people to interact. Several companies, including IBM, have products today that recognize a person's handwriting and speech. Touch, or pointing with a pen, is a more natural method of relating to objects than using a mouse or arrow keys, even though people have adapted well to using them. To date, however, there has been no compelling reason for computer manufacturers to include pen hardware on personal computer systems for general consumption. Internet commerce will create the need to verify an individual's signature. The technology exists today to encapsulate a signature and verify it with great accuracy against a previously stored signature. IBM's Pen for OS/2 and the pen computing market was highlighted in Personal Systems' November/December 1994 issue, so we won't focus much attention on it in this issue. Microsoft's Windows for Pen Computing, Telxon's PenRight!, and IBM's Pen for OS/2 are the major forces in the pen computing arena today.

IBM's Voice Type speech recognition system is the focus of Andy Hirshik's article in this issue. Speech is beginning to find its way into major manufacturers' preloaded offerings with speech navigation of the GUI as well as control of well-behaved applications and leading-edge games. These functions are considerably easier than free-form dictation because the limitations on the grammar and vocabulary allow the software to make better guesses at ambiguous input. It also lessens the strains on computer resources by requiring smaller memory and disk storage. Speech recognition has the potential for being a breakthrough technology for computer neophytes. Figuring out how to do something on a computer has always been the user's burden. Wouldn't it be nice to have the computer worry about how to fax a letter or find a file for a change? This interaction is modeled more after that of an assistant or associate and represents the horizon of the new HCI.

Two final areas of interface that are being incorporated into the HCI are agents and collaboration. Agents are people who undertake tasks for you within a specific set of guidelines from you. Travel agents are a good example of this type of interaction. You provide guidelines to a travel agent for dates, times, airlines, meals, and seating preferences. Using your guidelines, the travel agent books travel on your behalf, if a critical mass of your criteria is met. Agent technology on personal computers works the same way. Some of the available agents today help filter and prioritize e-mail, remind you of appointments, and react to conditions on your computer. Future agent technology promises a much richer set of functions, including replacing the human travel agent for making travel arrangements and researching information in remote databases. Agent technology is still in its infancy, with General Magic's Telescript being the first and most prominent agent technology in the industry today. IBM offers a product called IntelliAgent, which will perform some tasks for you on your desktop without your intervention. The result is that you can spend less time doing mundane tasks.

People work together to accomplish shared goals. They collaborate using meetings and processes that automate the work flow in a business. Recent articles in computer trade magazines indicate that the marketplace for solutions involving teleconferencing and groupware is heating up. Teleconferencing requires multimedia computers, microphones, cameras, and relatively high speed communications lines to create virtual meeting spaces. Teleconferencing has the potential for reducing travel budgets while increasing the quality of communications between geographically disparate parties who have to cooperate to get something done. This technology is being used today (although it is still expensive and the required communications services are not pervasive) to allow many people to work from home. The term coined for this form of working is "telecommuting." Many municipalities are encouraging businesses to adopt this technology because of the potential impact it has on road use, energy consumption, and air pollution - not to mention giving people more time at home with their families.

Groupware is a term that defines software for automating a process within a business. The process may be creating a document or specification that many people must contribute to or review, or a progression of stages in a pipeline, like a mortgage approval. Groupware allows computer users to conduct business more like they would if there were no computers - collaboratively - with unencumbered communications. Today, Lotus Development Corporation's Notes is the most popular groupware software on the market.

Much more work will be devoted in the coming years towards making HCI more natural. These technologies, in combination or individually, have the potential to profoundly impact our everyday lives. Just think, someday you will be able to say "Computer, plot a course to the nearest pizza restaurant, top speed ...Engage."