IVR Technology: IVR and XML (Page 3) - Technical Library IVR Outsourcing Services Answering Services and Voice Broadcasting

Database Systems Corp. BBB Business Review

IVR AND VOICE BROADCASTING SERVICES AND SYSTEMS

Home | Contact Us | About Us | Sign Up | FAQ

Message Service / IVR Answering Services • IVR Applications • Open IVR Applications • Tech Library

ACD Systems • Call Center Services • Voice Broadcasting • Answering Service • IVR

Information

Toll Free Services
Telephone Answering Service
800 Number Services
Voice Messaging Systems
Call Recording Systems
Voice Mail Message
Voice Mail System
Voice Mail Software
Inbound Call Center Services
IVR Hosting
Business Phone Services
IVRS Software & Services
IVR Customer Satisfaction Surveys
Telemarketing Services
IVR Provider

Website Information

Hosted IVR
IVR Hosting

IVR Software Solutions

Our technical library presents information and documentation relating to IVR Software Solutions and custom IVR software and products. Business phone systems and 800 answering systems are very popular for service and sales organizations, allowing customers and prospects to call your organization anywhere in the country.

What Is IVR?. Interactive Voice Response (IVR) processes inbound phone calls, plays recorded messages including information extracted from databases and the internet, and potentially routes calls to either inhouse service agents or transfers the caller to an outside extension.

Call today at Call Database to learn more about our IVR services and IVR application development software.

XML Meets "IVR With An LCD"

By Robert Richardson, Computer Telephony

Page 3 of 3

But that's only half of the problem. If application providers really want to provide users with choices on their own terms, then they'll need to offer both voice and onscreen menuing options that mirror each other. If the WAP side loses the context when it hands off to a VoiceXML, then the user is forced to repeat menu choices they've already made (just moments before), this time using voice commands.

Multi-modal options look pretty dorky so far, given these problems, but you can't lay the blame entirely at the feet of the WAP Forum. VoiceXML, too, shares some of the same "early-days" shortcomings of WAP (see The New IVR: Talking to You, September 2000, for a rundown on the basics of the scheme). In some respects, VoiceXML is slightly better prepared for a multi-modal world. For one thing, VoiceXML supports a tag that will initiate a call and provide rudimentary monitoring of call progress. Unlike WAP, this kind of call can be initiated either as a bridged or a blind transfer. If blind, it's no different than the WAP call to a voice number. If bridged, however, the new phone line is conferenced into the existing call. A bridge transfer assumes that the call will terminate within a preset time limit and that control will transfer back to the current VoiceXML page (and, in fact, voice options from that page are still in operation within the call). The VoiceXML server never hangs up, so the context of the call isn't lost.

The fly in this ointment is that a bridged call, by virtue of the fact that the call that's already in progress is a voice call, can't handle data packets. So you won't be updating your WAP deck with a bridged call.

Common Engine of Change

So what's fueling hope for a multi-modal universe? For one thing, WAP and VoiceXML's deep common roots in the general industry move to XML. As XML moves forward, the two protocols are tending to track more closely together. If you have any doubts about this, just imagine trying to talk about matching any pre-Internet-era IVR system to the display of dynamic web content on a wireless LCD screen.

Says Jeff Kunins, manager of developer products and evangelism at Tellme "One of the hugest benefits for VoiceXML is this amazing convergence of voice recognition with using the web development paradigm." What's driving acceptance of VoiceXML - and the broader concept of supporting voice access to Internet content - are web-facing servers capable of formatting markup output dynamically on the fly. This trick is a staple task of today's web servers.

"You basically can author shared business logic one time on the web server and then target the device-specific markup language to create the appropriate user interface for whatever dizzying array of devices you want to support," says Kunins. Not only is this a write-once, publish-however-you-like solution, but this should also mean that when a user moves through the business logic stored on the web server, it should be possible to switch back and forth from one presentation mode to another without losing one's place.

Says Kunins: "You can create holistic, truly multi-mode applications that are all aware of one another on the back end. This has huge economies of scale for cost and integration and customer relationship management. VoiceXML brings speech recognition into that game.

"From a technology perspective, I expect us to wind up in a world where the protocols for interacting with all devices have both a push and pull semantic available to them, in the same way that a phone does with outbound calls versus calls that come in. Rather than having a universal markup language that lets you somehow string together the logic for how those different interfaces come together," Kunins says, the interweaving of logic "just gets done on the server." He expects the development of "some standard techniques that get used on the server to help orchestrate the interactions among the different devices in real time, because, fundamentally, we want to keep markup languages as simple as possible and very specific to the presentation layer for the individual device."

Further Developments

Meanwhile, the World Wide Web Consortium (W3C) has taken at least two steps that are sure to have an impact on future multi-modality. First, the group officially adopted XHTML Basic as a W3C recommendation. This puts the specification on track for IETF adoption and general use across the Internet. A key feature of XHTML Basic is its cross-device usability. It's designed to work on cell phones, PDAs, pagers, and WebTV, in addition to the traditional PC-with-a-VGA-screen.

Second, the W3C's working group held a session in conjunction with the WAP Forum at a recent meeting in Hong Kong to discuss precisely the problem of making WAP and VoiceXML aware of each other. The upshot was a decision to form a multi-modal working group. Interested parties presenting at the Hong Kong workshop included Nuance, Philips, NTT DoCoMo, IBM, NEC, PipeBeach, and OpenWave.

The WAP Forum, a technical consortium representing manufacturers and service providers for over 95% of the handsets in the global wireless market, is already taking steps toward interoperability with other XML-based protocols. Scott Goldman, CEO of the WAP Forum (Mountain View, CA - 650-949-6760), says the "conformance release" expected in June of this year "moves toward a couple of fundamental, next-generation changes for WAP. Probably the most notable one is that we're migrating the markup language from WML to XHTML." This move, he says, "presents a couple of real advantages. First, you have the opportunity to use the same tools when you are writing an application." Second, he says, developers can get far more specific information about the type of device they are sending their XHTML pages to by referring to a new function in XHTML called the user agent profile. It's nothing more than a few lines of code that describe the device in terms of its screen resolution and dimension, the kinds of input it allows, and so on. But it makes easy work out of what is now a complex and gotcha-laden task of inferring what the device must be based on what it says about its browser name and version.

For WAP watchers, Goldman notes "a number of other things are coming down the road in WAP 2.0 that will be more recognizable from a consumer standpoint. Those will include support for color, graphics, animation, large-file downloading, and synchronization with desktop PIMs."

What Goldman doesn't expect to see, interestingly enough, is code-level convergence of WAP and VoiceXML - a slightly different point of view, it seems, from the talk at the W3C's multi-modal working group. "We at the WAP Forum are trying to stick to our knitting, to doing what we know how to do best, and letting the guys who are working on voice recognition and voice markup language do what they do best. We'll just put the hooks on both sides of the equation so that they can connect to each other without being dependent on each other.

Page [1] [2] [3]