Thursday, June 4, 2009
XML - INDEX
XML-RPC
Data Typing ZwiftBooks and XML-RPC
XML-RPC Responses
SOAP
Road to SOAP
HTTP
OVERVIEW OF SOAP
SOAP Protocol
SOAP Overview
SOAP Message Structure
SOAP Messaging Example
SOAP and Actors
SOAP Design Patterns
SOAP Faults
SOAP with Attachments
The W3C and SOAP
UNIT IV
What Is Web Services
Web Services - A ZwiftBooks Perspective
Web Services Technologies
Key Technologies
UDDI
UDDI Failure and Recovery
WSDL
From Abstraction to Reality
XML Index
ebXML
UN/CEFACT and OASIS are key players behind ebXML.
ebXML Technologies
The technical architecture consists of several pie
.NET, J2EE, and Beyond
Transactions
NET and J2EE
Sun ONE and Web Services
.NET
The .NET Platform
The .NET Framework
J2EE
Object Oriented Programming
SCRIPTING LANGUAGE
XML-RPC
• In confronting the communication problem of how a program on machine A can get some code on machine B to run, XML-RPC ignores the difficulty entirely and delegates the transport to HTTP, focusing instead on the details of what to say, not how to get the message there.
• Early work on XML-RPC was done by Dave Winer of UserLand Software.
Winer had been working on one of the classic problems of distributed computing:
• how to get software running on different platforms to communicate.
• Shortly after XML came out in 1998, Winer demonstrated cross-platform communication by placing XML remote procedure commands in the body of an HTTP POST request.
• Because XML-RPC depends on HTTP to move data from one server to another, it only needs to define an XML vocabulary that specifies the name of some piece of code to execute remotely and any parameters the code might need.
Data Typing ZwiftBooks and XML-RPC
• In keeping with the spirit of Web reuse, XML-RPC uses XML Schema data types to specify the parameter types of the procedure call.
• Data types include scalars, numbers, strings, and dates, as well as complex record and list structures.
ZwiftBooks and XML-RPC
• To allow other computer systems to query the ZwiftBooks server about the availability and delivery time of a badly needed book.
• The basic idea is that the user supplies an ISBN and a zip code and ZwiftBooks returns the guaranteed delivery time.
• Figure illustrates the use of XML-RPC over HTTP to trigger the execution of a procedure called getGuaranteedDeliveryTime based on ISBN number and zip code.

The XML-RPC specification places a number of minimal requirements on the XML, including the following:
• The XML payload must be well-formed XML and contain a single methodCall structure.
• The methodCall element must contain a methodName sub-item consisting of a string that names the method to be called.
• If parameters are required, the methodCall element must contain a params sub-item that contains individual param elements, each of which contains a single value.
XML-RPC Responses
• According to the rules of XML-RPC, a server must return either the result of the procedure execution or a fault element.
• Figure also illustrates the return value of an XML-RPC packaged in the data area of an HTTP reply. Again, as far as HTTP is concerned, it's just data.
• XML-RPC specifies that the response to a procedure call must be a single XML structure, a methodResponse, which can contain either the return value packaged in a single params element or a fault element which contains information about why the fault occurred.
• Figure also illustrates returning a fault element as the payload of an HTTP response.
• As we'll see with SOAP, the specification for describing failure is an important aspect of XML-based protocols.
SOAP
-used as a protocol language that has enabled communication and data exchange across the Web.
• XML has proven effective in promoting data exchange between partners and collaborators across a wide range of industries,
• A new perspective on middleware with XML protocols such as XML Remote Procedure Call (XML-RPC) and Simple Object Access Protocol (SOAP), - offer platform, language, and transport independence for data exchange between partners and suppliers.
• Transporting XML - HTTP request- Web-based distributed computing and the emergence of Web services.
What Is SOAP and Why Is it Important?
• SOAP is an XML-based protocol for exchanging information in a decentralized, distributed environment.
• Made for the Web, a combination of XML and HTTP that opens up new options for distributed data exchange and interaction in a loosely coupled Web environment.
• SOAP is a technology that allows XML to move easily over the Web.
• SOAP does this by defining an XML envelope for delivering XML content and specifying a set of rules for servers to follow when they receive a SOAP message.
• SOAP has opened opportunities for extending the enterprise.

• Before SOAP- two options for moving data between partners.
Option-I
• To build a wide area network spanning a broad geographic region and let partners plug into it.
• Approach taken by Electronic Data Interchange (EDI), which defined messages and protocols for data transfer but left the network details up to the partners.
• Result- collection of networks that pretty much locked the partners in and made it difficult and expensive to reach out to other EDI networks and costly to bring in new partners.
Option-II
• Approach for moving data between partners was to build a distributed object infrastructure than ran over the Internet.
Common Object Request Broker Architecture (CORBA), Remote Method Invocation (RMI), and Distributed Component Object Model (DCOM).
• The problem was that each had to decide on a protocol that could sit on top of TCP/IP and handle interobject communication.
• CORBA chose Internet Inter-ORB Protocol (IIOP), DCOM chose Object Remote Procedure Call (ORPC), and RMI chose Java Remote Method Protocol (JRMP).
• Drawback was that CORBA could talk to CORBA, RMI to RMI, and DCOM to DCOM, but they could not talk to each other nor directly to the Web except through special sockets
SOAP is one of several options for moving data across the Web.

Option-III
• SOAP, combines the data capabilities of XML with the transport capability of HTTP,
• Overcoming the drawbacks of both EDI and tightly coupled distributed object systems such as CORBA, RMI, and DCOM.
• It does this by breaking the dependence between data and transport and in doing so opens up a new era of loosely coupled distributed data exchange.
Inception in 1998, SOAP has gained wide acceptance across the software industry.
Its impact is evident from the following observations:
• Web services frameworks use SOAP as the transport technology for delivering data and XML-RPC messages across distributed networks.
• Microsoft is committed to SOAP as part of its .NET initiative.
• Sun is using SOAP in its Sun Open Net Environment (Sun ONE) Web services framework.
• IBM, which has played a major role in the SOAP specification, has numerous SOAP support tools, including a SOAP toolkit for Java programmers.
• IBM has donated the toolkit to Apache Software Foundation's XML Project, which has published an Apache-SOAP implementation based on the toolkit.
• CORBA Object Request Broker (ORB) vendors such as Iona are actively supporting SOAP in the form of CORBA-to-SOAP bridges.
Road to SOAP
• As far back as the 1960s, companies turned to computer automation to reduce the paperwork burden associated with purchase orders, bills of lading, invoices, shipping orders, and payments.
• Driven by a need to standardize the exchange of data between companies doing business with each other, in 1979 the American National Standards Institute (ANSI) chartered the Accredited Standards Committee X12 (ASC X12) group to develop uniform standards for interindustry electronic interchange of business transactions.
• The result was a collection of standards known as the Electronic Data Interchange, better known today as EDI.
• As Figure shows, EDI is built around point-to-point networks that require partners to use software that implements EDI's data and messaging specifications.
• It is expensive both to develop and to maintain. In addition, once an EDI system is in place, changes must be agreed upon and implemented by all participants.
• For medium- and small-size businesses, EDI's cost is prohibitive.
• There will always be a need for a WAN wrapper, a network over which to deliver the data.
• Using the Internet as the global WAN wrapper and XML as the data format, the problem of data distribution is greatly simplified. The missing piece is how to get the data from point A to point B, which leads us to HTTP.
HTTP
• Although the Internet and various protocols such as FTP and TELNET had been in existence since the 1970s for moving files, sending email, and allowing individuals to connect remotely, it wasn't until 1992 that the face of the Internet was changed through the use of a simple request-response protocol known as HTTP.
• Figure shows that HTTP works much like FTP except that the contents of a file are delivered to a browser instead of a filesystem.
• EDI works by providing a collection of standard message formats and element dictionaries so that businesses can exchange data using networks of their choice.
• EDI's early success in the transportation industry led to its adoption by other industries, including health care insurance, management, financial services, and government procurement.
• Over the past two decades over 100,000 organizations have used EDI to conduct business with partners and suppliers.
• EDI suffers from the same problem faced by all pre-Web, tightly coupled technologies: network lock-in.
Both HTTP and FTP move data across the Internet. FTP delivers data directly to disk while HTTP delivers it to a browser. When the data is in HTML or a format the browser understands, we have the Web

• To understand how XML is used as a protocol language it is instructive to take a look at how HTTP works. The first HTTP specification written by Tim Berners-Lee is a study in simple elegance
• Clients request files from servers using a simple text string of the form:
GET filename
• This command is interpreted as a request to a server listening on port 80.
• The response of the server is either the contents of the requested file or a string indicating an error
• HTTP gains its power from its simplicity and its explicit avoidance of transport lock-in.
• HTTP sits on top of TCP/IP, which is responsible for reliably moving data between Internet nodes.
• HTTP, a simple request-response Web protocol, has been the catalyst for XML's widespread use.
• The HTTP GET command requests a Web page. The HTTP POST command delivers information and receives information back.

POST Me Some Data
• The POST command is a request for a server to do something with data delivered as part of the POST message.
• POST was included in the HTTP specification in order to deliver HTML form data to a server for processing by some server program.
• The structure of a POST request is similar to a GET, except that data intended for the server appears after the header and is referred to as the body or payload of the request.
• Figure illustrates the structure of an HTTP request showing the difference between GET and POST.
• When a POST request arrives at a server, the server looks for data following the blank line that signals the end of header information.
• This data delivery mechanism turns out to be the key element in moving XML across the Internet.
• Instead of supplying data from an HTML form, the payload slot of an HTTP request can just as easily be packaged with XML.
• The structure of an HTTP request provides an opportunity for delivering XML.
• As far as HTTP is concerned, it's just data.

• As Figure 4.6 shows, XML's transport independence means that it may be carried by any Internet protocol, including HTTP and FTP, or even sent via mail using Simple Mail Transfer Protocol (SMTP). This freedom to move data has opened the door to XML-RPC, SOAP, and the entire Web services initiative.
• XML and HTTP are loosely coupled, with no internal dependencies on each other. Distributed infrastructures such as CORBA, RMI, and DCOM are tightly coupled, with dependencies between data and transport

OVERVIEW OF SOAP
SOAP carries on the XML-RPC tradition by defining an XML language for packaging arbitrary XML inside an XML envelope.
HTTP will usually be used as the transport protocol for SOAP messages.
SOAP is important for seeing how XML can be used to move information across the Web and how it fits in the grand vision of making XML-based distributed computing a reality.
SOAP Background
• The SOAP 1.0 specification was developed by Microsoft, DevelopMentor, and Dave Winer of UserLand Software and released in the spring of 1998.
• Prior to the release of SOAP 1.0, Winer released his work on RPC as the XML-RPC specification http://www.xmlrpc.com, which is very close to SOAP 1.0.
• Following the release of SOAP 1.0, IBM and Lotus joined the original developers, along with a group of partners including Ariba, Commerce One, Compaq, IONA, Intel Corp., ObjectSpace, Rogue Wave, and others.
• SOAP 1.1 was published by the W3C as a Note in May 2000. In July 2001, the W3C released the first public Working Draft for SOAP Version 1.2 based on the work of the W3C's XML Protocol Working Group
• The SOAP specification and its influences

SOAP Protocol
• As Figure illustrates, SOAP is a transport protocol similar to IIOP for CORBA, ORPC for DCOM, or JRMP for RMI.
SOAP differs from CORBA, RMI, or DCOM in several ways:
• IIOP, ORPC, and JRMP are binary protocols, while SOAP is a text-based protocol that uses XML. Using XML for data encoding makes SOAP easier to debug and easier to read than a binary stream.
• Because SOAP is text-based, it is able to move more easily across firewalls than IIOP, ORPC, or JRMP.
• SOAP is based on XML, which is standards-driven rather than vendor-driven.
• Figure 4.9. The SOAP protocol opens up new options for data exchange across the Web

• The net effect is that SOAP can be picked up by different transport protocols and delivered in different ways.
• For example, when used with HTTP it can be delivered to a Web server; when used over FTP it can deposited directly into a file system; and when used with SMTP it can delivered to a user's mailbox
• Figure 4.10 illustrates that SOAP can be used for direct connection between sender and receiver, or, with the use of messaging middleware,
• SOAP messages can be stored for subsequent delivery and/or broadcast to multiple receivers.
• SOAP extends the Web from server-to-browser to server-to-server interaction.

• Many companies, using SOAP as protocol for exchanging data between established partners is proving a totally satisfactory way to leverage the benefits of XML and the Web.
• All that is required is an agreed-upon schema, either a DTD or an XML Schema, for the XML data being exchanged and a SOAP server capable of handling the incoming XML as it arrives over the Web.
• Details about what kind of schemas to expect and who will check that the XML conforms to the schemas are decided offline by individuals participating in the process.
• On the software side, senders need to be involved in packaging their data in an XML document.
• For those companies already storing data in XML, this should require only minimal effort.
• If the stored XML data is not in the form required by the agreement, an XSL Transformations (XSLT) style sheet can be programmed to automate the transformation.
SOAP Overview
SOAP consists of three parts:
•Encoding rules that control XML tags that define a SOAP message and a framework that describes message content
•Rules for exchanging application-defined data types, including when to accept or discard data or return an exception to the sender
•Conventions for representing remote procedure calls and responses
•SOAP messages define one-way data transmission from a sender to a receiver. However, SOAP messages are often combined to implement patterns such as request-response. When using HTTP bindings with SOAP, SOAP response messages can use the same connection as the inbound request
SOAP Message Structure
• Figure 4.11 illustrates the structure of a SOAP message, consisting of three parts:
• For example, it is possible to add SOAP header information that instructs a server to add transaction or authentication information.
• Headers are also important in building piped architectures where processing is done in stages and data is modified as it is passed from handler to handler.
• The SOAP Body: An element that must appear in a SOAP message. The Body element is where the transported XML is loaded. SOAP makes no assumptions about the kind of XML transported in the body of a SOAP message. The data may be domain-specific XML or it may take the form of a remote procedure call.

• Figure 4.11. SOAP messages have a common format that includes a SOAP Envelope, an optional Header, and a Body section that contains the message content. SOAP also defines a message path and set of roles that SOAP nodes can adopt along a path.
SOAP Messaging Example
• In the previous section we looked at sending XML-RPC over HTTP to execute specific procedures on our ZwiftBooks server. With SOAP we may still use XML-RPC to trigger specific methods on the server, or we may simply define XML elements that get processed by our SOAP server.
• To understand how SOAP works, let's continue our ZwiftBooks example and look at how SOAP may be used to expand business functionality by opening our server up to collectors who wish to notify ZwiftBooks about books they have for sale. ZwiftBooks will then add the providers and their books to the ZwiftBooks database.
• To make this happen, several things must be done by ZwiftBooks:
• Define a top-level element and related subelements that will trigger processing of the book availability data by the SOAP server.
• Define a schema (DTD or XML Schema) that dictates the form of the XML that will arrive from collectors and book providers.
• Specify a namespace that is unique to ZwiftBooks. This may be the ZwiftBooks Web site or any URI.
• Configure the server to return a fault if the incoming SOAP message is not one of the special elements defined in step 1.
• Figure 4.12 illustrates a SOAP request for the ZwiftBooks guaranteed delivery time for a specific book, specified by ISBN number. Note that the SOAP Envelope element is the top-level element in the body of the SOAP message and within the SOAP body element is found the request for GetGuaranteedDeliveryTime, packaged as an element including the ISBN number.
Figure 4.12. A SOAP request sent to the ZwiftBooks server.

• A SOAP response takes the form illustrated in Figure 4.13. Here the XML response indicating the best delivery time is packaged in a SOAP message that is delivered using the standard HTTP response protocol. As far as HTTP is concerned, it's just data being returned to the client that initiated the request. However, for a client that understands SOAP, the data becomes useful information.
Figure 4.13. A SOAP response to a request to the ZwiftBooks server.

Message Paths
• An important aspect of SOAP is the provision for message paths. Independent of the transport protocol used to send SOAP messages, messages may be routed from server to server along a so-called message path. As we saw in Figure 4.11, message paths support message processing at one or more intermediate nodes in addition to the ultimate destination.
SOAP Intermediaries
• SOAP intermediaries are an essential aspect of building scalable Web-based distributed systems. Intermediaries can act in different roles, including proxies, caches, store-and-forward hops, and gateways. Again, experience with HTTP has shown that intermediaries cannot be implicitly defined but must be provided as an explicit part of the messaging path model. Thus, one of the key motivations of the Working Group is to ensure that an XML protocol supports composability between peers.
• A SOAP-compliant server must be able to act as a SOAP intermediary capable of processing and forwarding a SOAP message on a path from its origin to a final destination. SOAP intermediaries may be explicitly specified by providing their URIs as the value of the SOAP actor attribute within a SOAP header, for example:
•
SOAP and Actors
• Identify the parts of the SOAP message intended for that application. This means checking the header for an actor attribute that is either the URI of the application or the URI http://schemas.xmlsoap.org/soap/actor/next, which means that the application must process the header.
• Verify that all parts of the header intended for the application and associated with a mustUnderstand="true" attribute are supported by the application. If the application cannot process the message, then it must discard the message and return a SOAP fault (see section on SOAP faults on page 137).
• Process the parts of the header intended for the application. If there are elements that include the attribute mustUnderstand="false" or that do not specify the mustUnderstand attribute, then the application may ignore those elements.
• If the application is not the ultimate destination of the message, then it must remove all header elements intended for it before forwarding the message.
SOAP Design Patterns

SOAP Faults
SOAP with Attachments
SOAP with Attachments
•SOAP provides a protocol to deliver XML across the Internet. However, requirements often dictate that not just XML needs to be transported but also other related documents such as DTDs, schema, Unified Modeling Language diagrams, faxes, public and private keys, and digests that may be related to the XML. In keeping with the spirit of the Web not to introduce new technologies when existing ones are available, SOAP relies on the existing rules for HTTP attachments to deliver auxiliary data with a primary SOAP message, allowing a SOAP message to reference the attachments.
•The SOAP with Attachments (see Figure 4.15) document defines a binding for a SOAP message to be carried within a Multi-Purpose Internet Mail Extensions (MIME) multipart/related message in such a way that the processing rules for the SOAP message are preserved. The MIME multipart mechanism for encapsulation of compound documents can be used to bundle entities related to the SOAP message, such as attachments
•Figure 4.15. SOAP with Attachments lets additional documents travel with SOAP-based XML content using HTTP as the transport protocol
SOAP and Firewalls
•SOAP's global reach is made possible by its alliance with HTTP, the Internet protocol that is the basis for moving data back and forth from Web servers to browsers. HTTP works by accessing Web servers on port 80, which is kept open for Web traffic. Most servers shut down other ports for security purposes.
•SOAP's use of port 80 is a double-edged sword. While an open port 80 makes SOAP messaging possible, it also makes system managers nervous about incoming SOAP traffic, since SOAP messages traveling on port 80 bypass the protection afforded by firewalls. SOAP messages can contain XML-RPC commands to execute code on the server, which requires caution to protect the server from unwanted attacks, the form of which is difficult to anticipate.
•It should be noted that while XML-RPC calls can easily pass through firewalls, XML-RPC distinguishes itself from other server traffic by including a header element that specifies content-type as text/xml. This at least alerts the server and associated firewall software that XML is being POSTed to the server.
The W3C and SOAP
Taking SOAP to the Next Level

Monday, June 1, 2009
What Is Web Services
Web services is at once a technology, a process, and a phenomenon.
As a technology it is a set of protocols that builds on the global connectivity made possible by SOAP and the synergies of XML and HTTP.
As a process, it is an approach to software discovery and connection over the Web.
As a phenomenon, it's an industry-wide realization that the decentralized, loosely coupled, synergistic Web can't be ignored.
Web services builds on SOAP's capability for distributed, decentralized network communication by adding new protocols and conventions that expose business functions to interested parties over the Internet from any Web-connected device.
SOAP, for example, is not a stand-alone technology, but the result of synergies between XML and HTTP.

Web services is a technology and process for discovery and connection.
It includes:
Describing: Web services describes its functionality and attributes so that other applications can figure out how to use it.
Web Services - A ZwiftBooks Perspective
ZwiftBooks uses a Web services repository to list its offerings
Book delivery service, It must
Decide on the service it wants to provide
Pick a registry (or registries) for uploading its information
Decide how to list its service at the registry
Define explicitly how users can connect to its service
Deciding on a Service
Picking a Registry
Deciding How to List
Table outlines the options for storing information in a UDDI repository.
Web services registries support white, yellow, and green pages.
Of course, ZwiftBooks will be in the white pages under "Z,".
Individuals and software agents will search repositories.
Remember, Web services is intended for computer-to-computer interactions.
There may be a human on the other end of the computer trying to find a book service,
Web services yellow pages will list companies according to conformant standards.
What will be important in attracting software agents
Defining How to Connect
Operation
Publish: How the provider of Web services registers itself
Directory
Information
Operation
Find: How an application finds a particular Web service
Directory
Information
Service information: Describes a group of Web services. These are contained in a businessService object.
Operation
Bind: How an application connects to and interacts with Web services after it's been found
Directory
Green pages: Technical information about the Web services provided by a given business
Information
Binding information: The technical details necessary to invoke Web services. This includes URLs, information about method names, argument types, and so on. The bindingTemplate object represents this data.
Service specification detail: This is metadata about the various specifications implemented by a given Web service. These are called tModels in the UDDI specification
Web Services Technologies
XML,
SOAP,
UDDI,
WSDL.
The Web Services Architecture
Three major aspects to Web services:

Key Technologies
key technologies
UDDI
WSDL
SOAP

UDDI
core -UDDI Business Registry,
UDDI defines an XML-based infrastructure for software to automatically discover available services on the Web, using SOAP as the protocol to invoke services
UDDI: Public versus Private Registries
In Version 2 - both public and private Web service registries, allowing enterprises to deploy private registries to manage internal Web services using the UDDI specification.
Many IT companies are beginning to use Web services technologies behind their firewalls for application-to-application integration.
The UDDI Family of Specifications-Describe how a program can interact with a registry, including the following.
The UDDI Programmer's API Specification
The UDDI Data Structure Specification
These include
businessEntity
businessService
bindingTemplate
tModel
Using UDDI to Make the ZwiftBooks Connection
Scenario connecting to our ZwiftBooks server using UDDI discovery:
A company -writing software that connects to several book-service providers and comparing price and delivery times for each.
businessEntity
bindingTemplate
The company sets up its program to interact with the ZwiftBooks Web service.
The semantics of the service may be obtained by accessing the tModel contained in the bindingTemplate for the service.
At runtime, the program invokes the Web service based on the connection details provided in the bindingTemplate.
If the remote Web services and the calling program each accurately implement the required interface conventions -calls to the remote service will be successful.
UDDI Failure and Recovery
The following scenario describes how error recovery fits into Web services:
WSDL
WSDL - Web services framework that describes how to connect to Web services providers.
WSDL -XML format- describing how one software system can connect and utilize the services of another software system over the Internet.
WSDL uses XML-based syntax -describe the specifics of accessing a Web service,
such as
type , number of parameters passed to a service, type and structure of the result returned.
After discovering -Web service (via UDDI),
WSDL provides the details of how to actually bind and interact with that service.
WSDL supports direct client interaction with a Web service provider by building on the infrastructure provided by HTTP and SOAP.
WSDL ports define data bindings.
WSDL defines services as collections of network endpoints or ports.
Figure 5.4 illustrates that in WSDL the abstract definition of endpoints and messages is separated from their concrete network-based data bindings.

From Abstraction to Reality
WSDL also relies on XML Schema.
To define an operation, we add an operation element with the same name as the previously defined operation.
Within this operation element, we now add a soap:operation element with the soapAction attribute.
Finally, we must specify how the input and output messages of this operation are encoded.
The complete binding looks like this:

TheURI http://schemas.xmlsoap.org/soap/encoding/ indicates the SOAP encoding style as described in the SOAP specification.
WSDL file is complete and ZwiftBooks clients can begin to use it to figure out how to connect to the ZwiftBooks SOAP server and begin to use ZwiftBooks services.
Web Services Caveats Web services risks.

Maturity
Security
Configuration Management