For decades the world's leading research organisations invested massively in the pursuit of hypertext documents, ubiquitous connectivity and universal access to resources. Then, almost without realising what they'd wrought, developers let the genie of the World Wide Web out of the bottle. They thought they had a nice little tool for processing and exchanging information in universities and laboratories - and perhaps of some military value. But who could have predicted (apart from maybe Nick Negroponte) the emergence of the great hyperspace shopping mall or the wave of turmoil and forced evolution in the basic infrastructure that drives the modern enterprise? This transformation has significantly raised the stakes for application availability, scalability, load predictability and security because applications are exposed to a wider audience with unpredictable demand.
From a basic architecture and infrastructure standpoint, the Web is pushing enterprises toward a fundamental three-tier architecture consisting of a Web-facing set of services, a middle tier of application logic and a back-end tier of database servers. In addition to all the traditional services required of any enterprise transaction application, this architecture must also cope with the new Web-driven demands for dynamic scaling of resources in response to unpredictable demand. And, of course, the need for reliability has never been greater, as evidenced by - among other things - eBay's much reported multiple crashes last [northern] summer.
In response to these pressures, IT architects must learn new methods of managing the stream of enterprise-critical Web applications that will be required to compete in the 21st century. It won't be easy, but if architects follow some fundamental principles of partitioning and build systems knowing that they will have to scale and protect services, their chances of success are far greater.
Web Services for the Future
Web services are the combination of Web servers, cache servers, front-end security and IP traffic management functions that are used to process and manage the HTTP requests coming into the application system.
Despite the industry's focus on the Web server, that's seldom where the bottleneck is. Giga believes that other architectural elements are far more critical. For example, caching, which nobody paid much attention to in the past, has emerged as an important architectural element. There are two major divisions of caching systems: pure software on general-purpose platforms and dedicated appliances with specialised software and file systems. In general, expect substantially higher performance from the dedicated devices. In 2000, we will see more products for cache refresh and page propagation as well as more product announcements involving both higher-performing and lower-cost caches, and value-added services that come from integrating caches with Internet services. Already Exodus Communications and Inktomi have jointly announced that the former will use the latter's cache product. That means faster response time, such that if Exodus is carrying your Web site, viewers can open your pages quicker than before.
The next frontier for Web-resource management vendors is content management to allow Web site managers to specify how content should be spread across multiple servers. The replication works in conjunction with the load balancing function (which directs demands on the system to the most appropriate server) to ensure that requests are not dispatched to servers hosting out-of-date content. A new wave of switch vendors have added a further wrinkle by offering "Web switching" technologies, where the load balancing is built into the routers and switches, as opposed to being a separate function on server-based software or a separate box altogether. Some products such as switches from ArrowPoint Communications can also selectively detect and correctly dispatch cacheable content to a cache and detour non-cacheable content around the cache to a Web server.
The challenge for technology architects is twofold: to identify and select the necessary functions for their systems and to track the rapidly evolving choices for implementation, which cross traditional disciplines as the lines between network-resident and IT-resident resources become increasingly blurry. The incredible pressure for quick time to the Web also leads to crossing organisational barriers within companies as the development process becomes highly compressed.
Multiple categories of Web application development tools have recently emerged. When advancing guidelines for tool selection, managers should focus on providing their developers with a range of solutions. At the same time, however, selecting a set of solutions built on a common architecture - for example, Java/Enterprise JavaBeans, Component Object Model (Com) and Common Object Request Broker Architecture (Corba) - will simplify integration between applications and migration of applications between tools, if that becomes necessary.
While there are still a lot of home-brew applications out there, the use of integrated application infrastructure products such as IBM's WebSphere Application Server Enterprise Edition or the Sun-Netscape Alliance's Netscape Application Server are becoming more prevalent. They have pre-integrated services such as failover (switching a transaction to a different server if one crashes) and state management (tracking a transaction) and others that are difficult to implement from scratch.
Knowing where to locate data for electronic commerce can be a conundrum. In general, a complete e-commerce system has data residing in a number of separate repositories, including the Web server, external object request broker, specialised memory-resident relational databases and traditional back-end databases. Almost every significant application system today ends up at back-end database servers, which, for large e-commerce applications, are still predominantly Unix and/or mainframes. The vast majority of transaction systems have some eventual tie-in to mainframe databases in the corporate legacy systems, which means you'll always have to interface with the mainframe.
An interesting emerging trend is the increased use of memory-resident databases - where the database is kept in system memory with no disk accesses during execution (except for creating a log file for recovery). In more and more commerce applications, there are opportunities to employ databases that do not involve transactions against the back-end databases.
These are cases where access to the data must be quick but it doesn't make sense to go all the way back to the mainframe, so developers put a large chunk of the data, such as a product catalogue, in system memory. System memory is 100 to 1000 times faster than disk-resident memory. Uses include session and state information, and maintaining extracts from customer files for applications such as cross-selling and promotion, affinity-club activity and user-activity logging.
Some of these are transient but need to be continuously maintained. State information, for example, tracks how far along you are in a transaction; if you've selected the red sweater and chosen Fed Ex delivery and the system goes down, that much of the transaction can be resurrected when another server picks it up.
In 1998, we saw significant changes in server technology as vendors delivered more scalable Unix symmetric multiprocessors and the second generation of non-uniform memory access. Microsoft's Windows NT servers took a detour, with only a few vendors introducing eight-way or larger NT servers based on Intel's Pentium Pro chip. The introduction of the four-way Pentium II Xeon servers in 1999 temporarily defused the demand for larger servers because performance was much better than expected, and the long-delayed volume shipment of eight-way servers in 1999 further ameliorated that demand. When Windows 2000 Advanced Server and Windows 2000 Datacenter Server ship and are proven to be stable, NT's single-systems capacity will expand to 32 CPUs.
The emerging frontier for large-server technology, given the increased emphasis on continuous operations and high availability, is clearly multi-domain processing. A multi-domain system is one where multiple protected copies of the operating environment are simultaneously co-resident on the same physical server but isolated from each other so that failures within one region will not propagate to another. Multiple domains within a single system, especially as systems grow increasingly scalable, offer most of the advantages of clusters with higher performance.
In the operating systems arena, Unix and NT continue to do battle. The environments remain clearly differentiated in terms of high-end scalability and stability, with Unix still in the lead. What has changed is the level of absolute level of performance and the future prospects for the maturation of NT. Giga has done extensive surveys of NT reliability, and the results, while better than anticipated, indicate that NT is still not suitable for many enterprise applications, such as very large database servers. A significant number of systems experienced problems once a month or more, and a much higher incidence of data corruption was reported than we are comfortable with. Performance levels for both Unix and NT have almost doubled during the last year, but Unix continues to outperform NT by a factor of about three to four. The outlook for the near future is for both to continue to grow in capabilities, with shipment of Windows 2000, volume availability of eight-way Intel NT platforms and with even more scalable Unix platforms due in early 2000.
Once dubbed "the living dead", mainframes have regained some recognition as the reliable workhorses of the industry. Mainframes continue to ship in quantity to a relatively stable base of installed clients whose processing needs continue to mushroom at a far greater rate than was anticipated either by them or by vendors as recently as five years ago. Rather than fading into history, mainframe vendors have responded with a steady stream of developments - such as continued performance and price improvements, and enhancements in the interoperability of mainframes with IP-based environments - to not only keep the mainframe secure in its role as a high-end data repository but also to position it as a good Web citizen. Giga's experience with our client base leads us to believe that the mainframe fills a vital niche in the Web ecology because a significant fraction of enterprise data still resides there.
The one area where we continue to be cautious about mainframes is their use as primary Web servers. Besides cost, there are other factors involved in this decision, which are largely influenced by the degree of Web application dependency on legacy-application logic and data stores. Exceptions to our usual recommendation will almost always involve those applications where the majority of the data, and possibly the logic, resides on the mainframe.
The primary emerging trends in storage systems are the advent of storage area networks (SANs), fibre channel technology and the beginnings of true heterogeneous storage environments, along with management tools for NT storage. With constantly declining costs, many advanced features from high-end proprietary systems are now appearing on NT.
The transformation of the Web into an integral part of the enterprise infrastructure has resulted in a number of fundamental changes in the characteristics of enterprise IT infrastructure. IT architects must plan to incorporate new architectural elements, including caching, IP traffic management, load balancing and content replication for many large applications. The addition of new architectural layers - Web services - has changed the way entire applications are partitioned, and forces architects and developers to make new choices about partitioning workload and where various processes are performed. Another casualty of the Web era is traditional methods of capacity planning. In a time of ubiquitous connectivity, capacity planning is extremely difficult, and old models of growth, backup and contingency resources are no longer adequate.
The final impact of the Web on IT environments and infrastructure will not be fully understood for many years, but it is clearly leading IT into a world where it must be more intimately connected with the business drivers and competitive issues of its business users, as well as more directly connected with its customers. In short, the Web is a powerful catalyst for IT's transformation from a service to a core competitive competency. With attention to fundamentals of architecture and an eye for the future, IT organisations can deliver the effective underpinnings of tomorrow's growth.
Richard Fichera is a vice president at Giga Information Group and can be reached at firstname.lastname@example.org
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.