"Binding" database functionality to your Web site, with a little help from Java
While young Bob Epstein was taking notes in his computer science class at UC-Berkeley, little did he know that one day he would compete head-on for leadership in the database-software market with his own professor, Michael Stonebraker.
Epstein graduated and went on to found Sybase, a company whose database sales are exceeded only by those of Larry Ellisonโs Oracle.
Professor Stonebraker, after inventing one of the industryโs first relational database called Ingres, today challenges the dominance of both Sybase and Oracle with his feisty new startup โ Illustra.
Envisioning an entirely new type of database architecture โ one that made absolutely no assumptions about the type of data it would contain โ Bob Epstein isnโt the only one studying Professor Stonebrakerโs latest invention. The database industry is scrambling to mimic the extensibility that Illustra (now a subsidiary of Informix) has pioneered in a revolutionary new database architecture.
For those serious developers who want to โbindโ database functionality within and underneath their Web siteโs content, there are some powerful new tools from these three companies and others. And the good news for JavaSoft and its partners is that Java promises to play an integral role in all of them.
Relational databases are good at handling the stuff that fits neatly into spreadsheets, like short strings of text and numbers. But Web sites will soon have more complex data types like VRML, video, animation, and sound.
Professor Stonebrakerโs vision was to build an architecture that was infinitely extensible โ one that could empower developers to create types, functions, and rules for new data types that would be too difficult to manage with conventional database tools.
This month in The Cyberstruction Zone, letโs take a look under the hoods of Illustra, Sybase, and Versant to see where the most progressive of the database engine guys are headed.
A two-minute history of RDBMS on the Web
Ninety percent of all the content on the World Wide Web resides within simple, hierarchical filesystems.
Web sites today are emerging out of the โdesign & buildโ phase and into the โmanage & planโ era. Integral to this migration will be the successful integration of Web-database engine technologies. (See Figure 1.)

If Web sites are going to be anything more than online marketing brochures presented in HTML format, developers need to learn about โbinding,โ the marriage of Web content to underlying relational- and object-oriented database engines.
Parallel to Web site evolution from Design to Planning states is the migration from a free to a commerce-driven paradigms (see Figure 2), and each step involves a further degree of sophistication in the underlying database architecture for a site.
As time unfolds, weโre going to see the expression of this maturation beginning with online catalogs, growing into online publishing, and eventually into fully-functional, commerce-driven site environments. (See Figure 3.)

Most of the popular relational database architectures have a 25-year-old structure which was a dramatic improvement upon flat-file storage of simple filesystems. In their simplest form, a relational database has an architecture that looks like this:
PARSER
โโโโโโโโโ
OPTIMIZER
โโโโโโโโโ
FUNCTION MANAGER
โโโโโโโโโ
STORAGE MANAGER
โโโโโโโโโ
DISK STORAGE
What makes one RDBMS better than another is how tightly bolted together these layers are. The more efficient the design, the less โlatencyโ that will slow down the system.
Once you start introducing new data types into an RDBMS engine (like sound and animation), the storage manager starts working overtime to pull out the files from responses to SQL queries. This affects the latency dramatically, and many RDB systems break down.
Sybaseโs and Oracleโs engines (Figures 4 and 5, respectively) have not been fundamentally rebuilt (particularly in the case of Oracle) for a long time; the Informix code, on the other hand, was completely rewritten and redesigned in 1991, a significant advantage to the team of Informix programmers who are now working to integrate Illustraโs Web-oriented engine.
Multi-blobs of complex objects within Web content that relational databases donโt understand are easier for a tool like Illustra that was designed from the ground up for data type extensibility. (See Figure 6.)
Only a few Web database architectures can boast a tight coupling of its relational engine to its Web content: Navisoft (a subsidiary of America Online that uses an architecture based on Illustra) and Netscapeโs LiveWire Pro (based on Informixโs architecture) are among the few examples today.
โThere are no turn-key solutions out there, and I wouldnโt consider LiveWire Pro to be turn-key at this juncture,โ said Bill Ray, director of digital media distribution at Illustra. LiveWire Pro is a development environment that is early in its evolution. Two things to keep in mind: It is bundled with Informixโs Online Workgroup Server, and Netscape couldโve chosen Oracle or Sybase instead.
Finally, there is legitimate concern on the part of those companies who want to connect their large, back-office production databases to the World Wide Web. Security is only the first of many issues that face companies with decades worth of proprietary data behind their corporate firewalls.
Jumpstarting Java for Web databases
Someday, a SQL server will be able to externally call a Java-stored procedure, and Java-enabled browsers will have functionality that take dynamically stored data into a new realm. Today, however, server-side Java applets (Sybase engineers call them โdataletsโ) can call SQL server databases using Sybaseโs OpenClient and the ObjectConnect frameworks. Take a look at Illustraโs demo called โUMVโ on www.illustra.com.
All the companies interviewed for this column agreed that Java will soon become the tool of choice for creating โhooksโ that will allow HTML-formatted user interfaces to passively or actively capture clickstream data from online Web travelers.
While both Microsoft and Oracle have licensed Java, itโs not clear whether they and others will eventually chose to implement it in their Web-database engine technologies.
The following questions remain: At what point is it safe to build a server-side commercial application with Java? And at what point will there really be portability to platforms other than Sun or Intel/Windows? When will Apple, HP, IBM, and DEC jump on board the Java train? With at least five hardware vendors designing Java chips, how will this affect portability and Javaโs role in speeding up queries to Web databases?
โMicrosoft certainly has the ability to set standards,โ said Rich Mironov, director of Sybaseโs Internet products group. โI think theyโre going to twist OLE very hard, and weโll have [at least] two standards: plug-ins for Netscape (NSAPI) and OLE for Windows (ISAPI). And maybe one more. Microsoftโs standards will always be there, itโs just a question of whether they will dominate. But the Netscape folks have been reading the Microsoft business plan for years, and who knows where Netscape will go next?โ
Most of the database vendors interviewed are concerned that premature implementations of Java could backfire.
โUntil thereโs a stable, efficient Java compiler on more than a few platforms, our customers are not willing to gamble on untested code,โ said Mironov. Because of the marketโs pressure and need for a solution like Java, he further stated that he believes Java will mature in a brief 2-year cycle, and not the average 10-year cycles of other new technology standards.
โMost of our customers are going to experiment rather than implement, except where they are doing media-oriented or client-oriented stuff where the sizzle is the important thing,โ said Mironov.
C was created and promoted by AT&T, which had no hardware platform to worry about. Java is being pushed by Sun, which has a lot of mixed incentives. HP and IBM might be forced to do Java, for example, but will they like it?
โJava is still only 50 percent as fast as C at best,โ said Mironov. โThe reality is going to be limited until there are native compilers that optimize and speed things up a lot more.โ
Microsoft doesnโt have to be the company that writes a third-party Java implementation for Windows, but itโs instructive to learn from Microsoftโs success in bending the SQL Access Group sufficiently so that they helped create a Microsoft-owned ODBC standard. After the standardization committees disbanded, everybody else had to follow this direction because Microsoft had put its official stamp of approval on ODBC.
The Microsoft strategy is very clear and very well established: Every year, the companys take one new product area and make it cease to exist. The next target may be to take the DBMS market and the Web server market and make them an integral part of Microsoftโs operating systems.
โThe word on the street is that the next thing Microsoft will do with its OS will be to make its browser more integral,โ said Mironov. โYouโll be able to look at all your files with the browser and itโll be more seamless whether you are actually on the Net or not.โ
โCompared to Java, Perl and CGI create a great big hairball of code that you have to maintain,โ said Bill Ray of Illustra. โTheyโre both slow, theyโre insecure, theyโre middleware, and theyโre difficult to manage. You can write two stupid scripts to get data in and out, or you can write one smart one to do both.โ
Some drawbacks to CGI and Perl: Theyโre less efficient, the scripts can be difficult to manage and maintain, the middleware layers are potential security holes; and there is process overhead associated with CGI programming.
Illustraโs Web DataBlade eliminates the need for Perl programming. Web content and code are both stored in the database itself, eliminating the middleware level. Web DataBlade introduces an application page paradigm, which is executed and parsed by a database server function, WebExplode, which takes an application page as an argument. Application pages can contain HTML, SGL, and Java; and, since they are objects managed by Illustra, they can be protected, reused, and combined with other application pages in building Web-based applications.
In contrast, Sybase thinks more highly of Perl, at least for now. The first version of Sybaseโs web.sql will include the OpenClient tool, which gives developers the ability to get access back to the whole back-office production database securely. However, it is via a Perl interpreter that all this stuff is made accessible. According to Sybase engineers, however, the second version of web.sql will become more โJava centric.โ
โWould you be willing to put your 0 million production database on the hook with somebodyโs compiler thatโs really only had 200 person-hours of testing?โ asked Mironov. โWeโre going to do our next version of web.sql in Java probably, and it will access Java applets and have all the appropriate underpinnings so that you can get to Javaish interfaces for object connecting on a SQL server. But weโre under the classic pressure of โget something done so that our customers can get real work done today.โ โ
While the Perl language is only 8-bit, and while itโs not a real-time architecture, it has been in use by Web developers for a long time, and a lot of people know to use it.
Illustra pushes the envelope
As a small, start-up company trying to gain traction against Oracle, Sybase, Ingres, and Informix, the only way Illustra can compete is to narrowly focus and try to identify those areas where they can play ball with a competitive advantage.
Illustraโs architecture has sought to โhave its cake and eat it too.โ From relational design, it borrowed functions like row and column SQL querying, commit, rollback, use, grant, and revoke; and from object-oriented databases, it borrowed polymorphing, inheritance, and type extensibility, and is heading into the upper-right portion of the graph as presented in Figure 7.
Before it decided to purchase the Illustra company, Informix had a project nicknamed the โIllustra Killer Project.โ
After doing the โbuy vs. buildโ calculus, the executives at Informix decided to acquire Illustra to build upon the toolโs extensible architecture. Previously, Informix has made its mark on the strength of its scalability, parallelization of queries, manageability, and performance. Today, Informix and Illustra engineers are working frantically to couple the two database engines, with the target completion date of year-end 1996.
โAt Illustra, we are building on the databaseโs inherent extensibility and providing the connectivity to the adjacent technology that you need to build these second- and third-generation sites,โ said Ray.
As the web migrates from โstate-lessโ to โstate-fullโ Web applications, and from โpublish & subscribeโ to โclient/serverโ architectures, they must learn to develop small modules that are going to surround where the Web server used to be. Figure 8 illustrates the kinds of functions and tool vendors in these two application class areas.
โWeโre competing in certain โcore servicesโ areas,โ said Ray. โWe want to give developers reusable code that provides a framework upon which we can build turn-key templates.โ Figure 9 illustrates the underlying architecture for Illustraโs strategy.
Illustra is currently delivering three layers of technology to Web developers. The first layer is a completely extensible database engine that makes no assumptions about the types of data to be managed. The next layer is the Web DataBlade itself, which facilitates the connection between Illustra and any Web server. On top of these two layers are application templates, which are managed collections of application pages available for customization for a specific client implementation.
Illustra refers to a new breed of specialized, add-on DataBlades as tools that slice and dice a database in unique ways, similar to how class libraries operate. โYou take the complexity out of the application, and let the DataBlades execute directly within the engine,โ said Ray. DataBlades (see Figure 10) will let developers define functions that โteachโ their databases how to deal with user-defined data types.
Further, DataBlades allow end users and developers to innovate as opposed to waiting for Illustra to add support for a specific data type. Anyone can develop DataBlades. Illustra builds DataBlades that are horizontal in nature (e.g., TimeSeries for the financial markets, 2D and 3D for geospatial markets, etc.), but third parties can also provide DataBlades (e.g., PLS for text retrieval and Verity for searching).
When a developer creates a โtypeโ in Illustra, what heโs really doing is inserting a type definition into a type meta-table, and when he defines a function, itโs very similar.
Illustra was the content-management system for photojournalist Rick Smolanโs recent experiment, 24 Hours in Cyberspace. Illustra designed and built the digital editing system and managed all content (text, images, audio) for the site. There were three phases to Smolanโs experiment:
- Collection โ e-mail submitted with self-describing data, digested, parsed, and automatically routed to six pods which mapped to the six themes of the site. Application pages and the Illustra Rules system facilitated this process.
- Staging of media contributions โ through the filesystem where the referential integrity between the links was handled by a tool called NetObjects and then mirrored to 18 sites throughout the world.
- Publishing โ once the content was finalized, NetObjects staged the site locally, then Sun Microsystems Integration Services mirrored the content around the world.
Sybaseโs Web strategy
In Sybaseโs Web strategy, there are three interoperable and intercommunicable โzonesโ : the client software, the middle-tier server software, and the back-end production databases with multiple formats and standards for data sources.
โWe let you keep your logic and data in the middle or the back end so that the solution is client-side independent,โ said Sybaseโs Mironov. โThis method will keep proprietary data secure. If you donโt plan ahead by having the right database architecture, youโre hosed in this fast-moving market.โ
A big part of Sybaseโs Web strategy centers around its acquisition of PowerSoft, the leading Windows applications tools company. Its PowerBuilder product is the top client/server tool for building Sybase applications, as well as Oracle applications (where it is the market share leader). PowerSoft has also announced a C++/Java development environment called Optima++ that builds upon the concept of โcomponentsโ โ self-describing and self-helping gadgets.
PowerBuilderโs largest user base is Oracle developers, and Sybase has the largest installed based of gateways to DB2, RDB, and other popular data-source formats at the present time.
Sybase also purchased Visual Components Inc., creators of the Formula One spreadsheet, a โrich textโ (RTF) plug-in, and a Web-oriented spell checker.
The technology from Visual Components gives Sybase a competitive advantage in the use of innovative โdata windowsโ and certain graphical tools allow for data to be formatted in charts, columns, and so on. The spreadsheet application works like Excel, and the rich-text description is Microsoft Word-like. Sybase plans to embed spreadsheet and word-processing client software applications in a network-savvy manner for its users.
โIn a world where the network computer wins, we hope to have all the software that runs on it,โ said Mironov. โYou could effectively run a browser application in background mode, get the data, and drop it into your application if thatโs what you want to do.โ
In the โmiddlewareโ space between the Web server and the database, Sybase plans four different initiatives:
- web.sql โ the โlightest, cheapest way to get into the game,โ this tool is called from one line of HTML on the server side;
- PowerBuilder 5 โ a tool for building server-side, non-trivial objects for rich, complex applications that will read data, make decisions, do joins, and analyze data in a non-object-oriented manner;
- Optima++ โ a tool for doing Java-equivalent applications;
- Object Connect โ a series of object bindings plus object/relational database functionality for Microsoftโs OLE (and eventually C++ and Java).
Using Sybaseโs ObjectConnect, developers can make an object request that will automatically determine whether the result will come from an โobject store,โ a โrelational store,โ or a โdirectory store.โ If itโs going to a relational store, it does the mapping and then it goes into the relational database and retrieves the object, freeing the developer from directly having to do any of mundane, customized coding.
On the deeper, back end of Sybaseโs Web strategy, there is a set of projects called EnterpriseCONNECT, which offers a gateway to 35 kinds of foreign data โ from DB2 on the mainframe to Oracle, Informix, Ingres, and a number of others. โWeโre deep into the 90th percentile on data sources,โ said Mironov. โWe assume a heterogeneous world and that our customers are always discovering a new source of heterogeneity that they forgot to tell us about. Weโre proud to say that we have approximately 35 different back-end data sources. Thatโs 30 more than everybody else.โ
The real key to Sybaseโs gamble is web.sql itself. When the web.sql interpreter links in with the http server, web.sql preprocesses the page after it discovers the โsybase type = sqlโ tag. What immediately comes back is text. When the text executes, it passes the result back to the Web server (which never sees any SQL) and the Web browser (which also never sees any SQL), and the result is presented to the user as a static page that is formatted with all the resulting data from the original query.
โIf a Web developer writes a page in web.sql,โ said Mironov, โSybase will guarantee that itโll work on the other side of the firewall.โ
web.sql assumes that you have a pre-existing infrastructure in the back end that is opened up with OpenClient and OpenServer. PowerBuilder 5 assumes youโre going to design an application with logic and business rules and youโre going to put that in its perfect place. Finally, Sybaseโs ObjectConnect tool assumes that youโre going to do this same thing, but that youโre going to be using an object interface that gives you one more level of indirection (in case some of your data resides in an object or relational store) and youโre moving it in or out.
Versant takes the high road
Visionaries describe the worldโs telephone systems as the โultimate accounting systemsโ of the future. Not only will they move voice, data, and video; theyโll also be performing a function that they do best of all โ sending customers a bill and collecting the money.
What if the telecommunications industry giants could provide billing and fulfillment services to corporations in a variety of industries? Speeding through the trunks of its high-bandwidth telephone lines someday will be the digital content of next-generation, TCP/IP networks that take the Internet to another level.
Doing databases at this level of complexity and volume is no simple task. Now enters a company called Versant Object Technologies.
Founded in 1988 by a small group of relational database architects who foresaw the limitations of existing RDB technology when put to the test of high-bandwidth, high-capacity ATM switches, Versant sought to improve the solution to this problem: database-intensive applications mapped poorly to the object-oriented programming languages of today.
Versantโs founders set out to build a database that would store โobjectsโ as opposed to a persistent storage method of like-kind elements.
โOur focus is on network management and high-demand database architectures,โ said Craig Russell, director of product management at Versant.
With the big telecommunications clients that Versant serves, their object-oriented database magic can be embedded directly into the switches, the routes, and the fiber lines that bind everything together. With big transportation companies, itโs the concrete and the trucks. With big utilities, itโs the power grid itself (the poles, the wires, the transformers and how they interoperate with each other).
A new standard has been created by the International Telecommunications Union (ITU) called โMNโ โ for telecommunications Management Network. Intended to help standardize how large companies manage their infrastructures and constantly flowing digital assets, MN promises to help companies deliver real database-intensive services to real customers in terms of business management.
The structure of how to manage a businessโs data, along with the unique types of functions and procedures for managing it, are best described by the word โobjects.โ
โWe have โbusiness objectsโ that use โservice objectsโ that use โoperations objectsโ that use โelement objects,โ โ said Versantโs Russell. โItโs a completely integrated stack of object-based abstractions of a given business model, its service model, and the actual operations of its network itself.โ
If you combine all those things you come up with a particularly elite, small number of large customers who are embedding Versantโs object-oriented database technology in areas like billing systems, operations management, and even the embedded management of the fabric inside ATM switches themselves.
From top to bottom, Versant has invaded the telecommunications industry via its relationships with clients like AT&T, MCI, Sprint, Erickson, Alcatel, and others.
โWe tend to not compete head-on with SQL database server companies,โ Russell said. โAfter theyโve tried to deploy applications with relational tools, they often find it too difficult to cost/justify the continuing management of relational code to do what Versant does with object code.โ
Relational database tools fall apart when confronted with the three areas that Versant specializes in: complex data (multimedia and other data types), complex relationships among data sources, and distribution.
โSome of our clients have come to grips with how theyโve been โforce-fittingโ this relational model into something that it wasnโt really designed to do,โ Russell claimed. Versant handles complex relationships among objects that facilitate rapid access from one part of the database to another. It also provides for distribution of data over local- or wide-area networks.
โMore and more, weโre finding some of the more demanding applications โ for example, managed-care health systems โ where consumers or providers have a need for object-oriented database technology,โ said Russell. A health service provider has three complex functions: a billing organization, a health-care provider, and a big conglomerate that performs a medical service for the consumer.
โWhen you peel a really big onion, it doesnโt fit nicely into rows and columns,โ said Russell. โYouโve got this huge mismatch when you try to present disparate information in a relational model.โ
Versantโs technology treats the Web as merely another medium, and there will be two ways of accessing the data: Java and Declarative definitions with HTML formatting.
โWe believe that Java is the language of the Internet,โ said Russell. โWhat you really need for complex applications is intelligence in your โnetwork toaster.โ Thatโs where we see the role of Java; and itโs ideally suited for multiple, dynamic data sources in the same screen by providing the intelligence there that says, โNot only do I have multiple, disparate data sources, but Iโm going to invoke a transaction that will combine all information from those together into a single display and help the user through the massaging of that information.โ โ
Russell envisions a future where consumers will โrentโ a Java applet from Fidelity Investments that will perform an analysis of live stock portfolio values, and then make suggestions on actions to be taken.
Versant will be supporting not only Java, but also CGI and JDBC. (JDBC is an implementation in Java for ODBC. Itโs a wrapper for SQL that is an extension of standards work to define a single API that would be usable to access a number of SQL data sources.)
From Versantโs perspective, a Java applet out on the Web will have an execution environment that will be very limited by current standards. โItโll be a couple megabytes of RAM, a Pentium-sized processor โ what we call a โnetwork toaster,โ โ said Russell. โIn such a limited execution environment, youโre going to need to get your data from somewhere else.โ
There are four โsomewhere elseโ sources from where the data will come on the Web:
- HTML/http route โ a URL is sent back to the Web site where data is requested, and SQL scripts pull out static data to be displayed in HTML. (Versant has a declarative method of retrieving that information very much like the kind of script that used to get data from Oracle server.)
- A Java application program on the serverโs back end that is tied in and invoked from a Web serverโs script that will poke into a Versant database, โmassageโ the data, and format it in HTML output for display.
- A JDBC interface where your Java application will go through a package of Java code that was designed to translate the information into a JDBC call, which in turn would open a data source, retrieve the info, and send it back for display.
- โORBโ technology for heavy-duty Web applications. Itโs the detonation of a Java IDL and NEO called Joe that allows Web developers to define an interface to an object (that is built in Java on the front end) and execute behavior on that Java object using a standard ORB implementation language. These kinds of apps would involve data sources at the mainframes, relational databases, or other object-oriented databases.
Next monthโฆ
Next month in The Cyberstruction Zone, weโll take a look at how to use Java in conjunction with Moving Worlds โ the next-generation implementation of VRML by a consortium of companies including Netscape, SGI, Sony, and Worlds Inc.


