overview



a wise programmer once said, "the one constant in computing is change." there couldn't be a truer


statement. this article is about such change, specifically moving from html to the next generation, xhtml


(extensible hypertext markup language).



this article includes the following sections:



an introduction to xhtml


implementing xhtml today


changing html to xhtml


conclusion


additional xhtml resources and facts


the analysis is from a server-side perspective, meaning it applies equally well to asp, jsp, php or other


server-side driven projects.





an introduction to xhtml



xhtml (now in version 1.1) is the merging of html 4 and xml. it represents such an important advancement


that the world wide web consortium (w3c), the international standards body for the web, is replacing html


with xhtml as the standard tool for creating web pages.



xhtml is built to open doors to other formats. for example, xhtml can be used to format content for


pagers, whereas html cannot. xhtml will replace wap and other markup languages. it is a cornerstone in the


revolutionary change in thinking beginning to occur in web site design. instead of viewing a web site as a


stand alone data island, xhtml will expand web applications, allowing web sites to control and send


information which will drive countless devices, presentation styles and other web sites. xhtml is the


starting point for this tremendous change we are about to experience in how we use the web.



using xhtml has many advantages over using html. because of its structure, xhtml is faster. its well


formed documents result in quicker and smaller parsers. these smaller parsers waste less time verifying


and doing logic sorting that's required for hodge podge html documents. while faster results are not


available yet, expect improved performance from the next generation of xhtml-based browsers.



the architecture of xhtml allows tags, attributes and document types to be uniquely defined by the users


of xhtml. html restrictions no longer apply. over time, this will allow for the development of industry


and project specific xhtml documents. to explore this idea more fully, see the w3c page.



a significant limitation of html today is the form field. the w3c established special task groups to


expand the functionality of xhtml and one of these is working to improve form field usage. the


xhtml/xforms specifications are still under development but when done will dramatically change the way we


use forms. a list of some of the great features xforms will add includes:



pre-built functions remove the need to use javascript as heavily as in the past. it will be a great boon


for supporting small devices where javascript may not have been available.


elements are device independent, allowing flexibility to add voice or other input methods.


data is transmitted from the form in xml format.


data types are predefined.


forms will be separated into 3 distinct layers: presentation, logic and data. splitting forms into these


logical partitions will make it easy for forms to work on different kinds of browsers and devices while


maintaining a standard back end.


what other advancements does the future hold for forms? only the final specifications will tell the full


story on all the features. the draft specifications for xforms were released in april 2000. the final


specifications are expected by year end. xforms will likely be one of the driving forces to upgrade to


xhtml in the future. for more information on xforms see w3c and w3schools.



another advantage of xhtml is that it is a xml-based system. xml is an great technology and it is being


used in many exciting ways. while programmers would like to use xml in a variety of applications, it still


isn't practical to use for many projects. xhtml changes this because it makes xml easy to use with any


project. learning xhtml means expanding xml knowledge and skills. it means learning to think in xml. xhtml


enables sites to use xml conveniently in day-to-day web business. it is the stepping stone that will


finally give everyone easy access to the power and convince of xml.



implementing xhtml today



how soon does xhtml need to be implemented? that depends on a number of factors, many of them


infrastructure related. the current generation of tools, such as editors and browsers, need updating to


use xhtml efficiently and smoothly. then these updated tools need to make their way into common use.


furthermore, some of the standards, like xforms, are still under development, and once developed will


likely change (much like any new software) soon after the first full release. addressing these


infrastructure issues will likely take from one to four years.



nothing, however, is stopping conversion from beginning now, and, in fact, it's a good idea to start


learning the basics of xhtml, incorporating it into current projects and planning for it in new projects.


it's a good time to begin changing programming habits to enable a smooth transition in the future. this is


possible for a few reasons.



for the most part, xhtml content which doesn't match standard html will still usually work with html


parsers. this is because the parsers ignore most errors. when a parser encounters something that isn't


quite right in the page it usually won't cause a failure. this isn't always true, such as for scripting


(discussed below), but it is possible to at least make most end pages of projects completely xhtml


compliant.



as another example, because xhtml is case sensitive, tags are written in lower case. while this may seem


like a relatively minor change, it is one, none the less, which can be implemented immediately, creating a


good programming habit. similarly, nesting rules are strict in xhtml and can be followed in html to


ingrain good programming habits. both of these topics are discussed more below.



implementing xhtml in html web applications now also helps ensure that the output will be xhtml compatible


later. designing with an eye to the future is important whether that future be 6 months or 10 years from


now. changing a web page is easy, but updating the components takes more thought and time.



how do you implement a migration plan? begin to write code which is xhtml compliant but don't require the


end pages to be completely compliant at this stage. you may find you need to make significant changes in


how your dynamic server pages are written. if you use code from your library, make sure xhtml rather than


html is being produced. when the html pages are done with the components integrated, run them through a


conversion tool to update them and check for ide-generated html that doesn't match the xhtml standard.


that's it! just remember, the goal isn't necessarily to be 100% xhtml compliant, but rather to begin


learning and applying xhtml where it makes sense for your projects and web sites. it can be applied in


stages, so take advantage of this flexibility where it benefits you.



changing html to xhtml



here's some of the particulars you should consider in getting started with your conversion from html to


xhtml. this isn't a comprehensive list or discussion, but covers the major changes using the strict


document definition.



xhtml is based on xml standards. this means a document must follow "well formed" rules, that is, xml


syntax. the rules of most concern include:



xml is case sensitive. in xhtml this means every html tag must be written in lower case. so use


not

. current html editing tools will fight you here. don't worry about the case that is auto-


generated. instead, when hand typing in html tags, get used to using lower case. also, when generating


html dynamically make sure to use lower case.



important! while your should get use to writing your tags in lower case, don't worry about the case of


html tags that are automatically generated by the current generation of html editors. tools are available


to clean up html pages to make them xhtml compliant. these tools, however, will not catch tags with


improper syntax generated by your code! get in the habit of using the right syntax within your scriptlets,


javabeans, com objects or wherever else you are generating your own html content.




non-empty tags must be properly nested. this means tags do not cross over each other. in the invalid


example below notice that the form and table tags are improperly nested, that is, they cross over one


another. then see how this is rectified in the correct example. invalid example:






hi




correct example:






hi


??




attribute values must be quoted. so

is not legal but is

legal.




all tags must be closed. for tags which don't normally have a closing element, end the tag within itself.


for example,


by itself is not legal. rather, use


. these tags may also end like


, but



syntax seems to work better with current browsers.




no attribute may appear more than once in the same tag. this shouldn't be a problem.



in addition to the changes driven by xml, more changes in tags are driven by xhtml's own dtd (document


type definition). here's the highlights.


the first tag in a xhtml document must be . this tag informs the reader which definition to use


in describing the xhtml document. xhtml uses dtd modules to translate tags. in selecting among the three


dtds follow these rules.




when writing pure xhtml use the strict dtd:





when writing for the most html compatibility use the transitional dtd :





when using frames use the frameset dtd:





the transitional dtd will be used for most pages.




the second tag in a xhtml document must be and the xmlns attribute is mandatory.




the




form tags must have an action attribute. for example,





style tags such as

and have been removed! use style sheets for formatting.



data (which in a html page would be text) must be enclosed within a set of valid tags. a partial list of

valid tags to enclose free standing data (text) includes "p", "h1" "div", "pre".


the first example is wrong because the data (text) is not enclosed within a defined tag set


hi, this is wrong.



validate


this is the correct way to include data using the div tag.


hi, this is right.


validate







every tag must have an alt attribute.



every tag must have a type attribute.



no stand-alone attributes (also known as minimized attributes) are allowed. for example,

is no longer valid. instead, it will look like .



"inline" tags cannot contain "block-level" tags. for example, an anchor tag can't enclose a



scripting elements pose a problem for xhtml compatibility. the xml parser will parse the script as a xml

document unless you enclose your script in a cdata block. therefore, a javascript element would now look

like:





this causes a hassle for all the current browsers as they will not like the cdata block. for now, the only

solution is to call the javascript from an external file. for example:


??


for the server-side programmer this is a problem when you modify the javascript dynamically. using a

separate file source for your javascript prevents you from being able to dynamically change your

javascript. this is because the javascript is being included on the client side so the server side won't

be able to touch it now. when modifying javascript using asp, jsp or php scripting, use the standard html

method of script declaration. this is the one place where making jsp or asp 100% compatible with xhtml

will be most problematic. remember, however, the goal is not to be 100% compatible with xhtml, but to

begin incorporating xhtml where feasible, allowing a quick and easy transition when the time comes. when

that time arrives, new compatible browsers should be available and you'll be set to make the jump to 100%

compatibility.

conclusion


in this article we've explored some advantages of xhtml and how to start using it right now with very

little hassle. xhtml is far more than a replacement for html. thinking of it as html 5.0 unnecessarily

limits its power and the possibilities it will introduce. xhtml is meant to be expanded by the user

community. it creates xml documents which contain, define and manipulate data, going far beyond the

capabilities of html-based documents. it makes xml easy to use. to fully realize the potential xhtml

presents will require a new way of thinking about future applications. it creates fresh possibilities.

xhtml really is a new thing (not merely an upgrade) and the challenge ahead of us is to experiment and

discover where it can take us.