MAMA: Markup report, part 2: Primary functional and structural markup
Introduction
This time we will look at some of the basic document structural elements. These
are the elements that form the backbone of most documents. Some of the topics
mentioned this week carry so much detail (such as the child elements of the
HEAD
element) that we can only give them brief lip
service here. For a deeper look at these areas and more, the following MAMA
article topics are also available this week:
Some of the topics involved with the HEAD
element,
such as CSS (the LINK
and STYLE
elements), and script (the SCRIPT
element) will receive
MUCH more attention in other articles coming soon.
To read more details of MAMA's findings, check out the MAMA home page.
Frames
The document layout concept for Web pages known as "frames" was first implemented
in Netscape 2.0 in 1995. It allows the browser window to be sub-divided into any
number of rows or columns of smaller windowed documents. The concept has many
design and
usability problems;
yet, it is popular enough (and easy enough) that its usage has blossomed
over the years. Many authors and designers have a special place of fury in
their hearts for frames—a place where disdain for other reviled constructs
like the BLINK
and MARQUEE
elements lives. The current version of frames enjoys wide deployment "in the wild",
despite its many drawbacks. Frames defiantly maintain a degree of authoring
inertia, despite the general disfavor.
Many authors probably do not care enough about the arguments against frames to use
other alternatives—or else they just are not being original enough in coming
up with design alternatives. Despite being dropped in XHTML 1.1, frames are not
going to go away any time soon.
Usage of Frame-related elements
The almost identical numbers of FRAMESET
and
FRAME
element usage are an obvious result—neither
element does anything useful without the other. The FRAME
and IFRAME
elements, on the other hand, are not used
together very often—only 19,472 of the IFRAME
cases
(8.8%) use the two elements together. Although the IFRAME
use numbers are lower than either the FRAMESET
or
FRAME
totals, the total number is likely much higher—
many Web page ad systems are dynamically created by script using IFrames.
ELEMENT | Frequency |
---|---|
FRAMESET | 378,033 |
FRAME | 378,107 |
IFRAME | 222,462 |
An interesting frame-related attribute: Target
Use of the Target
attribute is far, far greater than
the general usage of frames would indicate. It was detected in 2,077,198 of
MAMA's URLs, with A
element usage leading the way: 1,978,018 times—more than
3 times as much as the overall use of FRAME
and
IFRAME
would indicate. Why is this? Authors are most
likely concerned with the frame situation the hyperlinks in their documents
will end up in, so they take steps to control it with the Target
attribute.
ELEMENT | frequency |
---|---|
A | 1,978,018 |
FORM | 199,085 |
BASE | 159,479 |
AREA | 146,703 |
LINK | 1,585 |
Popular Target
attribute values
The Target
attribute can accept a wide variety of values,
but it also has several special reserved keywords, all beginning with the
underscore character ("_"): "_blank",
"_top", "_self",
"_parent" and "_new".
Naturally, these values are the most popular. Values resembling these keywords
(such as "blank" or "new")
are also very common, as are those which stress the parent-child relationship
of frame documents to their content documents (including "main"
and "contents", and even German equivalents of the same:
"hauptframe" and "inhalt").
Target attribute value | frequency | Target attribute value | frequency | |
---|---|---|---|---|
_blank | 1,548,594 | blank | 43,287 | |
_top | 550,637 | mainframe | 31,691 | |
_self | 306,182 | google_window | 20,905 | |
_parent | 121,225 | contents | 18,076 | |
_new | 84,293 | hauptframe | 15,829 | |
main | 82,075 | inhalt | 12,828 | |
new | 52,756 | content | 10,316 |
The HEAD
element and its children
HEAD
is the most popular of any element used in MAMA's
URL set, found in 98.7% of MAMA's URLs. Its top 5 sub-elements are also in the
top 20 of ALL markup elements used. This overview will not spend
too much time on this topic. Many of these child elements participate in very
important Web page topics, such as CSS and scripting.
ELEMENT | frequency | ELEMENT | frequency | |
---|---|---|---|---|
HEAD | 3,464,519 | LINK | 2,018,510 | |
TITLE | 3,459,207 | STYLE | 1,313,454 | |
META | 3,276,347 | BASE | 266,149 | |
SCRIPT | 2,528,823 | ISINDEX | 63 |
The META
Name
and
Http-equiv
attributes
The META element is a popular way to assign and designate extra information
about the document. It accomplishes important authoring tasks that
are not possible in any other way, so its use is extremely very high. This usage
is rather evenly divided between two functional attributes: Http-equiv
and Name
.
Http-equiv attribute value | frequency | Name attribute value | frequency | |
---|---|---|---|---|
content-type | 2,679,505 | keywords | 2,170,259 | |
content-language | 456,078 | description | 2,098,529 | |
pragma | 167,801 | generator | 942,051 | |
refresh | 163,413 | robots | 931,622 | |
expires | 163,350 | author | 815,415 |
Common attributes
There are a number of attributes that are nearly universal in scope and usage with HTML; they can be applied to most, if not all elements. The following sections examine some of these in more detail.
Attribute | Frequency |
---|---|
Name | 3,220,308 |
Class | 2,139,184 |
Style | 1,878,916 |
Id | 1,782,769 |
Event handlers ("on*" attributes) | 1,692,823 |
Title | 1,010,147 |
The Name
and Id
attributes
These are two similar attributes that both assign unique identifiers to individual
elements. Of the two, Name
is encountered more often;
It is actually the most popular of all the common attributes (used in some form
on 91.8% of MAMA's URLs). The Id
attribute is the newer
method for uniquely labeling an element, while the Name
attribute has considerable historical traction with authors under a variety of
different uses.
ELEMENTs using Name | frequency | % Total element usage | ELEMENTs using Id | frequency | % Total element usage |
|
---|---|---|---|---|---|---|
META | 2,710,638 | 82.7% | DIV | 1,085,482 | 43.4% | |
INPUT | 990,058 | 98.2% | TABLE | 482,760 | 16.7% | |
IMG | 875,460 | 27.2% | IMG | 471,807 | 14.7% | |
PARAM | 576,508 | 99.97% | INPUT | 372,905 | 37.0% | |
FORM | 570,643 | 54.8% | A | 319,619 | 9.7% | |
A | 485,168 | 14.7% | FORM | 266,886 | 25.6% | |
MAP | 456,648 | 99.7% | TD | 230,312 | 8.0% | |
FRAME | 349,820 | 92.5% | UL | 192,453 | 23.8% | |
SELECT | 275,323 | 96.5% | SPAN | 180,553 | 11.8% | |
EMBED | 138,809 | 25.4% | OBJECT | 165,628 | 31.1% |
Name
and Id
attribute values
There are extreme differences between the most popular values these two attributes
carry. The top values of the Name
attribute demonstrate
their ancestry of specific usage in the popular META
,
IMG
, A
, PARAM
,
and form elements. On the other hand, top values for the Id
attribute evidence a templating or classification behavior akin to the use of the
Class
attribute. The most frequent Id
values show sequential unique labels for certain categories, for instance the images
in a typical document might all sport successive Id
attributes
(eg: "image1", "image2",
"image3"...). The full attribute value lists for
Name
and
Id
demonstrate these
behaviors more clearly than the shorter top 10 lists here are able to do.
Name attribute value | frequency | Id attribute value | frequency | |
---|---|---|---|---|
keywords | 2,189,708 | footer | 288,061 | |
description | 2,100,858 | content | 228,661 | |
generator | 943,496 | header | 223,726 | |
robots | 937,844 | logo | 121,351 | |
author | 818,017 | container | 119,877 | |
movie | 530,989 | main | 106,327 | |
quality | 504,666 | table1 | 101,677 | |
revisit-after | 475,765 | menu | 96,161 | |
copyright | 423,210 | layer1 | 93,920 | |
progid | 281,339 | autonumber1 | 77,350 |
The Class
attribute
This attribute offers a degree of categorization and classification not possible
with the inherent element semantics of a markup language. The Class
attribute allows multiple elements to share the same grouping, and a single element
instance can belong to multiple categories. The attribute sees its greatest expression
with CSS (which we will cover more later), but the category names themselves that
authors assign are interesting to examine.
ELEMENT | frequency | % Total element usage | ELEMENT | frequency | % Total element usage | |
---|---|---|---|---|---|---|
A | 1,111,526 | 33.6% | TABLE | 580,281 | 20.1% | |
TD | 1,082,979 | 37.5% | INPUT | 438,516 | 43.5% | |
SPAN | 1,046,840 | 68.5% | IMG | 320,281 | 10.0% | |
DIV | 1,031,384 | 41.3% | LI | 228,422 | 27.1% | |
P | 736,885 | 27.3% | UL | 197,729 | 24.4% |
Class
attribute values
The most popular Class
value, "footer",
is twice as popular as its natural companion "header".
One big noticeable trend from the full Class
value list: there are a high number of class names of the form:
/style\d+/
. The popularity of each class value
decreases as the integer value at the end increases. MAMA detected values like
this going at least up to "style117" and probably
higher. A high (but untested) correlation was noticed between class names of
this type and the use of Macromedia Dreamweaver scripting library functions.
As Macromedia Dreamweaver is not always the easiest editor to detect, this
correlation will remain a theory.
Value | frequency | Value | frequency | |
---|---|---|---|---|
footer | 179,528 | content | 113,951 | |
menu | 146,673 | title | 91,957 | |
style1 | 138,308 | style2 | 89,851 | |
msonormal | 123,374 | header | 89,274 | |
text | 122,911 | copyright | 86,979 |
Event-handler attributes
As mentioned previously, we will discuss scripting in greater detail soon. However, for now, we will take a look at those HTML markup portals to scripting, the event-handler attributes. Event handlers were detected in ~2/3 of the 2,617,305 MAMA URLs using script. MAMA found 52 unique event-handler attribute names occurring more than 4 times. With each event-handler attribute, there was generally a single element with which it showed the greatest affinity.
Event handler | Element with highest usage |
Frequency when used with element | Total overall attribute frequency |
---|---|---|---|
Onmouseover | A | 829,262 | 1,051,631 |
Onmouseout | A | 781,567 | 998,854 |
Onload | BODY | 741,946 | 772,567 |
Onclick | A | 492,092 | 684,117 |
Onchange | SELECT | 158,761 | 163,476 |
Onsubmit | FORM | 151,699 | 152,286 |
Onfocus | INPUT | 146,043 | 197,235 |
Conclusion
Now that we are starting to see the general shape that markup documents take,
we should pause to consider what to look at next. The full writeups for this
week offer our first real glimpses of what makes most documents tick (especially
the thorough treatment of a document's HEAD
structure and the use of common attributes). Looking ahead to
next week, our natural progression leads us to the elements that most authors use
in the BODY
section of their documents: images
(IMG
) and hyperlinks (A
).
Next week's overview will also start dipping into the bulk of the basic semantic
phrase and block markup. See you soon!
This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.
Comments
The forum archive of this article is still available on My Opera.