MAMA: Forms

By Brian Wilson

Index:

  1. Introduction
  2. FORM element
  3. INPUT element
  4. SELECT element
  5. OPTGROUP element
  6. OPTION element
  7. LABEL element
  8. TEXTAREA element
  9. LEGEND element
  10. BUTTON element

Introduction

Aside from hyperlinks, forms are the main way users interact with the Web. Among their varied critical uses, forms allow people to:

  • Find what they want via search engines
  • Publish their thoughts online via blogs
  • Enter their personal details and make purchases via e-commerce sites

The standards bodies keep trying to create successors to the current popular incarnation of forms in order to make things easier for creators and to provide a richer experience for users (check out the XForms spec and the WebForms 2.0 spec). By MAMA's representation statistics, authors do not seem to be embracing the newer forms features yet in significant quantities. The XForms namespace was only found in 16 of MAMA's URLs, and syntax from the most popular new features in Web Forms 2.0 numbers just over 100 detected cases. (Bear in mind that, at the time of writing, Web Forms 2.0 was a nascent technology, with fairly limited browser support.) Forms in general are very popular—found on up to one-third of all pages analyzed.

The popularity of the main types of form elements varies widely, and sometimes surprisingly. For example, almost every FORM has an INPUT element, but relatively few make use of TEXTAREA. Such variations may be due to a number of factors, including inherent biases in MAMA's current URL set (a majority of MAMA's URLs are Surface/Home pages). The intended use of a Web page often dictates the types of elements used, including form elements.

Form elements frequencies

Fig 1-1: Frequency of forms-related elements
ELEMENTFrequency ELEMENTFrequency
FORM1,040,771TEXTAREA36,410
INPUT1,008,545FIELDSET31,673
SELECT285,362LEGEND18,269
OPTION281,923BUTTON11,455
LABEL159,631OPTGROUP5,348

The FORM element

We will start our look at form elements by looking at its main container element: FORM. Notice that the Action attribute is used on most pages—it specifies what to do with the information the form is collecting. This attribute is required, so the dominance here is understandable. The Method attribute is only slightly less popular than the required Action attribute (found in 89.39% of all form usage). The Name attribute is just over twice as popular as the Id attribute for this element.

Fig 2-1: FORM element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency
FORM1,040,771   Target199,085
   Action977,934   Enctype31,845
   Method930,343   Accept-charset8,775
   Name570,643   Align1,569
   Id266,886   Accept0

The Method attribute

Approximately 70% of pages that specify an explicit HTTP Method use the "post" method, while ~46% use the "get" method (some documents had pages with multiple forms that had a mixture of methods). This would indicate a clear authoring preference for the "post" method, but there are a few factors to consider. Up to 15% of the pages specifying the Method attribute use multiple forms on the page that mix both "post" and "get" methods. There are 110,428 URLs that used the FORM element with no Method attribute; "get" is the implied default value in such cases. This brings the relative preferences for Method among all FORM usages much closer: 62.19% for "post" and 51.56% for an explicit or implied "get" value. The full frequency table for this attribute shows other values, including a number of typos, but they appear inconsequential next to the two main, accepted values.

Fig 2-2: FORM Method explicit values
Attribute valueFrequency
post647,234
get426,192

The Accept-charset attribute

MAMA kept track of the values for this attribute, although it was not known in advance if it was used in significant quantities (it was detected only 8,775 times). The most popular value is clearly "utf-8", with "iso-8859-1" also being very common. Other than those, Japanese encodings held sway with 3 of the next 4 most popular values.

Fig 2-3: FORM Accept-charset values
[Also see the full frequency table.]
Attribute valueFrequency
utf-85,683
iso-8859-12,185
x-euc-jp286
iso-8859-2147
euc-jp138
shift_jis86

The INPUT element

This popular element is used in 96.90% of all documents using forms. With the element's functionality being as overloaded as it is, this popularity is both understandable and expected. Many of the attributes listed below in Fig 3-1 are only used with specific Type attribute values, so we will look at the Type attribute before saying anything more about the other attributes.

Fig 3-1: INPUT element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency  ELEMENT/AttributeFrequency
INPUT1,008,545   Border172,843    Vspace8,358
   Type1,005,152   Checked135,049    Autocomplete5,053
   Name990,058   Width120,420    Readonly3,936
   Value947,403   Height119,902    Language3,314
   Size656,354   Align70,163    Valign3,184
   Src335,990   Accesskey35,501    Disabled2,688
   Maxlength329,415   Tabindex34,725    Dir1,892
   Alt213,924   Hspace10,193    Required929

The Type attribute

Fig 3-2: Top INPUT Type attribute usage
[Please also see the complete frequency table.]
Attribute valueFrequencyAttribute valueFrequency
text806,926checkbox81,260
hidden733,126button71,031
submit568,445reset17,417
image337,286search1,102
password167,098textbox864
radio159,626input796
empty110,971file791

Now, our discussion of the INPUT element gets more interesting:

  • The "empty" value indicates that an INPUT element did not have a Type attribute at all. In such situations, a widget is interpreted as Type="Text". In all, 79,050 URLs used INPUT elements where none of them specified a Type attribute.
  • The most popular attribute value is Type="Text", but "Hidden" is also very popular.
  • The next most popular Type values are "Submit" and then "Image". Because "Image" is a type of submittal, and each of the two mentioned will often be used to the exclusion of the other, looking at their combined totals shows that submittal is the most popular function of forms (more popular than "Text"). This is actually an expected result.
  • Type="Image" has a much higher representation than expected, with up to one-third of INPUT instances using graphical submit buttons instead of the default "Submit" widget.
  • Type="Image" related attributes: Width and Hspace (horizontal dimensions) have just a slight edge over Height and Vspace (vertical dimensions), the same as in the case of the IMG element.
  • Type="Image": The Src attribute is used 335,990 times, compared with 337,286 times for Type="Image"— a difference of 1,296 URLs not having any Src. This does not make a lot of sense and might warrant further investigation.
  • In the early days of forms, most "Submit" buttons were paired with a "Reset" button, but today, that seems to be passé. By comparison, "Reset" is rarely encountered now.
  • The Size and Maxlength attributes (used with Type="Text") are both quite popular overall, but Size is about twice as popular as Maxlength.
  • The exclusive choice widget, Type="Radio", is twice as popular as the multi-choice Type="Checkbox" widget.
  • The invalid sequence Type="Input" occurs more often than Type="File". On the surface, this seems unusual, but this outcome may be quite reasonable. The Type="File" sequence is often used with more complex Web application pages that are not presented as often on the main Surface/Home pages that compose the majority of MAMA's URL set.

The Size attribute

The popular values for the Size attribute (used with Type="Text" boxes) definitely show a pattern. Authors really like multiples of 5, although the most popular explicit value - "20" - is also the default size used in most browsers.

Fig 3-3: Input Size values
[Also see the full frequency table.]
Attribute valueFrequency
20137,644
15114,750
10109,592
1254,690
2544,623
3034,639

The SELECT element

Aside from the overloaded INPUT element, the SELECT element is the next most popular of the form widgets. The use of the Multiple attribute was much lower than I expected; it only just beats out Disabled for last place with usage in 0.64% of all SELECT lists. The Name attribute is used with most SELECT elements (96.48%), and Name dominates over the Id attribute again by a 4-to-1 ratio.

Fig 4-1: SELECT element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency
SELECT285,362   Tabindex5,282
   Name275,323   Multiple1,826
   Size70,201   Disabled1,515
   Id68,087  

The Size attribute

The Size attribute is only used in ~25% of all SELECT lists, but as you can see from the frequency data for the attribute value, the size that is specified the most is 1 (93.26% of the time). Since this the typical default size in most browsers, the value is probably automatically inserted as such by many Web page editors. Also of note is that the legal Size="2" value does not even make the top 10 values, whereas some questionable values like "0" and "-1" rank higher.

Fig 4-2: SELECT Size frequency
[See also the full frequency table.]
Attribute valueFrequency Attribute valueFrequency
165,4720576
51,16410510
38698417
48617323
6725-1246

The OPTGROUP element

OPTGROUP was introduced in HTML 4.0, and it still has not gained a lot of traction—it was found to be the least popular of any of the form-related elements. When it is used, it almost always uses the Label attribute. The other attribute specifically defined for this element, Disabled, was only detected a paltry 4 times.

Fig 5-1: OPTGROUP element/attribute usage
ELEMENT/AttributeFrequency
OPTGROUP5,348
   Label5,327
   Disabled4

The OPTION element

There are over 3,000 URLs in MAMA that use the SELECT element but not the OPTION element. It is possible that these cases are creating the OPTION elements for these lists dynamically using SCRIPT, but that would need some further scrutiny. Almost 97% of all URLs having OPTION elements also use a Value attribute with at least one of them. The Label and Disabled attributes are (comparatively) rarely used, proving to be only slightly more popular than the Name attribute.

Fig 6-1: OPTION element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency
OPTION281,923   Label1,735
   Value273,138   Disabled1,325
   Selected163,967   Name1,017
   Id4,615  

The LABEL element

It is not necessary for a LABEL element to have a For attribute in order to directly associate it with another element, but most authors do so (88% of the time). The representation of the Accesskey attribute seems quite low, but that is a little deceiving; only the A and INPUT elements had higher numbers for the attribute.

Fig 7-1: LABEL element/attribute usage
ELEMENT/AttributeFrequency
LABEL159,631
   For140,576
   Accesskey5,330

The TEXTAREA element

I found it rather surprising that TEXTAREA was used in only 3.50% of all pages using the FORM element. Perhaps the dominance of top-level pages in MAMA's URL set had something to do with the usage numbers. The TEXTAREA element is often a workhorse for more serious applications than what you would likely find on a glossy/glitzy home page. The Rows attribute has a slight edge over the Cols attribute in usage, but both attributes are found at the same time in 31,046 URLs. In all, 3,136 URLs used TEXTAREA elements without any Rows or Cols attributes (less than 10% of all URLs using TEXTAREA), leaving the browser to use the default dimensions. Note that the Name attribute maintains its dominance again over the Id attribute by a wide margin.

Fig 8-1: TEXTAREA element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency
TEXTAREA36,410   Wrap7,848
   Rows32,754   Readonly1,668
   Name32,500   Tabindex1,570
   Cols31,566   Accesskey91
   Id9,183   Disabled79

TEXTAREA Wrap attribute values

The values for this attribute have never been well documented. In addition to a number of other values, there are three combinations that seem to be an attempt to control the same behavior of a TEXTAREA box: "Virtual"/"Physical", "Soft"/"Hard" and "Off"/"On". "Virtual", "Soft" and "Off" all seem to be much more popular than their corresponding opposite attribute value. HTML5's Web Forms 2.0 codifies the "Soft"/"Hard" values, but as can be seen, these are not the values that are the most widely used.

Fig 8-2: Popular TEXTAREA Wrap values
[Please also see the complete frequency table.]
Wrap Attribute valueFrequency
virtual3,608
physical1,886
soft1,299
hard376
off252
on201

The LEGEND element

The "bounding box" visual effect that the FIELDSET element creates is usually paired with a LEGEND in traditional user interface usage, so it is odd that only 57.68% of FIELDSETs have a LEGEND element. Additionally, very few authors subsequently use the Align or Accesskey attributes.

Fig 9-1: LEGEND element/attribute usage
ELEMENT/AttributeFrequency
LEGEND18,269
   Align546
   Accesskey91

The BUTTON element

This element is rarely used in comparison to the INPUT Type attribute values that it subsumes. The INPUT Type=Submit/Reset/Image are still the preferred method for accomplishing their respective tasks. Even so, this forms latecomer still has some respectable numbers given its relatively recent arrival. Notice that the Name to Id ratio is much closer than with other, older forms widgets.

Given that the Type attribute has a default value of "Submit", and submittal is the top function from the INPUT element that BUTTON replicates, it seems a little peculiar that having an explicit Type attribute has fairly high usage (~80%). MAMA did not track the Type attribute for this element the way it did for the Type attribute for INPUT, but it is expected that Type="Submit" would be the dominant value here as well.

Fig 10-1: BUTTON element/attribute usage
ELEMENT/AttributeFrequency ELEMENT/AttributeFrequency
BUTTON11,455   Value2,271
   Type9,079   Tabindex464
   Name4,246   Disabled59
   Id3,387  

This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.

Comments

The forum archive of this article is still available on My Opera.

No new comments accepted.