Validation of live site (not localhost one).
6 errors are SVG validator bugs.
Information is about HTML5 by default, unless stated otherwise. Backwards compatibility will also be considered.
Considers both W3C standards and browser support.
HTML is standardized by the Word Wide Web Consortium (W3C) http://www.w3.com
The w3schools website is not affiliated in any way to W3C. It is said that it contains many errors, and its certification is worthless: it is the cplusplus.com of the web. It's major flaw is not pointing to the standards that corroborate it's affirmations.
Formally, only HTML related stuff described in the, WD of 2008, expected REC for 2014: HTML5 standard The huge time lapse is in part due to wars on the future of Flash replacement.
Informally, an umbrella term for new web standards in general, e.g. as used on the name of HTML5 Rocks website.
Not all HTML5 is not XML, just pretty close to it. It is however possible to write HTML5 code that is valid XML. XHTML5 is a version of HTML5 which is always XML.
HTML5 is not SGML, but it was influenced by it. HTML 4 is SGML.
XHTML5 is a strictly XML version of HTML5.
The standard is developed together with HTML5.
Browsers differentiate between the two based on the HTTP content type:
text/html
is for HTML5 and application/xhtml+xml
is for XHTML5.
The DOCTYPE
is not used to differentiate between HTML and XHTML.
The distinction is made by the MIME type on the content-type
field of the HTTP header:
application/xhtml+xml
and is strict XMLtext/html
and is not XMLThere is not inclusion relation in any direction between the XML and HTML.
Features that only exist in HTML and not in XML: http://stackoverflow.com/questions/7092236/what-is-cdata-in-html/39559758#39559758
Features that only exist in XML and not in HTML: (http://stackoverflow.com/questions/5558502/is-html5-valid-xml)
?xml
XML declaration does not need, and cannot be used in HTML
!DOCTYPE
and comments which are present in both.
CDATA
is not treated specially in HTML5.
It's main use in XHTML to allow characters like &
to be written in script
elements.
However in HTML5 this is not necessary, as opening script tags make the parser ignore everything up to the closing tag.
To write code that is portable to both HTML5 and XHTML5, a common technique is to write the CDATA inside script on commented out lines.
HTML5 is cannot be described by a DTD or XSD: they are not expressive enough: http://stackoverflow.com/questions/4053917/where-is-the-html5-document-type-definition
W3C offers an HTML validation service at: http://validator.w3.org/
You can either upload your file, or give it an URL.
There are also tools that make it faster to validate a page, for example the Firefox Web Developer extension.
Google style guide: http://google-styleguide.googlecode.com/svn/trunk/htmlcssguide.xml
In order to understand what is going on, you will want to use browser tools.
Firefox has the built-in Web developers tool which is quite good. Shortcut: Ctrl + I.
It also has the popular third party Bugzilla tool. Bugzilla was more popular in the past when the built-in tool was worse, now using just the built-in tool should be enough
To validate HTML easily, try: https://addons.mozilla.org/en-US/firefox/addon/web-developer/ Shortcut: Ctrl + Alt + L
http://www.w3.org/TR/html5/syntax.html
<p>a</p>
is an element
<p>
and </p>
are a start and end tag respectively.
The standard does not mention the terms "opening" or "closing" tags, so don't use them.
All tag names are case insensitive.
Lower case names are much more common now. They are easier to type, and tags + syntax highlighting already gives them enough visibility.
XHTML5 is simple, just like XML:
<br>
In HTML5, things are much messier. There are five different kinds of elements to consider, and tons of rules.
void elements are the only elements which *cannot* have an end tag.
As a consequence, they cannot have any content, which justifies their name.
<br />
== <br>
: the slash is omitted
<p />
== <p /></p>
<p />
== <p>>
. The slash can appear on any element,
and it is the same as the greater than sign.
p
elements has by far the most complex rule. Quoting:
p
element's end tag may be omitted if the p
element is immediately followed by an
address
, article
, aside
, blockquote
, div
, dl
,
fieldset
, footer
, form
, h1
, h2
, h3
, h4
,
h5
, h6
, header
, hgroup
, hr
, main
, nav
, ol
, p
, pre
, section
, table
, or ul
, element
,
or if there is no more content in the parent element and the parent element is not an element.
li
element is immediately followed by another li
element
or if there is no more content in the parent element.
div
, ul
, table and blockquote
tags must have and end tag
http://stackoverflow.com/questions/9797046/whats-a-valid-html5-document
Example at: min.html.
Works because of tag omission.
Such minimalism may break on IE8 as noted in the SO thread.
A saner and more portable minimal template is main2.html.
HTML defines a valid URL as: http://url.spec.whatwg.org/
Valid URLs are required on all attributes that take URLs like href
or src
:
in particular you must escape special chars like spaces,
even if the parameter quotes would be enough to determine the URL.
However it seems that this definition of URL allows IRIs, which allow many Unicode characters, to appear directly. TODO check. See: http://stackoverflow.com/questions/2742852/unicode-characters-in-urls http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid
.HTML5 uses only the term "Character references": http://www.w3.org/TR/html5/syntax.html#character-references
The term "entity" is only used in XML, and has no exact analogue in HTML,
since XML entities can be defined in the document with the !ENTITY
construct,
and can be defined to mean multiple characters, including other entities, which leads to the billion laughs vulnerability.
Even so, some authors informally say "Entity" when talking about HTML Characters references.
The term entity is not used: it is only used for a similar concept in XML.
There are two kinds of entities:
named. E.g.: > >
Full list: http://www.w3.org/TR/html5/syntax.html#named-character-references
There is a bunch of accents and mathematical characters.
XML on the other hand only defines the 5 that need to be escaped in some context: & < > " '.
HTML 4.1 did not have '
: you had to use a numeric reference: '.
It was introduced in HTML5. The insanity!
Numeric: refers to Unicode code points, not UTF-8 or the current encoding.
Each reference contains the entire Unicode point.
Starts with the number sign #
.
Not all numbers are valid! The following are explicitly disallowed: U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and control characters other than space characters.
It seems that XML has similar restrictions. TODO check.
This is not so surprising, considering that HTML should treat references almost exactly as if they were characters, so if the raw characters are not allowed, the corresponding references should not be allowed either.
Commented out invalid examples: Control:
HTML5 adopts exactly the same references as XML, but earlier versions do not, so watch out. TODO check.
Notably: " is on the official list of valid HTML 4 entities, but ' is not.
Entities are considered in both data and attribute values.
Escape sequences are intended for characters that:
Not present in the current encoding, such as Unicode character in an ASCII encoded file.
Prefer Unicode for all non-whitespace characters.
Useful whitespace characters:
TODO: heard it somewhere that semicolons can be omitted on certain references. Check.
Similar to a regular space, but:
overflow-wrap
CSS property to allow breaks anywhere.
It seems to be the only whitespace with a named reference, but other whitespace character width variations exist in Unicode: http://en.wikipedia.org/wiki/Non-breaking_space#Width_variations like the en and em spaces.
Before After
Not in HTML4, so avoid it and use "
instead. In HTML5 and XML.
Using non standard tags makes your document invalid.
It is still likely to render correctly on all browsers.
You can also cheat this by referencing a custom !DOCTYPE declaration, but this is really overkill.
HTML allows you to use any attribute prefixed by data-
. Great addition!
There is even JavaScript support with dataset
:
https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/Using_data_attributes#JavaScript_Access
Each element can have a specific set of attributes: these are specified on HTML5 on the documentation of each element under the "Content attributes" section.
There are attributes which can be used for every element: those are called Global attributes
Attributes are case sensitive, but inside a single element there can be no two attributes which are the same lowercased.
The only characters that must be escaped inside quotes are the quote itself (same type), and &.
' is not in HTML 4, even if it is in XML and HTML5! This is a very strong argument towards using double quotes for attribute values.
Besides the quotes, only the ampersand & also has to be escaped, or it would be parsed as an entity otherwise.
Omitting quotes is OK for values that don't have spaces, but very insane http://www.w3.org/TR/html5/syntax.html#attributes-0 red with title.
This is not possible in XML.
Before HTML5, id attributes must start with a-zA-Z
. In particular, no digits.
This restriction was lifted with HTML5, but it is still recommended to follow it for backwards compatibility.
id
attributes must be unique across the entire document or it is invalid.
Implementation statuses:
http://programmers.stackexchange.com/questions/127178/two-html-elements-with-same-id-attribute-how-bad-is-it-really
Multiple classes are space separated as in: "class0 class1 class2"
Most user agents shows as a tool tip on hover. HTML5 discourages relying on it to show information as it cannot be seen by some user agents, notably those without a mouse, like a tablet or cell phone.
It is not possible to style the tooltip directly, but there are good CSS ways to reproduce the tooltip effect allowing arbitrary style: http://stackoverflow.com/questions/2011142/how-to-change-the-style-of-title-attribute-inside-the-anchor-tag
HTML5
Boolean attribute.
IE11+
Vs CSS display:none
:
http://stackoverflow.com/questions/6708247/what-is-the-difference-between-the-hidden-attribute-html5-and-the-displaynone
hidden
never shows, display:none
might show in some devices but not on others.
TODO understand better with examples.
Before hidden. Hidden. After hidden.
Often used with the meta
attribute.
Used for RDFa.
Many proprietary values are used.
TODO vs property
?
http://stackoverflow.com/questions/22350105/whats-the-difference-between-meta-name-and-meta-property
http://www.w3.org/TR/rdfa-lite
Allows to add arbitrary metadata do a document.
TODO: vs meta?
Not content attributes that are written in HTML tags, but rather internal properties that are accessible through JavaScript properties of the DOM model. For this reason they are often called properties.
Important confusing case: checked
attribute vs property.
Changing the attribute also changes the property, but clicking on the checkbox
only changes the property, not the attribute.
The first line determines the document type.
HTML5 has one single possibility: <!DOCTYPE html>
. It is case insensitive.
HTML4 has multiple possibilities
XHTML5 has only accepts uppercase, so just stick to the uppercase version.
Can be omitted: http://stackoverflow.com/questions/5641997/is-it-necessary-to-write-head-body-and-html-tags
Contains document metadata.
Can be omitted: http://stackoverflow.com/questions/5641997/is-it-necessary-to-write-head-body-and-html-tags TODO how toes HTML decide what is inside of it and what is inside the body? Can all head tags only be inside it?
Required.
On all modern browsers, appears on the tab name.
Has big importance to search engines.
Document metadata.
The property
global attribute is often used on it.
meta
must have at least one of the following attributes:
http-equiv
, itemprop
, name
, property
.
meta
can be used to define a key value dictionary,
where the keys are stored in the name
attribute,
and the value on the content
attribute.
Besides a few standard name
arguments, HTML5 says that anyone
can recommend a new name
usage:
http://wiki.whatwg.org/wiki/MetaExtensions.
For instance, Rails defines csrf-param
and csrf-token
there.
However, for the validator to pass you must add your name
value to the WHATWG wiki,
which is peer reviewed. The validator parses the wiki every time for the validation.
Many proprietary values are registered, e.g. from Twitter.
Commented out failing example: custom-name
.
An important metadata, as it is likely to show on search engine result snippets. https://support.google.com/webmasters/answer/35624?hl=en
Indicates the document's encoding.
Very important document wide, so it must be within the first 512 bytes of the document.
Keep your sanity and use UTF-8 *everywhere*.
There seems to be no requirement at all that certain glyphs are available to user agents or not: http://stackoverflow.com/questions/2702447/which-of-the-following-unicode-characters-should-be-used-in-html
The only official recommendation we could find seems very vague: http://www.w3.org/TR/REC-html40/charset.html#h-5.4
Set values of things that can be set on the HTTP headers.
Only a few values a supported: http://www.w3.org/TR/html51/document-metadata.html#attr-meta-http-equiv
Content-Language
is deprecated in favor of the global lang
attribute.
Automatically refresh page after 1 second or redirect to another page!
This is a non standard HTTP header that is standard on HTML.
Interesting way to automatically reload news feeds without JavaScript, reload modifications on this cheat, or just annoy your users to hell.
Too annoying even for tests!
Redirect instead of refresh. Evil idea: HTML-only infinite loop of redirects!
https://developer.mozilla.org/en/docs/Mozilla/Mobile/Viewport_meta_tag
See also: CSS's device-pixel-ratio
.
Contains document "visible" content.
Can be omitted: http://stackoverflow.com/questions/5641997/is-it-necessary-to-write-head-body-and-html-tags
It is recommended that you always set the lang
attribute of the html
tag to help search engines.
In HTML5, can be used on any tag an sets the language of the element, but the most common place to put it by far is in the body element.
meta http-equiv="Content-Language"
is deprecated.
HTML5 has many element categories.
Flow content and phrasing are two of the most important categories.
Elements can be both flow content and phrasing at the same time.
One of the most important thing this determines is which elements an element can contain. Each element has:
Important conclusions that can be taken from content models and categories include:
p
, ul
.HTML4 uses terms block-level, inline and inline-block. In most senses, all elements are either block-level or inline.
HTML5 has many more categories.
Inline replaced elements such as images which behave like elements with CSS display:inline-block
property.
They can be put inside other inline elements.
http://www.w3.org/TR/html5/dom.html#sectioning-content-0
http://www.w3.org/TR/html5/sections.html#headings-and-sections
Organizes the structure of the document.
Mostly are only semantic and don't render specially in most browsers.
http://www.w3.org/TR/html5/grouping-content.html#the-main-element
Authors must not include more than one main element in a document.
Authors must not include the main element as a descendant of an article, aside, footer, header or nav element.
Content can be either directly inside main
, or inside articles inside of it
.
TODO
TODO
http://www.w3.org/TR/html5/sections.html#the-section-element
Usually a child of article
or main
,
that contains a heading. Usually there are multiple sibling sections.
Usually a sibling of header
.
TODO
Can be inside any article
or main
or body
,
so there can be multiple per document.
Can contain h1
, nav
and summaries,
although those elements can also be placed outside.
Usually a sibling of section
.
White spaces are taken into consideration when rendering.
Can only contain phrasing content elements like span
or b
.
Since it can contain elements, smaller and greater than signs are *not* automatically escaped.
A similar effect can be achieved with the CSS white-space:pre;
attribute,
although there are some behavior differences.
Newline space:
a b
Newline before and after:
a b
Two newlines before and after:
a b
Whitespace newline before and after:
a b
a
b
You cannot put HTML4 block-level elements such as p
inside pre
(another block level element)
2 Newlines before and after:
a b
Line wrapping
a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d
Term used in the spec to group tags that add semantics to text. Some of those tags do have special browser styles, so it is not just semantics.
In the past, some tags were used to indicate format and not semantics.
Many were marked obsolete on HTML5 such as center
in favor of CSS, but some acquired
semantic meaning related to the context in which they were most often used in, such as b
for emphasis.
#code tag
: computer code. The most widely used of this list.HTML5
Main purpose: give phonetic reading for languages like Chinese, which are very hard to learn to read because they have weak correspondence between text and sound. Used mostly in texts targeting students, teenagers or foreign learners.
Low support as of 2014.
Chinese pinyin: 河 蟹
Nested: Japanese + Japanese phonetic symbols + Japanese romanisation: 攻殻機動隊
I bet this will be used by the Ruby programming language community at some point.
An extra place where words may break if container too small without adding spaces:
With in small div:
Without in small div:
With in large div:
Bi Directional override: This text will go right-to-left.
The Inline element code
is still often used for computer code representation.
It does not imply whitespace preservation:
a
b
Common combo with pre
to preserve spaces and make it a block element:
a
b
HTML5.
Semantic for times.
Does not render as anything special in major browsers.
Can have the datetime
attribute to specify time for inner content.
No attribute:
datetime
attribute .
The ins
and del
elements indicate changes in the text.
They can also have the cite
and datetime
attributes to contain extra information about the changes,
but on Firefox 31 at least those have not default special effects.
#ins: inserted in document from last version.
#del: removed from document from last version.
Set of elements used in the HTML5 spec.
Groups elements that require downloading extra data through new requests.
Some of those elements can be embedded either directly in the HTML if they are XML based like SVG, or with data URIs for all types, but the downside of that is that the browser can't cache it, so don't do it unless the page entire page can be cached and you won't be reusing the element across pages.
img elements are neither inline nor block: they are replaced inline elements.
Must be non-empty or invalid HTML5:
An image pointing to HTML:
It is possible to set width and hight and the browser will resize as requested. height="100" width="200":
Setting only width or height makes browser keep proportion. width=300
What is the best way to set image size: CSS (width and height apply to replaced inline elements) or HTML? http://stackoverflow.com/questions/2414506/should-image-size-be-defined-in-the-img-tag-height-width-attributes-or-in-css
If:
then alt the browser can show the alt text instead of the image
As of 2013, chrome replaces a missing image with a dummy image, therefore ignoring alt. Therefore don't rely on alt being shown in your app logic: alt should only be shown in case of error. If an image should not present, replace it with something else explicitly.
According to the HTML5 validator, the alt
attribute is mandatory except under certain conditions,
so just use it always for your sanity.
With alt text:
fig + figcaption
is good practice in HTML5.
figcaption
has no special display,
It is only semantic and the real work must be done with CSS.
Browsers may however place it in a sensible place such as the bottom if no style data is found.
Some examples to show that data URLs work:
Single long line (clutters less the screen):
Also possible for SVG, but remember: src
must be a valid URL,
so it can't contain things like spaces.
Added in HTML5.
Takes multiple formats, plays the first supported only.
controls
attribute adds controls like play / pause:
Autoplay (commented out:)
autoplay loop: great way to annoy users to death! (commented out):
Firefox right click does "Save Audio As" just like it does for images.
Add some javascript, and you got a music player:
Multiple tracks: just change the src of the auto tag.
Powerful things can be done with it using the JavaScript Audio API.
Stream: nope:
Include another HTML document in the current one.
This element therefore automatically makes an HTTP request to another page, so it is an interesting way go include HTML content without any JavaScript.
In the past, before server side scripting, widely used to reuse code fragments such as indexes.
Now generally frowned upon for navigation.
frame
was invalidated in HTML5.iframe
, inline frame, is still valid in HTML5.iframe always has a fixed size, which you can change with CSS:
Disable scroll in HTML5 CSS3 is not yet widely supported: http://stackoverflow.com/questions/1691873/safari-chrome-webkit-cannot-hide-iframe-vertical-scrollbar
The frame content is like alt
for images.
Your browser may simply ignore it in case of failure if he has something better to do
(Firefox 27 shows the not found page with a reload button.)
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/object
HTML4, but lost many attributes when HTML5 was introduced.
TODO iframe
and embed
:
http://stackoverflow.com/questions/16660559/difference-between-iframe-embed-and-object-elements
Pass parameters to script running in object
.
Can only be contained inside of object
.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/embed
Standardized in HTML5, but had wide incompatible supports before.
TODO: whatever is inside it does not show: abc
Standard way to represent mathematics.
XML like language not suitable for direct user input, not LaTeX.
A converter from LaTeX to MathML such as Blahtexml or itex2MML or is inescapable content writers.
TODO examples
This section shall only consider the relationship between SVG and HTML5 technologies: it shall not go into how SVG works itself.
Specified by W3C at: http://www.w3.org/TR/SVG/.
Can be used from either:
img
object
The trade-offs are:
img
and object
object
So the object
approach is the best in general.
Embedded:
img
:
Unlike the embedded version, img
svg
element
must have the xmlns
and version
attributes!
TODO: what is the version of the embedded SVG? Can it be specified on the element?
object
:
Great article: http://css-tricks.com/using-svg/
Even though HTML is not XML, CSS can be used to style any XML as well as HTML.
Just use selectors and properties analogously to how they are used in HTML!
Even complex effects like hover
work!
Inline:
object
styling only works with styles embedded in the SVG or external styles. TODO: get working.
Embedded:
External:
You can include script tags inside SVG and compliant browsers must execute it.
This allows for interactive SVGs, but also XSS attacks, which is probably why GitHub does not server SVGs. http://stackoverflow.com/questions/13808020/include-an-svg-hosted-on-github-in-markdown/25606546#25606546
It seems that most browsers will only execute inline SVG,
not embedded in objects like img
but this behaviour is not standardized.
Click for Js embedded:
Click for Js inside object
:
Click for Js inside img
:
External JavaScript:
https://bugzilla.mozilla.org/show_bug.cgi?id=798374
Do ulimit -Sv 500
and open
/svg-billion-laughts.svg
to see Firefox 44 die.
Chromium 36 and Firefox 31 both crash (and may make your system crash) on a billion laughs exploit: https://bugzilla.mozilla.org/show_bug.cgi?id=798374 https://code.google.com/p/chromium/issues/detail?id=231562
http://stackoverflow.com/questions/7458546/html-in-svg-in-html
The link element links to different types of external documents.
One notable exception of external document that is not referred to via
link ref are scripts such as JavaScript, which must be referred to with script href
.
Why? http://stackoverflow.com/questions/2631635/can-i-load-javascript-code-using-link-tag
The rel
attribute specifies what is the main meaning of the linked element.
Only the white-listed rel
values generate valid HTML5,
although major websites use invalid extensions, e.g. GitHub uses xhr-socket
.
Vendors have also created extensions, e.g. apple-touch-icon
from Apple.
The image will show on most browsers on the tab before the title.
It is like an icon of a desktop application.
A standard name for the icon image is favicon.ico
.
http://www.html5rocks.com/en/tutorials/webcomponents/imports/
link destination
img:
Anchors without an href
are specified to look like regular text:
I am an anchor without href
! There are situations
in which one wants to use them as event generators.
Not possible: http://stackoverflow.com/questions/9882916/are-you-allowed-to-nest-a-link-inside-of-a-link because anchor cannot have interactive content descendants, and anchor is interactive content.
a
cannot contain a
.
Has no special semantics to browsers, but may instruct users and search engines not to follow links. rel="nofollow"
Where to open the link.
The other values can only be seen with frames and I'm lazy to do it.
If present, the linked resourced will be downloaded with the name of the value of this attribute instead of navigated to.
For certain MIME types, in particular those that the browser cannot display, and possibly user configured, downloading may be the default browser action.
Download an HTML: html2.html html2.html as newname
Cannot be used of course for XSS protection if HTML pages may be downloaded:
TODO does HTTP Content-Disposition
work for that?
TODO check if all each of them are supported in HTML5.
Contain the data of the link inline.
Defined in: http://tools.ietf.org/html/rfc2397
Good or bad: http://stackoverflow.com/questions/1124149/is-embedding-background-image-data-into-css-as-base64-good-or-bad-practice Mostly bad because browsers have a little feature called cache.
TODO: does HTML5 say you can use them in anchors? HTML5 does mention it in some parts, but I could not find the passage.
Cannot work from browser because would be a security gap.
Not doing anything in Firefox 31.
Open default email client with the email pre-filled.
Header with th:
head 0 | head 1 |
---|---|
0 | 1 |
2 | 3 |
4 | 5 |
thead
must come *before* tbody
and tfoot
in valid HTML,
as any sane programmer should do.
head 0 | head 1 |
---|---|
foot 0 | foot 1 |
body 0 | body 1 |
tbody
is automatically inserted on the DOM
around groups of rows without a surrounding tbody
.
http://stackoverflow.com/questions/938083/why-do-browsers-insert-tbody-element-into-table-elements
Automatic addition can be verified with CSS, JavaScript or Firebug like tools.
However, the sanest thing to do is to only omit tbody
for a trivial table where there is a single one:
handle any more complex case explicitly with multiple tbody
elements.
There can be multiple tbody
elements on a single table.
Header with th:
Merge table cells.
0 | 1 |
2 | |
4 | 5 |
0 | 1 |
2 | 3 |
5 |
Can only contain table related elements like tr
and script supporting.
To group columns, use tbody
, which can be used multiple times.
Don't forget that tbody
is added automatically
in most cases around groups of rows without a surrounding tbody
0 | 1 |
2 | 3 |
4 | 5 |
6 | 7 |
8 | 9 |
Caption must be the first child of table.
Caption default position:
body 0 | body 1 |
Caption on bottom. The align
attribute is not in HTML5 (deprecated in 4.01). Use CSS caption-side
instead.
body 0 | body 1 |
Contains everything needed to send data to the server, including inputs, input labels and a submit button.
Important attributes:
action
: where to send the request tomethod
: what HTTP method to use. Can only be:
Why this is so: http://programmers.stackexchange.com/questions/114156/why-there-are-no-put-and-delete-methods-in-html-forms
Rails gets around this by sending a POST with a _method
param for other methods.
enctype
: value of the content-type
header:
application/x-www-form-urlencoded
: default. URL encode.
multipart/form-data
: can only be used if the method
is post
.
Best option for uploading large data such as files via input type="file"
.
text/plain
: useless?To see what forms do exactly, consider using the netcat
utility or an ECHO server.
Multipart form to http://example.com:
Sending the form is like clicking a link: the browser attempts to render the reply, sets the URL on the address bar, etc. For this reason, the most common action after submitting a form is to redirect to somewhere else.
Useless historical artifact? http://stackoverflow.com/questions/8946320/whats-the-point-of-html-forms-name-attribute http://stackoverflow.com/questions/11111670/is-it-ok-to-have-multiple-html-forms-with-the-same-name
If missing or empty, send form to current page: http://stackoverflow.com/questions/1131781/is-it-a-good-practice-to-use-an-empty-url-for-a-html-forms-action-attribute-a
For most form elements, gives the name of the parameter that will be sent on the request.
If missing:
Such an input could still be used by JavaScript.
There are two magic values: isindex
and _charset_
:
https://www.w3.org/TR/html5/forms.html#naming-form-controls:-the-name-attribute
If the first text field has name isindex
, the value is sent just as ?value
instead of ?isindex=value
.
TODO: if missing HTML is valid but input is ignored on submit, like name
?
Any form field can have the disabled
boolean attribute.
If present, it:
The disabled IDL attribute (Javascript property) always matches the content attribute.
Even so, it is better to modify the IDL attribute instead of the content attribute,
making things more uniform with checkbox
operation, and avoiding unnecessary HTML modification.
pointer-events:none
but not yet standard.active
class does it,
but currently relies on pointer-events:none
, so it is useless.
Examples: disabled
, checkbox's checked
,
select
's selected
.
If not present: false. If present, true.
Do not need a value, but if a value is given there are only two valid possibilities:
The best option is using an exact lowercase match of the attribute, which is also XHTML5 portable.
Mostly historical elements: new elements like contenteditable
get the saner true
and false
valid values.
Boolean attribute.
If true, field cannot be modified, but unlike disabled will be sent with the form and its text can be selected.
Invalid for checkboxes: http://stackoverflow.com/questions/155291/can-html-checkboxes-be-set-to-readonly Just disable them.
type
: should always be specified to avoid browser inconsistencies:
action
.
Default value if the type attribute is missing.
The visible name of the form field.
Does not have lead to many programmatic effects, except that clicking on the label may focus or toggle its corresponding form control.
The most flexible way to use labels is with the for
attribute pointing to the input id:
It is also possible to put the input inside the label. This allows not to give IDs to inputs and to click on the space between the checkbox and the label, but is harder to style and less logical. Discussions: http://ux.stackexchange.com/questions/35289/input-checkboxes-wrapped-inside-labels
It is valid HTLM5 to omit the type attribute, but a bad practice due to browser inconsistencies.
TODO default is text?
value="default":
initial content in the text field:
size="3"
size="5"
placeholder="asdf": text that shows when the field is empty:
placeholder="asdf", value="default"
The value sent is:
name=value
pair, and name=on
if there is no value attributeTODO: can you not send the value since it is useless? http://stackoverflow.com/questions/4557387/is-a-url-query-parameter-valid-if-it-has-no-value
http://www.w3.org/TR/html5/forms.html#attr-input-checked
Boolean attribute that determines the checkbox's initial checked state.
If modified, also modifies the checked state if the checkbox hasn't been clicked on. It it has, a dirty flag is set, and adding or removing the attribute does not modify the checked state.
Resetting a form resets the dirty flag.
The conclusion is simple: only use this attribute to set the initial value of the checkbox:
never modify it with JavaScript. Use the checked
property instead.
Contrast this complex behavior with that of the disabled
attribute,
which always matches the IDL attribute.
Beware! Some browsers like Firefox might remember the checkbox value from old submits, and override this attribute. Your logic should not depend on it's initial state. Ways to disable this: http://stackoverflow.com/questions/299811/why-does-the-checkbox-stay-checked-when-reloading-the-page
A good way to give a title for all the buttons is through a
fieldset
+ legend
pair.
Shows image, user clicks, submits form and sends click coordinates as
name.x=1&name.y=2
.
Vs buttons: http://stackoverflow.com/questions/7117639/input-type-submit-vs-button-tag-are-they-interchangeable http://stackoverflow.com/questions/469059/button-vs-input-type-button-which-to-use
Interaction identical to button type="submit"
and type="button"
respectively.
input
has less styling capabilities.
To start with it cannot contain other elements.
Appears to be the oldest method for adding buttons to pages, and some legacy browsers only support it but not button. But today it is probably better to use buttons instead.
Multiple submit buttons on the same form: http://stackoverflow.com/questions/547821/two-submit-buttons-in-one-form
Suggest autocomplete data from fixed data set to text field. Shows possibilities that match what user it typing.
Also shows autocomplete values if enabled.
Options: "abcdef", "defghi", "ghijkl".
In theory, should set the visible size of elements to show exactly that many characters / items.
In practice, modern web browsers don't implement this consistently, so don't use it: http://stackoverflow.com/questions/1077483/html-input-size-attribute-not-working http://stackoverflow.com/questions/15760089/select-size-attribute-size-not-working-in-chrome-safari
size="2"
size="4"
Limits on UI the number of characters the user can input.
maxlength="6"
textarea maxlength="6"
. Newlines count:
maxlength="6" value="1234576"
. TODO: valid?:
Should behave like maxlength
,
but hasn't been implemented on Firefox 31 nor Chrome 36 nor in the Validator.
Large box for inputting text.
How many chars will fit into the textarea
vertically and horizontally.
Many browsers set the textarea font to monospace by default,
and in that case cols
is accurate.
Otherwise, cols is not accurate since different characters can have
different widths.
cols="3" rows="3"
TODO what does cols mean, since font widths vary with different characters?
Opens a file selector popup.
Should almost always be used with form method="post" enctype="multipart/form-data"
.
Accepts the multiple
attribute.
Indicates that the input field takes multiple values.
Introduced reasonably recently: only available on IE10+: http://caniuse.com/#feat=input-file-multiple So as of 2014 it is better to use libraries for that.
Currently only valid for file and email.
The select
element also supports it.
Multiple files, control select multiple files in the popup and submits.
Data is sent as: file=path1&file=path2
.
Multiple emails show as a single text field, but it forces the user to enter comma + space separated email addresses. Works because emails cannot contain commas.
Values are sent as repeated keys with different values.
Boolean attribute.
The following options seem not to be possible in HTML. Some are possible with the File API, others need JavaScript libraries that hide the file uploader and do operations with divs and the File API.
multiple
is used:
http://stackoverflow.com/questions/9337793/remove-selected-files-before-upload-with-javascript
The only solution is to hide the file input, and create div
s with Js.
Some good libraries include:
The boolean attribute selected
marks the default.
Dirtiness analogous to checkbox's checked
.
Shift for click for ranges, Ctrl click for unions.
Data is sent as this were multiple inputs with the same name.
Similar to multiple checkboxes with the same name, but:
Organize select options in categories.
Hides the value. Does not autocomplete, Firefox suggests to remember the password.
Generates a RSA private public pair, sends the public, stores the private.
TODO how to get the private key?
HTML5
Groups several form elements.
If disabled, all its elements are disabled.
TODO any more behavior changes? What is the name attribute for?
HTML5 adds great form validation abilities through:
email
required
, min
and pattern
:invalid
and :required
which can be used from CSS without Javascript
Invalid forms fields also raise the oninvalid
event if you need some Javascript.
Browsers like Firefox show invalid fields with red default CSS.
On submit, browsers can open a speech balloon with an error message over invalid fields.
It does not seem possible to customize the message without Javascript:
http://stackoverflow.com/questions/5272433/html5-form-required-attribute-set-custom-validation-message
With Javascript, the message can be customized with setCustomValidity
inside the onError
callback.
Try to submit with an empty field:
Firefox 30 shows an arrow box with "Please fill out this field."
Field must match a Javascript regexp to be valid.
Matches a huge regexp that represents valid emails http://www.w3.org/TR/html5/forms.html#valid-e-mail-address
HTML5 says it does not enforce a syntax like URL or email, probably because there is no single standard for telephones, so this is mostly semantic.
TODO what validation is done? HTML5 links to the URL standard: http://url.spec.whatwg.org/ Validation seems not implemented in Firefox 31.
There are several time validators: month, week, etc.
Effects;
valueAsNumber
in JavascriptBetween 10 and 20:
Can also specify a step
which determines which allows only certain multiples to be valid.
In Javascript, input.value
is still a string even for numeric inputs like
type=number
or type=range
.
and you would have to use parseInt
to get an integer in Javascript.
For those elements however, it is possible to use the
valueAsNumber
property to get a number directly.
Draggable slider:
Seems that you need Javascript to show the current value: http://stackoverflow.com/questions/10004723/html5-input-type-range-show-range-value
HTML5
A visual percentage measure.
value="0.6" :
value="2" min="0" max="10" :
value="2" min="0" max="10" low=3 :
value="2" min="0" max="10" high=1:
Indicates progress of a process. Meant to be modified with Javascript.
How is it different from meter? HTML5 says: only to measure time.
22 out of 100:
Determines if the browser will spellcheck and highlight errors or not.
Works for:
Can be turned on or off through the spellcheck
Javascript IDL attribute.
Boolean attributes: automatically focus on element on page load.
As a consequence, the page will roll and focus on it.
Not adding demo here not to annoy readers to death: dedicated demo at: autofocus.html
Not in HTML5: http://stackoverflow.com/questions/5333416/autocapitalize-attribute-on-input-element-used-for-ios-breaks-validation
HTML5
Determines if browser should suggest auto complete values based on previous form submits.
Consider the datalist
element if you have a predetermined suggestion list for the user.
Default on
.
TODO: what do browsers use exactly for the prediction? URL + name
attribute?
Output of a calculation with inputs given by the user, often in forms and with Javascript input.
TODO: is it purely semantic without any special behaviors, like a span? It seems so: http://stackoverflow.com/questions/20700499/what-is-the-purpose-of-for-attribute-in-output-element
HTML5 rendered some elements and attributes obsolete. This means that:
Add buttons to the right click context menu.
Currently only supported in Firefox.
TODO: is it defined in HTML5? http://stackoverflow.com/questions/4447321/adding-to-browser-context-menu Appears to be in early draft, or didn't make it into HTML5.
Browser may show if Javascript is not supported.
Block element.
Determines the order that hitting tab will cycle focus through elements.
The possible integer values are:
-1
.tabindex
If multiple elements are on the same precedence level, the one that comes first on the document tree goes first.
Example:
Set single letter keyboard shortcuts that activate (e.g. follows links) or focus (e.g. focuses on form fields) on elements
Not very portable, since the Ctrl Shift Alt prefix combination for the shortcuts varies across browsers, making this useless. The prefixes are:
Example:
If true, you can edit the content of the element like for a textarea. Tab cycles through such elements.
p contenteditable=true
p contenteditable=false
TODO
TODO
TODO
HTML5 adds built-in drag-and-drop DnD support complete with a Javascript API. Until this, DnD was implemented by libraries through mouse events.
http://www.html5rocks.com/en/tutorials/dnd/basics/
By default for a long time now, images and anchors have had a drag effect that produces a ghost image, but with no specified effects. This is probably done to support drag and drop to other applications.
Drag and drop is made available through the draggable
attribute,
although in some browsers like Firefox you wont see any difference unless Javascript bindings are being used.