Forms

Forms and Functions

If you've grasped the basics of listening for page events with jQuery, working with forms should be a piece of cake. Most of the same skills are involved: finding form elements, using on() to register for events, and processing the results. Unfortunately, forms are one of the oldest parts of HTML, and they can be a little quirky. jQuery provides a set of specialized methods for dealing with those quirks.

Normally, for HTML to interact with a server, we write a form element and put some elements inside to capture user input. It might look a little something like this: <form action="feedback.php"> <h2>Submit your feedback!</h2> <textarea name="comments"></textarea> <input type="submit"> </form> Forms written this way have a number of usability flaws. They don't provide feedback to the user, if one of the form elements is required or if the input needs to be in a certain format. They require a server to process the input, even if it's not something that needs to be stored or sent elsewhere. And of course, the entire page has to be refreshed whenever the form is submitted, losing all the other state on the page.

We won't learn how to submit forms without a page refresh until we reach our chapter on AJAX, but the other issues we can certainly address with JavaScript. As a result, our forms will be much friendlier and easier to use. We can also write client-side applications that use form elements for input, but never try to submit those elements to a server. They can do everything through JavaScript in the browser, while levering the rich set of building blocks that forms give us for user interaction.

The Value of val()

Across all the elements that we can put in a form, they express their value in very different ways. <input> and <button> tags have a value attribute, although it's sometimes used in odd ways (as in radio buttons and checkboxes, which combine it with the checked attribute). <textarea> elements have a value attribute that they use, but they get their initial value from the content between the opening and closing tags, instead. Select boxes (<select>) don't intrinsically have a value themselves, but they contain <option> tags that do, and the selected option donates its value to the select box as a whole.

We should also distinguish between attributes and properties when it comes to elements. Technically, only tags have attributes--they are the x="y" that you write into your HTML. When the tag is parsed by the browser and converted into an element in the Document Object Model, or DOM, some of those attributes are converted to properties on the element object. Changing the property sometimes changes the attribute as well, but in many cases--and the value attribute used by a number of form elements is one of these--it doesn't.

Confused yet? Most of the time, this won't actually matter, because we can let jQuery's val() method do the heavy lifting. val() handles the differences between various elements, and makes sure that you get the right information regardless of how screwy it is under the hood (most of the time--there are a few exceptions). Let's see how this works in practice. <input type="text" id="first-name" value="Thomas"> <input name="last-name" value="Wilburn"> <select id="selection"> <option value="one" selected=true>One</option> <option value="two">Two</option> </select> <textarea>Hello, World</textarea> <script> $('#first-name').val(); //Thomas $('input[name="last-name"]').val(); //Wilburn $('#selection').val(); //one $('textarea').val(); //Hello, World //like css(), html(), and other jQuery functions, val() is a getter //and a setter, depending on whether or not you feed it a value. $('textarea').val('Fnord'); //we've set the value of the textarea </script>

By using jQuery's val() function, we can sidestep some of the messiness involved in working with HTML form elements. The real exceptions, however, are checkboxes and radio buttons. These elements are linked by a shared name attribute, and you should use jQuery's custom :checked selector to find the selected item and then get its value. <input type="checkbox" name="pets" value="socks">Socks <input type="checkbox" name="pets" value="bo" checked=true>Bo <input type="checkbox" name="pets" value="checkers">Checkers <script> var checkedVal = $('input[name="pets"]:checked').val(); </script>

If we work with forms enough, we might want a function that could be pointed at a form, and would return an object containing the values of all the elements inside. One version might look something like this: var getForm = function(form) { form = $(form); //make sure the form object is jQuery, just to be safe var output = {}; //find all the simple inputs, which we can read with val() var easy = form.find('input[type=text],textarea,select'); for (var i = 0; i < easy.length; i++) { //eq() pulls out a single jQuery item, similar to [] var item = easy.eq(i); var name = item.attr('name'); var value = item.val(); //set a property on the output item with the name and value of the input output[name] = value; } //we also need to find the checkboxes and radio buttons, but //they need an additional filter applied var hard = form.find('input[type=checkbox], input[type=radio]'); //we only care about selected items, which match :checked hard = hard.filter(':checked'); for (var i = 0; i < hard.length; i++) { var item = hard.eq(i); var name = item.attr('name'); //since we've filtered down to checked items, now we can use val() var value = item.val(); output[name] = value; } //return the completed form object return output }

Regular Expressions

In addition to the technical quirks dating back to the earliest HTML, forms offer us another challenge: human beings. A form input is an invitation to chaos. It's hard to say what people will type into it, but it's almost guaranteed that it won't be what you expect, from the overlooked (someone types their phone number with dots, and you expect dashes) to the incorrect (the user types their home city and state into a field that's only meant for the city) to the malicious (hackers who attempt to feed dangerous code into your form in the hopes of corrupting or accessing your server's database). How do we deal with this chaos?

The answer, especially when it comes to good web security, is an entire book all on its own. But for simple input validation, we can use regular expressions (or "regexes" for short) to make sure that the format of the input matches what we expect, and let users know if they need to change it before they submit the form. These are not only helpful in JavaScript, but they're also common in almost all other programming languages, which means writing them is a versatile skill to have.

Regular expressions are a kind of language for wildcards in text. You may have used wildcard characters before, say on a command line, where you might type dir *.txt to find any file that ends in ".txt". These wildcards are pretty crude, however: they stand for any character, and any number of that character. If we're looking for files that start with one of three letters, and are less than five letters long, an asterisk just isn't going to suffice. We need something with a little more granularity in its search pattern.

Let's start by capturing a specific phrase. The following code creates a regex that looks for the letters "java", and executes it on a sample string. Regular expressions in JavaScript are just strings, but instead of being quoted with apostrophes or double-quotes, we wrap them in slashes. var java = /java/; //here's our regex var sample = "Let's learn some javascript"; //String.search() returns where the pattern was matched, i.e. character #17 var index = sample.search(java); //String.match() returns any matching substrings as an array var matches = sample.match(java); //The test() method on the regex just says whether the string was found. var found = java.test(sample); //true!

If we want to match any one character, we can use a dot, which will stand in for any character. var bax = /ba./; bax.test('bat'); //true bax.test('bar'); //true bax.test('cat'); //false bax.test('baseball'); //true, for both "bas" and "bal" We can match repeated characters, including the dot wildcard, by using metacharacters. These include *, which matches the previous character 0 or more times, and +, which matches the previous character 1 or more times. We can also ask for a specific number of repeats by putting pairs of numbers inside curly braces, like so: /bal+/; // Matches "balk" and "ballet", but not "bard" // asterisks can make a character optional /bar∗/; // Matches "barrel" and "baby", since the "r" is required 0+ times //a single number between braces requires exactly that many repeats /bat{2}/; // Matches "battle" and "batten" but not "batch" or "battt" //comma-separated numbers allow a range of repeat values /ban{1, 2}/; // Matches "bane" and "banner"

In addition to repeating single characters, by putting parts of our regular expression in parentheses, we combine them as a group that's treated as a single character for repetition purposes. /ba(na)+na/; // Matches banana, banananana, etc.

Just as you can be very specific about the number of repeated characters, you can also specify subsets of the alphabet that are more restrictive than the dot character's "everything matches" rule. The following table lays out a few helpful metacharacters for matching only certain letters.

Sequence Notes
\d Matches any digit. Capitalizing it, as \D, matches anything that's not a digit.
\w Matches any "word" character, meaning all English letters, numbers, and underscores, but not spaces or punctuation. Capitalizing this pattern (\W) matches anything that's not a word character. It's not a coincidence that \w matches characters that are valid variable names in many languages, including JavaScript.
\s Matches any whitespace, including spaces and tabs.
[abc] Putting characters between braces means that it will match any of the characters inside. In this case, it will match "a", "b", or "c".
[a-m] You can also express character ranges. This set matches the first half of the lower-case alphabet. To capture the same as \w, for example, we would write [a-zA-Z0-9_].
[^abc] Negated character set, which will match anything that's not one of the characters between the brackets.

If you want to use any of the metacharacters in your actual pattern (for example, if you wanted to search for "("), you can escape them using backslashes, just like you can escape quotes in strings. You don't have escape characters inside of the square brackets, however. These are only a few of the metacharacters available to you.

Writing a regular expression between slashes is a regex literal, but we can also generate regular expressions from strings by calling the RegExp() function. If you create them this way, you don't have to use the open and closing slashes, but you do have to double-escape the backslahes that precede many metacharacters, because otherwise they'll be treated as escaped string characters. var fromString = RegExp('\\d+'); //match 1 or more numbers

If you're interested in learning more, you should check out the Mozilla Developer Network documentation. Now, let's put them to use.

Pattern Matching

The HTML5 pattern attribute lets authors specify a regular expression (without the outside slashes) that an input's value must match before the browser will let you submit a form. Of course, pattern is fairly new, and many older browsers do not support it. In this exercise, we'll write a polyfill that simulates the feature using JavaScript when support for pattern isn't available. Polyfills are a valuable part of the modern JavaScript landscape, since they let us "patch" older browsers to have the newest features, albeit sometimes in a slower or degraded form.

First, we need to write our regular expression patterns to attach to our elements. Regular expressions can be difficult to write, but there are lots of tools that let us test them in an interactive fashion. The box below contains two inputs: an input area for typing your regular expression, and a text area containing some sample text. Any complete regular expressions will be run against the sample text, and matches will be highlighted. Try to write two patterns: one that matches all the phone numbers in the sample, and one that matches all the e-mail addresses.

Phone numbers: (555) 012-1234 555.012.1234 5550121234 Email: a@b.com a@b.co.uk a.b@c.com a.b+c@d.com

Writing a regular expression that can match all the variations is not easy, is it? That's why, for the purposes of form inputs, it is probably best to be forgiving. An alternate tactic is to concentrate less on matching all the possible patterns that someone could type, and more on ruling out inputs that we know are absolutely not allowed. For our phone and e-mail form inputs, we'll use these two regular expressions, which I've annotated below. /* Telephone numbers: since these can be written in so many ways, we're just going to require two simple rules: no letters, and at least 7 characters. */ var phoneRegex = /^[^a-zA-Z]{7,}$/; //Let's break that down into parts: //By starting our regex with ^ and ending with $, we require it to match the //whole line: no partials ^ [^ //a negated set: match anything that's not inside a-zA-Z //a range covering all the letters, uppercase and lower ] { //require a certain number of the previous, not-letter characters 7, //we require at least seven, but we leave out the upper limit } $

/* E-mail addresses: e-mail is fantastically copmlicated, and there is actually a monstrous regular expression in the standard for matching e-mails. However, the real test of a valid address is whether or not it can be sent. So we'll check for just a few requirements, then leave it up to our server to actually verify. */ var emailRegex = /^.+@.+\..+$/; //Again, we'll break this down: //Start and end with ^ and $ to require a whole-line match ^ .+ //Match anything before the @, as long as there's at least one character @ //Literally, match the @ .+ //Following the @, there's the domain, which could be almost anything. \. //However, it must have at least one dot (escaped here, not a wildcard) .+ //And the dot must be followed by at least one of anything $

Using these two regular expressions, we're going to write a function that provides form validation for older browsers that don't support the HTML5 pattern. Adding features this way is called a polyfill if it works the same way as the standard feature, and a shim if it provides a new API (you can think of jQuery as a shim for the low-level page functions). Normally, you would want to test for the existence of the feature, similar to the way Modernizr does, before adding your functionality. In this case, since the UI for pattern tends to be poor, we'll just run our code no matter what.

From a high level, here's our plan. We want to blend in with HTML features, so we'll use the same pattern attribute to store our regular expressions. When the submit button is clicked, we'll loop through the input elements with pattern attributes, creating regular expressions from the pattern and checking it against the input value. If an input fails, we'll cancel the click event on the submit button and put up a helpful error message for our user. If they all check out, the event goes through and the form submits without a problem.

Exercises and Practice Questions

  1. Try to write a regular expression that will check addresses. It may be easier if you break the address into several form inputs.
  2. Along with the required and pattern attributes, HTML5 introduced a number of new input tag types. What are these new types, and which ones are available in the browsers you target?
  3. Forms can be validated using JavaScript, but they can also be created in response to user actions. Using jQuery, create a form where a drop-down menu for "city" changes in when the user picks a state.