it

profileElm20o

 

Your company develops a variety of web-based products. Many tools for transforming HTML (the language of web pages) require those pages to be written in a “disciplined” dialect of HTML called XHTML. One of the key factors distinguishing XHTML from ordinary HTML is that all opening and closing tags in XHTML must be properly balanced.

You have been asked to write a validator that can examine an XHTML file for this balancing.

HTML is a “mark-up” language in which ordinary text is interspersed with “tags” that introduce formatting properties. Tags are written inside < > brackets. There are three forms of tags:

  • opening tags: ‘<whatever…>’ The ... may be empty or may contain one or more attributes, separated from the tag name by at least one whitespace character. Attributes will be ignored in this assignment.
  • closing tags: ‘</whatever…>’ The ... may be empty or whitespace characters.
  • singleton tags: ‘<whatever…/>’ The ... may be empty or may contain one or more attributes, separated from the tag name by at least one whitespace character.

For example, here is a snippet of HTML in which I have highlighted the opening tags, closing tags, and singleton tags:

<p>
For example, here is a snippet of HTML in which
I have highlighted the <span class="high1">opening
tags</span>, <span class="high2">closing
tags</span>, and
<span class="high3">singleton tags</span>:
</p>
<hr/>

Most tags occur in pairs, in which an opening tag <whatever...> is closed by a similar tag with a / at the start of its name: </whatever>. For example, web pages can get bold text this way: <b>bold</b> and italic text this this way: <i>italic</i>. Notice that the tag names are case sensitive in XHTML. Pairs of tags may be nested: <i>like <b>this</b></i> to produce text _like this _, but may not overlap <b>like <big>this</b></big>. Consequently, a tag with a / must always match the most recently seen non-/ paired tag. That rule suggests the use of stacks.

Some tags are singletons, not occurring in pairs. These are signaled by a ‘/’ just before the closing >. For example, <hr/> produces a horizontal rule:

  1. Files for this assignment may be found here or, if you are logged in to one of the CS Dept. Linux machines, in ~zeil/Assignments/cs361/stack_xhtml/.
  2. You are given a driver for the XHTML validator. The code you are provided with will read an XHTML file and scan for any tags. It passes each tag it encounters to a Balancer whose job is to make sure that the tags are occurring in properly balanced pairs.
    The driver is designed to take the path to the file to be checked as a command line parameter.
    ./xmlcheck test0.html 
  3. You must supply the Balancer class. This class must use a std::stack as its primary data member. (You may have additional simple data members, bools or ints, but no containers other than the stack.) The two most critical operations the Balancer class must support are
    • tag(string atag): indicates that the driver has encountered a tag. The full text of the tag is passed as the parameter atag. This may be an opening tag, a closing tag, or a singleton tag.
    • status(): returns a single int indicating the status of what has been observed so far:
      • -1 indicates that an error has been detected.
      • 0 indicates that no error has been detected, but that one or more opening tags have yet to be matched
      • 1 indicates that no error has been detected and there are no opening tags that have yet to be matched against a closing tag.
  4. As in the prior assignments, you also will be provided with unit tests for the class that you are working on (Balancer);
    There is also a pair of test files test0.dat and test1.dat that you can use to test the full xmlcheck program. test0.html should be reported as OK, but test1.html has mismatched tags. As always, you should conduct additional systems testing with your own test cases.
  • 5 years ago
  • 35
Answer(0)