Ticket #1265 (closed defect: fixed)

Opened 5 years ago

Last modified 5 years ago

[TransformInnerHTML] Patch for correct handling of definition lists (dl, dt, dd)

Reported by: guest Owned by:
Priority: normal Milestone: 0.96
Component: HTML Output Version: trunk
Severity: normal Keywords: dl, dt, dd, definition lists
Cc:

Description

Currently the TransformInnerHTML module does not handle <dl> <dt> and <dd> tags correctly for IE. Basically, IE does not close <dt> and <dd> tags in a definition list, with the exception of the last <dd> in a list. Firefox has no problem with these tags. The situation with IE mirrors exactly the bug it has with <li> tags in unordered and ordered lists. TransformInnerHTML has a workaround for <li> tags (see the comment "//IE drops all </li> tags in a list except the last one" in the current TransformInnerHTML.js).

I have produced a diff patch (attached) against TransformInnerHTML.js which applies the <li> workaround also to <dt> and <dd> tags, by tweaking the code for the original workaround. The patch adds <dl><dt><dd> tags in the appropriate places of the regexes 14, 15, 16, and 17 in that file, and adds these tags to the series of regexes that tidy up list items. Definition lists are now properly indented too. I've also added an extra tidy-up that cleans up the redundant white space which Xinha adds before </li></dt></dd> in list items that contain hyperlinks (and possibly other tags). Tested in IE7 and Firefox 2. I can explain the regex tweaks in detail if anyone wants me to. The boxes below show the main change to the tidyup routine:

Original code:
//IE drops  all </li> tags in a list except the last one
if(Xinha.is_ie) {
	html = html.replace(/<li( [^>]*)?>/g,'</li><li$1>').
	replace(/(<(ul|ol)[^>]*>)[\s\n]*<\/li>/g, '$1').
	replace(/<\/li>([\s\n]*<\/li>)+/g, '<\/li>');
}

New code:

//IE drops  all </li>,</dt>,</dd> tags in a list except the last one
if(Xinha.is_ie) {
	html = html.replace(/<(li|dd|dt)( [^>]*)?>/g,'</$1><$1$2>').
	replace(/(<[uod]l[^>]*>[\s\S]*?)<\/(li|dd|dt)>/g, '$1').
	replace(/\s*<\/(li|dd|dt)>(\s*<\/(li|dd|dt)>)+/g, '</$1>').
	replace(/(<dt[\s>][^<]*)(<\/d[dt]>)+/g, '$1</dt>');
}

The extra cleanup regex comes before "if(outputroot)" and is as follows:

//Cleanup redundant whitespace before </li></dd></dt> in IE and Mozilla
html = html.replace(/\s*(<\/(li|dd|dt)>)/g, '$1');

--GeoffreyK

Attachments

Definition_List_Patch.patch (0.7 kB) - added by guest 5 years ago.
Revised patch to allow for html inside <dt> tags

Change History

Changed 5 years ago by guest

I forgot to add that this patch also prevents comments from being stripped after <ul>, <ol> and <dl> tags. The old regexes stripped anything between <ul>...<li>, which is wrong. The new code allows for:

  <ul><!-- this is marked up in reverse order -->
    <li class="last">...</li>
    <li>...</li>
  </ul>

And for:

      <dl class="links"><!--Insert external links here -->
        <dt>Quick links</dt>
        <dd><a href="/">...</a></dd>
        <dd><a href="/">...</a></dd>
      </dl>

--GeoffreyK

Changed 5 years ago by ray

  • owner deleted
  • component changed from Xinha Core to HTML Output

Changed 5 years ago by ray

  • status changed from new to closed
  • resolution set to fixed
  • summary changed from Patch for correct handling of definition lists (dl, dt, dd) to [TransformInnerHTML] Patch for correct handling of definition lists (dl, dt, dd)

rev [1025]: trusting Geoffrey, aplied patch untestedly

Changed 5 years ago by guest

Revised patch to allow for html inside <dt> tags

Changed 5 years ago by guest

  • status changed from closed to reopened
  • resolution deleted

rev [1025]: trusting Geoffrey, aplied patch untestedly

Thank you, Ray. HOWEVER, after some further testing of my own, I've found a slightly unusual use of <dt> tags which contain further html tags within them. So we have to do a very minor tweak one of the regexes above. I've updated the patch (which now takes into account rev 1025).

Detail:
The Regex:

replace(/(<dt[\s>][^<]*)(<\/d[dt]>)+/g, '$1</dt>');

in TransformInnerHTML needs to be replace with:

replace(/(<dt[\s>][\s\S]*?)(<\/d[dt]>)+/g, '$1</dt>');

--Geoffrey

Changed 5 years ago by ray

  • status changed from reopened to closed
  • resolution set to fixed

rev [1034]: OK, thanks

Note: See TracTickets for help on using tickets.