Opened 10 years ago

Closed 9 years ago

Last modified 9 years ago

#287 closed enhancement (worksforme)

Making Xinha leave HTML for Flash and scripting intact (object, script, and noscript)

Reported by: mharrisonline Owned by: gogo
Priority: normal Milestone:
Component: Xinha Core Version:
Severity: normal Keywords:
Cc:

Description

When HTMLArea 3 was adapted by the Jones Standard user group to be a WYSIWYG editor for content in that course management system, it had to be changed to prevent IE from corrupting existing HTML that contained Flash movies, and also to (as much as possible) not remove scripting. This modification has been applied to Xinha, however Xinha's code changes rapidly, and this modification is adapted to the nightly build from perhaps a month ago. It stops Xinha from removing the nested <embed> code for the Flash movie from the <object>.

Although it is possible to display Flash with just <object>, it prevents the movie from streaming, doesn't alert the user if they need a newer plugin, and doesn't work for everybody. Placeholders would not work, since they would not already be in legacy content. This simply makes regular html for Flash work without modification.

Also, in IE Xinha will remove the contents of a <script>, and corrupt <noscript> as well. This allows it to leave <script> and <noscript> alone. Probably not everyone will want JavaScript? to survive being opened in Xinha, but here is how to do it. This doesn't prevent onLoad events from being stripped out of the body tag, however.

I hope this will be useful to other members of the Xinha community, or at least save someone the pain of trying to figure out how to make Flash work in Xinha.

Mike Harris
Jones International University


Within the function HTMLArea.Config,

this.fullPage = false;

was changed to

this.fullPage = true;

This might work with simply the full page plugin instead, and perhaps this modification could be adapted to work with the default setting. Content in Jones Standard is actual HTML files, but this obviously won't work for everyone. Xinha in IE and fullpage already preserves JavaScript? in the head of the document, this modification allows it to preserve JavaScript? in the body as well.


Within the function HTMLArea.prototype.forceRedraw,

 this._doc.body.innerHTML = this.getInnerHTML();


was uncommented, otherwise for some reason the line above it

  this._doc.body.style.visibility = "visible";

would sometimes appear in the content.


HTMLArea._blockTags had additional tags added:

HTMLArea._blockTags = " body form textarea fieldset ul ol dl li div embed" +
"p h1 h2 h3 h4 h5 h6 quote pre button table thead object script" +
"tbody tfoot tr td th iframe address noscript blockquote";

_

HTMLArea._closingTags had more tags added:

HTMLArea._closingTags = " head body script form style div span tr td th tbody table em strong button object b i code cite dfn abbr acronym font a title ";

_

There were many changes in the function HTMLArea.getHTMLWrapper, but this function seems to have been greatly modified recently. So, here is the entire now one month(!) obsolete HTMLArea.getHTMLWrapper function with the new code commented as !!. It's a pretty bloated and redundant hack, but it worked great for our purposes:

HTMLArea.getHTMLWrapper = function(root, outputRoot, editor) {
  var html = "";
  switch (root.nodeType) {
    case 10:// Node.DOCUMENT_TYPE_NODE
    case 6: // Node.ENTITY_NODE
    case 12:// Node.NOTATION_NODE
      // this all are for the document type, probably not necessary
      break;

    case 2: // Node.ATTRIBUTE_NODE
   
      break;

    case 4: // Node.CDATA_SECTION_NODE
      // Mozilla seems to convert CDATA into a comment when going into wysiwyg mode,
      //  don't know about IE
      html += '<![CDATA[' + root.data + ']]>';
      break;

    case 5: // Node.ENTITY_REFERENCE_NODE
      html += '&' + root.nodeValue + ';';
      break;

    case 7: // Node.PROCESSING_INSTRUCTION_NODE
      // PI's don't seem to survive going into the wysiwyg mode, (at least in moz)
      // so this is purely academic
      html += '<?' + root.target + ' ' + root.data + ' ?>';
      break;


      case 1: // Node.ELEMENT_NODE
      case 11: // Node.DOCUMENT_FRAGMENT_NODE
      case 9: // Node.DOCUMENT_NODE
      {
    var closed;
    var i;
    var root_tag = (root.nodeType == 1) ? root.tagName.toLowerCase() : '';
    if (root_tag == 'br' && !root.nextSibling)
      break;
    if (outputRoot)
      outputRoot = !(editor.config.htmlRemoveTags && editor.config.htmlRemoveTags.test(root_tag));
    if (HTMLArea.is_ie && root_tag == "head") {
      if (outputRoot)
        html += "<head>";
      // lowercasize
      var save_multiline = RegExp.multiline;
      RegExp.multiline = true;
      var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
        return p1 + p2.toLowerCase();
      });
      RegExp.multiline = save_multiline;
      html += txt;
      if (outputRoot)
        html += "</head>";
      break;
	  
// !!Without this code the beginning of script tags are sometimes corrupted:

			} else if (HTMLArea.is_ie && root_tag == "body") {
			if (outputRoot)
				html += "<body";
			closed = (!(root.hasChildNodes() || HTMLArea.needsClosingTag(root)));
			html = "<" + root.tagName.toLowerCase();
			var attrs = root.attributes;
			for (i = 0; i < attrs.length; ++i) {
				var a = attrs.item(i);
				if (!a.specified) {
					continue;
				}
									

				var name = a.nodeName.toLowerCase();
				if (/_moz|contenteditable|_msh/.test(name)) {
					// avoid certain attributes
					continue;
				}
				var value;
				if (name != "style") {
					// IE5.5 reports 25 when cellSpacing is
					// 1; other values might be doomed too.
					// For this reason we extract the
					// values directly from the root node.
					//
					// Using Gecko the values of href and src are converted to absolute links
					// unless we get them using nodeValue()
					if (typeof root[a.nodeName] != "undefined" && name != "href" && name != "src" && name !="onclick" && name !="onmouseover" && name !="onmouseout" && name !="onmousedown") {
						value = root[a.nodeName];
					} else {
						value = a.nodeValue;
						// IE seems not willing to return the original values - it converts to absolute
						// links using a.nodeValue, a.value, a.stringValue, root.getAttribute("href")
						// So we have to strip the baseurl manually -/
						if (HTMLArea.is_ie && (name == "href" || name == "src")) {
							value = editor.stripBaseURL(value);
						}
					}
				} else { // IE fails to put style in attributes list
					// FIXME: cssText reported by IE is UPPERCASE
					value = root.style.cssText;
				}
				if (/(_moz|^$)/.test(value)) {
					// Mozilla reports some special tags
					// here; we don't need them.
					continue;
				}
	 html += " " + name + '="' + HTMLArea.htmlEncode(value) + '"';
      }
      if (html != "") {
        html += closed ? " />" : ">";}
		
// !! htmlarea formats HTML in IE, formatting gets lost otherwise, code doesn't work in Mozilla though 
        if (!HTMLArea.is_ie) { html += "";} else {html += "\r";}
		

// end of new code
			// lowercasize
			var save_multiline = RegExp.multiline;
			RegExp.multiline = true;
			var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
				return p1 + p2.toLowerCase();
			});
			RegExp.multiline = save_multiline;
			
			html += txt;
			if (outputRoot)
				html += "</body>";
			break;
	  
// !! Teach HTMLArea to not empty code between script tags and remove script attributes	

					} else if (HTMLArea.is_ie && root_tag == "script") {
			if (outputRoot)
				html += "<script";
			closed = (!(root.hasChildNodes() || HTMLArea.needsClosingTag(root)));
			html = "<" + root.tagName.toLowerCase();
			var attrs = root.attributes;
			for (i = 0; i < attrs.length; ++i) {
				var a = attrs.item(i);
				if (!a.specified) {
					continue;
				}
									

				var name = a.nodeName.toLowerCase();
				if (/_moz|contenteditable|_msh/.test(name)) {
					// avoid certain attributes
					continue;
				}
				var value;
				if (name != "style") {
					// IE5.5 reports 25 when cellSpacing is
					// 1; other values might be doomed too.
					// For this reason we extract the
					// values directly from the root node.
					//
					// Using Gecko the values of href and src are converted to absolute links
					// unless we get them using nodeValue()
					if (typeof root[a.nodeName] != "undefined" && name != "href" && name != "src" && name !="onclick" && name !="onmouseover" && name !="onmouseout" && name !="onmousedown") {
						value = root[a.nodeName];
					} else {
						value = a.nodeValue;
						// IE seems not willing to return the original values - it converts to absolute
						// links using a.nodeValue, a.value, a.stringValue, root.getAttribute("href")
						// So we have to strip the baseurl manually -/
						if (HTMLArea.is_ie && (name == "href" || name == "src")) {
							value = editor.stripBaseURL(value);
						}
					}
				} else { // IE fails to put style in attributes list
					// FIXME: cssText reported by IE is UPPERCASE
					value = root.style.cssText;
				}
				if (/(_moz|^$)/.test(value)) {
					// Mozilla reports some special tags
					// here; we don't need them.
					continue;
				}
					html = HTMLArea.htmlEncode(html);
				html += " " + name + '="' + value + '"';
			}
// close standalone tags like <br> (<br />)
		html += closed ? " />" : ">"; 
			// lowercasize
			var save_multiline = RegExp.multiline;
			RegExp.multiline = true;
			var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
				return p1 + p2.toLowerCase();
			});
			RegExp.multiline = save_multiline;
			html += txt;
			if (outputRoot)
				html += "</script>";
			break;
// !!  Teach HTMLArea to not remove noscript nodes				
					} else if (HTMLArea.is_ie && root_tag == "noscript") {
			if (outputRoot)
				html += "<noscript>";
			// lowercasize
			var save_multiline = RegExp.multiline;
			RegExp.multiline = true;
			var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
				return p1 + p2.toLowerCase();
			});
			RegExp.multiline = save_multiline;
			html += txt;
			if (outputRoot)
				html += "</noscript>";
			break;
			
			
// !!  Teach HTMLArea to not corrupt object 
// parameters and expel nested embed nodes, thus preventing Flash code
// from being destroyed

		} else if (HTMLArea.is_ie && root_tag == "object") {
			if (outputRoot)
				html += "<object";
				
			closed = (!(root.hasChildNodes() || HTMLArea.needsClosingTag(root)));
			html = "<" + root.tagName.toLowerCase();
			var attrs = root.attributes;
			for (i = 0; i < attrs.length; ++i) {
				var a = attrs.item(i);
				if (!a.specified) {
					continue;
				}

				var name = a.nodeName.toLowerCase();
				if (/_moz|contenteditable|_msh/.test(name)) {
					// avoid certain attributes
					continue;
				}
				var value;
				if (name != "style") {
					// IE5.5 reports 25 when cellSpacing is
					// 1; other values might be doomed too.
					// For this reason we extract the
					// values directly from the root node.
					//
					// Using Gecko the values of href and src are converted to absolute links
					// unless we get them using nodeValue()
					if (typeof root[a.nodeName] != "undefined" && name != "href" && name != "src" && name !="onclick" && name !="onmouseover" && name !="onmouseout" && name !="onmousedown") {
						value = root[a.nodeName];
					} else {
						value = a.nodeValue;
						// IE seems not willing to return the original values - it converts to absolute
						// links using a.nodeValue, a.value, a.stringValue, root.getAttribute("href")
						// So we have to strip the baseurl manually -/
						if (HTMLArea.is_ie && (name == "href" || name == "src")) {
							value = editor.stripBaseURL(value);
						}
					}
				} else { // IE fails to put style in attributes list
					// FIXME: cssText reported by IE is UPPERCASE
					value = root.style.cssText;
				}
				if (/(_moz|^$)/.test(value)) {
					// Mozilla reports some special tags
					// here; we don't need them.
					continue;
				}
				
				html += " " + name + '="' + value + '"';
			}
// close standalone tags like <br> (<br />)
		html += closed ? " />" : ">"; 
			// lowercasize
			var save_multiline = RegExp.multiline;
			RegExp.multiline = true;
			var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
				return p1 + p2.toLowerCase();
			});
			RegExp.multiline = save_multiline;
			html += txt;
			if (outputRoot)
				html += "</object>";
			break;
			
//!! end of  Jones Standard modifications for script and multimedia support

    } else if (outputRoot) {
      closed = (!(root.hasChildNodes() || HTMLArea.needsClosingTag(root)));
      html = "<" + root.tagName.toLowerCase();
      var attrs = root.attributes;
      for (i = 0; i < attrs.length; ++i) {
        var a = attrs.item(i);
        if (!a.specified) {
          continue;
        }
        var name = a.nodeName.toLowerCase();
        if (/_moz_editor_bogus_node/.test(name)) {
          html = "";
          break;
        }
        if (/(_moz)|(contenteditable)|(_msh)/.test(name)) {
          // avoid certain attributes
          continue;
        }
        var value;
        if (name != "style") {
          // IE5.5 reports 25 when cellSpacing is
          // 1; other values might be doomed too.
          // For this reason we extract the
          // values directly from the root node.
          // I'm starting to HATE JavaScript
          // development.  Browser differences
          // suck.
          //
          // Using Gecko the values of href and src are converted to absolute links
          // unless we get them using nodeValue()
          if (typeof root[a.nodeName] != "undefined" && name != "href" && name != "src" && !/^on/.test(name)) {
            value = root[a.nodeName];
          } else {
            value = a.nodeValue;
            // IE seems not willing to return the original values - it converts to absolute
            // links using a.nodeValue, a.value, a.stringValue, root.getAttribute("href")
            // So we have to strip the baseurl manually :-/
            if (HTMLArea.is_ie && (name == "href" || name == "src")) {
              value = editor.stripBaseURL(value);
            }
          }
        } else { // IE fails to put style in attributes list
          // FIXME: cssText reported by IE is UPPERCASE
          value = root.style.cssText;
        }
        if (/^(_moz)?$/.test(value)) {
          // Mozilla reports some special tags
          // here; we don't need them.
          continue;
        }
        html += " " + name + '="' + HTMLArea.htmlEncode(value) + '"';
      }
      if (html != "") {
        html += closed ? " />" : ">";
      }
    }
    for (i = root.firstChild; i; i = i.nextSibling) {
      html += HTMLArea.getHTMLWrapper(i, true, editor);
    }
    if (outputRoot && !closed) {
      html += "</" + root.tagName.toLowerCase() + ">";
    }
    break;
      }
      case 3: // Node.TEXT_NODE

    // If a text node is alone in an element and all spaces, replace it with an non breaking one
    // This partially undoes the damage done by moz, which translates '&nbsp;'s into spaces in the data element

    html = /^script|style$/i.test(root.parentNode.tagName) ? root.data : HTMLArea.htmlEncode(root.data);
    break;

      case 8: // Node.COMMENT_NODE
    html = "<!--" + root.data + "-->";
    break;		// skip comments, for now.
  }
  return html;
};

Attachments (1)

htmlarea.js (176.2 KB) - added by mharrisonline 9 years ago.
Patch for JavaScript? and Flash applied to current htmlarea.js (August 3, 2005)

Download all attachments as: .zip

Change History (20)

comment:1 Changed 10 years ago by gogo

Please tell us the exact revision of the file you modified (should be at the top of the file, unless it's really old).

comment:2 Changed 10 years ago by mharrisonline

HTMLArea.getHTMLWrapper was taken from:

-- $LastChangedDate?: 2005-05-04 04:37:33 +1200 (Wed, 04 May 2005) $
-- $LastChangedRevision?: 108 $
-- $LastChangedBy?: niko $

I tried all these changes on the daily build from 5/18 and they worked, but I totally replaced HTMLArea.getHTMLWrapper with what I pasted here.

Mike

comment:3 Changed 10 years ago by mharrisonline

The new code in HTMLArea.getHTMLWrapper was based upon code adapted from HTMLArea3 RC1, by the time I got the object parameters to stay valid and stop embed from being deleted, the forum had been shut down.

comment:4 Changed 10 years ago by mharrisonline

Actually, I just tested today's example and the problem with object parameters missing seems to be fixed, and the Flash movie will play. However, the embed code is still being removed, which causes the movie to not be able to play as it downloads, and will prevent some browsers from seeing it.

Here is what happens to code for Flash in Xinha now.

We start with this sample HTML:

<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,0,0" width="300" height="85" id="testmovie" align="middle">
<param name="allowScriptAccess" value="sameDomain" />
<param name="movie" value="testmovie.swf" />
<param name="quality" value="high" />
<param name="bgcolor" value="#000099" />
<embed src="testmovie.swf" quality="high" bgcolor="#000099" width="300" height="85" name="testmovie" align="middle" allowScriptAccess="sameDomain" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" />
</object>

And paste it into today's Xinha example in IE in the view source, go back to designer view, then back to view source, and we now have, in one long string:

<object id="testmovie" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,0,0" height="85" width="300" align="middle" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000"><param value="sameDomain" name="allowScriptAccess" /><param value="testmovie.swf" name="movie" /><param value="high" name="quality" /><param value="#000099" name="bgcolor" /></object>

...and anything between script tags are still removed. Basically, what I submited makes Xinha in IE treat object, embed nested in object, script, and noscript exactly as it treats them in Mozilla (it leaves them alone).

comment:5 Changed 10 years ago by guillaumed

What is the status of this modification on FF?

comment:6 Changed 9 years ago by mharrisonline

...if you mean, what does it do in FireFox?, currently nothing as it is only used if the browser is IE and you are in full-page mode. Xinha in FireFox? doesn't break thsese things when in full-page mode.

comment:7 Changed 9 years ago by mharrisonline

If we take this in small steps, below is the only code change needed to make tonight's download retain the normal HTML needed for Flash in IE, in regular or full screen mode. I made no other modification to htmlarea.js:

Insert at line 4666, just below

4664 html += (HTMLArea.is_ie ? ('\n' + indent) : ) + "</head>";
4665 break;

		} else if (HTMLArea.is_ie && root_tag == "object") {
			if (outputRoot)
				html += "<object";
				
			closed = (!(root.hasChildNodes() || HTMLArea.needsClosingTag(root)));
			html = "<" + root.tagName.toLowerCase();
			var attrs = root.attributes;
			for (i = 0; i < attrs.length; ++i) {
				var a = attrs.item(i);
				if (!a.specified) {
					continue;
				}

				var name = a.nodeName.toLowerCase();
				if (/_moz|contenteditable|_msh/.test(name)) {
					// avoid certain attributes
					continue;
				}
				var value;
				if (name != "style") {
					if (typeof root[a.nodeName] != "undefined" && name != "href" && name != "src" && name !="onclick" && name !="onmouseover" && name !="onmouseout" && name !="onmousedown") {
						value = root[a.nodeName];
					} else {
						value = a.nodeValue;
						
						if (HTMLArea.is_ie && (name == "href" || name == "src")) {
							value = editor.stripBaseURL(value);
						}
					}
				} else { 
					value = root.style.cssText;
				}
				if (/(_moz|^$)/.test(value)) {
					
					continue;
				}
				
				html += " " + name + '="' + value + '"';
			}

		html += closed ? " />" : ">"; 
			// lowercasize
			var save_multiline = RegExp.multiline;
			RegExp.multiline = true;
			var txt = root.innerHTML.replace(HTMLArea.RE_tagName, function(str, p1, p2) {
				return p1 + p2.toLowerCase();
			});
			RegExp.multiline = save_multiline;
			html += txt;
			if (outputRoot)
				html += "</object>";
			break;

and that's all you need to do to make Flash markup be preserved in Xinha!

Could this be applied to Xinha?

comment:8 Changed 9 years ago by mharrisonline

Excuse me, I meant regular or Full HTML mode, not Full Screen.

comment:9 Changed 9 years ago by derekcopelin@…

The above fix works great from the small testing I have done. Is there a way to have xinha rewrite the Non IE height and width fields to be the same as the IE fields?

comment:10 Changed 9 years ago by mharrisonline

If you place a base href into the HTML while it is in the editor, the Flash movie will actually play inside the editor.

comment:11 Changed 9 years ago by mharrisonline

Ticket 253 also makes Xinha preserve Flash code in IE, as well as scripting, etc.

Changed 9 years ago by mharrisonline

Patch for JavaScript? and Flash applied to current htmlarea.js (August 3, 2005)

comment:12 Changed 9 years ago by mharrisonline

I applied the patch to the current htmlarea.js. The only bug I know of is that it removes onLoad events from the body tag. Otherwise, it works great with Javascript. It is a good idea though to change the word javascript before your HTML loads into the editor, and then back after submit, to prevent document.write statements from writing to the HTML. Oh, and it works with Flash, too.

comment:13 Changed 9 years ago by mharrisonline

Derek,

You asked,

The above fix works great from the small testing I have done. Is there a way to have xinha rewrite the Non IE height and width fields to be the same as the IE fields?


I don't know how to do that, maybe someone else does.

comment:14 Changed 9 years ago by mharrisonline

It is a good idea though to change the word javascript before your HTML loads into the editor, and then back after submit, to prevent document.write statements from writing to the HTML

This is how I make this work on the PHP page that loads the editor:

Before Xinha starts:

 // stop javascript from executing, it is changed back at submission
 $text_box=eregi_replace("javascript","freezescript",$text_box);
 $text_box=eregi_replace("JavaScript","FreezeScript",$text_box);

After Xinha submits:

 // remove javascript freeze
      $text_box=eregi_replace("freezescript","javascript",$text_box);
      $text_box=eregi_replace("FreezeScript","JavaScript",$text_box);

Then, Xinha preserves the script, but doesn't execute document.writes, etc. On save, the code is restored to its original form. Ideally, it should be possible to do this in the script node within Xinha, this solution means that if javascript is in the text, it to would be changed while in the editor.

comment:15 Changed 9 years ago by mharrisonline

  • Resolution set to fixed
  • Status changed from new to closed

This code is no longer needed, thanks to the GetHtml plugin. : )

comment:16 Changed 9 years ago by mharrisonline

I am reopening this for now, because until the GetHtml plugin retains formatting in comments, any JavaScript? that hides from older browsers inside a comment will be placed into a single line and broken.

This patch handles quite well code like the one below:

<script language="JavaScript" type="text/javascript">
        <!-- Hide script from old browsers
        //Choose stylesheet based on browser type
        if (navigator.appName == "Netscape") {
            document.write("<link rel='STYLESHEET' type='text/css' href='style_ns.css'>")
        }
        else {

However, curently the GetHtml plugin changes the second, third and fourth lines into a single line. Until that is fixed, this patch is the safest way to handle JavaScript? in the HTML. Note: the part of this patch that retains scripts works best in full page mode (it places a body tag into the HTML when fullpage is false), the part that retains embedded media works in any mode.

comment:17 Changed 9 years ago by mharrisonline

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:18 Changed 9 years ago by mharrisonline

  • Resolution set to worksforme
  • Status changed from reopened to closed

The comment issue was fixed in the GetHtml plugin, so it is again the best choice for JavaScript?.

comment:19 Changed 9 years ago by mharrisonline

After changeset:425 I can't find any JavaScript? or HTML, no matter how old, that breaks with the GetHtml plugin. I can no longer find any reason (unless you can't use XHTML) not to use the GetHtml plugin instead of the extended GetHTML function I provided in this ticket.

Note: See TracTickets for help on using tickets.