El Dorado

A full-stack community web application written in Ruby/Rails
Updates for bbcodeizer « El Dorado « almost effortless
 
Tue, 27 May 2008, 11:55am #1
irkenInvader
Member
Zim
Registered: May, 2008
Last visit: Wed, 04 Jun 2008
Posts: 85

Hi Trevor,

First, I'm not sure if this is the 'forum' that you want to discuss code changes/ideas/updates. If not, let me know what would be more appropriate. google group? email?

Second, I spent a lot of time (too much time - I'm obsessed!) over the long weekend thinking about how to clean up bbcodeizer so it works with simple_format better.

The problem, as I saw it, was that bbcodeizer outputs posts (any strings, really) with lots of newlines strewn about that shouldn't have been there. The three most obvious examples were in the [quote], [code], and my new [list] blocks. When rails handed the bbcodeized post over to simple_format, it would just append <br /> to any newlines, not replace the newlines. This would result (especially in those blocks) in extra double-spaces.

So, I've learned way more about strings, regex objects, matchData objects, and working with all three together in Ruby than I ever wanted! The end result is some significant changes to the way that bbcodeizer handles those blocks so as to clean up any extraneous newlines entered by a poster before handing the post over to simple_format. This way, bbcodeizer cleans up it's own generated HTML and lets simple_format do what it should without accidentally adding needless double-spaces.

I also implemented your idea about how to implement the list item (<li>) tags. You'll see when I get the patch done. I'll also upload my changes to the bbcode docs that encompass all the changes I made.

On a side note, I also added a literal tag to bbcodeizer so that bbcode tags could be referenced in posts without being parsed by bbcodeizer. In other words, you could write [code] and bbcodeizer translates that into &#91code&#93, the XHTML and HTML4 literals for open-square-bracket and close-square-bracket.

I'm going to fork your el-dorado github project, make the changes, check them in, and then attempt to send you a pull request. We'll see how that all works out this evening when I get home and can access github correctly (my work's network firewall doesn't allow traffic on the git:// ports)


Zim wrote:

Invader’s blood marches through my veins like giant, radioactive rubber pants! The pants command me! Do not ignore my veins!

Offline
Tue, 27 May 2008, 12:46pm #2
Trevor
Administrator
Wait-ill-fix-it
Registered: Sep, 2005
Posts: 240

Sounds good to me. I still think we may want to spin off this BBCodeizer plugin into a new one with a different name and host it on github. Perhaps we could do that and share commit access? I think we've diverged from the original plugin enough to warrant it.

Also, keep in mind that we don't need to use simple_format or anything really. I haven't spent much time on this part of the app (just enough to get it working) and it sounds like you've done a lot of digging. So, if there's a better way to go about all of this, please go for it!

Online
Tue, 27 May 2008, 1:41pm #3
irkenInvader
Member
Zim
Registered: May, 2008
Last visit: Wed, 04 Jun 2008
Posts: 85

Well, I think simple_format has an important purpose and does what it does well. The only problem is when you have blocks that don't need it's help mixed in with blocks that do need its help.

The line in question is in the bb function in app/helpers/application_helper.rb:
[code]
...
text = simple_format(bbcodeize(sanitize(h(text))))
...
[/code]

Let's break it out one function at a time, from the inside out.

the h() function (really, just an alias to html_escape()) will escape (dangerous!) html code we don't want our posters to be able to inject. So, any text like:
[code]
<p>This is a paragraph</p>
[/code]
gets translated into:
[code]
&lt;p&gt;This is a paragraph&lt;/p&gt;
[code]

This is a "good thing" and we don't want to get rid of it. To keep up with our current topic, h (or html_escape) doesn't touch any newlines in our post, though it does translate '<br />' or '<br>' into '&lt;br /&gt;' or '&lt;br&gt;'

Next up is sanitize(). sanitize() is designed to remove any html tags or attributes in those tags that the you don't want. You can explicitly allow some if you want, but by default is removes them all. This seems a little redundant as html_escape() is already covering this, but there might be some hacker-ish way of getting around the escaping being done and sanitize() should catch that and remove the "bad" tags.

After that, we pass the escaped and sanitized text to bbcodeizer to translate any bbcode tags into appropriate html tags to markup the post the way the poster wanted. We assume the html_escape() and sanitize() have "secured" our text and now we want bbcodizer to mark it up.

Finally, simple_format is called on the resulting text of the post to inject '<br />' and '<p>...</p>' elements in the "right" places. This makes it easy for us to hook into those elements with CSS if we want and "pretty"s up the post for browser consumption.

The problem comes with certain HTML elements that do need "pretty"fying and other's that don't. For example, the bbcode [code] tag gets translated into '<pre>...</pre>' elements. Those tag names are short for "pre-formated" and are designed to allow an author who has already formated his text with appropriate whitespace and newlines to ignore the browser's desire to "squash" all the whitespace. This is a "good thing" for preformated text, but simple_format doesn't distinguish between <pre> tags and non <pre> tags. Therefore it just translates all newlines, inside or outside of <pre> elements, from "\r\n" to "\r\n<br />", which is (usually) the opposite of what the author intended when he used the <pre> element in the first place. Also, simple format will translate two newlines ("\r\n\r\n") into <p> (paragraph) elements. Again, this is a "good thing" when outside of a <pre> element, but not so much when the text is already "pre-formatted".

I thought strongly about modifying the simple_format function (overriding it) to check for those tags that shouldn't have <br> elements injected and ignoring them, but I felt like that was the wrong thing to do for many reasons. One, I don't dare presume to change what the rails developers have designed! Also, those tags I was having problems with are being generated in the bbcodeizer function in the first place. The problem was being introduced there so I felt like that's where it should be fixed.

I realized that well formed HTML concerning preformated code or lists or whatever shouldn't need any newlines at all unless explicitly desired. What I mean is, inside the bbcodeizer function I could "squash" all the newlines that weren't needed, making simple_format()'s job easier. If we have the following bbcode marked-up text:
[code]
[ code]\r\n
This is some pre-formatted text\r\n
I want it to look just like this\r\n
[ /code]
[code]
I could squash it down before it gets passed to simple_format(), like this:
[code]
[ code]This is some pre-formatted text\r\n
I want it to look just like this[ /code]
[/code]
thereby eliminating unnecessary newlines before simple_format() gets a hold of the post.

I hope all that (lengthy discussion!) makes sense.


Zim wrote:

Invader’s blood marches through my veins like giant, radioactive rubber pants! The pants command me! Do not ignore my veins!

Offline
Tue, 27 May 2008, 3:15pm #4
Trevor
Administrator
Wait-ill-fix-it
Registered: Sep, 2005
Posts: 240

I think I get what you're saying (maybe), and it sounds good!

Online
Tue, 27 May 2008, 9:53pm #5
irkenInvader
Member
Zim
Registered: May, 2008
Last visit: Wed, 04 Jun 2008
Posts: 85

Ok, after much experimenting, I've forked your repo, made changes, pushed them, and sent you a pull request. Two, actually. So, let me know what you think.


Zim wrote:

Invader’s blood marches through my veins like giant, radioactive rubber pants! The pants command me! Do not ignore my veins!

Offline
Tue, 27 May 2008, 10:13pm #6
Trevor
Administrator
Wait-ill-fix-it
Registered: Sep, 2005
Posts: 240

Sweet!

http://github.com/trevorturk/el-dorado/commits/...

I love it.

Online
Tue, 27 May 2008, 10:17pm #7
Trevor
Administrator
Wait-ill-fix-it
Registered: Sep, 2005
Posts: 240

Checking it out...

  • item
  • item 2

[youtube]example[/youtube]

First line.

Second line.

Third line.

Sweeeeet.

Online
Wed, 28 May 2008, 8:02am #8
irkenInvader
Member
Zim
Registered: May, 2008
Last visit: Wed, 04 Jun 2008
Posts: 85

Testing:

First line.
Second line.
Third line.

First line.

Second line.

Third line.


Zim wrote:

Invader’s blood marches through my veins like giant, radioactive rubber pants! The pants command me! Do not ignore my veins!

Offline
Wed, 28 May 2008, 8:52am #9
Trevor
Administrator
Wait-ill-fix-it
Registered: Sep, 2005
Posts: 240

Some of my users are testing things out and still might have an issue with the simple_format helper worth looking at (if you're interested). I'll ask them to confirm and hop over to this thread if they still see a problem. Thanks again!

[edit] Maybe not. I'll let you know if I come across anything else.

Last edited Wed, 28 May 2008, 9:07am by Trevor

Online