This post requires a certain degree of technical savvyness.
When experimenting with various shortcode locations in the post content, I noticed that under some conditions the shortcode behaved a bit strange with regard to paragraph arrangement. So I implemented a couple of “diagnostic” shortcodes and filters to further investigate the problem. But before I go into that, follow me on a short tour through what the Shortcode API is.
The Shortcode API
Now, plug-ins have been providing functionality through similar (but varying) notation way before there’s been a Shortcode API. But they had to parse the entire post content as a whole and determine on their own what’s inside, and what’s outside the shortcode tags.
Because this pattern was so widely spread, the WordPress team decided to ease everybody’s life by (a) standardizing the notation and (b) providing standard protocols to follow, and built-in mechanisms to access shortcode elements – the Shortcode API. Here’s what WordPress says about it:
The API handles all the tricky parsing, eliminating the need for writing a custom regular expression for each shortcode. Helper functions are included for setting and fetching default attributes. The API support both self-closing and enclosing shortcodes.
All a shortcode plug-in needs to do is handle what’s in the shortcode. In fact, it doesn’t even see anything outside the shortcode. It gets it’s meal served nicely pre-digested by a friendly WordPress. Now, this sound anything but bad, doesn’t it? It actually sounds pretty great really!
That is… if only it worked! 🙁
Let’s take a look at this example entry as it is entered in the editor:
Some text here [SHORTCODE] Enclosed text[/SHORTCODE] more text.
Before the plugin that handles the shortcode is activated, the entry is internally represented as:
<p>Some text here [SHORTCODE]</p> <p>Enclosed</p> <p>text[/SHORTCODE] more text.</p>
Assuming the plug-in will, after it’s activated, substitute the shortcode by, say, “XXX”, then this is what it’d result to:
<p>Some text here XXX more text.</p>
Now, assume the plug-in wanted to do something more complicated, and substituted the shortcode content by, say
<div class="my_fancy_css_class">shortcode content</div>
This will result in:
<p>Some text here <div class="my_fancy_css_class"></p> <p>Enclosed</p> <p>text</div> more text.</p>
Ooops! Bad luck! That’s a crap of a HTML. It’s not even consistently parsable by browsers.2 (I’ve faced this problem already when I was working on my Sliding Notes plug-in. On the plug-in homepage I’ve explained some aspects of it, too.)
Dangling And Leaking Paragraphs
O.K., I thought, the Shortcode API is not as usefull, as I first assumed. It does relieve me of some work on the parsing side, but I have to invest efforts into working my way around its insufficiencies instead. But at least it is consistent, and once I’ve found a workaround…
Yeah, would have been nice if it was! But nope! It is not!
See this example, with a sample shortcode that simply substitutes itself with itself (and hence should not make a difference if the shortcode is active or not):
Some text here [SHORTCODE] the tag is first on a line Enclosed text[/SHORTCODE] more text.
This is what it internally looks like, before it’s been through the Shortcode API:
<p>Some text here </p> <p>[SHORTCODE] the tag is first on a line</p> <p>Enclosed</p> <p>text[/SHORTCODE] more text.</p>
And this is the HTML after pre-digestion by the Shortcode API:
<p>Some text here </p> [SHORTCODE] the tag is first on a line</p> <p>Enclosed</p> <p>text[/SHORTCODE] more text.</p>
Note the missing opening paragraph tag in line 2 ?!?
On other occasions, it was the closing paragraph tag that was missing. On others yet, both were, like here:
<p>Some text</p> [SHORTCODE] <p>Enclosed text</p> <p>[/SHORTCODE]</p> <p>More text.</p>
The corresponding text as entered in the editor is:
Some text. [SHORTCODE] Enclosed text [/SHORTCODE] More text.
Sometimes it ate a paragraph tag near the opening shortcode tag, sometimes near the closing one. It was despairing. Hence my:
- The Shortcode API “eats” the opening or closing paragraph tag, or both.
- In neither case can the plug-in handler do something about it, because it has no influence on what’s going on outside of the shortcode tags.
- The conditions under which the API misbehaves are unreproducible. You never know when it’ll hit you, until it does. In this regard, it is similar to a “chaotic system”.
- Life is too short to mess around with stupid issues like this.
Ergo: I’m resorting to “good old” filters until the issue has been resolved.
- This is a simplified representation, sufficient for the purpose of this article. A full description is available at the WordPress site. [↩]
- A careful reader adept with WordPress API’s might have correctly noticed that this effect is not entirely to the merit of the Shortcode API. At first sight it may seem this has something to do with how TinyMCE, WordPress’ visual editor, processes the entered text. But my diagnosis shows that the aforementioned processing is with WordPress’ filters, not TinyMCE itself. (Filters of priority higher then normal see a “normalized” version of the text, which is pretty close to what the text looks like in the editor. Ergo, the text stored in the database does not manifest the issue.) I still see it as a shortcoming of the Shortcode API, because it’s lacking awareness of the processing that takes place in its native environment. [↩]