ebook cover of The Strange Case of Dr. Jekyll and Mr. Hyde from Gutenberg

Gutenberg broke my site

Trying to use my syntax highlighter implementation made me realise how broken my site is since the new Gutenberg editor became an integral part of WordPress.

https://blog.xarta.co.uk/2017/06/wordpress-theme-tinkering-featured-div-vs-featured-image/

… didn’t render properly. UPDATE: fixed – had to disable MediaElement.js – HTML5 Audio and Video plug in (not sure how I’ve used it yet – still checking). It was throwing a JavaScript error that prevented my Syntaxhighlighting JavaScript from rendering.

And it shared more issues with the rest of the site.

Patched basic syntax highlighting. Two main issues:

  • Quotation marks were being replaced with fancy ones, e.g. single quote ASCII 39 replaced with ASCII 145. On the php-side. So that upset the JavaScript side of the Syntaxhighlighting plug-in.

  • Any old posts left completely alone were still parsed correctly by my php function to find shortcodes, but I had to change the function for anything “touched” (I mean altered in the editor in any way). After adding the classic-editor plugin, I can detect new posts with the classic-editor plugin looking for a custom field, and can separately detect Gutenberg block use. But any kind of editing of old posts also set the custom field for the classic-editor plugin so I had to further discriminate those posts with a manual custom field override.

Detect Gutenberg blocks:

function is_gut() {
	
    global $post;

    if ( function_exists( 'has_blocks' ) && has_blocks( $post->ID )) {
        return true;
    } else {
        return false;
    }
}

Issue 1 – quotation marks

Copied wptexturize() from wp-includes\formatting.php to functions.php and renamed it my_wptexturize().

function my_wptexturize( $text, $reset = false ) {

    //added global $xartaCodesToCheckGlobal
    global $wp_cockneyreplace, $shortcode_tags, $xartaCodesToCheckGlobal;
    // ----
    $default_no_texturize_shortcodes = array_merge($xartaCodesToCheckGlobal, $default_no_texturize_shortcodes );

remove_filter('comment_text',    'wptexturize');
remove_filter('the_excerpt',     'wptexturize');
remove_filter('the_content',     'wptexturize');
remove_filter('the_rss_content', 'wptexturize');
 
add_filter(   'comment_text',    'my_wptexturize');
add_filter(   'the_excerpt',     'my_wptexturize');
add_filter(   'the_content',     'my_wptexturize');
add_filter(   'the_rss_content', 'my_wptexturize');

The global $xartaCodesToCheckGlobal gets assigned to in my php part of the Syntaxhighlighter plugin, before my_wptexturize() gets called. This is also covered in Issue 2.

Issue 2 – square brackets

Below are excerpts from a class I called “TheContent”.
To fix my issue, I made $xartaCodesToCheck public, added detection for Gutenberg blocks, classic-editor evidence, or a custom field that insists the post is pre-Gutenberg and based on that the $searchString has to look for different formatting to accommodate changes Gutenberg makes to the content!

...

class TheContent
{

    public $xartaCodesToCheck;

    public function __construct($xartaLangs)
    {
            $this->xartaCodesToCheck = $xartaLangs;

            // high priority / early filter "4"
            add_filter('the_content', array($this, 'xarta_before_the_content_normal_filters'), 4);
    }

    public function xarta_before_the_content_normal_filters($content)
    {	
        // TASK TWO (see notes above class)
        $potentialShortcode = strpos($content, '[');
        if($potentialShortcode !== FALSE)
        {

            array_push($this->xartaCodesToCheck, "xsyntax");    // additional shortcode to check
                                                                // (doesn't count as language alias)
            
            array_push($this->xartaCodesToCheck, "gedit");      // also protect this styling shortcode

            array_push($this->xartaCodesToCheck, "crt");        // also protect this styling shortcode

            array_push($this->xartaCodesToCheck, "dos");        // also protect this styling shortcode
			
            $custom = get_post_custom();
            if(isset($custom['classic-editor-remember']) && !isset($custom['old-syntaxhighlight-style']) || is_gut() == true)
            {
	            $gutenbergfix = true;
            } else {
	            $gutenbergfix = false;
            }

            // override for testing
            //$gutenbergfix = false;
			
			
            foreach ($this->xartaCodesToCheck as $searchLang)
            {
				//echo($searchLang);
					
                // e.g. $searchLang = 'code' or $searchLang = 'js' or $sesarchLane= 'c#' etc.
                if(strpos($content,'['.$searchLang, $potentialShortcode) !== FALSE)
                {
                    // remember attributes e.g. [js  some attributes]content[/js] etc.

                    if($gutenbergfix == true)
                    {
                       $searchString   = '/\['.$searchLang.'(.*)\](.*)\]/'; // https://regex101.com/
                       $replaceString = "[$searchLang $1]<pre class=\"xprotect\">$2]";						
                    }
                    else
                    {
                       $searchString   = '/\['.$searchLang.'(.*)\]/'; // https://regex101.com/
                       $replaceString  = "[$searchLang $1 ]<pre class=\"xprotect\">";						
                    }


					
                    // using preg_replace to cope with attributes (no wild card in str_pos)
                    // can't easily do the whole [shortcode atts]code-to-highlight[/shortcode]
                    // in one go though as it gets complicated when the shortcode appears
                    // more than once, successively (have to look at occurances etc.)
                    // and computationally gets expensive.  This is a compromise.
                    $content = preg_replace(    $searchString,
                                                $replaceString, 
                                                $content );


                    $content = str_replace( '[/'.$searchLang.']',   
                                            '[/'.$searchLang.']', 
                                            $content);

											
                }

            }
        }

        return $content;
    }

    // ETC
}

NOTE: in the code above, because the Syntaxhighlighter plugin is displaying “itself”, and string replaces are acting on the text within the shortcodes too, then some string literals are not displayed!

So I’m displaying the code in a Gutenberg code block below also:

class TheContent
{

    public $xartaCodesToCheck;

    public function __construct($xartaLangs)
    {
            $this->xartaCodesToCheck = $xartaLangs;

            // high priority / early filter "4"
            add_filter('the_content', array($this, 'xarta_before_the_content_normal_filters'), 4);
    }

    public function xarta_before_the_content_normal_filters($content)
    {	
        // TASK TWO (see notes above class)
        $potentialShortcode = strpos($content, '[');
        if($potentialShortcode !== FALSE)
        {

            array_push($this->xartaCodesToCheck, "xsyntax");    // additional shortcode to check
                                                                // (doesn't count as language alias)
            
            array_push($this->xartaCodesToCheck, "gedit");      // also protect this styling shortcode

            array_push($this->xartaCodesToCheck, "crt");        // also protect this styling shortcode

            array_push($this->xartaCodesToCheck, "dos");        // also protect this styling shortcode
			
            $custom = get_post_custom();
            if(isset($custom['classic-editor-remember']) && !isset($custom['old-syntaxhighlight-style']) || is_gut() == true)
            {
	            $gutenbergfix = true;
            } else {
	            $gutenbergfix = false;
            }

            // override for testing
            //$gutenbergfix = false;
			
			
            foreach ($this->xartaCodesToCheck as $searchLang)
            {
				//echo($searchLang);
					
                // e.g. $searchLang = 'code' or $searchLang = 'js' or $sesarchLane= 'c#' etc.
                if(strpos($content,'['.$searchLang, $potentialShortcode) !== FALSE)
                {
                    // remember attributes e.g. [some-language some-attributes]content[/js] etc.

                    if($gutenbergfix == true)
                    {
                       $searchString   = '/\['.$searchLang.'(.*)\](.*)\]/'; // https://regex101.com/
                       $replaceString = "[$searchLang $1]<pre class=\"xprotect\">$2]";						
                    }
                    else
                    {
                       $searchString   = '/\['.$searchLang.'(.*)\]/'; // https://regex101.com/
                       $replaceString  = "[$searchLang $1 ]<pre class=\"xprotect\">";						
                    }


					
                    // using preg_replace to cope with attributes (no wild card in str_pos)
                    // can't easily do the whole [shortcode atts]code-to-highlight[/shortcode]
                    // in one go though as it gets complicated when the shortcode appears
                    // more than once, successively (have to look at occurances etc.)
                    // and computationally gets expensive.  This is a compromise.
                    $content = preg_replace(    $searchString,
                                                $replaceString, 
                                                $content );


                    $content = str_replace( '[/'.$searchLang.']',   
                                            '</pre><!-- end xprotect -->[/'.$searchLang.']', 
                                            $content);

											
                }

            }
        }

        return $content;
    }

    // ETC
}

The public property xartaCodesToCheck is assigned to the global variable outside the class so that my_wptexturize() can access it.

...

$xartaSyntaxHLthecontent =  new TheContent($xartaLangs);
$xartaCodesToCheckGlobal =  $xartaSyntaxHLthecontent->xartaCodesToCheck;