Published in Laravel

Parsing Blade Comments in Forte

By John Koster

Comments: That's the topic for today. As I continue working on the HTML parser for Forte, a project I intend to be the foundation for a suite of tools to create the best Laravel Blade extensibility framework ever, I am tackling smaller projects and working through them so that I do not lose my mind with the sheer scope of this endeavor. And today I have chosen comments.

There are two types of comments in Blade: HTML comments and Blade comments, sometimes referred to as client and server-side comments, respectively.

1<!-- An HTML Comment, which can {{ $outputThings }} -->
2
3{{-- A Blade Comment, where things are {{ $notParsed }} --}}

Let's start by thinking through Blade's comments. These are simpler to parse, as we just need to shut down parsing everything until we see the end token. The tokenizer does most of the heavy lifting here, as it emits text nodes whenever it encounters Blade comments until it finds the end of the comment.

As a consequence, the parser implementation is quite simple:

1function parseNode(): ?Node
2{
3 switch ($this->current->type) {
4 // ...
5
6 case Tokens::T_BLADE_COMMENT:
7 $commentStart = $this->current->startOffset;
8 $bladeCommentContent = str($this->getContentAndAdvance())->substr(4, -4)->value();
9
10 return $this->finalizeNode(
11 new CommentNode($bladeCommentContent, CommentType::BLADE_COMMENT),
12 $commentStart,
13 $this->getCurrentOffset(),
14 );
15
16 // ...
17 }
18}

There isn't much interesting happening in that code sample: getContentAndAdvance() retrieves the inner content for the current token and advances the parser; the finalizeNode method handles ensuring the emitted nodes have the correct start and end offsets, as well as populates some additional properties common to all node types.

Given the following input:

1{{-- A Blade Comment, where things are {{ $notParsed }} --}}

The Forte parser produces:

HTML comments are more nuanced. We need to shut down the parsing of most things, but we still need to continue parsing Blade. The reasoning behind this is that we may want to send some dynamic bit to the client, but not have it rendered.

A common example might be something like this:

1<!-- Start include: {{ $viewName }} -->
2@include($viewName)
3<!-- End include: {{ $viewName }} -->

and the parser's output:

Forte's parser is a recursive parser; the simplest way to handle this was to just set a flag when parsing HTML comments.

1private function parseHtmlComment()
2{
3 $this->parsingComment = true;
4
5 $commentStartToken = $this->expect(Tokens::T_HTML_COMMENT_START);
6 $commentStart = $this->finalizeNodeFromToken(
7 new CommentStartNode($commentStartToken->content, CommentType::HTML_COMMENT),
8 $commentStartToken
9 );
10
11 // ...
12
13 $comment = new CommentNode(
14 $body,
15 CommentType::HTML_COMMENT
16 );
17
18 $this->parsingComment = false;
19
20 return $this->finalizeNode(
21 $comment,
22 $commentStartToken->startOffset,
23 $commentEndToken->endOffset
24 );
25}

With that out of the way, its a simple matter of creating a helper method to indicate if we should parse HTML-like structures or not and adjust the parser's strategy along the way:

1function parseNode(): ?Node
2{
3 switch ($this->current->type) {
4 // ...
5
6 case Tokens::T_CDATA:
7 if (! $this->canParseHtmlLikeStructures()) {
8 return $this->makeText();
9 }
10
11 return new CdataNode($this->getContentAndAdvance());
12 case Tokens::T_DOCTYPE:
13 if (! $this->canParseHtmlLikeStructures()) {
14 return $this->makeText();
15 }
16
17 return new DoctypeNode($this->getContentAndAdvance());
18
19 // ...
20 }
21}

I could have used the parsingComments flag directly there, but in my experience, there will inevitably be edge-cases I need to handle later. Having things already in a utility method makes this refactor much simpler later on.

That's all for this post!