Skip to main content
bool BlockRules::RuleParagraph(StateBlock& state, int start_line, int end_line,
                               bool silent) {
  // If this is an empty line -> not a paragraph
  if (state.IsEmpty(start_line)) return false;

  std::vector<std::pair<std::string, RuleBlock>> terminator_rules =
      state.md.block_parser.ruler.GetRules("paragraph");

  ParentType old_parent_type = state.parent_type;
  state.parent_type = ParentType::kParagraph;

  int next_line = start_line + 1;

  // Move line-by-line until we find a terminator
  for (; next_line < end_line && !state.IsEmpty(next_line); next_line++) {
    // Code-indented line after a paragraph = lazy continuation
    if (state.s_count[next_line] - state.blk_indent > 3) continue;

    // Blockquote marker quirk (negative indent)
    if (state.s_count[next_line] < 0) continue;

    // Run terminator rules
    bool terminate = false;
    for (std::pair<std::string, RuleBlock>& rule : terminator_rules) {
      if (rule.second(state, next_line, end_line, true)) {
        terminate = true;
        break;
      }
    }
    if (terminate) break;
  }

  // Extract raw paragraph text
  std::string content =
      state.GetLines(start_line, next_line, state.blk_indent, false);
  content = Utils::Trim(content);

  state.line = next_line;

  // Build tokens
  Token& token_open = state.Push("paragraph_open", "p", Nesting::kOpening);
  token_open.map =
      std::optional<std::pair<float, float>>({start_line, state.line});

  Token& token_inline = state.Push("inline", "", Nesting::kSelfClosing);
  token_inline.content = content;
  token_inline.map =
      std::optional<std::pair<float, float>>({start_line, state.line});

  state.Push("paragraph_close", "p", Nesting::kClosing);

  state.parent_type = old_parent_type;

  return true;
}

1. Input Alphabet

Let the input lines be: Σ={L0,L1,,Ln1}\Sigma = \{ L_0, L_1, \dots, L_{n-1} \} Each line LiL_i is a tuple: Li=(texti,emptyi,s_counti)L_i = (text_i, empty_i, s\_count_i) where:
  • emptyi{0,1}empty_i \in \{0,1\} indicates if the line is empty
  • s_countis\_count_i is the leading space count
  • textitext_i is the raw text of the line
Let blk_indentblk\_indent be the block indentation of the paragraph, and end_lineend\_line be the specified or actual last line, and T\mathcal{T} the set of terminator rules: T={τj:(state,i){0,1}}\mathcal{T} = \{ \tau_j : (state, i) \mapsto \{0,1\} \} Each τj\tau_j is a predicate corresponding to a block-level terminator.

2. State Machine Definition

We model RuleParagraph as a deterministic state machine: Mparagraph=(Q,Σ,δ,q0,F)M_{paragraph} = (Q, \Sigma, \delta, q_0, F)

States

Q={S0,S1,S2,S3}Q = \{ S_0, S_1, S_2, S_3 \}
  • S0S_0: initial line evaluation
  • S1S_1: scanning subsequent lines until termination
  • S2S_2: extract paragraph content and build tokens
  • S3S_3: final state

Initial State

q0=S0q_0 = S_0

Final State

F={S3}F = \{ S_3 \}

State Variables

  • ii: current line index
  • next_linenext\_line: scanning index
  • parent_typeparent\_type: current block context
  • terminate{0,1}terminate \in \{0,1\}: flag indicating terminator hit

3. Transition Rules

3.1 Start State (S0S_0)

At S0S_0, reject if the line is empty: If: emptystart_line=1empty_{start\_line} = 1 Then: δ1:δ(S0,start_line)=(S3)\delta_1: \delta(S_0, start\_line) = (S_3) Otherwise: δ2:δ(S0,start_line)=(S1,start_line+1)\delta_2: \delta(S_0, start\_line) = (S_1, start\_line + 1)

3.2 Scan State (S1S_1)

For i=next_linei = next\_line, continue scanning while all conditions hold: i<end_lineemptyi=0s_countiblk_indent3s_counti0τjT, τj(state,i)=0\begin{aligned} i &< end\_line &\land \\ empty_i &= 0 &\land \\ s\_count_i - blk\_indent &\le 3 &\land \\ s\_count_i &\ge 0 &\land \\ \forall \tau_j \in \mathcal{T},\ \tau_j(state,i) &= 0 \end{aligned} Then: δ3:δ(S1,i)=(S1,i+1)\delta_3: \delta(S_1, i) = (S_1, i+1) δ6: δ(S1,i)=(S1,i+1)iff s_counti<0\delta_6:\ \delta(S_1, i) = (S_1, i+1) \quad\text{iff } s\_count_i < 0 Otherwise if : iend_lineemptyi=1s_countiblk_indent>3s_counti<0τjT, τj(state,i)=1\begin{aligned} i &\ge end\_line &\vee \\ empty_i &= 1 &\vee \\ s\_count_i - blk\_indent &> 3 &\vee \\ s\_count_i &< 0 &\vee \\ \exists \tau_j \in \mathcal{T},\ \tau_j(state,i) &= 1 \end{aligned} δ4:δ(S1,i)=(S2,i)\delta_4: \delta(S_1, i) = (S_2, i )

3.3 Extract State (S2S_2)

At S2S_2, the machine:
  1. S2S_2 the paragraph content:
content = Trim(state.GetLines(start_line, i, blk_indent))
  1. Builds the tokens:
  • Open token: paragraph_open
  • Inline token: inline with content
  • Close token: paragraph_close
Then transition to final state: δ5:δ(S2,i)=(S3,i)\delta_5: \delta(S_2, i) = (S_3, i)

4. Tabular Format

Current StateInput / ConditionNext StateOutput / ActionTransition
S0S_0 (Start)emptystart_line=1empty_{start\_line} = 1S3S_3 (Done)Noneδ1\delta_1
S0S_0 (Start)emptystart_line=0empty_{start\_line} = 0S1S_1 (Scan)Init next\_line = start\_line + 1; load T\mathcal{T}; set parent_type = Paragraphδ2\delta_2
S1S_1 (Scan)iend_linei \ge end\_lineS2S_2 (Extract)End-of-block reachedδ4\delta_4
S1S_1 (Scan)emptyi=1empty_i = 1S2S_2 (Extract)Empty line terminates paragraphδ4\delta_4
S1S_1 (Scan)s_countiblk_indent>3s\_{count_i} - blk\_indent > 3S2S_2 (Extract)Indent too deep → terminate paragraphδ4\delta_4
S1S_1 (Scan)s_counti<0s\_{count_i} < 0S1S_1 (Scan)Blockquote quirk → skip lineδ6\delta_6
S1S_1 (Scan)τjT:τj(state,i)=1\exists \tau_j\in\mathcal{T}:\tau_j(state,i)=1S2S_2 (Extract)Terminator rule firesδ4\delta_4
S1S_1 (Scan)(i<end_line)(emptyi=0)(s_countiblk_indent3)(s_counti0)(j:τj=0)(i < end\_line)\land(empty_i=0)\land(s\_{count_i}-blk\_indent \le 3)\land(s\_{count_i}\ge 0)\land(\forall j:\tau_j=0)S1S_1 (Scan)Normal continuation: advance to i+1i+1δ3\delta_3
S2S_2 (Extract)S3S_3 (Done)Extract text, trim, generate tokensδ5\delta_5
S3S_3 (Done)Restore parent_type, assign state.line, return true

5. Image