OpenBSD/src lyUf2ACusr.bin/mandoc html.c mdoc_html.c

   The .UR and .MT blocks in man(7) are represented by <a> elements
   which establish phrasing context, but they can contain paragraph
   breaks (which is relevant for terminal formatting, so we can't just
   change the structure of the syntax tree), which are respresented
   by <p> elements and cannot occur inside <a>.

   Fix this by prematurely closing the <a> element in the HTML formatter.
   This menas that the clickable text in HTML output is shorter than
   what is represented as the link text in terminal output, but in
   HTML, it is frankly impossible to have the clickable area of a
   hyperlink extend across a paragraph break.  The difference in
   presentation is not a major problem, and besides, paragraph breaks
   inside .UR are rather poor style in the first place.

   The implementation is quite tricky.  Naively closing out the <a>
   prematurely would result in accessing a stale pointer when later
   reaching the physical end of the .UR block.  So this commit separates
   visual and structural closing of "struct tag" stack items.  Visual
   closing means that the HTML element is closed but the "struct tag"
   remains on the stack, to avoid later access to a stale pointer and
   to avoid closing the same HTML element a second time later.

   This also needs reference counting of pointers to "struct tag" stack
   items because often more than one child holds a pointer to the same
   parent item, and only the outermost child can safely do the physical
   closing.

   In the whole corpus of nearly half a million manual pages on
   man.openbsd.org, this problem occurs in exactly one page: the
   groff(1) version 1.20.1 manual contained in DragonFly-3.8.2, which
   contains a formatting error triggering the bug.
VersionDeltaFile
1.122+50-42usr.bin/mandoc/html.c
1.202+11-8usr.bin/mandoc/mdoc_html.c
1.123+10-8usr.bin/mandoc/man_html.c
1.63+3-1usr.bin/mandoc/html.h
+74-594 files

UnifiedSplitRaw