gcc/libcpp/include
David Malcolm 148066bd05 diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4)
This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.

In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:

"rendered": {"text":

where "text" has a string value such as (for a "trojan source" attack):

  "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
  "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
  "  |       |                                       |                           |\n"
  "  |       |                                       |                           end of bidirectional context\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"

where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:

  "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
  "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
  "  |       |                                                   |                               |\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"

The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
	* diagnostic-format-sarif.cc: Include "selftest.h",
	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
	"selftest-json.h", and "text-range-label.h".
	(class content_renderer): New.
	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
	(sarif_builder::make_location_object): Add class
	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
	pass a nonnull escape_nonascii_renderer to
	maybe_make_physical_location_object as its snippet_renderer, and
	add a property bag property "gcc/escapeNonAscii" to the SARIF
	location object.  For other overloads of make_location_object,
	pass nullptr for the snippet_renderer.
	(sarif_builder::maybe_make_region_object_for_context): Add
	"snippet_renderer" param and pass it to
	maybe_make_artifact_content_object.
	(sarif_builder::make_tool_object): Drop "const".
	(sarif_builder::make_driver_tool_component_object): Likewise.
	Use typesafe unique_ptr variant of object::set for setting "rules"
	property on driver_obj.
	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
	and use it to potentially set the "rendered" property (§3.3.4).
	(selftest::test_make_location_object): New.
	(selftest::diagnostic_format_sarif_cc_tests): New.
	* diagnostic-show-locus.cc: Include "text-range-label.h" and
	"selftest-diagnostic-show-locus.h".
	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
	New.
	(selftests::test_layout_x_offset_display_utf8): Use
	diagnostic_show_locus_fixture to simplify and consolidate setup
	code.
	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
	(selftests::test_one_liner_colorized_utf8): Likewise.
	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
	* gcc-rich-location.h (class text_range_label): Move to new file
	text-range-label.h.
	* selftest-diagnostic-show-locus.h: New file, based on material in
	diagnostic-show-locus.cc.
	* selftest-json.cc: New file.
	* selftest-json.h: New file.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::diagnostic_format_sarif_cc_tests.
	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
	that we have a property bag with property "gcc/escapeNonAscii": true.
	Verify that we have a "rendered" property for a snippet.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
	"text-range-label.h".

gcc/ChangeLog:
	* text-range-label.h: New file, taking class text_range_label from
	gcc-rich-location.h.

libcpp/ChangeLog:
	* include/rich-location.h
	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
	(rich_location::rich_location): Remove "= delete" from decl of
	copy ctor.  Add deleted decl of move ctor.
	(rich_location::operator=): Remove "= delete" from decl of
	copy assignment.  Add deleted decl of move assignment.
	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
	move.
	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
	decl of move assignment.
	* line-map.cc (rich_location::rich_location): New copy ctor.
	(fixit_hint::fixit_hint): New copy ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-07-24 18:07:54 -04:00
..
cpplib.h c: Add -std=c2y, -std=gnu2y, -Wc23-c2y-compat, C2Y _Generic with type operand 2024-06-11 23:00:04 +00:00
label-text.h libcpp: move label_text to its own header 2024-05-28 15:55:24 -04:00
line-map.h Update copyright years. 2024-01-03 12:19:35 +01:00
mkdeps.h Update copyright years. 2024-01-03 12:19:35 +01:00
rich-location.h diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4) 2024-07-24 18:07:54 -04:00
symtab.h Update copyright years. 2024-01-03 12:19:35 +01:00