gcc/gcc/gcc-rich-location.h
David Malcolm 148066bd05 diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4)
This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.

In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:

"rendered": {"text":

where "text" has a string value such as (for a "trojan source" attack):

  "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
  "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
  "  |       |                                       |                           |\n"
  "  |       |                                       |                           end of bidirectional context\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"

where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:

  "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
  "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
  "  |       |                                                   |                               |\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"

The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
	* diagnostic-format-sarif.cc: Include "selftest.h",
	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
	"selftest-json.h", and "text-range-label.h".
	(class content_renderer): New.
	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
	(sarif_builder::make_location_object): Add class
	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
	pass a nonnull escape_nonascii_renderer to
	maybe_make_physical_location_object as its snippet_renderer, and
	add a property bag property "gcc/escapeNonAscii" to the SARIF
	location object.  For other overloads of make_location_object,
	pass nullptr for the snippet_renderer.
	(sarif_builder::maybe_make_region_object_for_context): Add
	"snippet_renderer" param and pass it to
	maybe_make_artifact_content_object.
	(sarif_builder::make_tool_object): Drop "const".
	(sarif_builder::make_driver_tool_component_object): Likewise.
	Use typesafe unique_ptr variant of object::set for setting "rules"
	property on driver_obj.
	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
	and use it to potentially set the "rendered" property (§3.3.4).
	(selftest::test_make_location_object): New.
	(selftest::diagnostic_format_sarif_cc_tests): New.
	* diagnostic-show-locus.cc: Include "text-range-label.h" and
	"selftest-diagnostic-show-locus.h".
	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
	New.
	(selftests::test_layout_x_offset_display_utf8): Use
	diagnostic_show_locus_fixture to simplify and consolidate setup
	code.
	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
	(selftests::test_one_liner_colorized_utf8): Likewise.
	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
	* gcc-rich-location.h (class text_range_label): Move to new file
	text-range-label.h.
	* selftest-diagnostic-show-locus.h: New file, based on material in
	diagnostic-show-locus.cc.
	* selftest-json.cc: New file.
	* selftest-json.h: New file.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::diagnostic_format_sarif_cc_tests.
	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
	that we have a property bag with property "gcc/escapeNonAscii": true.
	Verify that we have a "rendered" property for a snippet.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
	"text-range-label.h".

gcc/ChangeLog:
	* text-range-label.h: New file, taking class text_range_label from
	gcc-rich-location.h.

libcpp/ChangeLog:
	* include/rich-location.h
	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
	(rich_location::rich_location): Remove "= delete" from decl of
	copy ctor.  Add deleted decl of move ctor.
	(rich_location::operator=): Remove "= delete" from decl of
	copy assignment.  Add deleted decl of move assignment.
	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
	move.
	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
	decl of move assignment.
	* line-map.cc (rich_location::rich_location): New copy ctor.
	(fixit_hint::fixit_hint): New copy ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-07-24 18:07:54 -04:00

125 lines
3.6 KiB
C++

/* Declarations relating to class gcc_rich_location
Copyright (C) 2014-2024 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#ifndef GCC_RICH_LOCATION_H
#define GCC_RICH_LOCATION_H
#include "rich-location.h"
/* A gcc_rich_location is libcpp's rich_location with additional
helper methods for working with gcc's types. The class is not
copyable or assignable because rich_location isn't. */
class gcc_rich_location : public rich_location
{
public:
/* Constructors. */
/* Constructing from a location. */
explicit gcc_rich_location (location_t loc)
: rich_location (line_table, loc, nullptr, nullptr)
{
}
/* Constructing from a location with a label and a highlight color. */
explicit gcc_rich_location (location_t loc,
const range_label *label,
const char *highlight_color)
: rich_location (line_table, loc, label, highlight_color)
{
}
/* Methods for adding ranges via gcc entities. */
void
add_expr (tree expr,
range_label *label,
const char *highlight_color);
void
maybe_add_expr (tree t,
range_label *label,
const char *highlight_color);
void add_fixit_misspelled_id (location_t misspelled_token_loc,
tree hint_id);
/* If LOC is within the spans of lines that will already be printed for
this gcc_rich_location, then add it as a secondary location
and return true.
Otherwise return false.
This allows for a diagnostic to compactly print secondary locations
in one diagnostic when these are near enough the primary locations for
diagnostics-show-locus.c to cope with them, and to fall back to
printing them via a note otherwise e.g.:
gcc_rich_location richloc (primary_loc);
bool added secondary = richloc.add_location_if_nearby (*global_dc,
secondary_loc);
error_at (&richloc, "main message");
if (!added secondary)
inform (secondary_loc, "message for secondary");
Implemented in diagnostic-show-locus.cc. */
bool add_location_if_nearby (const diagnostic_context &ctxt,
location_t loc,
bool restrict_to_current_line_spans = true,
const range_label *label = NULL);
/* Add a fix-it hint suggesting the insertion of CONTENT before
INSERTION_POINT.
Attempt to handle formatting: if INSERTION_POINT is the first thing on
its line, and INDENT is sufficiently sane, then add CONTENT on its own
line, using the indentation of INDENT.
Otherwise, add CONTENT directly before INSERTION_POINT.
For example, adding "CONTENT;" with the closing brace as the insertion
point and using "INDENT;" for indentation:
if ()
{
INDENT;
}
would lead to:
if ()
{
INDENT;
CONTENT;
}
but adding it to:
if () {INDENT;}
would lead to:
if () {INDENT;CONTENT;}
*/
void add_fixit_insert_formatted (const char *content,
location_t insertion_point,
location_t indent);
};
#endif /* GCC_RICH_LOCATION_H */