Add string escape function or escape functionality built into CAPTURE macro #2967

toughengineer · 2025-03-28T12:31:57Z

The problem

This:

const auto asciiControlsString =
  "null \0 tab \t carriage return \r line feed \n whatever this is \xf quotation mark \" chars"sv;
CAPTURE(asciiControlsString);

results in a message like this:

with message:
 line feed rolsString := "null  tab      carriage return
   whatever this is � quotation mark " chars"

Notice that because of carriage return \r character it garbles up the message:

 line feed rolsString := "null  tab      carriage return
 ^         ^
text       remnants of the variable name
after \r

Proposed solution

Having an escape() function to escape non-printable and some special characters (like carriage return) provided by Catch2 would be wonderful so people don't have to reinvent the wheel every time.

The result of code like this:

CAPTURE(escape(asciiControlsString));

can look like this:

with message:
  escape(asciiControlsString) := "null \0 tab \t carriage return \r
  line feed \n whatever this is � quotation mark " chars"

Alternatively CAPTURE() macro can incorporate this functionality, or Catch2 can provide additional macro with such functionality, e.g. CAPTURE_ESCAPED().

The text was updated successfully, but these errors were encountered:

ChrisThrasher · 2025-04-08T22:31:44Z

Could you solve your problem by using raw string literals? That way you can type \0 without that being converted into a null character. Instead you'll have \ and 0 end up in the string itself.

toughengineer · 2025-04-08T22:50:38Z

Sometimes you need to have strings containing special characters as test input, and you want to have a message containing the input as a context to make diagnosing easier when the test fails, e.g.

const auto asciiControlsString =
  "null \0 tab \t carriage return \r line feed \n whatever this is \xf quotation mark \" chars"sv;
CAPTURE(asciiControlsString);

@ChrisThrasher, I don't understand how your suggestion to use raw string literals solves the problem of garbled up output of strings containing special characters.

ChrisThrasher · 2025-04-08T22:53:16Z

Try this

const auto asciiControlsString =
  R"(null \0 tab \t carriage return \r line feed \n whatever this is \xf quotation mark " chars)"sv;

Raw string literals ensure those characters do not get escaped. This is the canonical C++ solution for writing strings that contain escape sequences that you don't want to get escaped.

toughengineer · 2025-04-08T23:22:28Z

Again, the point is to have special characters in the input string in order to test some functionality.
At the same time straightforwardly printing such a string (e.g. capturing it with CAPTURE() macro) as context to help diagnose problems in case the test fails garbles up the output.

To try to completely eliminate misunderstanding, imagine that the input is defined like this:

const auto asciiControlsString =
  std::string{'\x6e', '\x75', '\x6c', '\x6c', '\x20', '\x0', '\x20', '\x74', '\x61', '\x62',
              '\x20', '\x9', '\x20', '\x63', '\x61', '\x72', '\x72', '\x69', '\x61', '\x67',
              '\x65', '\x20', '\x72', '\x65', '\x74', '\x75', '\x72', '\x6e', '\x20', '\xd',
              '\x20', '\x6c', '\x69', '\x6e', '\x65', '\x20', '\x66', '\x65', '\x65', '\x64',
              '\x20', '\xa', '\x20', '\x77', '\x68', '\x61', '\x74', '\x65', '\x76', '\x65',
              '\x72', '\x20', '\x74', '\x68', '\x69', '\x73', '\x20', '\x69', '\x73', '\x20',
              '\xf', '\x20', '\x71', '\x75', '\x6f', '\x74', '\x61', '\x74', '\x69', '\x6f',
              '\x6e', '\x20', '\x6d', '\x61', '\x72', '\x6b', '\x20', '\x22', '\x20', '\x63',
              '\x68', '\x61', '\x72', '\x73'};
CAPTURE(asciiControlsString);

Your suggestion to not have special characters in the input string when you want to test input with special characters does not solve any problems.

ChrisThrasher · 2025-04-09T00:19:05Z

Okay I see. It took me a minute to understand what you're asking for. You just want a free function that takes a given string and returns that same string but with certain ASCII characters like \n converted to \ and n. Can you elaborate more on the use case for this? What makes this worth inclusion to the library when it could be a free function in a given codebase's test suite.

toughengineer · 2025-04-09T07:44:24Z

As I stated in the initial message, having to reinvent such escape function every time in every codebase's test suite that needs it is cumbersome,
so it makes it worth if Catch2 offers such functionality out of the box.

ChrisThrasher · 2025-04-09T14:07:57Z

Right, that’s just repeating what your PR originally stated. I’m asking for use cases. You have described what you want the function to do but haven’t really talked about the use case that is motivating you in the first place.

Does your use case apply beyond just you? Are you aware of any other codebases that have the same problem and would benefit from such a feature?

toughengineer · 2025-04-09T15:37:17Z

I thought the use case is self evident.

Sometimes you need to have strings containing special characters as test input, and you want to have a message containing the input as a context to make diagnosing easier when the test fails,
here is a little bit expanded example from my specific case:

//...
  SECTION("strings with ASCII control characters and \"") {
    const auto asciiControlsString =
      "null \0 tab \t carriage return \r line feed \n whatever this is \xf quotation mark \" chars"sv;

    SECTION("unchanged in relaxed mode") {
      CAPTURE(asciiControlsString);
      CHECK(minjson::unescape(asciiControlsString, minjson::UnescapeMode::Relaxed) == asciiControlsString);
    }


    SECTION("error in strict mode and during parsing") {
      SECTION("unescape") {
        CAPTURE(asciiControlsString);
        CHECK(minjson::unescape(asciiControlsString).empty());
        CHECK(minjson::unescape(asciiControlsString, minjson::UnescapeMode::Strict).empty());
      }
//...

I imagine this situation is quite common for certain types of code, like a JSON library, although I don't have any list of codebases which indeed have similar situation.

Quick search gave https://github.com/nlohmann/json, although they use doctest instead of Catch, their usage of CAPTURE() in tests illustrates the use case, e.g.:
https://github.com/nlohmann/json/blob/00ecc7ed7ab2f37fbdf5bc7eca46503301999547/tests/src/unit-testsuites.cpp#L241-L252

        auto TEST_STRING = [](const std::string & json_string, const std::string & expected)
        {
            CAPTURE(json_string)
            CAPTURE(expected)
            CHECK(json::parse(json_string)[0].get<std::string>() == expected);
        };


        TEST_STRING("[\"\"]", "");
        TEST_STRING("[\"Hello\"]", "Hello");
        TEST_STRING(R"(["Hello\nWorld"])", "Hello\nWorld");
        //TEST_STRING("[\"Hello\\u0000World\"]", "Hello\0World");
        TEST_STRING(R"(["\"\\/\b\f\n\r\t"])", "\"\\/\b\f\n\r\t");

In this particular example when expected == "\"\\/\b\f\n\r\t" the output of CAPTURE(expected) would be unreadable.

ChrisThrasher · 2025-04-12T17:58:28Z

Can you elaborate on how certain escape sequences will be replaced? One concern I have is different users will want to replace escape sequences with different replacement strings. For example, one user may want to print \n as an actual newline while someone else wants to replace that with the string "\n" or even the hex value of the newline character. I'm not sure how we can make a central API that satisfies all those use cases, not to mention other non-printable characters like ASCII control codes.

toughengineer · 2025-04-12T19:37:35Z

Anything that makes the output not garbled and understandable is acceptable to me.

The obvious way is to use the same rules as in C++, e.g. \0\b\f\n\r\t\\ for standard escapes and \xN
for other non printable characters.

In this case you want to escape \\ to distinguish e.g. a single newline character \n from a sequence of literal characters of backslash and small letter 'n' \\n.

You can also decide to escape \" and maybe \', that makes the result copy-pastable right into the code which is a nice feature.

A side note.

Detecting incorrect UTF-8 code points and outputting escaped code units would be amazing, e.g.:

this code point is missing one continuation byte: "\xF0\x9F\x98"

But that probably should not be the responsibility of this function.

The purpose of the hypothetical function is to offer functionality, not to satisfy all potential users.
This is the same logic that can be applied to the current functionality of Catch, you had to decide which functionality to provide and stop there instead of wondering what one user or someone else may want the functionality to be.

toughengineer · 2025-04-30T13:57:48Z

Today I discovered -i option, so there already is similar functionality that one must explicitly turn on.
Would be great if this functionality can be turned on for concrete invocations of e.g. CAPTURE().

philsquared · 2025-04-30T16:17:58Z

Hey, remember me? :-) I still get notifications for these issues bubble up to me sometimes.
In this case, since it was me that added -i originally I thought I'd chip in.

The purpose of that switch was for exactly what you describe (AFAICS) - although do say if you think there is anything missing in terms of what it converts (I think I added that back in the day when I was the "primary customer").

Sounds like you want to be able to apply the same feature, but programmatically for certain strings, only?

IIRC the conversion happens at the reporting stage, so it might not be trivial to accomplish that by just building on what's already there. Would a per-test-case option be sufficient? (no promises, just asking)

toughengineer · 2025-04-30T16:56:43Z

Sounds like you want to be able to apply the same feature, but programmatically for certain strings, only?

Pretty much.

Anything per test case/per section/per macro invocation would be sufficient for me.
Per test case (or even per section) escaping of "invisibles" makes a lot of sense now that you mentioned it, but a separate escape function would also do.

I want it to be baked into the test code itself, so e.g. someone (probably me a year from now) doesn't have to stare at the garbled output and somehow discover that -i option is a thing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add string escape function or escape functionality built into CAPTURE macro #2967

Add string escape function or escape functionality built into CAPTURE macro #2967

toughengineer commented Mar 28, 2025 •

edited

Loading

ChrisThrasher commented Apr 8, 2025

Uh oh!

toughengineer commented Apr 8, 2025

Uh oh!

ChrisThrasher commented Apr 8, 2025

Uh oh!

toughengineer commented Apr 8, 2025

Uh oh!

ChrisThrasher commented Apr 9, 2025 •

edited

Loading

Uh oh!

toughengineer commented Apr 9, 2025

Uh oh!

ChrisThrasher commented Apr 9, 2025

Uh oh!

toughengineer commented Apr 9, 2025 •

edited

Loading

Uh oh!

ChrisThrasher commented Apr 12, 2025

Uh oh!

toughengineer commented Apr 12, 2025

Uh oh!

toughengineer commented Apr 30, 2025

Uh oh!

philsquared commented Apr 30, 2025

Uh oh!

toughengineer commented Apr 30, 2025

Uh oh!

Uh oh!

Add string escape function or escape functionality built into CAPTURE macro #2967

Add string escape function or escape functionality built into CAPTURE macro #2967

Comments

toughengineer commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The problem

Proposed solution

ChrisThrasher commented Apr 8, 2025

Uh oh!

toughengineer commented Apr 8, 2025

Uh oh!

ChrisThrasher commented Apr 8, 2025

Uh oh!

toughengineer commented Apr 8, 2025

Uh oh!

ChrisThrasher commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

toughengineer commented Apr 9, 2025

Uh oh!

ChrisThrasher commented Apr 9, 2025

Uh oh!

toughengineer commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChrisThrasher commented Apr 12, 2025

Uh oh!

toughengineer commented Apr 12, 2025

Uh oh!

toughengineer commented Apr 30, 2025

Uh oh!

philsquared commented Apr 30, 2025

Uh oh!

toughengineer commented Apr 30, 2025

Uh oh!

toughengineer commented Mar 28, 2025 •

edited

Loading

ChrisThrasher commented Apr 9, 2025 •

edited

Loading

toughengineer commented Apr 9, 2025 •

edited

Loading