Skip to content

Commit d37e1f3

Browse files
vdmitrienkomarcphilipp
authored andcommitted
Introduce commentCharacter attribute in @Csv{File}Source (#5031)
The `@CsvSource` and `@CsvFileSource` annotations now allow specifying a custom comment character using the new `commentCharacter` attribute to resolve conflicts with using `#` as delimiter. Resolves #5028.
1 parent 77a900d commit d37e1f3

File tree

9 files changed

+233
-19
lines changed

9 files changed

+233
-19
lines changed

documentation/src/docs/asciidoc/release-notes/release-notes-6.0.1.adoc

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,12 @@ repository on GitHub.
3535
[[release-notes-6.0.1-junit-jupiter-bug-fixes]]
3636
==== Bug Fixes
3737

38-
* ❓
38+
* A regression introduced in version 6.0.0 caused an exception when using `@CsvSource` or
39+
`@CsvFileSource` if the `delimiter` or `delimiterString` attribute was set to `+++#+++`.
40+
This occurred because `+++#+++` was used as the default comment character without an
41+
option to change it. To resolve this, a new `commentCharacter` attribute has been added
42+
to both annotations. Its default value remains `+++#+++`, but it can now be customized
43+
to avoid conflicts with other control characters.
3944

4045
[[release-notes-6.0.1-junit-jupiter-deprecations-and-breaking-changes]]
4146
==== Deprecations and Breaking Changes
@@ -45,7 +50,8 @@ repository on GitHub.
4550
[[release-notes-6.0.1-junit-jupiter-new-features-and-improvements]]
4651
==== New Features and Improvements
4752

48-
* ❓
53+
* The `@CsvSource` and `@CsvFileSource` annotations now allow specifying
54+
a custom comment character using the new `commentCharacter` attribute.
4955

5056

5157
[[release-notes-6.0.1-junit-vintage]]

documentation/src/docs/asciidoc/user-guide/writing-tests.adoc

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2270,12 +2270,16 @@ The generated display names for the previous example include the CSV header name
22702270
----
22712271

22722272
In contrast to CSV records supplied via the `value` attribute, a text block can contain
2273-
comments. Any line beginning with a `+++#+++` symbol will be treated as a comment and
2274-
ignored. Note, however, that the `+++#+++` symbol must be the first character on the line
2275-
without any leading whitespace. It is therefore recommended that the closing text block
2276-
delimiter (`"""`) be placed either at the end of the last line of input or on the
2277-
following line, left aligned with the rest of the input (as can be seen in the example
2278-
below which demonstrates formatting similar to a table).
2273+
comments. Any line beginning with the value of the `commentCharacter` attribute (`+++#+++`
2274+
by default) will be treated as a comment and ignored. Note that there is one exception
2275+
to this rule: if the comment character appears within a quoted field, it loses
2276+
its special meaning.
2277+
2278+
The comment character must be the first character on the line without any leading
2279+
whitespace. It is therefore recommended that the closing text block delimiter (`"""`)
2280+
be placed either at the end of the last line of input or on the following line,
2281+
left aligned with the rest of the input (as can be seen in the example below which
2282+
demonstrates formatting similar to a table).
22792283

22802284
[source,java,indent=0]
22812285
----
@@ -2325,8 +2329,8 @@ The default delimiter is a comma (`,`), but you can use another character by set
23252329
cannot be set simultaneously.
23262330

23272331
.Comments in CSV files
2328-
NOTE: Any line beginning with a `+++#+++` symbol will be interpreted as a comment and will
2329-
be ignored.
2332+
NOTE: Any line beginning with the value of the `commentCharacter` attribute (`+++#+++`
2333+
by default) will be interpreted as a comment and will be ignored.
23302334

23312335
[source,java,indent=0]
23322336
----
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
org.junit.jupiter.params.provider.CsvFileSource#commentCharacter
2+
org.junit.jupiter.params.provider.CsvSource#commentCharacter

junit-jupiter-params/src/main/java/org/junit/jupiter/params/provider/CsvFileSource.java

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010

1111
package org.junit.jupiter.params.provider;
1212

13+
import static org.apiguardian.api.API.Status.EXPERIMENTAL;
1314
import static org.apiguardian.api.API.Status.STABLE;
1415

1516
import java.lang.annotation.Documented;
@@ -35,8 +36,8 @@
3536
* that the first record may optionally be used to supply CSV headers (see
3637
* {@link #useHeadersInDisplayName}).
3738
*
38-
* <p>Any line beginning with a {@code #} symbol will be interpreted as a comment
39-
* and will be ignored.
39+
* <p>Any line beginning with a {@link #commentCharacter}
40+
* will be interpreted as a comment and will be ignored.
4041
*
4142
* <p>The column delimiter (which defaults to a comma ({@code ,})) can be customized
4243
* via either {@link #delimiter} or {@link #delimiterString}.
@@ -63,6 +64,10 @@
6364
* column is trimmed by default. This behavior can be changed by setting the
6465
* {@link #ignoreLeadingAndTrailingWhitespace} attribute to {@code true}.
6566
*
67+
* <p>Note that {@link #delimiter} (or {@link #delimiterString}),
68+
* {@link #quoteCharacter}, and {@link #commentCharacter} are treated as
69+
* <em>control characters</em> and must all be distinct.
70+
*
6671
* <h2>Inheritance</h2>
6772
*
6873
* <p>This annotation is inherited to subclasses.
@@ -235,4 +240,22 @@
235240
@API(status = STABLE, since = "5.10")
236241
boolean ignoreLeadingAndTrailingWhitespace() default true;
237242

243+
/**
244+
* The character used to denote comments when reading the CSV files.
245+
*
246+
* <p>Any line that begins with this character will be treated as a comment
247+
* and ignored during parsing. Note that there is one exception to this rule:
248+
* if the comment character appears within a quoted field, it loses its
249+
* special meaning.
250+
*
251+
* <p>The comment character must be the first character on the line without
252+
* any leading whitespace.
253+
*
254+
* <p>Defaults to {@code '#'}.
255+
*
256+
* @since 6.0.1
257+
*/
258+
@API(status = EXPERIMENTAL, since = "6.0.1")
259+
char commentCharacter() default '#';
260+
238261
}

junit-jupiter-params/src/main/java/org/junit/jupiter/params/provider/CsvReaderFactory.java

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@
1818
import java.nio.charset.Charset;
1919
import java.util.Set;
2020
import java.util.UUID;
21+
import java.util.stream.Stream;
2122

23+
import de.siegmar.fastcsv.reader.CommentStrategy;
2224
import de.siegmar.fastcsv.reader.CsvCallbackHandler;
2325
import de.siegmar.fastcsv.reader.CsvReader;
2426
import de.siegmar.fastcsv.reader.CsvRecord;
@@ -65,15 +67,20 @@ private static void validateDelimiter(char delimiter, String delimiterString, An
6567

6668
static CsvReader<? extends CsvRecord> createReaderFor(CsvSource csvSource, String data) {
6769
String delimiter = selectDelimiter(csvSource.delimiter(), csvSource.delimiterString());
70+
var commentStrategy = csvSource.textBlock().isEmpty() ? NONE : SKIP;
6871
// @formatter:off
72+
validateControlCharactersDiffer(
73+
delimiter, csvSource.quoteCharacter(), csvSource.commentCharacter(), commentStrategy);
74+
6975
var builder = CsvReader.builder()
7076
.skipEmptyLines(SKIP_EMPTY_LINES)
7177
.trimWhitespacesAroundQuotes(TRIM_WHITESPACES_AROUND_QUOTES)
7278
.allowExtraFields(ALLOW_EXTRA_FIELDS)
7379
.allowMissingFields(ALLOW_MISSING_FIELDS)
7480
.fieldSeparator(delimiter)
7581
.quoteCharacter(csvSource.quoteCharacter())
76-
.commentStrategy(csvSource.textBlock().isEmpty() ? NONE : SKIP);
82+
.commentStrategy(commentStrategy)
83+
.commentCharacter(csvSource.commentCharacter());
7784

7885
var callbackHandler = createCallbackHandler(
7986
csvSource.emptyValue(),
@@ -90,15 +97,20 @@ static CsvReader<? extends CsvRecord> createReaderFor(CsvFileSource csvFileSourc
9097
Charset charset) {
9198

9299
String delimiter = selectDelimiter(csvFileSource.delimiter(), csvFileSource.delimiterString());
100+
var commentStrategy = SKIP;
93101
// @formatter:off
102+
validateControlCharactersDiffer(
103+
delimiter, csvFileSource.quoteCharacter(), csvFileSource.commentCharacter(), commentStrategy);
104+
94105
var builder = CsvReader.builder()
95106
.skipEmptyLines(SKIP_EMPTY_LINES)
96107
.trimWhitespacesAroundQuotes(TRIM_WHITESPACES_AROUND_QUOTES)
97108
.allowExtraFields(ALLOW_EXTRA_FIELDS)
98109
.allowMissingFields(ALLOW_MISSING_FIELDS)
99110
.fieldSeparator(delimiter)
100111
.quoteCharacter(csvFileSource.quoteCharacter())
101-
.commentStrategy(SKIP);
112+
.commentStrategy(commentStrategy)
113+
.commentCharacter(csvFileSource.commentCharacter());
102114

103115
var callbackHandler = createCallbackHandler(
104116
csvFileSource.emptyValue(),
@@ -121,6 +133,26 @@ private static String selectDelimiter(char delimiter, String delimiterString) {
121133
return DEFAULT_DELIMITER;
122134
}
123135

136+
private static void validateControlCharactersDiffer(String delimiter, char quoteCharacter, char commentCharacter,
137+
CommentStrategy commentStrategy) {
138+
139+
if (commentStrategy == NONE) {
140+
Preconditions.condition(stringValuesUnique(delimiter, quoteCharacter),
141+
() -> ("delimiter or delimiterString: '%s' and quoteCharacter: '%s' " + //
142+
"must differ").formatted(delimiter, quoteCharacter));
143+
}
144+
else {
145+
Preconditions.condition(stringValuesUnique(delimiter, quoteCharacter, commentCharacter),
146+
() -> ("delimiter or delimiterString: '%s', quoteCharacter: '%s', and commentCharacter: '%s' " + //
147+
"must all differ").formatted(delimiter, quoteCharacter, commentCharacter));
148+
}
149+
}
150+
151+
private static boolean stringValuesUnique(Object... values) {
152+
long uniqueCount = Stream.of(values).map(String::valueOf).distinct().count();
153+
return uniqueCount == values.length;
154+
}
155+
124156
private static CsvCallbackHandler<? extends CsvRecord> createCallbackHandler(String emptyValue,
125157
Set<String> nullValues, boolean ignoreLeadingAndTrailingWhitespaces, int maxCharsPerColumn,
126158
boolean useHeadersInDisplayName) {

junit-jupiter-params/src/main/java/org/junit/jupiter/params/provider/CsvSource.java

Lines changed: 37 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010

1111
package org.junit.jupiter.params.provider;
1212

13+
import static org.apiguardian.api.API.Status.EXPERIMENTAL;
1314
import static org.apiguardian.api.API.Status.STABLE;
1415

1516
import java.lang.annotation.Documented;
@@ -62,6 +63,16 @@
6263
* physical line within the text block. Thus, if a CSV column wraps across a
6364
* new line in a text block, the column must be a quoted string.
6465
*
66+
* <p>Note that {@link #delimiter} (or {@link #delimiterString}),
67+
* {@link #quoteCharacter}, and {@link #commentCharacter} (when
68+
* {@link #textBlock} is used) are treated as <em>control characters</em>.
69+
*
70+
* <ul>
71+
* <li>{@link #delimiter} and {@link #quoteCharacter} must always be distinct.</li>
72+
* <li>{@link #commentCharacter} must be distinct from the others only when
73+
* {@link #textBlock} is used.</li>
74+
* </ul>
75+
*
6576
* <h2>Inheritance</h2>
6677
*
6778
* <p>This annotation is inherited to subclasses.
@@ -132,17 +143,20 @@
132143
* {@link #useHeadersInDisplayName}).
133144
*
134145
* <p>In contrast to CSV records supplied via {@link #value}, a text block
135-
* can contain comments. Any line beginning with a hash tag ({@code #}) will
136-
* be treated as a comment and ignored. Note, however, that the {@code #}
137-
* symbol must be the first character on the line without any leading
138-
* whitespace. It is therefore recommended that the closing text block
146+
* can contain comments. Any line beginning with a {@link #commentCharacter}
147+
* will be treated as a comment and ignored. Note that there is one exception
148+
* to this rule: if the comment character appears within a quoted field,
149+
* it loses its special meaning.
150+
*
151+
* <p>The comment character must be the first character on the line without
152+
* any leading whitespace. It is therefore recommended that the closing text block
139153
* delimiter {@code """} be placed either at the end of the last line of
140154
* input or on the following line, vertically aligned with the rest of the
141155
* input (as can be seen in the example below).
142156
*
143157
* <p>Java's <a href="https://docs.oracle.com/en/java/javase/15/text-blocks/index.html">text block</a>
144158
* feature automatically removes <em>incidental whitespace</em> when the code
145-
* is compiled. However other JVM languages such as Groovy and Kotlin do not.
159+
* is compiled. However, other JVM languages such as Groovy and Kotlin do not.
146160
* Thus, if you are using a programming language other than Java and your text
147161
* block contains comments or new lines within quoted strings, you will need
148162
* to ensure that there is no leading whitespace within your text block.
@@ -296,4 +310,22 @@
296310
@API(status = STABLE, since = "5.10")
297311
boolean ignoreLeadingAndTrailingWhitespace() default true;
298312

313+
/**
314+
* The character used to denote comments in a {@linkplain #textBlock text block}.
315+
*
316+
* <p>Any line that begins with this character will be treated as a comment
317+
* and ignored during parsing. Note that there is one exception to this rule:
318+
* if the comment character appears within a quoted field, it loses its
319+
* special meaning.
320+
*
321+
* <p>The comment character must be the first character on the line without
322+
* any leading whitespace.
323+
*
324+
* <p>Defaults to {@code '#'}.
325+
*
326+
* @since 6.0.1
327+
*/
328+
@API(status = EXPERIMENTAL, since = "6.0.1")
329+
char commentCharacter() default '#';
330+
299331
}

jupiter-tests/src/test/java/org/junit/jupiter/params/provider/CsvArgumentsProviderTests.java

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -400,6 +400,83 @@ void honorsCommentCharacterWhenUsingTextBlockAttribute() {
400400
assertThat(arguments).containsExactly(array("bar", "#baz"), array("#bar", "baz"));
401401
}
402402

403+
@Test
404+
void honorsCustomCommentCharacter() {
405+
var annotation = csvSource().textBlock("""
406+
*foo
407+
bar, *baz
408+
'*bar', baz
409+
""").commentCharacter('*').build();
410+
411+
var arguments = provideArguments(annotation);
412+
413+
assertThat(arguments).containsExactly(array("bar", "*baz"), array("*bar", "baz"));
414+
}
415+
416+
@Test
417+
void doesNotThrowExceptionWhenDelimiterAndCommentCharacterTheSameWhenUsingValueAttribute() {
418+
var annotation = csvSource().lines("foo#bar").delimiter('#').commentCharacter('#').build();
419+
420+
var arguments = provideArguments(annotation);
421+
422+
assertThat(arguments).containsExactly(array("foo", "bar"));
423+
}
424+
425+
@ParameterizedTest
426+
@MethodSource("invalidDelimiterAndQuoteCharacterCombinations")
427+
void doesNotThrowExceptionWhenDelimiterAndCommentCharacterAreTheSameWhenUsingValueAttribute(Object delimiter,
428+
char quoteCharacter) {
429+
430+
var builder = csvSource().lines("foo").quoteCharacter(quoteCharacter);
431+
432+
var annotation = delimiter instanceof Character c //
433+
? builder.delimiter(c).build() //
434+
: builder.delimiterString(delimiter.toString()).build();
435+
436+
var message = "delimiter or delimiterString: '%s' and quoteCharacter: '%s' must differ";
437+
assertPreconditionViolationFor(() -> provideArguments(annotation).findAny()) //
438+
.withMessage(message.formatted(delimiter, quoteCharacter));
439+
}
440+
441+
static Stream<Arguments> invalidDelimiterAndQuoteCharacterCombinations() {
442+
return Stream.of(
443+
// delimiter
444+
Arguments.of('*', '*'), //
445+
// delimiterString
446+
Arguments.of("*", '*'));
447+
}
448+
449+
@ParameterizedTest
450+
@MethodSource("invalidDelimiterQuoteCharacterAndCommentCharacterCombinations")
451+
void throwsExceptionWhenControlCharactersAreTheSameWhenUsingTextBlockAttribute(Object delimiter,
452+
char quoteCharacter, char commentCharacter) {
453+
454+
var builder = csvSource().textBlock("""
455+
foo""").quoteCharacter(quoteCharacter).commentCharacter(commentCharacter);
456+
457+
var annotation = delimiter instanceof Character c //
458+
? builder.delimiter(c).build() //
459+
: builder.delimiterString(delimiter.toString()).build();
460+
461+
var message = "delimiter or delimiterString: '%s', quoteCharacter: '%s', and commentCharacter: '%s' " + //
462+
"must all differ";
463+
assertPreconditionViolationFor(() -> provideArguments(annotation).findAny()) //
464+
.withMessage(message.formatted(delimiter, quoteCharacter, commentCharacter));
465+
}
466+
467+
static Stream<Arguments> invalidDelimiterQuoteCharacterAndCommentCharacterCombinations() {
468+
return Stream.of(
469+
// delimiter
470+
Arguments.of('#', '#', '#'), //
471+
Arguments.of('#', '#', '*'), //
472+
Arguments.of('*', '#', '#'), //
473+
Arguments.of('#', '*', '#'), //
474+
// delimiterString
475+
Arguments.of("#", '#', '*'), //
476+
Arguments.of("#", '*', '#') //
477+
);
478+
}
479+
403480
@Test
404481
void supportsCsvHeadersWhenUsingTextBlockAttribute() {
405482
var annotation = csvSource().useHeadersInDisplayName(true).textBlock("""

jupiter-tests/src/test/java/org/junit/jupiter/params/provider/CsvFileArgumentsProviderTests.java

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,36 @@ void ignoresCommentedOutEntries() {
134134
assertThat(arguments).containsExactly(array("foo", "bar"));
135135
}
136136

137+
@Test
138+
void honorsCustomCommentCharacter() {
139+
var annotation = csvFileSource()//
140+
.resources("test.csv")//
141+
.commentCharacter(';')//
142+
.delimiter(',')//
143+
.build();
144+
145+
var arguments = provideArguments(annotation, "foo, bar \n;baz, qux");
146+
147+
assertThat(arguments).containsExactly(array("foo", "bar"));
148+
}
149+
150+
@ParameterizedTest
151+
@MethodSource("org.junit.jupiter.params.provider.CsvArgumentsProviderTests#"
152+
+ "invalidDelimiterQuoteCharacterAndCommentCharacterCombinations")
153+
void throwsExceptionWhenControlCharactersNotDiffer(Object delimiter, char quoteCharacter, char commentCharacter) {
154+
var builder = csvFileSource().resources("test.csv") //
155+
.quoteCharacter(quoteCharacter).commentCharacter(commentCharacter);
156+
157+
var annotation = delimiter instanceof Character c //
158+
? builder.delimiter(c).build() //
159+
: builder.delimiterString(delimiter.toString()).build();
160+
161+
var message = "delimiter or delimiterString: '%s', quoteCharacter: '%s', and commentCharacter: '%s' "
162+
+ "must all differ";
163+
assertPreconditionViolationFor(() -> provideArguments(annotation, "foo").findAny()) //
164+
.withMessage(message.formatted(delimiter, quoteCharacter, commentCharacter));
165+
}
166+
137167
@Test
138168
void closesInputStreamForClasspathResource() {
139169
var closed = new AtomicBoolean(false);

0 commit comments

Comments
 (0)