Skip to content

ByteSourceJsonBootstrapper uses CharBufferReader for byte[] inputs #1079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions src/main/java/com/fasterxml/jackson/core/io/CharBufferReader.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
package com.fasterxml.jackson.core.io;

import java.io.Reader;
import java.nio.CharBuffer;

public class CharBufferReader extends Reader {
Copy link
Member

@pjfanning pjfanning Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do anything yet but since CharBufferReader is only used in one place - I think it would be a good idea to move it to the same package as ByteSourceJsonBootstrapper and to make it package private and final to discourage its use by other users. We should at least add javadoc to the class saying that this class is only for internal use of jackson-core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, wanted to figure out where all this might be needed then encapsulate.

private final CharBuffer charBuffer;

public CharBufferReader(CharBuffer buffer) {
this.charBuffer = buffer.duplicate();
Copy link
Member

@pjfanning pjfanning Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to duplicate the char buffer? If we strongly discourage the use of this class and we only have 1 usage of this class - that usage will not modify the input buffer. If we know the input buffer won't be modified, then theoretically, we don't need to duplicate it.

Copy link
Contributor Author

@schlosna schlosna Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will refactor this when encapsulating, was being defensive.

}

@Override
public int read(char[] chars, int off, int len) {
int remaining = this.charBuffer.remaining();
if (remaining <= 0) {
return -1;
}
int length = Math.min(len, remaining);
this.charBuffer.get(chars, off, length);
return length;
}

@Override
public int read() {
if (this.charBuffer.hasRemaining()) {
return this.charBuffer.get();
}
return -1;
}

@Override
public long skip(long n) {
if (n < 0L) {
throw new IllegalArgumentException("number of characters to skip cannot be negative");
}
int skipped = Math.min((int) n, this.charBuffer.remaining());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't cast long to int - there is an edge case where n is too big and the cast value could be nonsense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change this to:

Suggested change
int skipped = Math.min((int) n, this.charBuffer.remaining());
int skipped = Math.min(Math.toIntExact(n), this.charBuffer.remaining());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion works for values of n greater than max-int. Your suggestion will raise an exception. If the API defines n to be long, I think we should respect the API and handle large longs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies - I had planned to suggest a change but then didn't. The suggestion is

 Math.min((int) Math.min(n, Integer.MAX_VALUE), this.charBuffer.remaining());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a funny set of APIs, CharBuffer.remaining returns an int so that suggests you can't have more the max-int chars. So maybe, your suggestion works fine. The skip API accepts longs but it appears that it not something we should worry about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the CharBuffer is wrapping a char[] so is practically limited to Integer.MAX_VALUE - 8

this.charBuffer.position(this.charBuffer.position() + skipped);
return skipped;
}

@Override
public boolean ready() {
return true;
}

@Override
public boolean markSupported() {
return true;
}

@Override
public void mark(int readAheadLimit) {
this.charBuffer.mark();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this ignores the input which seems wrong - if we can't work out the right impl, then you should change the markSupported to return false

}

@Override
public void reset() {
this.charBuffer.reset();
}

@Override
public void close() {
this.charBuffer.position(this.charBuffer.limit());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a new line at end of file?

Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
package com.fasterxml.jackson.core.json;

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;

import com.fasterxml.jackson.core.*;
import com.fasterxml.jackson.core.format.InputAccessor;
Expand Down Expand Up @@ -230,7 +233,9 @@ public Reader constructReader() throws IOException
InputStream in = _in;

if (in == null) {
in = new ByteArrayInputStream(_inputBuffer, _inputPtr, _inputEnd);
// [jackson-core#488] Avoid overhead of InputStreamReader allocating ByteBuffer
ByteBuffer byteBuffer = ByteBuffer.wrap(_inputBuffer, _inputPtr, _inputEnd);
return new CharBufferReader(Charset.forName(enc.getJavaName()).decode(byteBuffer));
} else {
/* Also, if we have any read but unused input (usually true),
* need to merge that input in:
Expand Down
103 changes: 103 additions & 0 deletions src/test/java/com/fasterxml/jackson/core/io/CharBufferReaderTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
package com.fasterxml.jackson.core.io;

import java.io.IOException;
import java.io.Reader;
import java.nio.CharBuffer;
import java.util.Arrays;

import static org.junit.Assert.assertArrayEquals;
import static org.junit.Assert.assertThrows;

public class CharBufferReaderTest extends com.fasterxml.jackson.core.BaseTest {

public void testSingleCharRead() throws IOException {
CharBuffer buffer = CharBuffer.wrap("aB\u00A0\u1AE9\uFFFC");
Reader charBufferReader = new CharBufferReader(buffer);
try (Reader reader = charBufferReader) {
assertEquals('a', reader.read());
assertEquals('B', reader.read());
assertEquals('\u00A0', reader.read());
assertEquals('\u1AE9', reader.read());
assertEquals('', reader.read());
assertEquals(-1, reader.read());
}
assertEquals(-1, charBufferReader.read());
}


public void testBulkRead() throws IOException {
CharBuffer buffer = CharBuffer.wrap("abcdefghijklmnopqrst\u00A0");
char[] chars = new char[12];
Reader charBufferReader = new CharBufferReader(buffer);
try (Reader reader = charBufferReader) {
assertEquals(12, reader.read(chars));
assertArrayEquals("abcdefghijkl".toCharArray(), chars);
assertEquals(9, reader.read(chars));
assertArrayEquals("mnopqrst\u00A0".toCharArray(), Arrays.copyOf(chars, 9));
assertEquals(-1, reader.read(chars));
}
assertEquals(-1, charBufferReader.read(chars));
}


public void testSkip() throws IOException {
CharBuffer buffer = CharBuffer.wrap("abcdefghijklmnopqrst\u00A0");
Reader charBufferReader = new CharBufferReader(buffer);
char[] chars = new char[12];
try (Reader reader = charBufferReader) {
assertEquals(12, reader.read(chars));
assertArrayEquals("abcdefghijkl".toCharArray(), chars);
assertEquals(4, reader.skip(4));
assertEquals(5, reader.read(chars, 3, 9));
assertArrayEquals("qrst\u00A0".toCharArray(), Arrays.copyOfRange(chars, 3, 8));
assertEquals(0, reader.skip(0));
assertEquals(0, reader.skip(1));
}
assertEquals(0, charBufferReader.skip(1));
}


public void testInvalidSkip() throws IOException {
try (Reader reader = new CharBufferReader(CharBuffer.wrap("test"))) {
IllegalArgumentException exception = assertThrows(IllegalArgumentException.class, () -> reader.skip(-1));
assertEquals("number of characters to skip cannot be negative", exception.getMessage());
}
}


public void testReady() throws IOException {
try (Reader reader = new CharBufferReader(CharBuffer.wrap("test"))) {
assertEquals(true, reader.ready());
}
}


public void testMarkReset() throws IOException {
char[] chars = new char[8];
try (Reader reader = new CharBufferReader(CharBuffer.wrap("test"))) {
assertEquals(true, reader.markSupported());
reader.mark(3);
assertEquals(3, reader.read(chars, 0, 3));
assertArrayEquals("tes".toCharArray(), Arrays.copyOf(chars, 3));
reader.reset();
assertEquals(4, reader.read(chars));
assertArrayEquals("test".toCharArray(), Arrays.copyOf(chars, 4));
reader.reset();
Arrays.fill(chars, '\0');
assertEquals(4, reader.read(chars));
}
}


public void testClose() throws IOException {
char[] chars = new char[2];
try (Reader reader = new CharBufferReader(CharBuffer.wrap("test"))) {
assertEquals(2, reader.read(chars));
assertArrayEquals("te".toCharArray(), chars);
reader.close();
Arrays.fill(chars, '\0');
assertEquals(-1, reader.read(chars));
assertArrayEquals("\0\0".toCharArray(), chars);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

files should end with new lines