-
Notifications
You must be signed in to change notification settings - Fork 225
fix: hdfs read into buffer fully #2031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -88,19 +88,18 @@ impl HadoopFileSystem { | |
|
||
fn read_range(range: &Range<u64>, file: &HdfsFile) -> Result<Bytes> { | ||
let to_read = (range.end - range.start) as usize; | ||
let mut total_read = 0u64; | ||
let mut buf = vec![0; to_read]; | ||
let read = file | ||
.read_with_pos(range.start as i64, buf.as_mut_slice()) | ||
.map_err(to_error)?; | ||
assert_eq!( | ||
to_read as i32, | ||
read, | ||
"Read path {} from {} with expected size {} and actual size {}", | ||
file.path(), | ||
range.start, | ||
to_read, | ||
read | ||
); | ||
while total_read < to_read as u64 { | ||
let read = | ||
file | ||
.read_with_pos((range.start + total_read) as i64, buf[total_read as usize..].as_mut()) | ||
.map_err(to_error)?; | ||
if read == -1 { | ||
break; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this return an unexpected eof error? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree, otherwise the function result is probably not predictable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually this implementation is incorrect. libhdfs behaves strangely, and so does fs-hdfs. References: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for catching this @Kontinuation! I based this on the return value of |
||
} | ||
total_read = total_read + read as u64; | ||
} | ||
Ok(buf.into()) | ||
} | ||
} | ||
|
@@ -141,13 +140,15 @@ impl ObjectStore for HadoopFileSystem { | |
let file_status = file.get_file_status().map_err(to_error)?; | ||
|
||
let to_read = file_status.len(); | ||
let mut total_read = 0u64; | ||
let mut buf = vec![0; to_read]; | ||
let read = file.read(buf.as_mut_slice()).map_err(to_error)?; | ||
assert_eq!( | ||
to_read as i32, read, | ||
"Read path {} with expected size {} and actual size {}", | ||
&location, to_read, read | ||
); | ||
while total_read < to_read as u64 { | ||
let read = file.read(buf.as_mut_slice()).map_err(to_error)?; | ||
if read == -1 { | ||
break; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is also problematic. |
||
total_read = total_read + read as u64; | ||
} | ||
|
||
file.close().map_err(to_error)?; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering would be that better to have
total_read
as usize?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to usize