Skip to content

Increase speed of Table#to_csv #348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions benchmark/table.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
loop_count: 100
contexts:
- gems:
csv: 3.3.0
- name: "master"
prelude: |
$LOAD_PATH.unshift(File.expand_path("lib"))
require "csv"
prelude: |-
n_columns = Integer(ENV.fetch("N_COLUMNS", "5"), 10)
n_rows = Integer(ENV.fetch("N_ROWS", "100"), 10)
fields = ["AAAAA"] * n_columns
headers = n_columns.times.collect do |i|
"header#{i}"
end
row = CSV::Row.new(headers, fields)
rows = [row] * n_rows
table = CSV::Table.new(rows)
rows = [row] * n_rows * 10
large_table = CSV::Table.new(rows)
benchmark:
"to_csv: no encoding": |-
table.to_csv
"to_csv: encoding": |-
table.to_csv(encoding: 'UTF-8')
"to_csv: encoding - 10 x rows": |-
large_table.to_csv(encoding: 'UTF-8')
26 changes: 15 additions & 11 deletions lib/csv.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1506,17 +1506,7 @@ def generate_line(row, **options)
if options[:encoding]
str.force_encoding(options[:encoding])
else
fallback_encoding = nil
output_encoding = nil
row.each do |field|
next unless field.is_a?(String)
fallback_encoding ||= field.encoding
next if field.ascii_only?
output_encoding = field.encoding
break
end
output_encoding ||= fallback_encoding
if output_encoding
if output_encoding = row_encoding(row)
str.force_encoding(output_encoding)
end
end
Expand Down Expand Up @@ -1960,6 +1950,20 @@ def table(path, **options)
private_constant :ON_WINDOWS

private

def row_encoding(row)
fallback_encoding = nil
output_encoding = nil
row.each do |field|
next unless field.is_a?(String)
fallback_encoding ||= field.encoding
next if field.ascii_only?
output_encoding = field.encoding
break
end
output_encoding || fallback_encoding
end

def may_enable_bom_detection_automatically(filename_or_io,
mode,
options,
Expand Down
11 changes: 8 additions & 3 deletions lib/csv/table.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1006,10 +1006,15 @@ def to_csv(write_headers: true, limit: nil, **options)
limit ||= @table.size
limit = @table.size + 1 + limit if limit < 0
limit = 0 if limit < 0
@table.first(limit).each do |row|
array.push(row.fields.to_csv(**options)) unless row.header_row?
end

if options[:encoding]
rows = @table.first(limit).select { |row| !row.header_row? }
array.push(CSV.generate_lines(rows, **options))
else
@table.first(limit).each do |row|
array.push(row.fields.to_csv(**options)) unless row.header_row?
end
end
array.join("")
end
alias_method :to_s, :to_csv
Expand Down
9 changes: 9 additions & 0 deletions test/csv/test_table.rb
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,15 @@ def test_to_csv_limit_negative_over
CSV
end

def test_to_csv_encoding
rows = [ CSV::Row.new(%w{A}, ["\x00\xac".force_encoding("ASCII-8BIT")]),
CSV::Row.new(%w{A}, ["\x00\xac"]) ]
table = CSV::Table.new(rows)

assert_equal(Encoding::UTF_8, table.to_csv(encoding: 'UTF-8').encoding)
assert_raises(Encoding::CompatibilityError) {table.to_csv}
end

def test_append
# verify that we can chain the call
assert_equal(@table, @table << [10, 11, 12])
Expand Down
Loading