Skip to content

adding feature to output CSV files (issue #44) #278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,9 @@ Style/SlicingWithRange:
Style/SpecialGlobalVars: # DANGER: unsafe rule!!
Enabled: false

Style/StringConcatenation:
Enabled: false

Style/StringLiterals:
Enabled: false
EnforcedStyle: double_quotes
Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@

# SmarterCSV 1.x Change Log

## 1.11.0 (2024-07-02)
* added SmarterCSV::Writer to output CSV files ([issue #44](https://github.com/tilo/smarter_csv/issues/44))

## 1.10.3 (2024-03-10)
* fixed issue when frozen options are handed in (thanks to Daniel Pepper)
* cleaned-up rspec tests (thanks to Daniel Pepper)
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@

[![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)

This library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.

#### LATEST CHANGES
#### BREAKING CHANGES

* Version 1.10.0 has BREAKING CHANGES:
* Version 1.10.0 had BREAKING CHANGES:

Changed behavior:
+ when `user_provided_headers` are provided:
Expand Down
1 change: 1 addition & 0 deletions lib/smarter_csv.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
require "smarter_csv/headers"
require "smarter_csv/hash_transformations"
require "smarter_csv/parse"
require "smarter_csv/writer"

# load the C-extension:
case RUBY_ENGINE
Expand Down
2 changes: 1 addition & 1 deletion lib/smarter_csv/version.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# frozen_string_literal: true

module SmarterCSV
VERSION = "1.10.3"
VERSION = "1.11.0"
end
102 changes: 102 additions & 0 deletions lib/smarter_csv/writer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# frozen_string_literal: true

module SmarterCSV
#
# Generate CSV files
#
# Create an instance of the Writer class with the filename and options.
# call `<<` one or mulltiple times to append data to the file.
# call `finalize` to save the file.
#
# The `<<` method can take different arguments:
# * a signle Hash
# * an array of Hashes
# * nested arrays of arrays of Hashes
#
# By default SmarterCSV::Writer automatically discovers all headers that are present
# in the data on-the-fly. This can be disabled, then only given headers are used.
# Disabling can be useful when you want to select attributes from hashes, or ActiveRecord instances.
#
# If `discover_headers` is enabled, and headers are given, any new headers that are found in the data will still be appended.
#
# The Writer automatically quotes fields containing the col_sep, row_sep, or the quote_char.
#
# Options:
# col_sep : defaults to , but can be set to any other character
# row_sep : defaults to LF \n , but can be set to \r\n or \r or anything else
# quote_char : defaults to "
# discover_headers : defaults to true
# headers : defaults to []
# force_quotes: defaults to false
# map_headers: defaults to {}, can be a hash of key -> value mappings

# IMPORTANT NOTES:
# * Data hashes could contain strings or symbols as keys.
# Make sure to use the correct form when specifying headers manually,
# in combination with the :discover_headers option

class Writer
def initialize(file_path, options = {})
@options = options
@discover_headers = options.has_key?(:discover_headers) ? (options[:discover_headers] == true) : true
@headers = options[:headers] || []
@row_sep = options[:row_sep] || "\n" # RFC4180 "\r\n"
@col_sep = options[:col_sep] || ','
@quote_char = '"'
@force_quotes = options[:force_quotes] == true
@map_headers = options[:map_headers] || {}
@output_file = File.open(file_path, 'w+')
# hidden state:
@temp_file = Tempfile.new('tempfile', '/tmp')
@quote_regex = Regexp.union(@col_sep, @row_sep, @quote_char)
end

def <<(data)
case data
when Hash
process_hash(data)
when Array
data.each { |item| self << item }
when NilClass
# ignore
else
raise ArgumentError, "Invalid data type: #{data.class}. Must be a Hash or an Array."
end
end

def finalize
# Map headers if :map_headers option is provided
mapped_headers = @headers.map { |header| @map_headers[header] || header }

@temp_file.rewind
@output_file.write(mapped_headers.join(@col_sep) + @row_sep)
@output_file.write(@temp_file.read)
@output_file.flush
@output_file.close
@temp_file.delete
end

private

def process_hash(hash)
if @discover_headers
hash_keys = hash.keys
new_keys = hash_keys - @headers
@headers.concat(new_keys)
end

# Reorder the hash to match the current headers order and fill missing fields
ordered_row = @headers.map { |header| hash[header] || '' }

@temp_file.write ordered_row.map { |value| escape_csv_field(value) }.join(@col_sep) + @row_sep
end

def escape_csv_field(field)
if @force_quotes || field.to_s.match(@quote_regex)
"\"#{field}\""
else
field.to_s
end
end
end
end
4 changes: 2 additions & 2 deletions smarter_csv.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Gem::Specification.new do |spec|
spec.authors = ["Tilo Sloboda"]
spec.email = ["tilo.sloboda@gmail.com"]

spec.summary = "Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots of optional features, e.g. chunked processing for huge CSV files"
spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with optional features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
spec.summary = "CSV Reading and Writing"
spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with lots of features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
spec.homepage = "https://github.com/tilo/smarter_csv"
spec.license = 'MIT'

Expand Down
Loading
Loading