-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Labels
Description
Describe the issue
Although the SAV format allows notes with more than 80 lines, the error ‘The provided note is too long for the file format’ is thrown. Since note
is only a string field, a word wrap was also tried before reaching 80 characters. This also fails.
To Reproduce
import pyreadstat
import pandas as pd
import textwrap
# Sample DataFrame
df = pd.DataFrame({
'ID': [1, 2, 3],
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35]
})
# Note text
note_text_1 = ("This is a note for the SPSS file. IBM allows SAVs to have up to 80 chars a line.")
note_text_2 = ("This is a note for the SPSS file. SPSS allows SAVs to have up to 80 chars a line.")
# Wrap the note text to 80 characters per line
wrapped_note_1 = '\n'.join(textwrap.wrap(note_text_1, width=80))
wrapped_note_2 = '\n'.join(textwrap.wrap(note_text_2, width=80))
# Write to SPSS file with note 1 -- exactly 80 characters, works
output_path = 'output_1.sav'
pyreadstat.write_sav(
df,
output_path,
note=wrapped_note_1
)
# Write to SPSS file with note 2 -- 75 characters plus newline, fails
output_path = 'output_2.sav'
pyreadstat.write_sav(
df,
output_path,
note=wrapped_note_2
)
File example
Straight out of SPSS 28: output_2_spss.zip
Hint: The timestamp gets added by SPSS automatically.
Expected behavior
Be able to write notes with 80 characters per line, like SPSS does (since note
is a string, maybe add word-wrapping?).
Setup Information
- How did you install pyreadstat? (pip, conda, directly from repo):
directly from repro
- Platform (windows, macOS, linux, 32 or 64 bit):
Debian Linux, 64 bit
- Python Version:
Python 3.11.2
- Python Distribution (System, plain python, Anaconda):
System
- Using Virtualenv or condaenv?:
Virtualenv