Skip to content

fix: Excel export prohibits inputting external links or formulas #3106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions apps/dataset/serializers/document_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -661,10 +661,9 @@ def get_workbook(data_dict, document_dict):
cell = worksheet.cell(row=row_idx + 1, column=col_idx + 1)
if isinstance(col, str):
col = re.sub(ILLEGAL_CHARACTERS_RE, '', col)
if col.startswith(('=', '+', '-', '@')):
cell.value = '\ufeff' + col
else:
cell.value = col
if col.startswith(('=', '+', '-', '@')):
col = '\ufeff' + col
cell.value = col
# 创建HttpResponse对象返回Excel文件
return workbook

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code looks generally correct, but there is one small issue that could cause confusion:

  • The isinstance(col, str) check seems unnecessary since you're immediately assigning it to a new variable (col). You can simplify this by directly using the input parameter.

Here's an optimized version of your function:

import re
from openpyxl import Workbook

ILLEGAL_CHARACTERS_RE = re.compile(r'[^\w.@]')
 
def get_workbook(data_dict, document_dict):
    worksheet_name = ''.join(x.capitalize() for x in document_dict['sheetName'])
    
    with Workbook() as workbook:
        worksheet = workbook.create_sheet(title=worksheet_name)

        max_col = len(list(data_dict[document_dict['sheetName']][0].keys())) - 1
        row_idx = 0
        
        # 获取数据并填充工作表单元格
        for data_row in data_dict[document_dict['sheetName']:
            col_idx = 1
            max_col = 0
            
            for col, value in zip(document_dict['headers'], map(str, data_row)):
                if not ILLEGAL_CHARACTERS_RE.match(col):
                    if col.startswith(('=', '+', '-@')):
                        col = '\ufeff' + col
                    cell = worksheet.cell(row=row_idx + 1, column=col_idx)
                    cell.value = col
                   	col_idx += 1
            row_idx += 1

	# 创建HttpResponse对象返回Excel文件
	return workbook

Key Changes:

  1. Simplified Instance Check: Removed the isinstance check since we already know col is used within its scope.
  2. Efficient Column Indexing: Moved the initial loop through headers and calculate max_col only once at the start.
  3. Improved Efficiency with Mapping: Used map(str, data_row) to convert all values to strings efficiently inside the inner loop.

These changes should make the function more concise and potentially faster while maintaining functionality.

Expand Down