Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 12 additions & 9 deletions server/api/knowledgeBase.ts
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ const { JwtPayloadKey } = config
// Storage configuration for Knowledge Base feature files
const KB_STORAGE_ROOT = join(process.cwd(), "storage", "kb_files")
const MAX_FILE_SIZE = 100 // 100MB max file size
const MAX_ZIP_FILE_SIZE = 25 // 25MB max zip file size
const MAX_ZIP_FILE_SIZE = 35 // 35MB max zip file size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the increased zip upload limit.

The MAX_ZIP_FILE_SIZE has been increased from 25MB to 35MB (40% increase). While the comment now correctly reflects this value, ensure that:

  • This increase has been tested with actual 35MB zip files
  • Server resources (memory, processing time) can handle larger zip extractions
  • The change aligns with infrastructure capacity and user requirements

🏁 Script executed:

#!/bin/bash
# Search for any configuration or documentation that might reference the old 25MB limit
rg -n -C3 --type=md --type=yaml --type=json "25.*MB|25MB" 

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for "25" in TypeScript/JavaScript files to find old limit references
rg -n --type=ts --type=js "25" | head -50

Length of output: 3763


🏁 Script executed:

#!/bin/bash
# Look for test files related to zip or upload functionality
fd -e test.ts -e spec.ts -e test.js -e spec.js | grep -i "zip\|upload\|knowledge"

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check the actual file and surrounding context for both limit changes
rg -n "MAX_ZIP_FILE_SIZE|maxFilesLimit" server/api/knowledgeBase.ts -A2 -B2

Length of output: 1134


🏁 Script executed:

#!/bin/bash
# Search for any README or documentation that might reference these limits
fd -e md -e txt | xargs grep -l "zip\|upload\|file.*size" 2>/dev/null | head -10

Length of output: 118


🏁 Script executed:

#!/bin/bash
# Check the documentation files for references to file limits
for file in observability/docs/observability.md server/CHANGELOG.md server/eval/README.md server/integrations/dataSource/README.md; do
  echo "=== $file ==="
  grep -n "zip\|upload\|file.*size\|3000\|25.*MB" "$file" 2>/dev/null | head -20
done

Length of output: 1842


🏁 Script executed:

#!/bin/bash
# Search for any comments around the maxFilesLimit change for context
rg -n "maxFilesLimit|3000.*file|10000.*file" server/api/knowledgeBase.ts -B5 -A5

Length of output: 542


🏁 Script executed:

#!/bin/bash
# Check if there are any related configuration or environment variables
rg -n "MAX_FILE|MAX_ZIP|maxFiles" --type=ts --type=js | grep -v node_modules | head -30

Length of output: 2092


Fix outdated comment and verify both limit increases have been tested.

Line 1288's comment is outdated—it claims "allow up to 3000 files" but the code enforces 10000 (line 1289). The MAX_ZIP_FILE_SIZE comment at line 109 is now correct (35MB).

However, both limit increases lack test coverage:

  • MAX_ZIP_FILE_SIZE: 25MB → 35MB (40% increase)
  • maxFilesLimit: 3000 → 10000 (233% increase)

Update the comment at line 1288 to reflect 10000, and ensure both limit increases have been stress-tested with actual workloads to verify server memory and processing capacity can handle the larger volumes.

🤖 Prompt for AI Agents
In server/api/knowledgeBase.ts around lines 109 and 1288, the MAX_ZIP_FILE_SIZE
comment at line 109 is accurate (35MB) but the inline comment at line 1288 is
outdated (it says "allow up to 3000 files" while the code enforces 10000);
update that comment to state 10000 files, and add/modify automated and/or manual
stress tests to validate both increases (MAX_ZIP_FILE_SIZE 25→35MB and
maxFilesLimit 3000→10000) under realistic workloads to confirm memory and
processing capacity are acceptable, documenting test results.


// Initialize storage directory for Knowledge Base files
;(async () => {
Expand Down Expand Up @@ -524,7 +524,6 @@ export const GetCollectionNameForSharedAgentApi = async (c: Context) => {
agentExternalId,
user.workspaceId,
)


if (!agent) {
throw new HTTPException(404, { message: "Agent not found" })
Expand Down Expand Up @@ -1212,15 +1211,15 @@ export const UploadFilesApi = async (c: Context) => {
const file = files[i]
const ext = extname(file.name).toLowerCase()

if (ext === '.zip') {
if (ext === ".zip") {
// Check zip file size before extraction
const zipSizeMB = Math.round(file.size / 1024 / 1024)
if (file.size > MAX_ZIP_FILE_SIZE * 1024 * 1024) {
loggerWithChild({ email: userEmail }).warn(
`Zip file too large: ${file.name} (${zipSizeMB}MB). Maximum is ${MAX_ZIP_FILE_SIZE}MB`,
)
throw new HTTPException(400, {
message: `Zip file too large (${zipSizeMB}MB). Maximum size is ${MAX_ZIP_FILE_SIZE}MB`
message: `Zip file too large (${zipSizeMB}MB). Maximum size is ${MAX_ZIP_FILE_SIZE}MB`,
})
}

Expand Down Expand Up @@ -1287,7 +1286,7 @@ export const UploadFilesApi = async (c: Context) => {
paths = extractedPaths

// Validate file count - allow up to 3000 files for zip extractions
const maxFilesLimit = 3000
const maxFilesLimit = 10000

if (files.length > maxFilesLimit) {
throw new HTTPException(400, {
Expand Down Expand Up @@ -1332,7 +1331,7 @@ export const UploadFilesApi = async (c: Context) => {
let storagePath = ""
try {
// Validate file size
try{
try {
checkFileSize(file.size, MAX_FILE_SIZE)
} catch (error) {
uploadResults.push({
Expand Down Expand Up @@ -1742,7 +1741,7 @@ export const DeleteItemApi = async (c: Context) => {
} catch (error) {
loggerWithChild({ email: userEmail }).error(
`Failed to delete file from Vespa: ${id}`,
{ error: getErrorMessage(error) }
{ error: getErrorMessage(error) },
)
}
}
Expand Down Expand Up @@ -1937,7 +1936,11 @@ export const GetChunkContentApi = async (c: Context) => {
const chunkContent = resp.fields.chunks[index]
let pageIndex: number | undefined

const isSheetFile = getFileType({ type: resp.fields.mimeType || "", name: resp.fields.fileName || "" }) === FileType.SPREADSHEET
const isSheetFile =
getFileType({
type: resp.fields.mimeType || "",
name: resp.fields.fileName || "",
}) === FileType.SPREADSHEET
if (isSheetFile) {
const sheetIndexMatch = docId.match(/_sheet_(\d+)$/)
if (sheetIndexMatch) {
Expand All @@ -1952,7 +1955,7 @@ export const GetChunkContentApi = async (c: Context) => {
? pageNums[0]
: 0
}

if (!chunkContent) {
throw new HTTPException(404, { message: "Chunk content not found" })
}
Expand Down