Skip to content

docs: add guide on file uploads #2017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: source
Choose a base branch
from
1 change: 1 addition & 0 deletions src/pages/learn/_meta.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ export default {
"best-practices": "",
"thinking-in-graphs": "",
"serving-over-http": "",
"file-uploads": "",
authorization: "",
pagination: "",
"schema-design": "Schema Design",
Expand Down
130 changes: 130 additions & 0 deletions src/pages/learn/file-uploads.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Handling File Uploads in GraphQL

GraphQL was not designed with file uploads in mind. While it’s technically possible to implement them, doing so requires
extending the transport layer and introduces several risks, both in security and reliability.

This guide explains why file uploads via GraphQL are problematic and presents safer alternatives.

## Why uploads are challenging

The [GraphQL specification](https://spec.graphql.org/draft/) is transport-agnostic and assumes requests are encoded as JSON.
File uploads, by contrast, require `multipart/form-data` encoding to transfer binary data—something JSON can’t handle.

Supporting uploads over GraphQL usually involves adopting community conventions, like the
[GraphQL multipart request specification](https://github.com/jaydenseric/graphql-multipart-request-spec). While useful in some
environments, these solutions often introduce complexity, fragility, and security risks.

## Risks to be aware of

### Memory exhaustion from repeated variables

GraphQL operations allow the same variable to be referenced multiple times. If a file upload variable is reused, the underlying
stream may be read multiple times or prematurely drained. This can result in incorrect behavior or memory exhaustion.

A safe practice is to use trusted documents or a validation rule to ensure each upload variable is referenced exactly once.

### Stream leaks on failed operations

GraphQL executes in phases: validation, then execution. If validation fails or an authorization check blocks execution, uploaded
file streams may never be consumed. If your server buffers or retains these streams, it can cause memory leaks.

To avoid this, consider writing incoming files to temporary storage immediately, and passing references (like filenames) into
resolvers. Ensure this storage is cleaned up after request completion, regardless of success or failure.

### Cross-Site Request Forgery (CSRF)

`multipart/form-data` is classified as a “simple” request in the CORS spec and does not trigger a preflight check. Without
explicit CSRF protection, your GraphQL server may unknowingly accept uploads from malicious origins.

### Oversized or excess payloads

Attackers may submit very large uploads or include extraneous files under unused variable names. Servers that accept and
buffer these can be overwhelmed.

Enforce request size caps and reject any files not explicitly referenced in the map field of the multipart payload.

### Untrusted file metadata

Information such as file names, MIME types, and contents should never be trusted. To mitigate risk:

- Sanitize filenames to prevent path traversal or injection issues.
- Sniff file types independently of declared MIME types, and reject mismatches.
- Validate file contents. Be aware of format-specific exploits like zip bombs or maliciously crafted PDFs.

## Recommendation: Use signed URLs

The most secure and scalable approach is to avoid uploading files through GraphQL entirely. Instead:

1. Use a GraphQL mutation to request a signed upload URL from your storage provider (e.g., Amazon S3).
2. Upload the file directly from the client using that URL.
3. Submit a second mutation to associate the uploaded file with your application’s data.

This separates responsibilities cleanly, protects your server from binary data handling, and aligns with best practices for
modern web architecture.

## If you still choose to support uploads

If your application truly requires file uploads through GraphQL, proceed with caution. At a minimum, you should:

- Use a well-maintained implementation of the
[GraphQL multipart request spec](https://github.com/jaydenseric/graphql-multipart-request-spec).
- Enforce a rule that upload variables are only referenced once.
- Always stream uploads to disk or cloud storage—never buffer them in memory.
- Apply strict request size limits and validate all fields.
- Treat file names, types, and contents as untrusted data.

## Example (not recommended for production)

The example below demonstrates how uploads could be wired up using Express, `graphql-http`, and busboy.
It’s included only to illustrate the mechanics and is not production-ready.

<Callout type="warning" emoji="⚠️">

Check failure on line 81 in src/pages/learn/file-uploads.mdx

View workflow job for this annotation

GitHub Actions / prettier-check

'Callout' is not defined
We strongly discourage using this code in production.
</Callout>

```js
import express from 'express';
import busboy from 'busboy';
import { createHandler } from 'graphql-http/lib/use/express';
import { schema } from './schema.js';

const app = express();

app.post('/graphql', (req, res, next) => {
const contentType = req.headers['content-type'] || '';

if (contentType.startsWith('multipart/form-data')) {
const bb = busboy({ headers: req.headers });
let operations, map;
const files = {};

bb.on('field', (name, val) => {
if (name === 'operations') operations = JSON.parse(val);
else if (name === 'map') map = JSON.parse(val);
});

bb.on('file', (fieldname, file, { filename, mimeType }) => {
files[fieldname] = { file, filename, mimeType };
});

bb.on('close', () => {
for (const [key, paths] of Object.entries(map)) {
for (const path of paths) {
const keys = path.split('.');
let target = operations;
while (keys.length > 1) target = target[keys.shift()];
target[keys[0]] = files[key].file;
}
}
req.body = operations;
next();
});

req.pipe(bb);
} else {
next();
}
}, createHandler({ schema }));

app.listen(4000);
```
Loading