Sophisticated Simplicity of Modern SQLite
A benchmark-driven guide to tuning SQLite for production workloads


A recent post claiming SQLite reaches 100k TPS using batched BEGIN IMMEDIATE transactions sparked a question: Can I build everything with SQLite?
Despite my experience since 2014 suggesting it lacks concurrent writer support, I wasn't alone in this curiosity. A 2024 Rails talk by @fractalmind clarified matters, debunking concurrency myths and explaining that SQLite simply needs tuning for modern hardware:
SQLite was built in 2004 and is the most backward compatible software ever. A new database is configured for 2004 hardware by default. But in the last 21 years, computers have evolved—SSDs are omnipresent. To run SQLite in production, you simply need to tune it to 2025 standards.
-- Stephen Margheim aka Fractal Mind (paraphrased)
This is not SQLite vs Postgres
This isn't an argument for SQLite over Postgres, but an exploration of vertical scaling for medium-sized apps. Inspired by the Rails' "one-person framework" philosophy, I want to test the limits of a single machine running an embedded database.
While a single instance introduces geographic latency, eliminating network DB latency offers a compelling trade-off. Most of my projects don't hit millions of users, so I'm prioritizing velocity over theoretical scale. If I ever need to migrate to Postgres, I'll count that as a success.
Why SQLite?
The client-server model of databases like Postgres requires extra infrastructure provisioning and upkeep. Because queries are passed to a network before processing, there is network latency. You can reduce the latency by running the database inside the same server as the app, but that reduces network latency rather than eliminating it. It also passes the burden of DB ops to you. This includes running the server, configuring Postgres, backups, and restore.
SQLite, on the other hand, is embedded inside your app. You can create as many SQLite databases as you want because it's just a file. The burden of DB ops like setting up the database and backups is still on you. But that's where this article comes in. I'll set up the database and explain how to do backup ops with Litestream. Everything else will just work.
Benefits:
- The app is self-contained. Anyone can work on the app locally, just pull and
npm install. No infra. - Simplified deployment: the app can be compiled as an executable.
- Easier testing: creating databases is as cheap as creating a variable.
- New databases are easy to spin up: have a DB for app data, for queues, for analytics, etc. (DuckDB!).
- Reads are faster than network DBs.
- Writes are faster in low concurrency.
About the benchmarking methodology
Benchmarks often optimize for peak numbers rather than real-world scenarios. Real applications involve connected data, joins, and concurrent writes. To simulate this, I used a blog model (Users, Posts, Tags) with realistic relationships (User has-many Posts, Post has-many Tags) and indices.
The benchmark mimics a multi-process web server with realistic queries:
- Users by time range
- Paginated posts and tags (100/read)
- Posts with joined users and tags
All code (co-authored by Composer 1, Opus 4.5, and Gemini 1.5 Pro) is available on GitHub.
Runtime
Benchmarks run on a 2.4GHz 8-core Intel i9 MacBook (32GB RAM). Node.js and Piscina simulate a concurrent web server (like pm2/gunicorn running with n workers).
I assumed a 16-thread limit matching the logical cores, but testing exceeded this to observe saturation.
Like a real app, all workers can read and write.
Write Phase
SQLite is a single-writer, multi-reader database. I began by tuning for write throughput, then introduced a mixed workload of 80% reads and 20% writes.
If your workload involves fewer than 20% writes, expect better performance than this benchmark. Higher write ratios will likely degrade performance.
In this phase, we start with a vanilla SQLite database and tune it for writes.
Vanilla SQLite: Concurrency vs Error Counts
Out of the box, SQLite starts failing as soon as write concurrency exceeds one. The chart below shows the error count for approximately 100k writes across increasing worker concurrency.
Note: When the error count is high, the latency metric is inaccurate because it only measures the latency of successful writes.
SQLite fails immediately when concurrent writes collide; it does not automatically queue transactions.
Mitigating Errors with Busy Timeout
Setting PRAGMA busy_timeout instructs the connection to retry for a specified duration before throwing a lock error.
I tested 400ms, 2000ms, and 5000ms timeouts:
There are no significant changes in p99 latency with various busy timeout settings. The latency drop beyond 32 workers in the 400ms case is due to lock errors. Lock errors cause the latency metrics to drop inaccurately.
Increasing the busy timeout prevents these errors without affecting latency. I recommend a setting between 5 and 10 seconds. This gives us our first knob:
| SQLITE Production Configuration Knob | Value | Description |
|---|---|---|
PRAGMA busy_timeout | 5s - 10s | Prevents lock errors |
Enable WAL Journal
By default, SQLite uses DELETE journal mode. DELETE mode locks the database during write operations. Before a change is made to the database file, the original content of the modified database pages is copied into a separate rollback journal file (.db-journal). If a crash occurs, this journal file is used to restore the database to its original, consistent state.
SQLite’s Write-Ahead Logging (WAL) mode does not write changes directly to the database file. Every write transaction appends modified pages immediately to the *.wal file, leaving the main database file unchanged.
Readers see a consistent view by reading from the database file plus any newer pages present in the WAL. This separation is the core reason WAL dramatically improves write latency and read concurrency: writers append sequentially, and readers never block on writers.
The WAL file is therefore a first-class data store, not a temporary buffer. At any moment, the authoritative state of the database is: database file + WAL file
Changes are copied from the WAL back into the database file later during a checkpoint. A checkpoint is not time-based but is triggered by:
- WAL file size thresholds
- Auto-checkpoint settings or explicit
PRAGMA wal_checkpoint - Database close
- Page cache pressure
The charts below show the average and p99 write latency across different numbers of workers. Lock errors were negligible.
Enabling WAL mode reduces p99 latency by 30-60% for more than 2 concurrent writers. Real-world applications with mixed read/write workloads should see even better performance improvements.
WAL Mode Throughput (writes/sec)
At 1 and 2 concurrent connections, writing to WAL reduces throughput by 43% and 17% respectively. Anything above that improves writes/second significantly.
WAL is the strongest knob we can tune to improve SQLite production performance.
| SQLITE Production Configuration Knob | Value | Description |
|---|---|---|
PRAGMA busy_timeout | 5s - 10s | Prevents lock errors |
PRAGMA journal_mode | WAL | Improves write concurrency |
Sync Normal vs Full
By default, SQLite uses PRAGMA synchronous = FULL, ensuring every transaction is durably written to disk before returning. In WAL mode, switching to PRAGMA synchronous = NORMAL separates logical commits from durable commits. This guarantees atomicity and consistency, the database is never corrupted, but accepts a tiny risk of data loss on power failure in exchange for performance.
Trade-offs:
- Safety: Database corruption is impossible.
- Durability: Recent commits (last few milliseconds) may roll back on kernel panic or power loss.
- Performance:
fsyncis removed from the transaction commit critical path
On my machine, I didn't notice any significant difference between NORMAL and FULL in terms of latency. But these are write-only workloads, so the difference might be more visible in read-heavy workloads. It could also mean that fsync on a Mac is not as expensive as I expected:
| SQLITE Production Configuration Knob | Value | Description |
|---|---|---|
PRAGMA busy_timeout | 5s - 10s | Prevents lock errors |
PRAGMA journal_mode | WAL | Improves write concurrency |
PRAGMA synchronous | NORMAL | Reduces fsync (trade durability for speed) |
ChatGPT recommended to use NORMAL for most workloads (Web APIs, caches, event ingestion). Only use FULL if you strictly cannot afford to lose the last committed transaction, even at the cost of higher latency.
Advanced WAL Tuning: Latency Comparison
Next we'll tune some advanced knobs that are very specific to your workload.
PRAGMA wal_autocheckpoint: Adjusts how often we "checkpoint" (transfer) data from the WAL to the main database file.PRAGMA mmap_size: Reduces system call overhead by mapping the database file directly into memory.PRAGMA temp_store: Keeps temporary tables and indices entirely in RAM to avoid disk I/O.
Below is a chart of following configurations against the WAL Sync Normal mode for reference.
- WAL + Sync NORMAL
- WAL + NORMAL + Checkpoint 2k
- WAL + NORMAL + Checkpoint 4k
- WAL + NORMAL + Checkpoint 4k + 1GB MMAP
On my machine, the difference in P99 and Average latency between these configurations was not very visible. This is likely because this particular test run emphasized write operations. The benefits of MMAP and larger checkpoints are typically more pronounced in read-heavy or mixed workloads where read lock contention and system call overhead become bottlenecks.
Checkpoints: Quick vs Delayed "Bill Payments"
To understand PRAGMA wal_autocheckpoint, use the "billing analogy". Think of writing to the WAL as accumulating a tab, and running a checkpoint as paying the bill (syncing data to the main database file).
The wal_autocheckpoint setting defines how often you pay that bill (in number of pages).
- Paying Often (Low Threshold, e.g., 1000 pages): The default setting. You pay your bill frequently. This keeps your debt (WAL file size) small and manageable, ensuring that readers have less WAL data to scan. However, the administrative overhead of logging in to pay (invoking the checkpointer and fsyncing) happens frequently, stealing cycles from your application.
- Paying Later (High Threshold, e.g., 2000 or 4000 pages): You let the tab run up higher before paying. This is more efficient because you batch the "payment" work—you perform the expensive checkpoint operation less often. The tradeoff is that the "bill" (WAL file) gets larger. A larger WAL file can slightly slow down readers and means the eventual payment will take longer to process, potentially causing a minor latency spike.
For high-throughput write applications, increasing the checkpoint threshold (e.g., from 1000 to 4000) strikes a better balance, reducing the frequency of checkpoint freezes.
Memory Mapped I/O (MMAP) and Temp Store
The final optimizations involve how SQLite interacts with memory.
- MMAP (
PRAGMA mmap_size): By default, SQLite reads data from disk using standard system calls (read()). This is like asking a librarian to fetch a book for you every time you need to look up a fact. Enabling MMAP (mmap_size > 0) allows the OS to map the database file directly into the process's memory space. It's equivalent to spreading the books out on your desk; you can access the data instantly without the overhead of asking the librarian. A 1GB MMAP limits syscalls and can significantly reduce CPU usage for reads. - Temp Store (
PRAGMA temp_store): Complex queries often create temporary tables or indices. By settingtemp_store = 2(Memory), you force SQLite to build these temporary structures in RAM rather than on the disk. This is a "free" performance win if you have memory to spare, preventing unnecessary I/O for transient data.
| SQLITE Production Configuration Knob | Value | Description |
|---|---|---|
PRAGMA busy_timeout | 5s - 10s | Prevents lock errors |
PRAGMA journal_mode | WAL | Improves write concurrency |
PRAGMA synchronous | NORMAL | Reduces fsync (trade durability for speed) |
PRAGMA wal_autocheckpoint | 4000 | Checkpoint less often to improve write throughput |
PRAGMA mmap_size | 1073741824 | (1GB) Reduces syscalls by mapping DB to RAM |
PRAGMA temp_store | MEMORY | Stores temp tables in RAM instead of disk |
Mixed Read-Write Phase
In this phase of the benchmark, we introduce 4 different kinds of read queries to the mix. These queries are:
posts_for_user: Fetches the latest 100 posts for a specific user. This query involves a JOIN with theuser_postsjunction table to filter posts by user ID.posts_in_timeframe: Retrieves a paginated list of 100 posts created within a specific start and end date range.single_post_with_details: Fetches a single post along with its author and tags. This is a more complex query requiring multipleLEFT JOINs to bring in data fromusers,user_posts,posts_tags, andtags.users_in_timeframe: Selects a list of 100 users who joined the platform within a specified time window.
Along with the 4 read queries, we continue to write data to the database. 80% of total queries are read queries and 20% are write queries.
Mixed Workload Database setup
The mixed workload database is configured with:
- Busy Timeout: 5s
- Journal Mode: WAL
- Synchronous: NORMAL
- WAL Autocheckpoint: 4000
- MMAP Size: 1GB
- Temp Store: MEMORY
- Page Cache: 64MB
Page Cache is similar to MMAP in that both utilize memory to accelerate data retrieval, but they function differently. The Page Cache (PRAGMA cache_size) is a user-space buffer managed by SQLite, meaning data is copied from the operating system's kernel cache into SQLite’s memory. MMAP, on the other hand, maps the file directly into the process's address space, allowing zero-copy access where the OS manages paging transparently. While MMAP acts as a "second layer" of caching that reduces syscall overhead, the Page Cache remains essential for managing dirty pages and handling write operations effectively.
Ops per second
The benchmarks start after seeding 50000 records in the database. Then we run 1,048,576 (2^20) reads and 131,072 (2^17) writes.
The drop from 14 to 20 workers feels unnatural. It could be due to my laptop throttling under sustained load. The sweet spot for ops per second is between 8 to 14 concurrency or 80% of thread count (80% x 16 = 12.8).
Detailed Read vs Write Latency
The p99 read latency consistently stays under 6ms, even at 60+ concurrent workers. This is a testament to SQLite's performance.
For writes, the p99 stays under 10 when concurrency is less than thread count. Post that it starts increasing logarithmically.
I was honestly surprised to see sub 10ms write latency, and sub 5ms read latency. This is just a file on my machine—it's radically simple, faster and easier to work with.
Impact of Page Cache Size on Ops/sec and Latency
In the next step of performance testing, I varied the page cache size and measured the ops/sec and latency. The following page cache sizes were tested: 8MB, 16MB, 32MB, 48MB, 56MB, 64MB, 128MB, 256MB, 512MB, 1024MB, 2048MB, 4096MB.
Ops/sec were highest and latency was lowest at 256MB cache size. The data doesn't have a conclusive shape, and it could be caused by factors beyond control of the benchmark. One possible control factor is the random timing of WAL checkpoints. Checkpoints are I/O intensive events; if one test run happens to trigger a checkpoint more frequently than another purely due to timing alignment, it can significantly skew the average throughput.
Since I got lucky with 256MB, here's the final set of fine tunes for a SQLite DB:
| SQLITE Production Configuration Knob | Value | Description |
|---|---|---|
PRAGMA busy_timeout | 5s - 10s | Prevents lock errors |
PRAGMA journal_mode | WAL | Improves write concurrency |
PRAGMA synchronous | NORMAL | Reduces fsync (trade durability for speed) |
PRAGMA wal_autocheckpoint | 4000 | Checkpoint less often to improve write throughput |
PRAGMA mmap_size | 1073741824 | (1GB) Reduces syscalls by mapping DB to RAM |
PRAGMA temp_store | MEMORY | Stores temp tables in RAM instead of disk |
PRAGMA cache_size | -262144 | (256MB) Caches hot pages in RAM to reduce disk I/O |
Tuning at the query level
Beyond database configuration, how you structure your queries and transactions plays a massive role in performance.
BEGIN IMMEDIATE
By default, an SQLite transaction started with BEGIN is deferred. This means it starts as a read transaction and only upgrades to a write transaction when you actually try to write something.
In a high-concurrency environment, this can lead to frequent SQLITE_BUSY errors. Picture two connections both starting a transaction (reading) and then both trying to upgrade to a write lock at the same time. One will fail.
BEGIN IMMEDIATE solves this by acquiring the write lock at the start of the transaction. If it succeeds, you are guaranteed to be able to complete your writes without being blocked by another writer upgrading their lock mid-way.
Batching Transactions
Every individual INSERT or UPDATE statement in SQLite is technically a transaction. If you run 1,000 inserts sequentially without an explicit transaction, SQLite initiates, commits, and syncs (fsync) a transaction 1,000 times. This is incredibly slow.
The single most effective optimization for writes is to wrap multiple operations in a single transaction:
BEGIN;
INSERT INTO posts ...;
INSERT INTO posts ...;
-- ... 1000 more inserts ...
COMMIT;
This reduces 1,000 fsync operations to just one, often resulting in a 100x-1000x speedup for bulk operations.
Implementing a write queue
Even with WAL mode, SQLite enforces a "single-writer" policy. While multiple readers can operate concurrently, only one write transaction can commit at a time. If your code indiscriminately fires off write transactions from multiple concurrent web requests, they will eventually contend for the lock.
While busy_timeout helps by making them wait, a more robust architectural pattern for write-heavy applications is to serialize writes at the application level using a queue.
How it works
- Dedicated Writer: Create a single, dedicated worker (thread or process) whose only job is to write to the database.
- Job Queue: When a web request needs to modify data, it doesn't touch the database directly. Instead, it pushes a "write job" (e.g., a closure or a data object) into an in-memory queue.
- Processing: The dedicated writer pulls jobs off the queue one by one and executes them.
Benefits
- Zero Contention: Since there is only one writer,
SQLITE_BUSYerrors due to write-write conflicts become impossible. - Smart Batching: The writer can peek at the queue. If there are 50 pending insert jobs, it can wrap them all in a single
BEGIN IMMEDIATE ... COMMITtransaction, dramatically reducing I/O overhead. - Backpressure: If the queue fills up, you can handle backpressure gracefully (e.g., return 503 Service Unavailable) rather than timing out threads deep in the database driver.
This pattern essentially turns SQLite into a highly efficient, single-threaded append-log engine for writes, while utilizing its multi-threaded read capabilities for serving data.
Production Readiness
You've tuned SQLite for performance, but production deployment requires addressing a few more practical concerns: accessing the database for debugging, handling complex analytical queries, and ensuring data safety through backups.
Accessing the Production Database
In a "one-person framework" philosophy, sometimes you need to jump into the production database to inspect or fix data quickly. While direct database access in production is generally discouraged in larger teams, for solo developers or small teams, pragmatism wins.
Options for safe production access:
- SSH + SQLite CLI: SSH into your server and use the
sqlite3command-line tool to open your database file. Set.modeto something readable likecolumnorjson. - Read-only connection string: When just inspecting data, open the database in read-only mode to prevent accidental writes:
sqlite3 'file:/path/to/production.db?mode=ro - Web-based admin panel: Build a lightweight admin interface in your app (protected by authentication) that lets you run read-only queries or perform common operations. Tools like sqlite-web can be run locally by copying the database file down first.
- Litestream restore to local: Use Litestream to restore a recent snapshot locally, then explore it without touching production.
What About Complex Queries?
SQLite excels at OLTP (transactional) workloads—the kinds of queries you see in this benchmark: lookups by ID, paginated lists, simple joins. It is not optimized for OLAP (analytical) workloads like complex aggregations, multi-table joins across millions of rows, or heavy GROUP BY operations.
Why? Postgres runs as a separate process with sophisticated query planning, parallel execution, and extensive memory for intermediate results. SQLite is embedded inside your application. When you run a complex query in SQLite, it consumes your application's CPU and memory, potentially blocking other requests.
The SQLite way: Lift complexity to the application layer
Instead of writing a single massive SQL query, break it into multiple smaller, focused queries and compose the results in your application code:
- Bad (OLAP-style):
SELECT category, AVG(price), COUNT(*) FROM products JOIN orders ... GROUP BY category HAVING ... - Better (OLTP-style): Run separate queries per category or per time window, then aggregate in JavaScript/Python/etc.
This might feel inefficient, but remember: SQLite has sub-millisecond query latency for simple operations. Running 10 small queries at 1ms each (10ms total) is often faster and more predictable than one complex query that takes 200ms and blocks your app.
For true analytics:
If you need real analytical workloads, consider:
- Separate analytics database: Replicate your SQLite data to DuckDB nightly for OLAP queries. DuckDB is designed for analytics and can query SQLite databases directly.
- Materialized views: Precompute expensive aggregations in a background job and store results in a summary table.
- Export to data warehouse: For serious BI, export to BigQuery, Snowflake, or ClickHouse periodically.
The key insight: SQLite's embedded nature is a feature, not a limitation. Embrace it by keeping queries simple and doing data processing in your application where you have full control.
WAL Checkpoint Behavior Under Load
Earlier, we tuned PRAGMA wal_autocheckpoint = 4000 to improve write throughput by reducing checkpoint frequency. But what actually happens when a checkpoint runs during peak traffic?
Checkpoints can block writes. During a checkpoint, SQLite copies modified pages from the WAL file back into the main database file. While this happens, the database briefly acquires locks that can stall concurrent write transactions. The impact depends on checkpoint mode:
- PASSIVE mode (default for auto-checkpoints): Won't block readers or writers if they're active. If the database is busy, the checkpoint simply skips and retries later. This is safe but means the WAL can grow unbounded during sustained write load.
- FULL mode: Blocks until all readers finish, then performs the checkpoint. Can cause latency spikes.
- TRUNCATE mode: Like FULL, but also resets the WAL file to zero bytes, preventing fragmentation.
Best practices:
- Manual checkpoints during quiet periods: If your application has predictable low-traffic windows (e.g., 3-5 AM), schedule a
PRAGMA wal_checkpoint(TRUNCATE)to reset the WAL. This prevents unbounded growth.-- In a nightly cron job PRAGMA wal_checkpoint(TRUNCATE); - Monitor WAL size: Track the size of your
*.db-walfile. If it exceeds 100MB regularly, your autocheckpoint setting might be too aggressive for your write volume, or you need manual checkpoints. - Separate checkpoint worker: Consider a dedicated background worker that runs checkpoints independently, outside your request-handling workers.
- Read-heavy apps: If you have many long-running read transactions, they can prevent checkpoints from completing. Ensure read transactions are short-lived.
The takeaway: Checkpoints are necessary I/O "bill payments" for WAL mode's performance gains. Tune wal_autocheckpoint based on your write volume, but don't ignore manual checkpoints during off-peak hours.
Pitfalls and Observability Gaps
SQLite's simplicity comes with a downside: the observability ecosystem is weak. Unlike Postgres with pg_stat_statements, slow query logs, and rich monitoring tools (pganalyze, Datadog integrations), SQLite offers minimal built-in instrumentation. Without observability, you're flying blind—a production incident caused by a runaway WAL or checkpoint blocking writes will be hard to diagnose.
For production deployments, you'll need to build custom monitoring for:
- WAL size metrics: Track your
*.db-walfile size over time - Checkpoint frequency and duration: How often checkpoints run and how long they take
- Lock contention metrics: Count
SQLITE_BUSYerrors to identify write conflicts - Query latency percentiles: Track p50, p99, p999 for reads and writes separately
- Cache hit ratios: Approximate via OS-level disk I/O monitoring tools like
iostat
Budget time to build a lightweight monitoring layer that exports metrics to a time-series database (Prometheus, CloudWatch, Grafana Cloud) and set alerts for anomalies like WAL size exceeding 100MB, SQLITE_BUSY error rates above 10/min, or p99 write latency over 50ms. The effort is small compared to the visibility you gain.
Backup and Replication with Litestream
As mentioned earlier, backups are non-negotiable for production. Litestream is the de facto solution for SQLite replication.
Litestream continuously monitors your SQLite database's WAL (Write-Ahead Log) file and streams changes to cloud storage (S3, Azure Blob, GCS, etc.) in near real-time. It's not a snapshot-based backup—it's continuous replication of every transaction.
If Litestream crashes or loses connection to S3, your application continues working normally—SQLite doesn't depend on Litestream. You just lose backup coverage until Litestream reconnects. Monitor Litestream with health checks and alerting.
For detailed setup instructions, see the Litestream documentation.
Final Thoughts
If tuned correctly, SQLite is an incredible database. It can handle a significant amount of traffic and is particularly well suited for read heavy workloads. But it's not a silver bullet for every application. It can easily be ruled out for OLAP requirements. The observability ecosystem is weak and disaster recovery is not as simple as it should be.
It might be a good choice for you. You can always give LLMs a link to this article and pitch them your specific use case. The information contained herein will help the LLM make a rational choice.
The code for these benchmarks is available at https://github.com/shivekkhurana/sqlite-test.
Shivek Khurana
I make things. Mostly software, but sometimes clothes, courses, videos, or essays.