IndexedDB seems to be inefficient when performing bulk mutations, like dumping a huge list of items into an object store – at least I think so at the first sight on the MDN docs. It provides no explicit API for the job as SQL does (you may add as many records after the INSERT INTO ... VALUES clause), so all we can do is to use a loop from client, and thus cannot benefit from database internal optimization (if there’s any). The mutation requests, in addition, appear to be spawned sequentially only. The tutorial recommends a paradigm to raise a request within the success event callback of the previous request, which is in fact a sequential execution. Such code will be definitely slow.

We may conduct a quick benchmark on the above approach:

;(async () => {
await new Promise((resolve) => {
const r = indexedDB.deleteDatabase("test")
r.onsuccess = r.onerror = resolve
})
const items = Array.from({ length: 100000 }, (_, i) => ({ id: i }))
const store = await new Promise((resolve) => {
indexedDB.open("test", 1).onupgradeneeded = (event) => {
const db = event.target.result
const store = db.createObjectStore("store", { keyPath: "id" })
store.createIndex("id", "id")
resolve(store)
}
})
console.time("bulkAdd")
await bulkAdd(store, items)
console.timeEnd("bulkAdd")
})()

function bulkAdd(store, items) {
const failures = []
return new Promise((resolve) => {
function _perform(idx) {
const req = store.add(items[idx])
req.onsuccess = (event) => {
if (idx === items.length - 1) resolve(failures)
else _perform(idx + 1)
}
req.onerror = (event) => {
failures.push(items[idx].id)
}
}
_perform(0)
})
}

Practically, we concern more about failed records than the ones inserted successfully. We thus only take down the indices of those records, which improves the efficiency at least a little bit.

The timing is rather unstable, but on average, it takes 30~40 seconds to insert 100k records or 2000~3000 records per second, which is not an ideal speed.

So why should the requests be aranged like this? I found a statement in the MDN docs:

Note that commit() doesn’t normally have to be called — a transaction will automatically commit when all outstanding requests have been satisfied and no new requests have been made. commit() can be used to start the commit process without waiting for events from outstanding requests to be dispatched.

Unlike many other databases, IndexedDB transaction implicitly commits changes, at the instant when all callbacks of previous requests are invoked (and returned), and there’s no more new requests. Hence a later request must be raised before the previous one completed or within its success or error callbacks, otherwise the transaction will be closed. Raising new requests in callbacks is safe, but not very efficient.

Can we do it better? Absolutely. We can perform the mutations “concurrently”, that is, raising all requests at a time.

function bulkAdd(store, items) {
const failures = []
const promises = []
for (const item of items) {
const req = store.add(item)
promises.push(
new Promise((resolve) => {
req.onsuccess = resolve
req.onerror = () => {
failures.push(item.id)
resolve()
}
})
)
}
return Promise.all(promises).then(() => failures)
}

The execution time is ~15 seconds now, around 2x faster than the first implementation.

But wait … is this implementation reliable? Is it possible that at some point all previous requests are completed, but there’s still new requests to be raised? The answer is yes, but the transaction won’t be closed for this, since the callbacks have not yet been invoked.

Recall that Javascript has a single-threaded execution model. All callbacks reside in a queue and executed sequentially when the thread is idle. The mechanism ensures that mutation request callbacks won’t be executed before all requests raised, as the raising operations and callbacks are executed in the same thread. So even if some requests complete early, their success or error callbacks will still be executed after all requests are created. The implementation is definitely reliable.

Now we have reached a speed of ~6000 records per second. Can we make it even better? Of course yes!

If you read the code carefully, you may notice that req.onsuccess is doing something vain. A resolve function is passed as success callback, just to inform the timing of request completion. However, we are not interested about the timing of each request completion. Instead, only the timing of last completion is important.

We may expect the callbacks be executed in the order they are raised. They do actually. But weirdly, the MDN docs does not advertise such point, and we have to dive into the IndexedDB Specification to find the clues. After struggling against the verbose documentation, I observe something at section Asynchronously executing a request. Briefly speaking,

  • A request won’t be processed before previous requests’ processed flags be set. (See 5.6.5.1 Wait until request is the first item in transaction’s request list that is not processed.)
  • A request will immediately queue a task after its processed flag set, within which the callbacks are invoked. (See 5.6.5.6: Queue a task to run these steps)

which means, the tasks for firing events will have an order as the raised requests. Consequently, we might be able to know the timing of all completions, by only listening to the success/error events of the last request:

function bulkAdd(store, items) {
return new Promise((resolve) => {
const failures = []
let req, item
for (item of items) {
req = store.add(item)
req.onerror = () => failures.push(item.id)
}
req.onsuccess = () => resolve(failures)
req.onerror = () => {
failures.push(item.id)
resolve(failures)
}
})
}

This time we have a better running time of 10~12 seconds, which is 8000~10000 records per second, nearly 4x faster than the first implementation! The table below summarizes how the performance get improved along the way:

impl. time(s) speed(r/s)
sequential 30~40 2000~3000
concurrent ~15 ~6000
concurrent+less listeners 10~12 8000~10000

The last implementation is also how Dexie.js does for efficient bulk mutations. The library states on its homepage:

Its bulk operations utilize an often-overlooked feature in IndexedDB, ignoring success callbacks when possible.

but few documentation or posts can be found to elaborate the mechanism. It is also the reason why this post is written.

References