org.openntf.domino.transactions.DatabaseTransaction

This works in my current branch of the OpenNTF Domino project…

private void iterateAllDocuments(Set secondReference) {

org.openntf.domino.transactions.DatabaseTransaction txn = db.startTransaction();
DocumentCollection dc = db.getAllDocuments();
for (Document doc : dc) {

docCount++;
if (docCount % 100 == 0) {

secondReference.add(db.getDocumentByID(doc.getNoteID()));

}
doc.replaceItemValue("TxnTest", new Date());

}
txn.rollback();

}

…followed by…

private void iterateSecondReferences(Set secondReference) {

org.openntf.domino.transactions.DatabaseTransaction txn = db.startTransaction();
for (Document doc : secondReference) {

doc.replaceItemValue("TxnTest2", new Date());

}
txn.commit();

}

Can you see what’s happening? Let me be a little more explicit.
In the first loop, every document has a value set of the current time into a field called TxnTest. And every 100th document is put into a stack for future processing. When the loop is complete, all the times in the field TxnTest are rolled back.

In the second loop, every document that was stacked in the first loop has a value set of the current time into a field called TxnTest2. When the loop is complete, all these times are saved into the fields.

The result is: transactional boundaries for Documents being update in batch.

If the database supports document locking, the lock is respected. The first time any operation that changes content in a Document is made, then the API locks the Document. The lock is held until the transaction is committed and then after saving the updates, the lock is released.

If the document is being removed instead, then the remove is queued into the Transaction and only executed when the transaction itself is committed. If the transaction is not committed, the remove is not actually executed on the Database.

A Document processed during a transaction that is not changed in any way is also not locked. So you can safely do the following…

DatabaseTransaction txn = db.startTransaction();
for (Document doc : db.getAllDocuments()) {

if ("foo".equals(doc.getItemValueString("foo"))) {

doc.replaceItemValue("foo", "bar");

}

}
txn.commit();

And that will work the way you would expect, locking and updating only the documents where the field “foo” has a value of “foo”.

Interestingly, since the DatabaseTransaction doesn’t block a normal Document.save() operation, you can still just perform a save in your code. If you do this, it’s faster. In fact, it cuts execution time on 10,000 documents by a little less than half. So implementing batch transactions is definitely slower than executing individual saves. However, the performance differences are also interesting. In my test environment, I see the following for 10,000 documents…

No transaction, just .save() during iteration: 5.9s
Transaction boundary with rollback: 7.3s
Transaction boundary with commit: 9.5s
Transaction boundary with commit and document locking: 17.7s

Thanks to the streamlined development of the OpenNTF Domino API, this solution took me 9 hours and 43 minutes to write, including a break for lunch and a shower. I’m not telling you that to claim that I’m some magical hacker-of-legend, but to point out that having a modern API to work with truly has a dramatic increase in developer productivity. It would have taken 10 times that to reach the same goal with the original API.

So if you’re on the fence about whether to work with this library, not only should you consider the performance and features of the API itself, but you should consider how it affects your productivity. Just this past Tuesday, I was writing some code and found myself dealing with a ViewEntryCollection.getNextEntry() pattern — and I almost wept at the idea that I was still coding this way.

Let’s climb out of the primordial ooze of 1.0 APIs and discover the amazing productivity that comes from walking upright and having opposable thumbs. Let’s ascend together into productivity and reliability. Let’s move Domino to its rightful place in the evolutionary chain.

Advertisements
Posted in Uncategorized
5 comments on “org.openntf.domino.transactions.DatabaseTransaction
  1. Very nice! It surprised me at first that there’s no .save() in the loop, but it makes sense.

  2. lehmannkarsten says:

    Looks very cool! We have a similar concept of transactions in our framework, but a bit more high-level, since it abstracts from the DBMS. What I would suggest (if not already there, I’m currently on vacation and can’t look into the code), is add a validation chain for the whole transactions (when it’s about to be committed), so that the transaction can be validated as a whole with old/new values for each changed document. And if validation fails, the transaction gets rolled back. Have you added some checks so that the code does not end in a deadlock when two threads lock the same docs in parallel in a different order? And what happens if I change more docs in a transaction than there are free handles? If only the Notes C API would expose real transactions with trans log support…

    • thentf says:

      Lots of great ideas there, Karsten. A few thoughts…

      re: validation Adding schemas to validate against is on the agenda for the API. You can expect me to crow very loudly about it when it’s implemented. I’m eager to see it happen.

      re: deadlock check. No, not yet. I would not regard the current design as thread-safe. This was the one-day implementation. However, the document is locked at the first time there is a write on the document, so unless the two threads were running under the same credentials (which is a horrible idea for other reasons, see here https://github.com/OpenNTF/org.openntf.domino/wiki/On-Recycling) the second attempt to get the lock would fail.

      However, there is no behavior yet for what happens if the lock attempt fails. So that’s needs to get done at some point.

      re: handle count. I’m not tracking this yet, but I intend to add it in the next few days. I’m going to just issue logging events until it reaches some high-risk numbers, probably in the 50,000 range. (Actually, now that I type that, it occurs to me that I can track outstanding Cpp handles for the entire API. I should just report at that level.)

      re: real translog support. Yes, I agree. I’ve asked for this. IBM didn’t seem interested.

  3. lehmannkarsten says:

    Another idea for transactions or the API in general: a chain of processing classes that get called before each document gets saved / transaction gets committed. Would be useful for computing fields or logging. In our framework, we use this pattern together with Eclipse extension points to make document modification extensible and log username, http request infos (if running in a web server) and before/after values for the data objects/documents. We first apply all changes to the transaction, then seal it so that its read-only and validate. If nothing fails, we sent a ‘queryToCommit’ to the database wrapper and then finally commit the changes.

    • thentf says:

      That’s a good idea. In fact, along with the schema, maybe we’ll just do a Java-level notification implementation. That way we can just add EventListeners to various things, like NSF opens, document opens, creates, updates and deletes, view updates, pretty much anything we can think of.

      I think what we might do is create some interfaces for “TransactionDb”, “EventDb” etc, and then build some wrappers that can be called statically to implement the behavior on an existing Database object, kind of like the wrappers for synchronized collections.

Take the red pill.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: