Agent Profiling is a new feature in R7 which can help to analyse,troubleshoot and monitor the lotus script code.Profile the agent help us more accurately target the performance problems.

To profile an agent, the agent profiling has to be turned on for that particular agent. This setting is on the second tab of the Agent Properties box.



After the profiling toggle is turned on, the next time the agent runs it will be profiled. Agents can be profiled regardless of how they run (for example, as a scheduled agent, as a Web agent, or manually from the Action menu). The profiling information is stored in a Profile document in the database associated with the agent.

To view profiling information, select the agent you are profiling in Domino Designer, and then choose Agent - View Profile Results



you see the name of the agent and the time stamp of when profiling was done. Elapsed time is the total amount of time the agent ran, followed by a total measured time, which is typically somewhat smaller because time values are rounded down for display purposes. For example, the values under one millisecond are displayed as zeros in the following table. The profiling table contains the class, the method, the operation, the total number of calls to that method, and the total amount of time spent on all calls to that method. The information in the table is sorted in descending order, showing the methods where the most amount of time was spent at the top.

Profiling results are presented in a document based on the hidden form $BEProfile. The result document contains a heading listing the name of the agent and the creation time. The Body item of the result document contains a table with a row for each Domino Objects method called and five columns:


  • Class -- The name of a Domino Objects class using normalized names such as Session, Database, and Document.
  • Method -- The name of a Domino Objects method or property using normalized names such as CurrentDatabase, AppendItemValue, and Save.
  • Operation -- For properties, the type of operation: Get or Set.
  • Calls -- The number of times the method or property was called.
  • Time -- The amount of time the calls consumed in milliseconds. The symbol "<" means not enough time to calculate.

One of the handy option Quickplace provides to view Office documents as Inline attachment.It can be achieved by selecting following options while creating a new page.



Quickplace opens Office documents as a inline attachment.It is easy and time saving for end users to access those attachments.

MS-Word


MS-Excel


MS-Powerpoint


But,Sometimes user comes with the requirement saying I wish to view pdf in a same way as office documents.Here are the code and steps to achieve the same.


  • Create a new form (Customize-> Forms ->Imported HTML form).

  • Create html file with following code

    (HTML code is not allowed here,Drop me a mail for HTML file)

  • Select newly created html file,name as "Inline Pdf" and publish the page.


  • Create new page with the "Inline pdf" option.


    Here is a result


This article introduces pre-delivery mail Agents that can process incoming mail before it reaches a user's mail box. These Agents can file new messages in folders other than the Inbox, remove large attachments to conserve disk space, or even delete it.

Notes R5 includes new pre-delivery mail agents that you can set up to run before new mail arrives. This means that you can do things like:


  • File newly delivered messages into a folder other than the Inbox when those messages meet certain conditions. For example, you may want to file bug reports into a Bugs folder. Because the agent runs when the message is delivered, the user does not see the message appear in the Inbox and then move to another folder.
  • Remove large attachments from messages before delivering those messages to the user's mail box. The agent removes the attachment before the entire message writes to the user's mail box; thus, you can conserve disk space and eliminate the need to replicate extra data.
  • Determine (based on something about the incoming message) that the message should not be delivered and delete it. For example, you may want to delete messages from a known "Spam" user. Again, you can save both your disk and replication resources.

This article will talk more about pre-delivery mail agents, and how you can use them to perform these operations. We will cover the agent's behavior, the restrictions imposed on this type of agent, and the situations where you should use the agents, versus when you should use the post-delivery mail agents. Then, we'll look at how to debug the agents, and go over some specific examples. You'll see how these agents can be useful for efficiency and storage capacity, as well as usability.

Note: Prior releases of Domino and Notes are not aware of pre-delivery mail agents and the new trigger type "Before new mail arrives." If an R5 server that has pre-delivery agents enabled fails over to a pre-R5 server, the router will deliver the mail without executing the pre-delivery agent. If you create a pre-delivery agent on a Notes R5 client and then attempt to edit or turn it off from a pre-R5 client, you will receive an error that the agent has an unknown trigger.

An introduction to pre-delivery mail agents

Mail agents in R5 now come in two types of flavors according to when they run: before new mail arrives or after new mail arrives. The new pre-delivery mail agents are identified by the new agent trigger "Before new mail arrives." The old-style R4.x mail agents are still supported in R5, and are now identified by the agent trigger "After new mail arrives." We will refer to these old-style agents as post-delivery mail agents. The following screen shows the new mail agent triggers:

Figure 1. New agent triggers


The pre-delivery mail agents process mail before it arrives in the user's mail database; for example, to move incoming mail to a folder. The agents are run by the mail router, so they are guaranteed to run before a new mail message deposits into the user's mail file. In contrast, post-delivery agents run after a mail message arrives in the user's mail database. So, you can use post-delivery agents for operations that do not depend on the timing of mail delivery; for example, to respond to mail messages.

Although you can use both types of mail agents in the same mail database, you can enable only one pre-delivery mail agent at a time. You can have an unlimited number of disabled pre-delivery mail agents. In addition, you can have an unlimited number of enabled and disabled post-delivery agents. The following screen shows the agent view of a mail database containing both pre-delivery and post-delivery mail agents. Notice that the blue arrow below the checkbox identifies the currently active (enabled) pre-delivery agent:

Figure 2. Agent list with enabled arrow


The order of execution

If you enable both pre-delivery and post-delivery mail agents in a database, the pre-delivery agent always execute first (before the mail message is deposited into the database). Then, the post-delivery agents execute (after the mail message has been deposited into the database). The post-delivery agents are triggered by the new message, even if the pre-delivery agent filed it into a folder other than the Inbox. If the pre-delivery mail agent deletes a mail message, the message is never delivered to the database. This means that the post-delivery agents are not invoked for this particular message.

In addition to mail agents, R5 provides two other mail filtering options: new router controls and mail template rules. You can use the built-in router controls to allow mail only from designated domains, allow mail from designated organizations, deny mail from designated organizations, and to impose size restrictions on incoming mail. With the mail template rules, you can specify to watch for messages from a certain sender (such as your boss), or messages that contain a certain subject, and then select what to do when those messages arrive (copy or move them to a folder, delete them, or change the importance of the message).

So, if you use all of the mail filtering options in R5, the options execute in the following order:

  1. The router controls verify that a message is allowed to route through the domain.
  2. The pre-delivery mail agent processes the incoming message.
  3. The action specified in the mail template rules occurs.
  4. The post-delivery mail agents process the message.

Configuration settings

Because pre-delivery mail agents are run by the router, Agent Manager settings have no effect on these agents. Therefore, these agents have separate configuration settings. First, the pre-delivery mail agents have a separate maximum execution time, specified in the "Pre-delivery agent timeout" field in the "Router/SMTP" /Restrictions and Controls/Delivery Controls tab of a server's Configuration Settings document. (shown below) By default, this setting is 30 seconds. (LotusScript and Java agents that run by all other triggers use the execution time specified in the Server Tasks/Agent Manager tab of the Server document.)


Figure 3. Configuration Settings document


If the pre-delivery agent exceeds the maximum timeout, it aborts in the middle of its execution. The terminate event fires, and thus gives you, the agent writer, a chance to do some minimum amount of work to clean up. If you write the agent in LotusScript, any files that the agent opened close automatically. If you write the agent in Java, the JVM (Java Virtual Machine) schedules all objects used in the agent for garbage collection. Although it may not happen immediately, the JVM closes any open files when the corresponding objects are collected.

In addition, you can control the number of concurrent pre-delivery mail agents that can execute by configuring the maximum delivery threads for the router, specified in the "Router/SMTP" /Restrictions and Controls/Delivery Controls tab of a server's Configuration Settings document. The number of delivery threads can vary from 3 to 25. By default, Domino determines the number of delivery threads based on the size and performance characteristics of the server.




Designing with pre-delivery mail agents

When designing your mail processing application, you can put both pre-delivery and post-delivery mail agents to work for you. If possible, try to separate your processing into whatever is "critical" to do before a message reaches the user's mail box, and whatever can occur after the message reaches the user's mail box. For example, you might want to have a pre-delivery agent strip off large attachments, and set a flag when it performs the operation. Then, the post-delivery agent can notify the mail sender that the message contained an attachment that was too large to deliver.

When you begin designing your pre-delivery mail agents, you'll need to keep the following things in mind:

  1. Pre-delivery agents have built-in restrictions to ensure their efficiency
  2. There are some key coding differences for pre-delivery and post-delivery mail agents
  3. Pre-delivery agents can run automatically when servers failover
  4. Folder operations work differently for pre-delivery agents
  5. Debugging pre-delivery mail agents is different than other agents

The next sections describe each of these areas in more detail.

Built-in restrictions for pre-delivery mail agents

Because the router executes the pre-delivery mail agents, it's vital for these agents to be as short and efficient as possible. Otherwise, they may slow down the router, which would be highly undesirable for server performance. So, to help ensure the efficiency of the pre-delivery agents, we placed the following restrictions on these types of agents:

  • As mentioned earlier, you can enable only one pre-delivery agent per database. This allows Domino to retrieve an agent in an efficient manner for any number of users.
  • You cannot set up pre-delivery mail agents to call other agents. This facilitates an efficient caching scheme.
  • You cannot set up pre-delivery mail agents to modify attachments. The agents can only examine attachments and detach them. Currently, there are no methods that can modify attachments; you can only modify them by using an OLE operation. Therefore, this restriction means that the methods Activate and DoVerb on NotesEmbeddedObject are not allowed.
  • As mentioned earlier, the maximum execution time for pre-delivery agents is separate from other agents, and we recommend you keep the setting at a much smaller number than for other agents. By default, the setting is 30 seconds.

Coding differences for pre-delivery and post-delivery mail agents

Remember that pre-delivery agents execute before the new mail message is written to the user's mail box. This means that as an agent writer, you will not find this new document in any collection obtained from the user's mail database. The new document is available only through the document context. In general, you will need to slightly modify the logic that you used in R4.x mail agents to work with the new mail trigger (that is, you will need to change how you obtain the document that is being delivered).

Another difference between post-delivery (R4.x) mail agents and the new pre-delivery agents is that with the new agent, you know that it will operate on only one document -- the new mail message that the router is delivering. In contrast, a post-delivery mail agent can operate on any number of documents, because the agent runs on all new documents that arrived in the user's mail database since the last time it ran.

Agent failover for pre-delivery mail agents

The router executes pre-delivery mail agents on the same server where it delivers mail. If mail delivery fails over to another server in a cluster, the agent execution fails over as well (provided the second server is also an R5 server). That is, the router on the second server automatically runs the pre-delivery mail agent on any mail it delivers.

In contrast, post-delivery agents are designed by default to run on the home mail server of the agent signer (the person who last modified the agent). Before the agents run, they perform a check to see if the current server is the home mail server of the agent signer. The home mail server is determined by taking the name of the agent signer, performing a lookup in the Domino Directory, and retrieving the mail server from the user's Person document. If the home mail server is not the same as the server on which the agent is attempting to run, the agent does not run. This means that the post-delivery agent does not run unless it resides on the user's mail server. Mail-in databases also should reside on the same server as the mail server listed in the user's Person document. You can change this default behavior by setting the NOTES.INI AMgr_DisableMailLookup variable to 1. This setting suppresses the check for the user's home mail server and allows the agent to run on any server. Be aware that depending on the logic of your agent, allowing the agent to run on any server may cause replication conflicts. (This Agent Manager setting has no effect for pre-delivery mail agents, because the router runs those agents.)

Managing folders in pre-delivery mail agents

Pre-delivery mail agents operate on a message while it's in transit -- that is, before the router delivers the message to the user's mail box. This results in the following special behavior in folder manipulation operations:

  • When the router delivers the message to the user's mail database, only one (the first) PutInFolder operation takes effect. If your agent uses LotusScript or Java, the following runtime error appears if more than one PutInFolder operation is detected: "Invalid sequence of operations in mail pre-delivery agent." If you are using a simple agent and have multiple move operations, the Agent Log generates a warning notifying you which folder operations have been ignored.
    Because only one PutInFolder operation can take effect, any newly delivered documents can appear in a maximum of two folders (Inbox and any other folder), but in any number of views at the same time. Note that views and folders are two different things. Membership in a view is determined by the selection formula for the view (All Documents selects all documents). For example, you can create a view called "All Mail From Julie" that selects all documents that have "Julie" in the From field. You can define any number of such views, and one document can match any number of these criteria so that it can appear in any number of views. Folders, on the other hand, are populated explicitly. By default, the router inserts a document into the Inbox folder on delivery. The agent can decide not to do that, and can also decide to put the document into another folder. So, when you use a pre-delivery mail agent, the document can appear in zero, one, or two folders, but in any number of views.

  • If the pre-delivery mail agent removes the message from the Inbox via the operation RemoveFromFolder("($InBox)"), the mail message is delivered, but only appears in the All Documents view.
  • If you want a new mail document to appear in only a different folder and not in the Inbox, the agent needs to perform two folder operations: PutInFolder("NewFolder") and RemoveFromFolder("($InBox)").

The following table summarizes the sequences and results of folder methods.


Figure 4. Folder operations table


Debugging pre-delivery mail agents

Because the pre-delivery mail agent operates on the message being delivered, you cannot test or run the logic of this agent effectively as a manual agent. If you do, you will receive the error "No documents have been selected" since the context is not set properly.

However, you have the following options for debugging pre-delivery mail agents:

  • You can change the agent trigger to another type, such as "Selected documents," and then verify that the agent logic works correctly when the agent context is the document selected in a view. When you are satisfied with the logic, switch to the "Before mail arrives" trigger.
  • You can debug the logic of the agent by writing debug information to the Agent Log.
  • You can use MessageBox or Print methods to print debug information to the server console and server log.

Since the router executes pre-delivery mail agents, the Agent Manager NOTES.INI settings for logging and debugging have no effect. To control the errors that log to the server console for pre-delivery agents, you can specify the "Logging level" setting for the router in the "Router/SMTP" /Advanced/Controls tab of a server's Configuration Settings document. You can set the Logging level to Minimal, Normal (the default), Informational, or Verbose. If you set this variable to Verbose, your agent errors log to the console. Note that you will also get a very verbose output of the router operations. By using this separate setting to control the output of pre-delivery mail agents, you can tune the performance of the router, while still having your other agents generate as much output as needed for your other operations.

You will notice that the output from the pre-delivery agent on the server console is prefixed by "Addin:", as shown in the following:

"Addin: Agent message box: Message generated by mail pre-delivery agent."
This happens because the messages are generated by the router task (which is executing Agent Manager routines via API calls) and the router task is a server add-in. The same prefix appears on the error messages generated by the agent to the server console.




Examples of pre-delivery mail agents

Now let's put everything we learned about the pre-delivery mail agents into action. We will go over the following examples of some typical operations you can perform with the pre-delivery agents:

  • Filing a message into a folder other than Inbox based on the subject of the message
  • Detaching attachments if they are larger than our threshold
  • Deleting a message and forwarding a copy to another person
  • Splitting a task into critical and non-critical portions, and using both types of mail agents to perform the task

Please note that the sample agents include a lot of debugging code for illustration purposes. To write these agents for a production system, you should remove all debug code (including the writing to the NotesLog, print, and MessageBox statements) after you complete your testing to make the agents as small and as efficient as possible.

Example One: Filing a message

This LotusScript agent files a message into a folder other than Inbox if the subject of the message is "Vacation request."

Sub Initialize
Dim session As New NotesSession
Dim note As NotesDocument
Dim dbug As NotesLog
Dim db As NotesDatabase
Dim it As NotesItem
Set session = New NotesSession
Set sourcedb = session.CurrentDatabase
REM Log steps in our processing for debug purposes
Set dbug = New NotesLog("Router log")
dbug.LogActions = True
dbug.OpenAgentLog
dbug.LogAction("begin")
Set db = session.CurrentDatabase
REM Make sure we have the note set correctly
If db Is Nothing Then dbug.LogAction("db is not set") Else dbug.LogAction("db is set")
Set note = session.DocumentContext
If note Is Nothing Then dbug.LogAction("note is not set") Else dbug.LogAction("note is set")
REM Note the Subject of all messages
dbug.LogAction("Subject ->" + note.Subject(0))
REM Is this message has the special subject, store it in the special folder
If note.Subject(0) = "vacation request" Then
Call note.PutInFolder( "Vacation" )
REM PutInFolder leaves a message in the Inbox view as well.
REM Since we want to have it only the Vacation Folder we need to remove it from Inbox
Call note.RemoveFromFolder("($InBox)")
dbug.LogAction("File into Vacation Folder")
End If
dbug.LogAction("done")
dbug.Close
End Sub


Example Two: Detaching attachments

This LotusScript agent detaches attachments that are bigger than a certain size (MaxSize).

Sub Initialize
Dim session As New NotesSession
Dim db As NotesDatabase
Dim doc As NotesDocument
Dim dbug As NotesLog
Dim rtitem As Variant
Dim fileCount As Integer
Dim it As NotesItem
REM Specify the size limit for attachments
Const MaxSize = 5000
fileCount = 0
Set dbug = New NotesLog("Router log")
dbug.LogActions = True
dbug.OpenAgentLog
dbug.LogAction("begin")
REM get the incoming mail message
Set doc = session.documentcontext
REM Log the subject name of the message for debug purposes
Set it = doc.GetFirstItem("Subject")
dbug.LogAction("doc subject from context" + "-> " + it.Text)
Set rtitem = doc.GetFirstItem( "Body" )
If ( rtitem.Type = RICHTEXT ) Then
Forall o In rtitem.EmbeddedObjects
REM Note how many files we have processed
fileCount = fileCount + 1
dbug.LogAction("file count:"+Cstr(fileCount))
If ( o.Type = EMBED_ATTACHMENT ) And ( o.FileSize > MaxSize ) Then
Call o.ExtractFile( "c:\tmp\newfile" & Cstr( fileCount ) )
Call o.Remove
REM Note that we removed an attachment
dbug.LogAction("attachment removed")
REM Created a field noting that we removed an attachment
doc.stripped = "yes"
Call doc.Save( True, True )
End If
End Forall
End If
REM Finish up agent log processing
dbug.LogAction("Mail preprocessing agent is done")
dbug.Close
End Sub



Example Three: Deleting messages

This LotusScript agent deletes messages that come from "Joe Spam" after forwarding them to the postmaster.

Sub Initialize
Dim session As New NotesSession
Dim note As NotesDocument
Dim dbug As NotesLog
Dim db As NotesDatabase
Dim it As NotesItem
Set session = New NotesSession
Set sourcedb = session.CurrentDatabase
Set dbug = New NotesLog("Router log")
dbug.LogActions = True
dbug.OpenAgentLog
dbug.LogAction("begin")
Set db = session.CurrentDatabase
Set note = session.DocumentContext
dbug.LogAction("Subject ->" + note.From(0))
If note.From(0) = "CN=Joe Spam/O=SpamFactory" Then
Call note.Send(False, "Administrator/Lily")
dbug.LogAction("Send memo")
Call note.Remove(True)
End If
dbug.Close
End Sub


Example Four: Using both pre-delivery and post-delivery agents

This example shows how to perform only critical mail processing in the pre-delivery agent, and all other processing in the post-delivery agent. Our overall goal is to strip any large attachments from incoming mail and to notify the mail senders that their attachments weren't delivered due to their size. The pre-delivery agent strips off large attachments, since this reduces the overhead and disk space. The post-delivery agent notifies the mail sender, since this part of the task is not time critical.

For the first part of the task, we will use the pre-delivery agent created in Example Two. Notice in that agent that when we remove an attachment, we set the status of a field "stripped" to "yes." In the post-delivery agent, we will check that field and if it was set, we will generate a reply to the sender.

Sub Initialize
Dim session As New NotesSession
Dim db As NotesDatabase
Set session = New NotesSession
Set db = session.CurrentDatabase
Set docs = db.UnprocessedDocuments
Count = docs.Count
REM if we have new mail start processing
If docs.Count > 0 Then
For n = 1 To docs.Count
Set memo = docs.GetNthDocument(n)
If Not( memo.SentByAgent ) Then
REM if attachemetns were stripped, send a reply
If (memo.stripped(0) = "yes") Then
Set reply = memo.CreateReplyMessage( False )
reply.Subject = "Re: " & memo.Subject( 0 )
reply.Body = "The message you mailed contained attachments that were too large.
They were removed before mail delivery."
Call reply.Send( False )
End If
End If
Call session.UpdateProcessedDoc(memo)
Next
End If
End Sub

Some important points to note before using a "Before new mail arrives" triggered agents in applications (Mostly in mail in database apps or customized mail boxes). These agents are different from normal agents which are scheduled or event triggered. The differences are

1. only one agent is allowed per database.

2. This agent cannot call another agent (Cascading is not possible).

3. The maximum execution time of this agent is NOT controled by the agent execution time parameter in the server document.

4. This agent execution time is part of router setting in the server document as these agents are executed by the router not by the agent manager.

5. The default time for maximum execution is 30 seconds. The parameter can be changed (mail predelivery agent execution timelimit parameter in the router/smtp settings.) but not advisable as it will adversly impact the router performance.

No Lotus Notes/Domino developer wants to hear the following comment: "Beautiful application, too bad it's so slow!" In this two-part article series, we explain how you can avoid this embarrassment by building Notes/Domino applications optimized for performance.

One of the saddest sights we know is a beautiful application that is so slow it's unusable -- all the long hours and hard work wasted because users are frustrated by slow response times. Over the past 12 years, we've spent a lot of time researching and testing Domino applications and functionality to understand how and where features can best be used to optimize performance. We started supporting and developing Domino applications in the early 1990s, and quickly became fascinated with application performance. It seemed to us then (as it still does today) that much of what we perceive as server performance problems are actually application performance problems. And the solutions, therefore, are often found within the application rather than on the server.

In this two-part article series, we will share some of what we've learned with you. This series covers three areas of application performance: database properties, document collections, and views. In part 1, we will discuss database properties and document collections. In each case, we will point out areas that are most significant and provide concise, real-world examples to help you understand what to do in your own applications. We'll use examples from many applications; you'll probably find that at least one of them closely matches something that you do or that you use. Our goal is to help you build applications that are as fast as they are beautiful.

This article assumes that you're an experienced Notes/Domino application developer.


Database properties


There are a handful of database properties that are relevant to the performance of your application.

Don't maintain unread marks

If you check this box, unread marks will not be tracked in your application regardless of the settings you have for each view. We used client_clock to track the time spent opening a database, and what we saw surprised us. For a large application (say 20 GB with 200,000 documents), our Notes client could open the database in about five seconds without unread marks, including network traffic. With unread marks turned on, we had to wait an additional six seconds or more. This additional time was spent in GET_UNREAD_NOTE_TABLE and RCV_UNREAD. With unread marks turned off, these calls aren't made.

In a smaller database (less than 1 GB), we saw savings of maybe 0.5 seconds with unread marks turned off. Of course, it was faster to open that database with or without unread marks compared to the larger database. So you should consider whether or not your application needs the unread marks feature before you roll it out into production.

Optimize document table map

This feature has not changed for the past few releases of Lotus Notes/Domino. This feature is designed to speed up view indexing for applications with structures that resemble the Domino Directory. (In other words, they contain many documents that use one form and a small number of documents using a different form. Think of Person documents versus Server documents in the Domino Directory.)

The idea is that, instead of checking every document note to see whether or not it should be included in a view index, we make two passes. The first pass merely checks to see if the correct form name is associated with that document. The second pass, if needed, checks for the various other conditions that must be met to include this document note in the view index.

Note: Currently, this feature does not appear to improve indexing times, not even for Domino Directories.

Don't overwrite free space

This feature has not changed for the past few releases of Lotus Notes/Domino. If you uncheck this box, then whenever documents are deleted, Lotus Notes will actually overwrite the bits of data instead of merely deleting the pointer to that data. The goal is to make the data unrecoverable. You would only use this feature if you feared for the physical safety of your hard disk. For virtually every application, this extra physical security is unwarranted and is merely an extra step when data is deleted.

Maintain LastAccessed property

This feature has not changed for the past few releases of Lotus Notes/Domino. If you check this box, Notes will track the last time a Notes client opened each document in the database. Lotus Notes always tracks the last save, of course, in the $UpdatedBy field, but this feature tracks the last read as well. (It does not track Web browser reads, however.)

We have not seen this feature used by developers other than in knowledge base applications, where data is archived if it has not been read within a certain number of months or years.

Document collections

We've looked at customer code for many years in agents, in views, in form field formulas, and so on. In our experience, frontend performance problems tend to be more troublesome than backend performance problems for a number of reasons:

1-backend processes are typically monitored more rigorously.
2-backend processes frequently do not have to worry about network traffic.
3-frontend problems can be confusing to decipher. Users often are not sure which actions are relevant, causing them to report unimportant and even unrelated actions to your Support desk.

But regardless of where the code comes from, if we find that something is slow and we open up the code to examine it, we will likely find the following as a common denominator:

1- The code establishes certain criteria from context, such as user's name, status of document that user is in, today's date, and so on.
2- The code gets a collection of documents from this or another database.
3- The code reads from, and/or writes to, these documents.

From performing tests over many years, we have found that typically, the first step is very fast and not worth optimizing, at least not until bigger battles have been fought and won. The third step is often slow, but unfortunately it is not very elastic. That is, you are unlikely to find that your code is inefficiently reading information from or saving information to a set of documents. For example, if you are trying to save today's date to a field called DateToday, you would likely use one of the following methods:

Extended class

Set Doc = dc.getfirstdocument
Do while not (Doc is Nothing)
Doc.DateToday = Today
Call Doc.Save
Set Doc = dc.getnextdocument ( Doc )
Loop
ReplaceItemValue

Set Doc = dc.getfirstdocument
Do while not (Doc is Nothing)
Call Doc.ReplaceItemValue ("DateToday", Today)
Call Doc.Save
Set Doc = dc.getnextdocument (Doc)
Loop

StampAll

Call dc.StampAll ("TodayDate", Today)
In our testing, we've never found a difference in performance between the first two of the three preceding examples. Using the extended class syntax, doc.DateToday = Today, appears to be just as fast as using doc.ReplaceItemValue ("DateToday", Today). In theory, we should see some performance difference because in one case, we are not explicitly telling Lotus Notes that we will update a field item, so Lotus Notes should spend a bit longer figuring out that DateToday is, in fact, a field. However, practical tests show no difference.

The dc.StampAll method is faster if you are updating many documents with a single value as in the preceding example. There were some point releases in which a bug made this method much slower, so if you're not using the latest and greatest, please confirm this is working optimally (either with testing or by checking the fix list). But as of Lotus Notes/Domino 6.5 and 7, this is once again fast. However, there are often so many checks to perform against the data or variable data to write to the documents that dc.StampAll is not always a viable option. We would put it into the category of a valuable piece of information that you may or may not be able to use in a particular application.

As for deciding which of the three methods we should focus on, our experience says that the ReplaceItemValue example (getting a collection of documents) is the one. It turns out that this is often, by far, the largest chunk of time used by code and fortunately, the most compressible. This was the focus of oiur testing and will be discussed in the remainder of this section.

Testing

Our testing methodology was to create a large database with documents of fairly consistent size (roughly 2K) and with the same number of fields (approximately 200). We made sure that the documents had some carefully programmed differences, so that we could perform lookups against any number of documents. Specifically, we made sure that we could do a lookup against 1, 2, 3, … 9, 10 documents; and also 20, 30, 40, … 90, 100; and also 200, 300, 400, … 900, 1000; and so on. This gave us a tremendous number of data points and allowed us to verify that we were not seeing good performance across only a narrow band. For example, db.search is an excellent performer against a large subset of documents in a database, but a poor performer against a small subset. Without carefully testing against the entire spectrum, we might have been misled as to its performance characteristics.

We ran tests for many hours at a time, writing out the results to text files which we would then import into spreadsheets and presentation programs for the purpose of charting XY plots. After many such iterations and after trying databases that were small (10K documents) and large (4 million documents), we came up with a set of guidelines that we think are helpful to the application developer.

Which methods are the fastest?

The fastest way to get a collection of documents for reading or writing is to use either db.ftsearch or view.GetAllDocumentsByKey. It turns out that other methods (see the following list) may be close for some sets of documents (discussed later in this article), but nothing else can match these methods for both small and large collections. We list the methods with a brief explanation here and go into more detail later.

1- view.GetAllDocumentsByKey gets a collection of documents based upon a key in a view, then iterates through that collection using set doc = dc.GetNextDocument (doc).
2- db.ftsearch gets a collection of documents based upon full-text search criteria in a database, then iterates through that collection using set doc = dc.GetNextDocument (doc).
3- view.ftsearch gets a collection of documents based upon full-text search criteria, but constrains the results to documents that already appear in a view. It then iterates through the collection using set doc = dc.GetNextDocument (doc).
4- db.search gets a collection of documents based upon a non-full-text search of documents in a database, then iterates through the collection using set doc = dc.GetNextDocument (doc).
5- view.GetAllEntriesByKey gets a collection of view entries in a view, then either reads directly from column values or gets a handle to the backend document through the view entry. It then iterates through the collection using set entry = nvc.GetNextEntry (entry).

If you have a small collection of documents (for example, 10 or so) and a small database (for instance, 10,000 documents), many different methods will yield approximately the same performance, and they'll all be very fast. This is what you would call the trivial case, and unless this code is looping many times (or is used frequently in your application), you might leave this code intact and move on to bigger problems.

However, you still may find small differences, and if you need to get many collections of documents, then even saving a fraction of a second each time your code runs will become meaningful. Additionally, if your application is large (or growing), you'll find the time differences can become substantial.

Here are two customer examples: first, scheduled agents that are set to run very frequently (every few minutes or whenever documents have been saved or modified) that iterate through every new document to get search criteria, and then perform searches based on that criteria. If 10 new documents were being processed, then 10 searches were performed -- and if 100 new documents were processed, then 100 searches were performed. For this customer, if we could shave 0.5 second off the time to get a collection of documents, that savings was really multiplied by 10 or 100, and then multiplied again by the frequency of the execution of the agent. It could easily save many minutes per hour during busy traffic times of the day, which is meaningful. Another case is a principal form has a PostOpen or QuerySave event that runs this code. If you have hundreds of edits per hour (or more), this 0.5 second saved will be multiplied to become a noticeable savings.

Pros and cons of each method

When we explained to colleagues or customers why some of these methods are faster or easier to use than other methods, we often engaged in a spirited debate, complete with "on the one hand" and "on the other hand" arguments. To our great satisfaction, the deeper we've taken these arguments, the clearer the issues have become. We will attempt to invoke that same spirit in this article with two mythical debating opponents, Prometheus ("Pro" to his friends) and his skeptical colleague Connie (a.k.a "Con").

Prometheus: view.GetAllDocumentsByKey looks very fast. I think I'm sold on using it wherever I can.

Connie: All well and good, my friend, but what if you're looking up data in the Domino Directory? You can't get permission to create new views there easily.

Pro: Great point. OK, in applications where I control the lookup database, that's where I'll use this method.

Con: Oh? And if you end up creating an additional 10 views in that database, is it still a good method? Think of all the additional view indexing required.

Pro: That might appear to be a nuisance, but if I build streamlined views, they will likely index in less than 100 milliseconds every 15 minutes when the UPDATE task runs -- more if required by lookups. Surely we can spare a few hundred milliseconds every few minutes?

Con: How do you streamline these views? Is that hard? Will it require much upkeep?

Pro: Not at all. To streamline a lookup view, you first make the selection criteria as refined as possible. This reduces the size of the view index and therefore, the time to update the index and perform your lookup. Then, think about how you'll do your lookups against this view. If you're going to get all documents, consider simply using a single sorted column with a formula of "1." Then it's trivial to get all the documents in the view. If you need many different fields of information, consider making a second column that concatenates those data points into a single lookup. One lookup is much faster than several, even if the total data returned is the same.

Con: OK, I might be sold on that method. But you have also touted db.ftsearch as being very fast, and I'm not sure I'm ready to use that method. It seems like it requires a lot of infrastructure.

Pro: It is true that to use db.ftsearch reliably in your code, you'll need to both maintain a full-text index and also make sure that your Domino server's configuration includes FT_MAX_SEARCH_RESULTS=n, where n is a number larger than the largest collection size your code will need to return. Without it, you are limited to 5,000 documents.

Con: And what happens if the full-text index isn't updated fast enough?

Pro: In that case, your code can include db.UpdateFTIndex to update the index.

Con: My testing indicates that this can be quite time consuming, far outweighing any performance benefits you get from using db.ftsearch in the first place. And what happens if the full-text index hasn't even been created?

Pro: If the database has fewer than 5,000 documents, a temporary full-text index will be created on-the-fly for you.

Con: I have two problems with that. First, a temporary full-text index is very inefficient because it gets dumped after my code runs. Second, 5,000 documents isn't a very high threshold. Sounds like that would only be some mail files in my organization. What if there are more than 5,000 documents in the database?

Pro: In that case, using db.UpdateFTIndex (True) will create a permanent full-text index.

Con: OK, but creating a full-text index for a larger database can be very time consuming. I also know that the full-text index will only be created if the database is local to the code -- that is, on the same server as the code that is executing.

Pro: True enough. Fortunately, Lotus Notes/Domino 7 has some improved console logging as well as the ability to use Domino Domain Monitoring (DDM) to more closely track issues such as using ftsearch methods against databases with no full-text index. Here are a couple of messages you might see on your console log. As you can see, they are pretty clear:

Agent Manager: Full text operations on database 'xyz.nsf' which is not full text indexed. This is extremely inefficient.
mm/dd/yyyy 04:04:34 PM Full Text message: index of 10000 documents exceeds limit (5000), aborting: Maximum allowable documents exceeded for a temporary full text index
Con: While I'm at it, I see that you haven't said much positive about view.ftsearch, view.GetAllEntriesByKey, or db.search. And I think I know why. The first two are fast under some conditions, but if the view happens to be structured so that your lookup data is indexed towards the bottom, they can be very slow. And db.search tends to be very inefficient for small document collections.

Pro: All those points are true. However, db.search is very effective at time/date sensitive searches, where you would not want to build a view with time/date formulas and where you might not want to have to maintain a full-text index to use the db.ftsearch method. Also, if you are performing lookups against databases not under your control and if those databases are not already full-text indexed, it is possible that db.search is your only real option for getting a collection of documents.

Here are some charts to help quantify the preceding points made by Pro. These charts show how long it takes to simply get a collection of documents. Nothing is read from these documents and nothing is written back to them. This is a test application in our test environment, so the absolute numbers should be taken with a grain of salt. However, the relationships between the various methods should be consistent with what you would find in your own environment.

In Figure 1, db.ftsearch and view.GetAllDocumentsByKey are virtually indistinguishable from each other, both being the best performers. Call that a tie for first place. A close third would be view.GetAllEntriesByKey, while view.ftsearch starts out performing very well, but then rapidly worsens as the number of documents hits 40 or so.

Figure 1. Document collections, optimized views (up to 100 documents)


In Figure 2, the only difference worth noting from Figure 1 is that db.search looks better and better as the number of documents increases. It turns out that at approximately 5 to 10 percent of the documents in a database, db.search will be just as fast as the front runners. As we saw in Figure 1, view.ftsearch is getting worse and worse as the document collection size increases.

Figure 2. Document collections, optimized views (100 to 1,000 documents)


In Figure 3, the views are no longer optimized to put the results towards the top. That is, if we are getting a collection of only a few documents, then in our test environment, we can try to skew the results by making sure those few documents are towards the top or bottom of the lookup view. In Figures 1 and 2, those documents tended to be towards the top of the view, but in Figure 3, those documents are at the bottom. For three of the methods, this is immaterial (db.search, db.ftsearch, and view.GetAllDocumentsByKey). However, for view.ftsearch and view.GetAllEntriesByKey, this switch is catastrophic in terms of performance. The scale on Figures 2 and 3 had to be changed -- instead of the Y-axis going up to one second, it has to go up to six seconds!

Figure 3. Document collections, non-optimized views


Conclusion

Whenever feasible, use view.GetAllDocumentsByKey to get a collection of documents. In conjunction with this method, streamline your lookup views so that they are as small and efficient as possible. Part 2 of this article series has some tips for doing this.

If your lookups need to go against rich text fields, or if your database is already full-text indexed, db.ftsearch is an excellent performer and well worth considering. Be sure that your results will always be less than 5,000 documents or use the Notes.ini parameter FT_MAX_SEARCH_RESULTS=n (where n is the maximum number of documents that can be returned) to guarantee that you do not lose data integrity due to this limit.

Part 2: Optimizing database views


In "Lotus Notes/Domino 7 Application Performance, Part 1," we examined how you can improve the performance of Lotus Notes/Domino 7 applications through the efficient use of database properties and document collections. In part 2, we explain how you can build high-performing views. As in part 1, this article provides many code snippets that you can re-use and adapt to your own requirements.

Over many years of analyzing application performance issues, we found that views are frequently involved in both the problem and the solution. Often, view indexing is the issue. This article explains how this can happen and what you can do to troubleshoot and resolve this type of problem. But there is another kind of view performance problem that has been popping up more frequently over the past few years. This involves views that display reader access controlled documents. The performance problems seen in these views are often not indexing related, so we’ll take a little time to discuss these separately.

This article assumes that you're an experienced Notes/Domino application developer.

Understanding view indexing (the Update task)

The first thing you have to know before troubleshooting a performance problem that may involve view indexing is how the indexing process works. Indexing is typically done by the Update task, which runs every 15 minutes on the Domino server. Technically, it is possible to tune that interval, but it involves renaming files, so it is rarely done.

When the Update task runs, it looks for every database on the server with a modified date more recent than the last time the Update task ran. Then the task refreshes the views in those databases. Based on our experience, it is reasonable to assume that it takes approximately 100 milliseconds to refresh a normal view in a production database in a production environment.

The logical question to ask is, "What flags a view as needing to be updated?" Every time any of the following occurs a view requires updating:

1- Replication sends a document modification to a database
2- A user saves or deletes a document (and then exits the database)
3- The router delivers a document

The Update task is very liberal in how it determines if a view needs to be updated. For example, imagine that you’re in your mail file and that you change the BCC field of a memo and nothing else. No view displays the contents of that field, so in fact, no view needs to be refreshed. But nevertheless, views that contain this memo will at least be examined simply because the server is not sure whether or not the edits you made will force a change in those views.

There is a twist to this: Users can force an immediate high-priority request to have a view refreshed by opening the view or by being in a view when they make a document change. Let’s look at an example.

Suppose Alice opens the Contact Management database to view #1 at 9:02 AM. Suppose also that the Update task conveniently ran at 9:00:00 and will run again at 9:15:00, 9:30:00, and so on. Alice updates a couple of documents at 9:05 AM. She creates new documents, edits existing ones, or maybe deletes documents. In any case, the server immediately updates view #1 because she is in that view. It would be strange if you deleted a document and still saw it in the view, right? So that view is updated right away. But the other views are queued for updating at 9:15 AM.

Now imagine that Bob is also in our Contact Management database at 9:02 AM. He is working in view #1. At 9:05 AM, he sees the blue reload arrow in the upper left corner, telling him that the view has been updated. He can press the F9 key or click the reload arrow, and the view will quickly display the updated contents. He doesn’t have to wait for a refresh.

Further, suppose Cathy is also in the database at 9:02 AM, but working in view #2. If she does not do anything to force a refresh in the interim (which is possible, but unlikely), then she sees the blue reload arrow at 9:15 AM when her view is refreshed by the Update task. More likely, though, Cathy makes some data updates of her own, scrolls through the view enough to force an update, or switches views. Any of these actions would force an immediate update.

At 9:15 AM, all the views that these clients have not already forced to refresh are refreshed, and we start all over again.

Additional information on view indexing

There are two additional pieces of information that we find helpful. The first is the full-text indexer. Typically, full-text indexes that are marked to update immediately are updated within seconds (or minutes if the server is busy). Full-text indexes that are hourly are updated (triggered) by the Update task at the appropriate interval. Full-text indexes that are daily are updated (triggered) by the Updall task when Updall runs at night.

The second additional point is that the developer can set indexing options in the view design properties. Many people misunderstand these to be settings that affect how the Update or Updall tasks will run. This is not so. The Update and Updall tasks will completely ignore these settings, updating views based solely on the criteria described in the preceding example.

The indexing options affect what happens when a client opens a view. If the options are set to manual, for example, then the user gets a (possibly stale) version of the view very quickly. If the options are set to "Auto, at most every n hours" and if it has been more than the specified time interval, then the user will have to wait a moment while the view is updated, just as if it were a regular view with Automatic as the view indexing option. We will discuss how these indexing options can be used to help craft useful views later in this article.

Quick tips on troubleshooting view indexing

In a well-tuned environment, indexing should be relatively transparent; views and full-text indexes should be updated quickly. In an environment experiencing performance problems, you may see the following symptoms:

1- Long delays when opening a database, opening a view, switching views, scrolling through a view, or saving a document
2- Delays when opening a document that uses lookups (In fact, you may see the form pause at certain points as it waits for these lookups to compute.)
3- Performance problems throughout the working day, but excellent performance during the off-hours
4- Out-of-date full-text indexes

The following logging and debugging Notes.ini parameters can help you address these issues:

log_update=1
Writes extra information to the log.nsf Miscellaneous Events view. This consists of one line of information every time the Update task starts updating a database, and a second line when the Update task finishes with that database. Each line has a time/date stamp, so by subtracting the times, you can get an approximate time (rounded to the nearest second) required to update that database.
log_update=2
Writes more information to the log. In addition to the data generated by log_update=1, this setting adds one line of information for each view in each database. Thus, a database with 75 views would have 77 lines -- one line to signal the start of the Update task against this database, 75 lines -- one for each view, and then a final line to denote the end of indexing in this database.

debug_nif=1
By far the most verbose way to collect information on indexing, debug_nif will write to a text file (which you specify using debug_outfile=c:\temp, for example). This can easily generate gigabytes of data per hour, so it should be used sparingly, if at all. The value of this debug variable is that it gives you the millisecond breakdown of all indexing activity, not just the Update task running every 15 minutes.
client_clock=1
Used on your client machine, not the server, this will write verbose information to a text file (which you specify using debug_outfile=c:\temp, for example), breaking down every client action you perform. This can be used to determine, for instance, whether or not a delay that the client sees is caused by long wait times for indexing.

To troubleshoot view indexing problems, start by collecting some data from log.nsf by setting log_update=1 (if not already set) and allow the server to collect information on view updates for a day. Then review the log.nsf Miscellaneous Events view for that day and look for patterns. Some meaningful patterns that may emerge are:

1- Very long update times for a specific database. This may indicate that the database has too much data being updated, too many complex views, or time/date-sensitive views (if the server is release 5 or earlier). The logical next step is to examine the business use and design of that database and perhaps to use log_update=2 for a day to pinpoint which views are problematic in that database.
2- Very long update cycles. We’ve seen cycles that have lasted four to five hours, meaning that instead of passing through all modified databases every 15 minutes, the Update task may only do so two or three times every day. This may indicate general performance problems affecting all tasks, or it may indicate a very high user or data load on the server. The logical next step would be to assess the general state of the server and the business use on it.

Whenever you find long update times, it is helpful to note which other tasks are running concurrently and whether or not those tasks are behaving normally. It is also helpful to note if the slow indexing times are universal, only during business hours, or only during certain peak times, such as every two hours. It is rare that a single observation will give you all the information you need to solve your problems, but it usually puts you on the right path.

View performance testing

To test the performance of various features and ways of building a view, we created a database with 400,000 documents and ran a scheduled agent to update 4,000 documents every five minutes. We then built approximately 20 views, each with slightly different features, but each displaying all 400,000 documents, using five columns. We will list the details later, but the big picture looks like this:

1- The size of your view will correlate strongly with the time required to rebuild and refresh that view. If your view doubles in size, so too will the time to rebuild or refresh that view. So, for example, if you have a 100 MB database that you expect will double in size every six months, and if you find that indexing takes approximately 30 seconds now, then you should anticipate that indexing will be 60 seconds in six months, then 120 seconds six months later, and so on.
2- The biggest "performance killers" in terms of view indexing are categorized columns. Sort columns add a small amount of overhead as do some other features, such as the Generate unique keys option (more on this in the tips at the end of the article).

Figures 1 and 2 demonstrate both the relationship between size and refresh time and the significant overhead that categorized columns add to your view performance. Figure 1 plots view index size against response time.

Figure 1. View index size versus response times


And Figure 2 shows view size compared to refresh times.

Figure 2. View size versus refresh time


In both charts, the first and third categorized views are initially expanded, and the second and fourth views are initially collapsed. Collapsing your categorized views shows a small, but discernible savings in refresh time.




Reader Names field

Over the past few years, we have seen a dramatic increase in the number of critical situations related to Reader Names fields and views. Customers find that performance is unacceptable frequently throughout the day. They cannot understand why because the code reviews and pilot tests all went smoothly. But a few months after rollout to their entire department/division/company, they find that complaints about performance are dominating conversations with their employees, and something has to be done.

Examples

Our favorite examples are HR and Sales Force applications -- both of which typically have stringent Reader Names controls. The following table illustrates a hypothetical scenario for a company using a database of 400,000 documents.










Title/role Number of documents user can see (out of 400,000) Percent of database
Corporate HQ, CEO, CIO, Domino administrators, developers400,000100 percent
District Manager4,0001 percent
Manager4000.1 percent
Employee400.01 percent


Upper management and the Domino administrators and developers typically experience pretty good performance, which makes the first reports of poor performance easy to ignore. Performance problems are typically exacerbated by weaker connectivity (WAN versus LAN), which may also make the early complaints seem invalid. But after some critical number of complaints have been registered, someone sits down at a workstation with an employee and sees just how long it takes to open the database or to save a document, and then the alarm bells start ringing. Figure 3 shows how long it took a user to open a view in the sample database with 400,000 documents. All views were already refreshed. No other activity was taking place on the test server, and only one user at a time was accessing the server.

Figure 3. Time required to open a view in the sample database


Users are denoted by the percentage of documents they can see. If you cannot see a bar, it means that it’s very close to zero.

Before moving on to an explanation of Reader Names fields and performance, we want to leave you with one thought about Figure 3: You can quickly see why users of the flat (sorted, not categorized) views would have such radically different impressions of the performance of this application if they were 0.01 percent users compared to 100 percent users.

Understanding Reader Names

When an application uses Reader Names, it means that some or all documents have Reader Names fields -- that is, fields that are Summary Read Access. Summary means that the data can be displayed in views. Technically, you can create Reader Names fields and prevent them from having the summary flag, but then Lotus Domino cannot properly ensure security of the documents at the view level. Read Access means just that -- who can see this document? Typically, individual names, group names, and [roles] populate these kinds of fields. We encourage the use of [roles] over groups whenever possible for reasons of maintenance and control, but in terms of performance, these are all equivalent.

When a user opens a view that contains these reader access controlled documents, the server has to determine whether or not this user is allowed to see each document. The server is very methodical, starting at the top and checking each document. The process is similar to the following:

1- Lotus Domino examines document #1, determines that the user can see it, and then displays it.
2- Lotus Domino examines document #2, determines that the user can see it, and then displays it.
3- Lotus Domino examines document #3, determines that the user cannot see it, and then hides the document from the user.

The process continues until Lotus Domino examines all documents in the view. Of course, the view has to be refreshed first, otherwise the server wouldn’t know which document was #1 or what its Reader Names values were. Let’s pause in this explanation to consider a classic example of poor performance. Imagine for a moment that our database is a Sales Tracking application, and it has a By Revenue view that is sorted in descending order by size of contract. Imagine that the view has five columns with the first two columns being sorted (for example, Revenue and SalesRep). There are no categories. In this case, when a user opens the view (a view that potentially displays 400,000 documents), this user forces a refresh first, if needed, and then the server starts with document #1 and checks to see whether or not that document can be displayed to the user, then it checks documents #2, #3, and so on.

You are presumably reading this article via a Web browser, and you’re perhaps familiar with the fact that browsers render portions of a page quickly (text) and other portions (such as graphics) more slowly. But a Domino server does not serve up views this way. It doesn’t send any portion of the view data to the user until it has what it considers a full screen of information. For practical purposes, this may be, for instance, 60 rows of data. You can test this yourself with a Notes client by opening a flat view (no categorized columns) and pressing the down arrow key. At first, the view scrolls quickly. Then there is a pause. This is when the server is dishing up another few KB of data for you. Then the view scrolls quickly again for another 60 or so rows. This repeats as long as you hold down the arrow key.

Back to our user: It turns out that he can see document #1, but perhaps he cannot see the next 10,000 documents. Why? Maybe he can see only 40 documents in the whole database. Because the Domino server is not going to show him anything until it can build about 60 rows of information (or until it gets all the way to the bottom of the view), it’s going to be a long wait.

This is exactly what users experience. They wait for minutes sometimes, while the server checks each document. In the case of a user who can see so few documents that they don’t fill the screen, the server goes through the entire view before sending the user the partial screen of data that he is allowed to see.

Some readers are aware of this functionality. Some are now screaming in pain. You can imagine that as someone who routinely analyzes applications for performance problems that this kind of problem would be one that we spot quickly and look to resolve with workarounds.

One workaround is to make the view categorized. The reason this helps performance is that the categories will always be displayed, and they take up a row of data. So in our example of a view By Revenue, the user would quickly see dozens of categories displaying, say, Revenue and SalesRep. Clicking the twistie to open any category results in a quick round trip to the server to determine that the underlying document is not viewable for this user, but that’s a far cry from having to examine 400,000 documents in one go.

On the other hand, finding that virtually every twistie the user clicks expands, but doesn’t show any documents, may be frustrating. In our example, there are some other possible workarounds, but we would assert that a view showing all documents by revenue is probably not a view that users with access to only 0.01 percent of the database should have access to. It is a contradiction of their access. But there are many cases in which it makes business sense to have many users accessing a view that contains more data than they should see. At the bottom of this section, we list some tips for building fast views in databases with reader access controlled documents.

We have two closing thoughts about Reader Names and views. First, in a normal production environment, it’s rare to get such clear data as we have in Figure 3. You’re much more likely to see that the performance problems caused by Reader Names cause a host of other performance delays, such as agents, view indexing, and so on. These problems may cause other problems. For example, you may notice that the views are slow for everyone and deduce that the problem is view indexing -- when the underlying problem is that Reader Names are forcing long computations by the server when some users open views.

Second, our tests (and Figure 3) represent an absolutely best-case scenario. The reason is that we had only one user at a time. We could duplicate the volume of data and even indexing issues with our scheduled agents, but our tests could not show the spiraling performance problem that occurs when hundreds of users simultaneously try to access views with reader access controlled documents. If your application uses Reader Names, you absolutely have to pay attention to view performance if you want to avoid a crisis.

Performance enhancing tips with reader access controlled documents

The following are some tips for making applications/views that perform well even with reader access controlled documents:

1- Embedded view using Show Single Category. This is the winner, hands down. If your data is structured so that users can see all the documents in a category, then you can display just the contents of that category very quickly to the user. In some cases, it may make sense to let the user switch categories, in which case you have to consider whether or not he can see the contents of the other categories. But in most cases, the view would be something like My Sales and would show all the sales documents for the current user. The caveat for this kind of view is that the user interface for the Notes client is not quite as nice as the native view display. For Web browsers, it is just as good, and we have never seen a reason not to use this kind of view for Web browser applications. In fact, the performance is so good that it’s faster to open one of these with reader access controlled documents than to open a native view without reader access controlled documents!
2- Local Replicas. For a sales tracking database, many companies use local replication to ensure that their sales reps can use the database when disconnected. This is a great solution in many ways, but reader access controlled documents can be tricky when their Reader Names values change, and they need to disappear from some local replicas and appear on others.
3- Shared, Private on first use views. This is an elegant solution for Notes clients, but there are some drawbacks. First, it cannot be used for browsers. Second, the views need either to be stored locally, which can be problematic for some customers, or on the server, which can be a performance and maintenance problem of its own. Third, as the design of the application changes, it can be tricky updating the design of these (now) private views. And fourth, some customers have experienced performance problems, which may be related to having large numbers of these private views being created and used.
4- Categorized views. As seen in Figure 3, categorized views can be very fast to open with respect to Reader Names. They are bigger and slower to index, but typically they eliminate the Reader Names performance issue. The real caveat here is that users may find these views to be unfriendly, a label no one wants to have on their application.

The final tip concerns something to avoid. The feature "Don’t show empty categories" could, in theory, be used very successfully with the preceding tip to make a categorized view that would only display the categories containing documents that a user can see. However, in practice, it will result in a view with performance characteristics akin to a flat view, so it is probably a feature to avoid if performance is important.




General view performance tips

Here are some tips about view performance, regardless of whether or not reader access controlled documents are present.

Time/date formulas

Using @Now or @Today in a column or selection formula forces a view rebuild every time the view is refreshed. Therefore, if you use this kind of formula, consider making the view refresh Manually or "Auto, at most every n hours." That way, when users open the view, they will not have to wait for a view rebuild (or rarely so, in the latter case). The downside to this is that the view may be out-of-date because it opened without trying to refresh. Consider the contents of these views and how fast they change to determine whether or not you can use these indexing options safely.

Use click-sort judiciously

Click-sort is a brilliant feature, one that we use in most public views in our applications. But it’s worth checking your hidden lookup views to be sure that no columns have click-sort enabled because having it increases the disk space and indexing requirements without adding any functionality for your users. On this same topic, consider using ascending or descending, but not both, when you use this feature to save on disk space and indexing time without impairing functionality. We would argue that the functionality is preferable when there is only one arrow because it is an on-off toggle rather than a three-way toggle.

Generate unique keys in index

One of the lesser known performance enhancing features, the "Generate unique keys in index" option for lookup views can dramatically improve lookup times. When selected, the view discards all duplicate entries. For example, if you have 400,000 documents all displaying only CompanyName, and there are only 1,000 unique company names, then selecting this feature results in only the first document for each company being held in the view index. Although there is a slight overhead to this feature, it is easily outweighed by the fact that your view may dramatically decrease in size. One warning is that if you are displaying the results of a multi-value field, you need to use a code trick to avoid losing data.

For example, imagine a database has 100,000 documents, all containing a multi-value field called Industry. You want to look up a unique list of industries already selected so that the Industry drop-down list can display the values when users create new documents. If you use this unique keys feature in a view that has a single column displaying Industry, it is possible that a new document containing the values Automotive and Air would not display because another document containing Automotive was already being displayed in the view. Thus, your new value of Air would never be picked up by the lookup formula.

To avoid this problem, use a formula such as @Implode(Industry; "~") in the column formula, and then the drop-down field lookup formula may be @Explode(@DbColumn("Notes"; ""; "(LookupViewName)"; 1); "~"). Although you will have some minor duplication of data in your view, you will be guaranteed no data loss.

Color profiles

If you use color profiles (as in the mail template), then any change to that color profile document necessitates a view rebuild for any views that reference that color profile.

Default view

Whenever your application is opened, your default view is opened. Always be sure that this is a view that opens quickly. An interesting example of this is your mail file. For many users, the view that opens by default is the Inbox (folder). If a user is not in the habit of cleaning out her Inbox and if it has many thousands of documents, then it will be slower to open her mail file than if she regularly cleaned out her Inbox and if the view/folder had only dozens or hundreds of documents. Incidentally, this is why some companies have a policy that old documents in the Inbox are deleted periodically. It forces users to move those documents into other folders, thus improving performance both for that user and the server.

Conclusion

The conclusion we hope you draw from this article series is that application performance is something you can influence. With thoughtful application of common sense (and perhaps some of the information and tips in this article), you can keep the performance of your applications at a level that is more than acceptable for your users.

By : IBM

Here's a general-purpose Author and Reader field troubleshooting guide. This document explains how these fields work and the causes of common perplexities.

There's a helpful tool that analyzes and explains your access to read or edit the document, but you should still read the information below so you'll understand what the tool does and what can go wrong.

To begin, look at one of the documents that has the readers fields in it. If the documents are hidden even from you, the database manager, then you will have to access the database from the server console, or use full access administration, which lets you see all documents regardless of Reader access. You may also have to temporarily disable the local security property ("Use consistent access control") in the database ACL.

TIP: To prevent the database manager/administrator ever losing read access to documents, put a Computed when Composed Authors field (not a Readers field) on every form, with a formula such as "[Admin]". This will allow all members of [Admin] role to read the document, without preventing others from reading it as a Readers field would.

From a view, open the Document Properties box. Here's an example of what it should look like:


Highlight each of your reader fields in turn and look for differences from this ideal image. Note that:

- Field flags must say READ-ACCESS NAMES. If not, read in Domino Designer help about IsReaders property of NotesItem.
- If there are multiple values, they should appear as separate quoted strings on separate lines. In this case there are two values, "CN=Rex Rightway/O=Pings" and "[Admin]". If instead the value were listed as "CN=Rex Rightway/O=Pings, [Admin]" (or any other character instead of comma) then the problem is that you are storing multiple values as a string instead of a proper multivalue. If this is the problem:
- Make sure the multivalue checkbox is checked in the field properties.
- If you assign the field with LotusScript, assign it from an array with one value in each element, as opposed to a string with some character in it as the delimiter. Example: Dim readerNames(0 to 1) As String
readerNames(0) = "CN=Rex Rightway/O=Pings"
readerNames(1) = "[Admin]"
doc.Readers = readerNames
' If field did not previously exist, tell Notes
' it's a Readers field.
doc.GetFirstItem("Readers").IsReaders = True

- If assigning the field from macro language, say FIELD Readers := "CN=Rex Rightway/O=Pings" : "[Admin]" as opposed to "CN=Rex Rightway/O=Pings;[Admin]" (for instance).
- If you use a username, you must use the full canonical username. If you store "Rex Rightway/Pings" the readers field will not work in all situations. You need that CN= garbage.
- Make sure there are not excess characters in the name, e.g. leading or trailing or multiple consecutive spaces. In other words, always use @Trim.
- If the field is not visible in the document properties from the view, but only when the document is open, you've probably made the field Computed for Display, or created the document using an agent that neglected to set the field as shown above, so that it's not being stored in the document when it's saved. To take effect, a readers field must actually be saved in the document, before the user tries to access it. Notes looks at the document, not the form, when deciding whether to grant access.
- Should you have a problem with users being able to read the document whom you think should not have access:
- Check whether the READ-ACCESS flag is set on your Readers fields as shown above.
- If all Readers fields are blank, anyone can read the document.
- An Authors field also allows read access (see below).

Author fields work like Readers fields for all the above tests, except that:

- the Field flags should read READ/WRITE-ACCESS NAMES instead of READ-ACCESS NAMES, and
- a blank Authors field does not enable everyone to edit the document. "*" does, though.

View - A view is one way of looking at all or some of the documents in the database. By selecting the View menu, you'll see a list of available views for the selected database.

Collection - Each view in a Notes database has a corresponding collection in the database. A collection contains a series of indexes related to the particular view. The collection indexes consist of:

1. An index sorted by note number.
2. The user-defined view collection (defined in the Design View Sort dialog)
3. An index of parent-child documents within the view.

Refresh - Refresh a view by pressing F9. Refreshing the view reads from the index and repaints the screen accordingly. It does not rebuild the index.

Rebuild - Rebuild a view by pressing Shift+F9. Rebuilding a view makes a call to NIF, and causes the current view's collection to be completely rebuilt.

Update - Pressing CTRL+SHIFT+F9 updates or builds all the views in the database. This includes both views which are hidden and views which are open. Pressing CTRL+SHIFT+F9 will work on server-based or local databases. If the views are not built, Notes will build them. If the views are already built, Notes will update them, not rebuild them.

The Notes 'Indexer' is actually a composite of three different subsystems. They are described below.


Discussion of NIF

The Notes Indexing Facility is comprised of a set of functions that the Notes server uses to manipulate indexes. Most of these calls are made by the server as users access databases (remember that Notes is client-server).

As a user makes changes to documents in a database, the current view for that user is updated immediately. This process is necessary so that the user sees their changes as they occur.

When a user changes documents within one view, and then switches to another view, NIF sees that the database has been modified since the last time the collection was rebuilt, so it forces an update to the collection. When the new view is open, it will reflect the updated documents. Note that this update may take some time if many documents are involved (other user's updates are included), or if the view is complex.

An indicator at the top left corner of the view (a reload arrow) signals that a view is out of date. That is, the database contains newer information than that reflected on the screen. Pressing F9 refreshes the view. Refreshing the view reads the index from the database, and paints it on your screen. Clicking the indicator with the mouse button also refreshes the view. (In Notes 3.x, the indicator that a view is out of date is a black question mark against a yellow background.)

The workstation normally stores the current chunk (called a page) of the index in memory. When you scroll back and forth within the view, you notice that scrolling down once or twice is quick, but the third times takes a few moments. This is because that third scroll has pushed you past the page of the index held in the workstations memory. A new page must be read from the database. This is when you may notice the refresh arrow. It will appear when you read a page of the index, and documents have been modified in the database since the view was last updated.

As you scroll up and down, you will see new documents in the view as they are updated by other users. The refresh arrow will display until you press F9. The view you have open may or may not change after pressing F9. This will depend on whether the modified document are supposed to appear in this view, or some other.

Discussion of Update

$Update.EXE is a process that runs continually, checking a work queue for databases to update. It checks the queue every 5 seconds. If it finds a queued request, it opens each collection within the specified database, forcing the update of the collections' indexes.

Update is normally specified on the ServerTasks= line of the NOTES.INI. For example:

ServerTasks=Replica, Router, Update

The Update process works off a queue, and updates database indexes as requested. Requests normally comes from the Replicator, Router, or a user.

Replicator When a database replicates, the Replicator adds an entry to the Update queue.
Router When the Router adds a document to a database, it also adds an entry to the Update queue.
User When a user closes a database, the workstation adds an entry to the Update queue if the database has been modified.

Update will wait to combine multiple request for the same job. About 15 minutes after the replication ends, the Update queue kicks in, updating all the indexes on the database. When Update has refreshed all the collections, it updates any databases that have Full Text indexes with the Index Update Frequency set to 'hourly'.

When a user closes a database, an entry is added to the Update queue. About 15 minutes later, the collections for that database will be updated to reflect the new documents that the user changed. If another user goes into the database before Update has updated the collections, the collection that the user opens is forced to update immediately. Note that this update may take some time if many documents are involved, or if the view is complex.


Discussion of Updall

$Updall.EXE is the single-instance version of $Update. It runs until it has processed each database, and terminates. It does not work from a queue. Like $Update, $Updall opens databases, forcing the update of the collection. When Update has rebuilt all the collections, it updates any databases that have Full Text indexes. Updall also purges delete stubs from databases.

Updall is normally specified on the ServerTasksat2= line of the Notes.INI. For example:

ServerTasksat2=Updall


Glossary of Notes Indexer Terms:

preload preload preload