Not missing a hit in SAP BusinessObjects using HTTP traffic logs

Monitoring the number of views of an important corporate document is a key need for any company who is aiming for better resource management and company-wide alignment. This is especially important in a migration project in order to define what content to migrate or in a document maintenance process to prioritize assignments. SAP BusinessObjects Auditing is able to register when documents are viewed or retrieved but it has certain limitations, so most of the users activity is lost. This article explains an alternative method to collect the number of views of any SAP BusinessObjects file type or web page that sits on our web server so they can complement our current auditing reports.

The Scenario

Our case was a customer running SAP BusinessObjects 3.1. one who wanted to migrate to SAP BI4, and the first question they asked for is about usage load per document type. The documents they wanted to monitor were:

  1. Web Intelligence and Desktop Intelligence documents
  2. Web Intelligence and Desktop Intelligence instances
  3. PDF or Excel instances
  4. SAP Dashboards (Xcelsius) (flash format .swf)
  5. Agnostic documents (pdf, xls, ppt, doc)
  6. Open document calls to any of the above
  7. Explorer Information Spaces
  8. HTML ad-hoc portals located in Tomcat Server

 

The SAP BusinessObjects Audit database contains part of this information, but it has strong limitations:

  • If a document is deleted the historical hits go away
  • If an instance disappears – which is very common, because instances are automatically deleted when the limit is reached – historical hits go away
  • It does not register hits of any other kind (such as items 3-8 from the list indicated above)

 

Given the need for a complete set of information, we decided to explore the HTTP traffic logs coming from the SAP BusinessObjects Tomcat Server to see if we can get this missing information. Here is the result.

 

Tomcat Logs

The files containing the HTTP traffic are generated every day in the following folder: $TOMCAT_HOME/logs. This generation does not occur by default, you can enable it by uncommenting the following sentence in the server.xml file located at $TOMCAT_HOME/conf :

<Valve className="org.apache.catalina.valves.FastCommonAccessLog Valve"..directory="logs" prefix="localhost_access_log." suffix=".txt"  pattern="combined" resolveHosts="true"/>

 

Immediately the files start generating a lot of information. Find below an example of a “View” hit on a WebI report:

10.60.89.150 - - [04/Apr/2013:08:19:18 +0200] "GET /AnalyticalReporting/WebiView.do?cafWebSesInit=true&appKind=InfoView&service=/InfoViewApp/common/appService.do&loc =en&pvl=nl_NL&ctx=standalone&actId=533&objIds=444645&containerId=444634&pref =maxOpageU%3D10%3BmaxOpageUt%3D200%3BmaxOpageC%3D10%3Btz%3DEurope %2FBerlin%3BmUnit%3Dinch%3BshowFilters%3Dtrue%3BsmtpFrom%3Dtrue%3Bprompt  ForUnsavedData%3Dtrue%3B HTTP/1.1" 200 4003 "http://boserver/InfoViewApp/listing/headerPlus.do?lastPage=home" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E; InfoPath.3; MS-RTC LM 8)"

 

With a quick check we detect there are very interesting fields for us:

  • User Name: Remote host name or IP address (if resolveHosts is false)
  • Date and Time
  • First line of the request (method and request URI)
  • HTTP status code of the response

 

In the last 2 fields is where most of the information is located. All the hits to different objects and their object Id or calls to http pages are registered here, as well as user names occasionally. If you wish to customize the format of the file there are many possibilities, just check from the Tomcat's vendor’s page.

So now we have the data we need. Next challenge is how to pick up the hits and transform them in order to fit in our auditing reports.

 

Data Transformation

The stages for data transformation & enrichment proposed here are as follows;

  • Consolidation of the source and hits filtering
  • Data enrichment & normalization:
    • Storage of deleted files and instances information
    • Assignment of the hit to the document – not the instance
    • Enrichment of this info with attributes like Kind, Name, Owner
    • Replacement of the IP with the User Name
    • Incorporation of other hits like Client Tools coming from the BO Audit database
    • Final hits filtering: Exclusion of hits from Administrators or instance owners

 

For a full sequence of the transformation you can check  the following diagram (Fig.1.):

 

Fig.1 Flow to obtain the number of Views of any SAP BusinessObjects document

 

The tricky part here is the “hits filtering” (initial one) which is mainly based on experience. Whatever is the hit you want to visualize, just reproduce it, and right after open the log file and search for it to see how it looks like and determine the perfect filtering pattern.

In our specific case a single SAP BusinessObjects Universe was used and for any new hit type we wanted to monitor a predefined condition was added with an OR statement. Below a few examples based on the source fields Method and HTTP:

  • Open document calls:
    • HTTP is equal to “-”
    • Method contains {“/OpenDocument/opendoc/openDocument.jsp?”}
    • Web Intelligence hits:
      • Method contains {“/WebiModify.do?”, “WebiView.do?”}
      • Desktop Intelligence hits:
        • Method contains {“FullClientModify.do?”, “FullClientView.do?”}
        • Agnostic documents and hyperlinks:
          • Method contains {“PlatformServices/content/view.do?”, “/hyperlink/view.do?”}

 

Once determined all the filtering patterns for the hits we need to monitor, we can enrich this information with data coming from the CMS System database as well as the CMS Audit Database. The result of all these data transformations is a single file containing hit information by:

  • Day & Time
  • User
  • Object Code & its attributes (kind, name, owner)

 

Data monitoring

 The resulting data set can be monitored in any BI visualization tool, find below a sample in Web Intelligence (Fig.2):

 

 Summary

In this exercise of data exploitation, transformation and monitoring, we have improved the auditing capabilities of SAP BusinessObjects due to the customer requirement of “not missing a hit”.

The benefit of this method is to provide hits that were impossible to detect previously:

  • Deleted content
  • Agnostic documents like pdf, xls, ppt, doc, flash
  • Explorer Information Spaces
  • Open Document calls
  • HTML portals and any other web page accesses

This will allow Administrators and IT managers to have a flavour of what content to target and prioritize in migrations or maintenance tasks, which will improve drastically their efficiency.

If you have questions about this article, or if you want to share your experience or tips, please feel free to leave a comment.