Troubleshooting an Unresponsive Blackboard Application

When troubleshooting any problem in Blackboard it is advisable to first check the system logs, especially if the problem is not forthcoming. This can help indicate the source of the problem or at least provide a starting point. The bb-services log is the primary Blackboard application log and is often the best place to start for any troubleshooting. In the case of an unresponsive application the stdout-stderr log is particularly valuable in determining if the application server’s JVM failed to start or crashed and perhaps an indication why.

Although it would be difficult to exhaustively list every possible cause of an unresponsive application, the following is a list of the more common causes that Blackboard may be unresponsive (i.e. “Service Unavailable”, “Page cannot be displayed”, etc):

 

STOPPED SERVICES

The Blackboard application will be unresponsive if its services are not running. Check whether the Blackboard services are running by navigating to Administrative Tools > Services. The following services should be running and set to automatic startup:

1.       Bb-Tomcat

2.      Bb-Collab*

3.       IIS Admin Service

4.       World Wide Web Publishing

If any of these services are not running, open a command prompt and navigate to X:\blackboard\tools\admin and run the following: “ServiceController.bat services.start”

If the services start check the website again (it may take a few minutes for the tomcat to begin serving the Blackboard application.) If the services fail to start check the bb-services-log and stdout-stderr-logs located under X:\blackboard\logs, and X:\blackboard\logs\tomcat respectively for details.

*On collaboration server only.

 

SERVICE OWNERSHIP

The blackboard application may be non-responsive or buggy if its services, application pool, and website are not owned by a domain account (this should be a domain account dedicated to Blackboard.) Make sure that the Blackboard services are owned by the by the Blackboard service account and not a local account by navigating to Administrative Tools > Services and check the “Log On As” field of the Bb-Tomcat and Bb-Collab services. If this set to “Local System” then Right Click > Properties > Log On and enter the credentials of the Blackboard service account. After setting the ownership restart both services.

Ensure that the Default Application Pool is owned by the Blackboard service account by Right Click > Properties > Identity within IIS Manager.

Ensure that anonymous access is enabled with the Blackboard service account for the Blackboard website by Right Click > Directory Security > Authentication and access control > Edit within IIS Manager.

 

HOSTNAMES

Incorrectly specifying the application server, database server, or content share location in bb-config.properties will result in a non-responsive or malfunctioning application.

The Blackboard application \ webserver should be configured with the following syntax:

bbconfig.webserver.fullhostname=AppMachineName.DomainName.edu

bbconfig.appserver.fullhostname= AppMachineName.DomainName.edu

bbconfig.appserver.machinename=AppMachineName

bbconfig.appserver.domainname=DomainName.edu

 

The database server should be configured with the following syntax:

bbconfig.database.bbadmin.machine.machinename=DBMachineName

bbconfig.database.bbadmin.machine.fullhostname=DBMachineName.DomainName.edu

bbconfig.database.bbadmin.machine.instancename=

NOTE: If using the default instance, then “instancename” should be left blank.

 

The content share should be configured to point at the shared content folder (UNC path if content share is on a different server.) For example:

bbconfig.base.shared.dir=//BbFileServer/BbContent

bbconfig.base.shared.dir.win=\\\\BbFileServer\\BbContent

 

Although not critical to the function of Blackboard, collaboration and mail must be configured properly within bb-config.properties for these functions to work.

Collaboration should be configured with the following syntax:

bbconfig.collabserver.fullhostname.default=CollaborationMachineName.DomainName.edu

bbconfig.collabserver.run.on.localhost=false

NOTE: In a loadbalanced environment run.on.localhost should only be ‘true’ on the dedicated collaboration server. In a single-server environment run.on.localhost should be ‘true’ on the appserver and the collab hostname should be the FQDN of the localhost.

 

Mail should be configured with the following syntax:

bbconfig.smtpserver.hostname=SMTPMachineName.DomainName.edu

 

PASSWORDS

Mismatched passwords between a Blackboard application and Blackboard databases may prevent the application from communicating with its databases. Ensure that the passwords match within bb-config.properties and the databases. The following entries in bb-config.properties should match the passwords of the Blackboard databases’ security logins (bb_bb60, bb_bb60_report, bb_bb60_stats, bbadmin) and the passwords listed in the instance table of the bbadmin database:

bbconfig.database.bbadmin.db.password

antargs.default.users.administrator.password

antargs.default.users.integration.password

antargs.default.users.rootadmin.password

antargs.default.vi.db.password

antargs.default.vi.stats.db.password

antargs.default.vi.report.user.password

The following parameter in bb-config.properties should match the sa password of the database instance where the Blackboard databases reside:

bbconfig.database.bbadmin.machine.systemuserpassword


BLACKBOARD WEBSITE

If the Blackboard services are running make sure that the blackboard website is running and that any other site bound to port 80 (i.e. the default website) is stopped or removed from the IIS Manager.

Check whether the Blackboard website is available internally. This can be accomplished by navigating to ‘localhost’ from a browser, or browsing the Blackboard website from IIS. If the Blackboard site can be viewed on the server but not via the internet then a firewall is blocking communication over port 80.

 

JAVA_HOME

If Java is incorrectly configured the Blackboard application will be unresponsive. Ensure that the JAVA_HOME variable is pointing to the Java JDK folder and the slashes are correct in the bb-config.properties file. Also check that a JAVA_HOME environment variable exists and points to the same folder location. An environment variable can be added by Right-Click My Computer > Properties > Advanced > Environment Variables. The correct syntax in bb-config is as follows:

bbconfig.java.home=C:/Java/jdkX.X.x_xx

bbconfig.java.home.win=C:\\Java\\ jdkX.X.x_xx

 

SSL

If Blackboard’s SSL option is enabled and an SSL certificate is not associated with each application server then the Blackboard application will be unresponsive. The SSL certificate may need to be applied after reboots. If the Blackboard installation uses SSL and SSL is associated with each application server (not externally i.e. load balancer), make sure that the SSL certificate is associated with the site:

1.       Right-Click the Blackboard website from IIS Manager.

2.      Select ‘Properties’.

3.       Click the ‘Directory Security’ tab.

4.      Under ‘Secure Communication’ check whether the SSL certificate is applied. If not, click on ‘Server Certificate…’ and select the certificate.

Part 2 – Tracking Blackboard Activity

This is Part 1 of 2 in this series:

Part 1 – Analyzing the Blackboard Access Logs

Part 2 – Analyzing the Activity Accumulator


PREFACE

When statistics tracking is turned on all users’ actions within the Blackboard application are recorded in the activity_accumulator table of Blackboard’s “core” and “stats” databases. Newer access information is stored in the core database (bb_bb60 or BBLEARN) and access information older than a threshold (default 180 days) resides in the stats database (bb_bb60_stats or BBLEARN_stats.) Each row in these activity_accumulator tables contains information regarding the user, course, group, tab, module, content, discussion board, and so forth involved in a specific access.

ANALYZING THE ACTIVITY_ACCUMULATOR

The following are the fields of the activity_accumulator tables:

  1. event_type – High level classification of an access. Common event_types are:
    • LOGIN_ATTEMPT
    • LOGOUT
    • SESSION_INT
    • TAB_ACCESS
    • PAGE_ACCESS
    • MODULE_ACCESS
  2. user_pk1 – The user’s primary key, which can be used to look up more detailed information about the user that performed the access in the users table (i.e. SELECT * FROM users WHERE pk1=’SomePrimaryKey’)
  3. course_pk1 – The course’s primary key (if the access was of a course), which can be used to look up more detailed information about the course that was accessed in the course_main table (i.e. SELECT * FROM course_main WHERE pk1=’SomePrimaryKey’)
  4. group_pk1 – The group’s primary key (if the access was made by a group), which can be used to look up more detailed information about the group that accessed the course in the groups table (i.e. SELECT * FROM groups WHERE pk1=’SomePrimaryKey’)
  5. forum_pk1 – The forum’s primary key (if the access was made to a forum), which can be used to look up more detailed information about the forum that was accessed in the forum_main table (i.e. SELECT * FROM forum_main WHERE pk1=’SomePrimaryKey’)
  6. internal_handle – Detailed classification of an access. An example of a few internal_handles that could be listed alongside an event_type of “COURSE_ACCESS”:
    • control_panel
    • cp_gradebook
    • cp_gradebook2_modify_item
    • cp_discussion_board
    • cp_staff_information
    • cp_announcements
    • cp_design
    • cp_groups
    • cp_collaboration
    • content
    • check_grade
    • grade_individual_attempt
    • discussion_board
    • discussion_board_entry
    • db_grade_list
    • course_tools_area
    • send_email
    • admin_course_list_users
  7. content_pk1 –  The content item’s primary key (if the access was made to a content item), which can be used to look up more detailed information about the content item that was accessed in the course_contents table (i.e. SELECT * FROM course_contents WHERE pk1=’SomePrimaryKey’)
  8. data – Contains event  specific information to provide greater detail of the event_type and internal_handle. For example:
    • A row with an event_type of “LOGIN_ATTEMPT” will have an entry in the data field that is either “Login Succeeded” or “Login Failed.”
    • A row with an event_type of “TAB_ACCESS” or “MODULE_ACCESS” will have an entry in the data field that is the primary key of the tab or module. The primary key can be then be used to look up detailed information in the tab or module tables.
    • A row with an event_type of “PAGE_ACCESS” will have an entry in the data field identifying which page was accessed (i.e. “Create Link Tab” or “Manage Brands”.)
  9. timestamp – The date and time that the access occurred.

 

From knowledge of these fields and the relationship of the activity_accumulator table to others (i.e. users, course_main, tab, module, forum_main, and so on) a user can construct queries to analyze access of a specific type from a specific time by a specific user or group. If, for example, an administrator wants to retrieve all access by a user with a user name of ‘jsmith’ in a course with an id of ‘History101′ after October 22, 2010 they could use the following:

SELECT * FROM activity_accumulator as aa

INNER JOIN users u

ON aa.user_pk1 = u.pk1

INNER JOIN course_main cm

ON aa.course_pk1 = cm.pk1

WHERE u.user_id='jsmith' AND cm.course_id='History101' AND timestamp > '2010-10-22'

Part 1 – Tracking Blackboard Activity

This is Part 1 of 2 in this series:

Part 1 – Analyzing the Blackboard Access Logs

Part 2 – Analyzing the Activity Accumulator

 

PREFACE

All users’ actions within the Blackboard application are recorded in Blackboard’s access logs. Each log entry contains information about the user, their environment, the nature of the activity, and the time of the activity. Prior to the release of Blackboard version 9.0 ( versions 6 through 8 ) the httpd logs located under blackboard\logs\httpd contained this information. With the release of Blackboard 9.0 the bb-access-log was introduced located under blackboard\logs\tomcat. In version 9.0 and later the httpd log still exists, but the bb-access-log is more useful to the administrator because it automatically cross-references the session id in the httpd logs and the modperl logs to identify the primary key of the user that performed a given access. In versions 6 through 8 this cross-referencing must be done manually.

 

ANALYZING THE BB-ACCESS-LOG

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

- The IP address of the user that accessed the page. If in a load balanced environment, this will be the IP of a server and not the user’s PC.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

- The primary key (pk1) of the user that accessed the page. The primary key of the user can be utilized to determine the username and name of the user by searching the users table in the bb_bb60 or BBLEARN table. For example:

SELECT user_id, firstname, lastname FROM users

Please note that the httpd log does not list users’ primary keys. To determine the user with an httpd log, the session id must be cross-referenced with the modperl log.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

- The date, time, and time zone when that the page was accessed.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

-  The cs-method specified whether the call was a GET or a POST.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1” 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

- The cs-uri-stem is usually the URL or portion of the URL that was accessed. This along with the user’s primary key is the most valuable information contained in a log entry. It is the HTTP call that was made as a result of a user’s access (i.e. clicking a tab or link.) From this example we can tell exactly what the user accessed. By examining “tab_tab_group_id=_2_1” we know that the user clicked on a tab. To determine which tab we can look up the row in the “tab_tab_group” table with a primary key of 2.

SELECT * FROM tab_tab_group WHERE pk1=’2’

The result:

tab_pk1                pk1                          position                  tab_group_pk1

2                              2                              0                              2

From here where know the primary key of the tab and its tab group, which we can in turn look up.

SELECT * FROM tab WHERE pk1=’2’

The “label” field from the returned row is “Courses.label” and tells us that the tab that was clicked was the Courses tab.

The cs-uri-stem will vary greater based on the type of access and may contain more or less detailed information than the above example. However the details of most accesses can be determined either by querying the Blackboard databases, or replicating the access and observing the respective cs-uri-stem for comparison.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 132E22BA1D45421822B981B8326AC471 0.017 17 7620

-  The User Agent specifies information regarding the client application that is sending the logging message. This includes their operating system, browser and version, language and dialect, version of the .NET CLR, etc.

 

192.168.90.128 – _5_1 [03/Dec/2010:10:40:21 -0600] “GET /webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_2_1%26url%3D HTTP/1.1″ 200 7620 “Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12″ 132E22BA1D45421822B981B8326AC471 0.017 17 7620

- The session id of the user that accessed the log. If using a version earlier than 9.0, the session id can cross-referenced in the modperl log to acquire the user’s primary key.

Blackboard Maintenance

The most important factor to sustaining performance and uptime of the Blackboard application is regular and thorough maintenance. This article will discuss tasks necessary for maintaining an active Blackboard application.

BLACKBOARD APPLICATION

Some maintenance tasks are automated by Blackboard and can be configured within blackboard\config\bb-tasks.xml. The successful execution of these tasks is critical to the continued operation and optimal performance of the Blackboard application. It is important to regularly check to ensure that the tasks successfully execute.

- The most important automated task is the rotate logs task. The rotate logs task can be configured by “bb.log.rotation” in bb-tasks.xml (Bb 8 and prior) or bb-tasks.windows.xml (Bb 9 and later.) Log rotation can be executed manually by running the RotateLogs batch file located under blackboard\tools\admin. The rotate logs task archives all logs to a compressed directory named with the current timestamp. The successful execution of the task is important for as the Blackboard logs grow performance diminishes. If not deleted or archived the accumulation of logging data will eventually cause the application to crash. The rotate logs task should be frequently checked for proper functioning because particularly large logs will cause the task to fail allowing the logs to continue growing. If the logs grow past the threshold size that causes task failure (this threshold varies from system to system) then the logs will continue to grow. If this occurs the logs must be manually archived or deleted. To prevent the issue from occurring again the frequency of log rotation can be increased or logging verbosity can be turned down (service-config.properties.) I recommend setting the logging verbosity no higher than “error” unless troubleshooting specific issues to prevent excessive logging.

Optionally the rotate logs task can be disabled can be and an external scheduling tool can be used to provide greater control (i.e. Windows Task Scheduler.)

- Another important automated task is the purge accumulator task. This can be configured in bb-tasks.xml (Bb 8 and prior) or bb-tasks.windows.xml (Bb 9 and later) by “blackboard.platform.tracking.PurgeApplicationTask” or executed manually by running the PurgeAccumulator batch file under blackboard\tools\admin. The Activity_Accumulator table in the Blackboard database contains statistics tracking data used for reporting. The primary purpose of the purge accumulator task is to migrate old Activity_Accumulator data from the “core” database to the “stats” database. For Enterprise applications the “stats” database contains all historic statistics data (the data is simply deleted for Basic licenses.) As with the rotate logs task the purge accumulator task can fail if there is too much data, in this case too many rows in the Activity_Accumulator tables. If the purge accumulator task fails and rows continue to accumulate a variety of statistics tracking and reporting failures can occur. This issue can be resolved by manually running the PurgeAccumulator batch file to purge a small number of rows at a time until below the failure threshold. To prevent the issue from occurring again the time-period that rows that should be kept can be decreased. To determine if the task is failing the following query can be used:

SELECT MIN(TIMESTAMP) FROM ACTIVITY_ACCUMULATOR;
SELECT MAX(TIMESTAMP) FROM ACTIVITY_ACCUMULATOR;

These queries should be executed against the bb_bb60 database (versions 6 – 9, or in-place upgrade to 9.1) or BBLEARN database (fresh install of version 9.1.) If the difference between these timestamps is greater than the number of days to keep (default 180 days), then the purge accumulator task is failing.

Optionally the purge accumulator task can be disabled can be and an external scheduling tool can be used to provide greater control (i.e. Windows Task Scheduler.)

 

Although the former tasks can be scheduled to run automatically, there are still several that should be performed on a schedule and require manual execution:

- The administrator should regularly clear out old data within the Blackboard content recycle bin located on the content share.

- All courses older than than a chosen age (i.e. one year) should be archived and removed from the system. This process should run at the conclusion of every term. These archives should be kept safely on another storage device in case of contested grades, audits, etc.

- Blackboard and Java updates should be assessed and applied if deemed appropriate. Java updates are generally reliable but the administrator must be sure to update the JAVA_HOME parameter in bb-config, update the JAVA_HOME environment variable, and run the PushConfigUpdates batch file for the change to take effect. Blackboard hotfixes, service packs, and major version upgrades should be considered carefully. The administrator should examine the release notes and the list of all known issues for the version (all the information is at the behind the blackboard website, http://behind.blackboard.com.) Blackboard updates should not  necessarily be adopted immediately after their release. I recommend allowing sufficient time for new issues to be reported and patched. After an update or version has been available for sufficient time the administrator can objectively weigh the benefits and the risks based on the aforementioned documentation, reports on the listservs, and other Blackboard forums.

- The administrator should regularly audit the Blackboard logs for errors and anomalies. It should be noted that Blackboard can throw a number of benign errors. The following are examples of significant problems to looks for:

bb-services-log:
“FATAL” or “SEVERE” errors
“Error getting connection”

stdout-stderr:
Particularly long garbage collections or JVM startups
“JVM exited unexpectedly”
“There were 5 failed launches in a row, each lasting less than 300 seconds. Giving Up.”

 

BLACKBOARD DATABASES

Just as it is important to maintain the Blackboard application, it is important to maintain Blackboard’s data. The following are suggested maintenance tasks for maintaining the Blackboard databases on MSSQL Server:

- Regularly truncate Blackboard’s transaction logs.

- Reorganize indexes daily.

- Rebuild indexes weekly.

- Regularly backup and truncate System Tasks Status logs. If too many of them collect course copies, archives\restores, and import\exports will suffer in performance.

Copy all queued_tasks into a new table:

SELECT *

INTO queued_tasks_backup

FROM queued_tasks

OR copy all queued_tasks into an existing backup table:

INSERT INTO queued_tasks_backup(pk1, dtcreated, dtmodified, title, task_type, status, users_pk1, entry_node, process_node, start_date, end_date, arguments, results)
SELECT qt.pk1, qt.dtcreated, qt.dtmodified, qt.title, qt.task_type, qt.status, qt.users_pk1, qt.entry_node, qt.process_node, qt.start_date, qt.end_date, qt.arguments, qt.results
FROM queued_tasks qt

Delete queued_tasks:

DELETE FROM queued_tasks

- Audit the Blackboard Databases on a schedule (i.e annually, semi-annually) for orphaned courses and users. Such orphans may arise from removing old data (i.e. archiving old courses.) For instance:

--ORPHANED USERS
SELECT pk1, user_id
FROM users
EXCEPT
SELECT users.pk1, user_id
FROM users
INNER JOIN course_users
ON users.pk1 = course_users.users_pk1

--COURSES WITHOUT INSTRUCTORS
SELECT pk1, course_id
FROM course_main
EXCEPT
SELECT cm.pk1, cm.course_id
FROM course_main cm, course_users cu
WHERE cu.role='p' AND cm.pk1=cu.crsmain_pk1

 

The following articles provide the details of these tasks for MS SQL Server:

http://kb.blackboard.com/display/KB/SQL+Server+2005+Maintenance+Plan+Wizard

http://kb.blackboard.com/display/KB/SQL+Server+2008+Maintenance+Plan+Wizard

HOST OPERATING SYSTEM

Appropriate maintenance for the host Operating System and environment will vary. The administrator may wish to develop a custom maintenance plan for this purpose. The following is suggested maintenance for all application, collaboration, and file servers:

- Clean up temporary data.

- Scheduled disk defragmentation.

- Implementation of security updates and hotfixes.

- Audit server and network logs for errors or anomalies.

 

ADDITIONAL RESOURCES

http://kb.blackboard.com/display/KB/Description+of+bb-tasks.xml

http://kb.blackboard.com/display/KB/The+PurgeAccumulator+task

http://kb.blackboard.com/display/KB/SQL+Server+2005+Maintenance+Plan+Wizard

http://kb.blackboard.com/display/KB/SQL+Server+2008+Maintenance+Plan+Wizard

http://kb.blackboard.com/display/KB/cleanbb

Part 3 – Blackboard Performance Tuning: An Iterative Approach

This is Part 3 of 3 in this series:

Part 1 – Overview & Architecture

Part 2 – JVM Tuning Methods

Part 3 – Additional Tuning Methods

 

In Part 1 the following was presented as the top-down list of resources related to the Blackboard application that affect its performance:

  1. Software
    • Blackboard Application
    • JVM
    • Tomcat
    • Web Server (IIS, Apache)
  2. Operating System and Services
  3. Server Hardware
  4. Network Architecture and Hardware

Part 2 discussed in detail tuning a Java Virtual Machine for use with the Blackboard application. The JVM was discussed first because it has the greatest potential performance gains than tuning any other single item on the list. This section contains suggestions for tuning the other items on the list. Please note that this section and this series are not exhaustive. For complete tuning practices please consult the documentation for each individual technology. There are additional resources listed at the end of this section.

 

BLACKBOARD APPLICATION TUNING

Garbage Collection Timeout – This value determines how long a garbage collection is permitted to run before it is forced to terminate (default is 30 seconds.) A system under a heavy load will sometimes require a garbage collection longer that will take longer than the default timeout permits. If this occurs the garbage collection will timeout and the heap will eventually run out of memory causing the JVM (and subsequently the Blackboard application) to crash. A good practice is to double the GC timeout to 60 seconds although this may need further adjustment for some systems.

The timeout is set by the wrapper.ping.timeout parameter and must be changed in both apps/tomcat/conf/wrapper.conf and config/tomcat/conf/wrapper.conf.bb.

Other Timeouts – Timeouts for session, assessments, and so on can be set within Blackboard. Generally, shorter timeouts result in improved performance but diminished stability and longer timeouts result in diminished performance but improved stability.

Assessment Timeout – For example “<session-timeout>20</session-timeout>” can be set within blackboard/webapps/assessment/WEB-INF/web.xml. See http://kb.blackboard.com/display/KB/Preventing+assessment+timeouts

Session Timeout – See http://kb.blackboard.com/display/KB/Modifying+the+default+timeout+session.

OPERATING SYSTEM TUNING (i.e. Windows Server)

Processor Scheduling – Performance gain may be achieved if processor scheduling is set to “Background Services.”

Disable Unused Services – Performance gain may be achieved by disabling any unnecessary services.

Paging File – Paging file size may be considered as another variable in the tuning process to test against the baseline configuration. Properly tuning the paging file size may achieve performance gains.

TCP Chimney – Disabling TCP Chimney may achieve performance gain. This can be accomplished by running “netsh int tcp set global chimney=disabled” at a command prompt.

WEB SERVER TUNING (i.e. IIS)

Allow Persistent Connections – Represented by the parameter bbconfig.webserver.keepalive in bb-config.properties. The recommended setting is 1.

Persistent Connection Timeout – Represented by the parameter bbconfig.webserver.keepalivetimeout in bb-config.properties. The recommended setting is 15.

HTTP Compression – Represented by the parameter bbconfig.webserver.compression in bb-config.properties. The recommended setting is ‘Yes’

 

ADDITIONAL RESOURCES

Blackboard

http://kb.blackboard.com/display/KB/Windows+2008+Performance+Guide

http://kb.blackboard.com/display/KB/Windows+2003+Performance+Guide

http://kb.blackboard.com/display/KB/The+Java+Service+Wrapper

http://kb.blackboard.com/display/KB/Preventing+assessment+timeouts

http://kb.blackboard.com/display/KB/Modifying+the+default+timeout+session

https://behind.blackboard.com

Document Library > Blackboard Learn 9.1: Performance Optimization Guide

Document Library > Hardware Sizing Guides

Java

http://blogs.sun.com/watt/resource/jvm-options-list.html

http://www.oracle.com/technetwork/java/tuning-139912.html

Windows Server

http://www.microsoft.com/whdc/system/sysperf/perf_tun_srv.mspx

http://www.microsoft.com/whdc/system/sysperf/Perf_tun_srv-R2.mspx

Part 2 – Blackboard Performance Tuning: An Iterative Approach

This is Part 2 of 3 in this series:

Part 1 – Overview & Architecture

Part 2 – JVM Tuning Methods

Part 3 – Additional Tuning Methods

 

JVM PERFORMANCE TUNING

The most significant application performance gains that can be achieved are by properly tuning each Blackboard application server’s JVM. Therefore part 2 is dedicated to JVM tuning. Before examining JVM tuning options understanding the life cycle of an object in the JVM heap is valuable:

  1. Object is created in ‘Eden’
  2. Object is transferred to the free survivor space (there are two survivor spaces) within the ‘young’ generation.
  3. Object is transferred between the survivor spaces until garbage collection frees it or it has been copied a predefined number of times (default is 31.)
  4. After the object reaches the threshold it is transferred to the ‘tenured’ generation where it resides until it is ready to be freed by garbage collection.

The function of this architecture and its relevance to tuning:

  • Accessing the young generation (read, write) is inexpensive but garbage collection within the young generation is expensive. If there is not enough space in Eden or a survivor space new objects will be prematurely tenured and their access will be more expensive.
  • Accessing the tenured generation (read, write) is expensive but garbage collection within the tenured generation is inexpensive. If the tenuring threshold is too high then a large number of objects will die in the young generation and garbage collection will be more expensive.
  • The larger the young generation is the fewer minor garbage collections will take place. A larger young generation implies a smaller tenured generation which increases the frequency of major collections.

To attain ideal JVM performance objects should reside in the young generation for as long as they are actively accessed and should be tenured when they are about to die or access to them drops off. Depending on the nature of the application it may be valuable to control the distribution of garbage collection as well (smaller GCs in the young generation result in shorter pauses than large collections in the tenured generation.) The following can be configured to affect the behavior  and potentially improve a JVM’s performance (these settings can be configured in Blackboard’s main configuration file, blackboard\config\bb-config.properties):

NOTE: The following recommended configurations are the result of analyzing Blackboard’s tuning recommendations and case studies and the author’s experience. These are not the direct recommendations of Blackboard. Furthermore, these are recommended baseline configurations to be adjusted by a test-driven approach to tuning.

REQUIRED CONFIGURATIONS

Heap Size – The size of the heap is the combined size of the young and tenured generation. This value is limited by available hardware and the memory requirements of the OS and other applications. The heap should be as large as possible without constricting the host OS.

Parameters (should be set equal):

bbconfig.min.heapsize.tomcat=<value>

bbconfig.max.heapsize.tomcat=<value>

Permanent Generation Size – The permanent generation contains permanent objects and global variables dedicated to the JVM.

Parameter:  bbconfig.max.permsize.tomcat=<value>

The recommended baseline combined heap and permanent generation size is approximately one-half the total available RAM. For every 4 GB of heap memory a minimum of 256 MB should be dedicated to the permanent generation. For instance, if the total available RAM is 24 GB, then the size might be 12 GB and the permanent generation would be 768 MB.

Stack Size – The size of each thread. This size should be set based on the requirements of specific version of the Blackboard application. The default stack size for a given version is sufficient, but Blackboard provides further analysis of suggested stack sizes for various versions in their knowledge base on the behind the blackboard website.

Parameter: bbconfig.max.stacksize.tomcat=<value>

Maximum Threads – The maximum number of threads permitted to exist within the heap. This cap should prevent the heap from running out of memory. The total number of threads that can exist without risking overflow is determined by the size of the Java heap and the maximum thread size.

The recommended baseline maximum number of threads is approximately 150 – 170 times the Java heap size (in GB.) Thus, if the heap is 2.5 GB an appropriate maximum threads setting would be 160 * 2.5 or 400 threads.

Parameter: bbconfig.appserver.maxthreads=<value>

OPTIONAL CONFIGURATIONS

Young to Tenured Generation Ratio – Determines the amount of space available to each generation.The young generation should be large enough to ensure that objects are not prematurely tenured. It is important to fine tune this ratio but it is often better to configure a slightly larger young generation than necessary rather than risk objects being prematurely tenured.

The recommended baseline ratio of tenured to young generation is approximately 1:4. It is often more desirable to specify the size of the young generation explicitly because it allows for a more precise configuration (the tenured generation will be automatically sized to heap minus young generation.)

Parameters:

-XX:NewRatio=<value> – The ratio of the tenured generation to the young generation.

-XX:NewSize=<value> – The size of the young generation in megabytes (tenured generation is inferred from heap and young generation sizes.)

-XX:MaxNewSize=<value> – The maximum size of the young generation in megabytes (tenured generation is inferred from heap and young generation sizes.)

Survivor Space Size – Determines the amount of space available in each survivor space (and indirectly the size of Eden.) The survivor spaces should be large enough to ensure that objects are not prematurely tenured. The default value is adequate for a baseline configuration.

Parameter:

-XX:SurvivorRatio=<value> – The ratio of the size of each survivor space (there are two) to the size of Eden. For example a value of 10 means that each survivor space is 1\10 the size of Eden and therefore 1\12 the size of the young generation or 1\12 (space 0) + 1\12 (space 1) + 10\12 (Eden.)

Tenuring Threshold – Determines the number of times that an object can be copied between survivor spaces before being tenured. This value should be high enough to enough to ensure that as many objects as possible stay in the young generation as long as they are actively needed but low enough to ensure that objects are tenured when they are no longer needed or about to die. The default value is adequate for a baseline configuration.

Parameter:

-XX:MaxTenuringThreshold=<value> – The number of times that an object is copied before being tenured (the default is 31.)

Garbage Collector Type – The size, lifespan, and manipulation of the objects that reside within the heap determine the effectiveness of garbage collection. There are a variety of garbage collectors to suit different applications (i.e. throughput collector, concurrent low-pause collector, incremental low-pause collector.) The default collector is adequate for a baseline configuration; however after testing most institutions find that the concurrent low-pause collector the most effective choice for Blackboard.

Parameters:

-XX:+UseParallelGC – The throughput collector uses the default tenured collector and a parallel version of the young generation collector.

-XX:+UseConcMarkSweepGC – The concurrent low-pause collector performs tenured collection concurrently with the execution of the application. The application is briefly paused during collection.

-XX:+UseParNewGC – Can be used in conjunction with the concurrent low-pause collector. This switch enables parallel young generation garbage collection in conjunction with the concurrent collections and is valuable in multi-core\processor environments.

-Xincgc – The incremental low-pause collector collects a portion of the tenured generation at each minor collection. It is slower than the default collector but minimizes long pauses from major collections.

Note that -XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, and -Xincgc are mutually exclusive. Using any of these switches together will result in unpredictable behavior.


Additional Switches
– There are a very large number of additional JVM tuning options most of which are not covered here. This article will cover those options found to be most common and valuable to tuning the Blackboard application. Sun’s website should be consulted for more information and an exhaustive list of these options (i.e. http://blogs.sun.com/watt/resource/jvm-options-list.html.) None of these additional options should be introduced to the baseline configuration. They should be added and tested one at a time to determine their impact on performance.

Common Parameters:

-XX:+UseTLAB – Enables thread-local object allocation and is faster than the default atomic operation.

+XX:UseISM – Enabled intimate shared memory which reduces the overhead of virtual to physical address translation when using larger heaps (Solaris only.)

-XX:+CMSParallelRemarkEnabled (only used in combination with ParNewGC) – Decreases remark pauses.

-XX:ParallelGCThreads=<value> – The number of parallel threads that the JVM uses to perform garbage collection in the young generation (the default is the number of number of processors.)

 

CASE STUDY

You are the Blackboard Server Administrator for a large institution of several hundred thousand users. Assuming that the application is already in place, tune the Blackboard application and its environment. The environment is as follows:

8 application servers

2 collaboration servers

8 processors per server (3.2 GHz)

20 GB RAM per server

Windows 2008 R2

Blackboard 9.1 Enterprise with Content and Community


The following could be an appropriate baseline configuration for each server:

Heap – 10000m

Perm Gen Size – 768m

Stack Size – 320k

Max threads – 1500

Young Generation – 2500m

Additional Switches – none

 

This would appear as the following in bb-config.properties:

bbconfig.appserver.maxthreads=1500

bbconfig.min.heapsize.tomcat=10000m

bbconfig.max.heapsize.tomcat=10000m

bbconfig.max.permsize.tomcat=768m

bbconfig.max.stacksize.tomcat=320k

bbconfig.jvm.options.extra.tomcat=-XX:NewSize=2500m -XX:MaxNewSize=2500m

 

The next step is to load test the application with the baseline configuration. From the load test the application responsiveness should be recorded. It is also valuable analyze and record garbage collection performance which is logged in blackboard\logs\tomcat\gc.log:

Total time for which application threads were stopped: 0.0000696 seconds

Application time: 119.2037722 seconds

Total time for which application threads were stopped: 0.0003771 seconds

Application time: 600.1297291 seconds

Total time for which application threads were stopped: 0.0006816 seconds

Application time: 0.0184405 seconds

Total time for which application threads were stopped: 0.0000854 seconds

Application time: 335.6940502 seconds

81399.961: [GC 81399.961: [ParNew: 249951K->3801K(276480K), 0.0107200 secs] 585391K->339241K(1198080K), 0.0108022 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]

Total time for which application threads were stopped: 0.0112471 seconds

Application time: 0.0004188 seconds

Total time for which application threads were stopped: 0.0001240 seconds

Application time: 0.0000176 seconds

Total time for which application threads were stopped: 0.0000670 seconds

Application time: 0.0000276 seconds

Total time for which application threads were stopped: 0.0000674 seconds

Application time: 0.0000260 seconds

Total time for which application threads were stopped: 0.0000676 seconds

Application time: 0.0000139 seconds

“Total time for which application threads were stopped” indicates an application pause.

“81399.961: [GC 81399.961: [ParNew: 249951K->3801K(276480K), 0.0107200 secs] 585391K->339241K(1198080K), 0.0108022 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]” indicates a full garbage collection.

After the baseline configuration is tested and the results recorded, tests of individual variables can begin. The most obvious method of tuning is to change the value of one variable, load test, and repeat. However, it is possible to save a great deal of time if using a load balanced environment with identical application servers (ideally clones.) Since the test applies a controlled load, setting the load balancer to a round-robin routing configuration will apply an identical load to the identical servers. Configuring a single variable differently on each node allows the complete testing of that variable with a single load test. In our example one might choose to tune the size of the young generation first. They would set the size of the young generation as follows on each of the eight application servers: 1800, 1850, 1900, 1950, 2050, 2100, 2150, 2200 (the baseline test for 2000 already exists.) If ideal performance does not fall in this range (performance continues to increase from 1800 to 2200 or from 2200 to 1800) another test should be performed in another range. If the test of say ’2150′ performed the best, one might wish to narrow the value further by running a test of: 2110, 2120, 2130, 2140, 2160, 2170, 2180, 2190.

Once the first variable has been adequately tested and tuned it should be set back to its baseline configuration and a test of the next variable should begin. This process should proceed until all variables that the admin wishes to test are tuned. Optionally, the system can be tuned further by using the new configuration as a new baseline and repeating the testing\tuning process as many times as needed.. This helps to account for the unpredictable nature of cumulative changes that multiple variable configurations can have. For example the configuration of two separate variables may independently improve performance, but the combined change may not yield the same performance gain or worse cause a performance loss.

A sample configuration of the tuned JVM might be:

bbconfig.appserver.maxthreads=1500

bbconfig.min.heapsize.tomcat=10000m

bbconfig.max.heapsize.tomcat=10000m

bbconfig.max.permsize.tomcat=768m

bbconfig.max.stacksize.tomcat=320k

bbconfig.jvm.options.extra.tomcat=-XX:NewSize=2500m -XX:MaxNewSize=2500m -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:ParallelGCThreads=6 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=16 -XX:MaxTenuringThreshold=28

Part 1 – Blackboard Performance Tuning: An Iterative Approach

This is Part 1 of 3 in this series:

Part 1 – Overview & Architecture

Part 2 – JVM Tuning Methods

Part 3 – Additional Tuning Methods


PREFACE

Blackboard provides performance tuning documentation for their application in the document library at the behind the blackboard website (https://behind.blackboard.com.) Blackboard’s guidelines are a valuable starting point and is a sufficient comprehensive solution for smaller institutions. However a test driven approach that relies on an iterative cycle of tuning and load tests will yield more substantial performance gains and adapts to the variety of host architectures. The following steps are a top-level overview of an iterative approach to tuning the Blackboard application and its host environment:

  1. Use Blackboard and Java tuning practices as a guideline to develop a baseline configuration.
  2. Isolate a single configuration variable (i.e. garbage collector choice, heap generation ratio, survivor space size, optional switches, etc.)
  3. Load test a variety of configurations for the chosen variable (while maintaining the base configuration for all other variables) to determine  the most effective setting.
  4. Repeat 2-3 for all variables.
  5. Following these steps will provide a more efficient configuration than the base configuration. However changing one variable may alter the effect of a change in another variable. The optional steps below will account for the interdependency of the variables and hone in on the most effective configuration possible:

  6. Use the tuned configuration arrived at in step 4 as the new baseline configuration.
  7. Repeat steps 2-4.


ARCHITECTURE

To understand effective performance tuning configurations of the Blackboard application and its host environment it is valuable to define the function and architecture of the application and its environment. The following is a top-down list of the resources that impact net performance of the application:

  1. Software
    • Blackboard Application
    • JVM
    • Tomcat
    • Web Server (IIS, Apache)
  2. Operating System and Services
  3. Server Hardware
  4. Network Architecture and Hardware

The Blackboard application is a collection of Java Server Pages and Servlets (accompanied by occasional perl.) This code is nothing more than a set of instructions. The performance of the application is determined by the methods used to execute the instructions. It is executed within a virtual machine model called a Java Virtual Machine (JVM.) The JVM provides an environment capable of executing Java bytecode (compiled Java source code) and storing information and complex data structures within its heap.

The web server and Tomcat serve requests to the Blackboard application. The web server hosts websites and delivers webpages to users, in this case the Blackboard website. However the web server does not directly make requests or receive responses from the JVM. The web server interfaces with Tomcat, a servlet container, which in turn interfaces with the JVM. Tomcat behaves as a Java-exclusive web server and acts as an adapter between the web server and the JVM.

The Operating System interfaces with the hardware on which it resides to expose the resources to Blackboard. The greatest effect that the operating system has on performance will be in the choice of operating system. This article will not promote any operating system over another and acknowledges that each has advantages (for simplicity this series’ examples will assume Windows Server OS.) Beyond the choice in operating system, OS tuning may impact Blackboard (i.e. processor scheduling, virtual memory configuration \ paging file size) and should be considered variables in the tuning process.

Network and server hardware are peripheral to the discussion of tuning the Blackboard application. These resources usually exist by the time tuning is an active topic of concern for an institution. For those institutions interested in hardware sizing, this article will not discuss the topic for it is worthy of its own comprehensive discussion. Blackboard’s documentation on hardware sizing can be found on the behind the blackboard website (https://behind.blackboard.com.) The purpose of mentioning both network and server in this discussion is the influence each has on tuning:

Hardware of individual servers directly affects available resources to the Blackboard application and related processes. The number of processors and the speed of processors and bus impact processing speed, internal communication, memory management and garbage collection, etc. Memory speed and quantity affects the potential size and composition of the Blackboard heap (JVM) and Blackboard memory management \ garbage collection.

The network determines the load and rate that resources on different servers can communicate with one another, external resources, and the end user. The larger the institution, the greater potential impact that institution’s network infrastructure will have on performance for hardware resources are often discrete and many.

Follow

Get every new post delivered to your Inbox.