Sunday, June 19, 2011

Problem Determination Steps

The following questions should be considered and if possible answered when trying to diagnose a problem:

1. What is the problem?
2. Where did it occur?
3. When did it begin happening?
4. What action was being performed?
5. Were any messages issued?

Check the server activity log for error messages.

If error messages are in the server activity log, check 30 minutes before and after the time that the error message was issued. Often the problem encountered is actually a symptom of another problem and seeing the other error messages that were issued may help to isolate this.

Did the Explanation or User Response section of the TSM message offer any suggestions on how to resolve the problem?

6. How frequently does this error occur?
7. Check any system error logs:

On Windows(R)
Check the application log.
On AIX(R) and other UNIX(R) platforms
Check the error report.

8. Check with others that may have made changes in the environment that could affect TSM. Some others in a typical IT environment include:

SAN Administrator
Network Administrator
Database Administrator
Client or machine owners

9. Check the TSM error logs. The following TSM error logs:

dsmserv.err - Server error file. This is located on the same machine as the server. The dsmserv.err file is typically in the server install directory. Note that the storage agent may also create a dsmserv.err file to report errors.
dsmerror.log - Client error log. This is located on the same machine as the client.
dsmsched.log - Client log for scheduled client operations. This is located on the same machine as the client.
db2diag.log, db2alert.log, userexit.log - DB2(R) log files. These are useful when troubleshooting a problem when backing up a DB2 database using Tivoli Data Protection for DB2. These are located on the same machine where DB2 is installed.

tdpess.log - Default error log file used by the Data Protection for Enterprise Storage Server(R) client.
tdpexc.log - Default error log file used by the Data Protection for Exchange client.
dsierror.log - Default error log for the client API.
tdpoerror.log - Default error log for the Data Protection for Oracle client.
tdpsql.log - Default error log for the Data Protection for SQL client.

10. Verify that devices are still accessible to the system and to TSM.
11. Search the online Knowledge Base for matching error messages or problem descriptions.
12. Test other operations to better determine the scope and impact of the problem. This may also help to determine if it is a specific sequence of events that causes the problem.

No comments:

Post a Comment