CAT Chat Minutes

November 4, 2002

 

With deep sadness, Roger Klaffky announced the passing of Tony Rauchas, AOD Division Director, on Sunday evening. As information is gathered, an e-mail notice will be sent regarding visitation and funeral arrangements. Tony will be sincerely missed.

 

Information and Follow-up Items

There was one fault in front end, similar to one User had previously. One hour was lost in getting the beam up and running again. Can this lost time be cut?

Kevin Beyer will look into this.

When the gateway is down can APS notify Users?

Yes, a notice is automatically sent to CATs but it should be arranged so that a notice simultaneously goes to floor coordinators. Ned Arnold will investigate.

It seems that a User computer was being security scanned. Was it a scheduled scan since this is usually done only during shutdown? User checked the source of scan and reports it to be ANLSCAN6.

APS computer security contact, Bill McDowell, was notified. He had this reply: Argonne Cyber Security office will security scan also during a machine studies day. ANLSCAN6 is an ANL Cyber Security computer. Once we know what date this took place, we can communicate to that office designated machine studies dates.

User is having trouble with VPN log-in.

Ken Sidorowicz will be notified. Floor coordinators will set up a meeting with Ken and the affected CAT.

Watchman Data: Follow-up on Watchman Data from Marcia Wood, AOD-MIS:
Files have been loaded as of October 31, 2002 (shift 3).
Data can be retrieved from Oracle based on station, sector, etc..
Goal was to get program written to load data.

Next steps:
create process to automatically move shift files to Oracle; create record level security to allow querying of this data via a web-based user interface.

 

User Operations

Operation Stats: Glen Decker reported 98.76% availability. Mean Time Between Faults (MTBF) is 49.26 hours. There was five hours downtime out of 400. Reported five faults due to the following reasons: vacuum valve close, corrector 19BH1 trip, corrector 19BH1 trip, Sector 13A:Q5 trip, and fill ongoing.

 

General Information

Future Operation Modes Workshop: Reminder of November 6th FOM Workshop beginning at 8:00 a.m. and continuing until noon. If there is need for further discussion, the meeting will continue in the B4100 conference room in building 401.

When the workshop is over, a website will be created containing links to individual presentations.

Oracle Database Backup: Steve Leatherman of AOD-MIS explained there are two scheduled backups: 10:00 p.m. for Oracle database which lasts approximately one hour; and beginning at 12:00 a.m. the system backup. See the handout, which has been added to the end of todays minutes.

When the new software and hardware is in place (about two to three weeks) they will back up the Oracle database between 10:00 p.m. and midnight and expect to reduce the backup time to five to ten minutes.

Beamline Report: Roger presented a sample report on viewgraph. He explained the floor coordinators station recordings.


Oracle Database Backup

The Oracle database is shut down during nightly backups. The process of copying the database files takes about one hour. During that time the database is unavailable for use. The Oracle database backup must complete prior to the initiation of the system backup, which begins at 12:30 a.m. nightly and runs throughout the early morning hours. There are occasions when the nightly system backup is still running when users come into work in the morning.

Historically the MIS department has moved the timing of the database backup around several times to accommodate various constituents. Also, beginning at 5 p.m. and running until 9 p.m. there are various batch loading jobs running. These are data loading programs, which would interfere with interactive use if run during the daytime.

Currently the database backup executes at 10pm and lasts for one hour. It can be moved to begin as late as 11 p.m. and still finish prior to the system backups. It cannot be moved to after the system backup as that would leave the division vulnerable with a database that is not backed up (for the latest 24 hr. period). So, the backup must start and complete between the hours of 10 p.m. and midnight.

MIS is very close to implementing a solution to the problem of the one-hour service outage. We are in the process of mirroring the database files. Without going into detail, the backup will still take as long, but users will not experience the one-hour outage. The outage duration will be shortened to about two to five minutes. The backup will still have to occur within the above stated time window, but the downtime will only be a few minutes.