AMPERIF ERROR RECOVERY
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP94T00858R000600880001-7
Release Decision:
RIPPUB
Original Classification:
K
Document Page Count:
8
Document Creation Date:
December 28, 2016
Document Release Date:
February 12, 2008
Sequence Number:
1
Case Number:
Publication Date:
April 13, 1983
Content Type:
REPORT
File:
Attachment | Size |
---|---|
![]() | 277.24 KB |
Body:
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
NDS OPERATIONS PROCEDURE MANUAL SYSTEMS SW & HW
NO. P-AO03 13 April 1983
ORIGINATOR: I STAT
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
DATA CENTER OPERATIONS BRANCH
DCOB OPERATIONAL PROCEDURE MANUAL OPERATIONS - GENERAL
No. 50-0009. 25 February 1983
AMPERIF CACHE DISC ERROR RECOVERY
PURPOSE:
1. This-will establish a procedure to follow when an error occurs with the
Amperif Cache Disc - Subsystems.
REFERENCES:
2. The following references are incorporated within the following procedures
AMPERIF CACHE Disc Operators Manual and the Univac 5046/8434 Operators Manual.
PROCESSING:
3. See accompanying attached documents.
Attachment: a/s
STAT
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
CACHE ERROR RECOVERY:
There are three major areas where an error can occur in a cache disk. sub-
system. They are:
1. Control Units
2. Disc Drives
3. Cache Memory
Before any activities are to be taken for repair and/or recovery it must first
be determined where the error is occurring. The operator should have knowledge
of what type errors can occur in a disc sub-system, for a understanding of these
errors one should reference the Uni.vac Operators Manual on disc sub-systems.
In the process of determining where an error has occurred some initial
procedures should be followed:
1. Only under the direct supervision of a customer engineer and OCO
should the cache memory be initialized. Once the memory has been initialized
all data that was in cache will be lost and cannot be recovered.
2. Do not power off any cache modules or control units. This will
destroy the data that is in cache and it cannot be recovered.
3. Before doing any re-setting one should check twice the toggle
registers for what type of reset you are about to do.
4. A quick survey of the control unit(s) status and any outstanding
error messages should be recorded. This will give the C.E.'s additional
information when they arrive.
5. A call should be put into the C.E.'s when an error occurs so that
they can give you additional information and/or help via telephone for
recovery procedures.
In determining where an error occurs one should first look at the disc drive
in question (,the logical address of the disc drive, control unit and path will be
contained in the error message). The disc drive is the most probable cause of
error. Being mostly a mechanical device it has the highest chance of an error
occurring on it.
If an error does occur a few items should be looked at to determine if the
disc drive is where the error is occurri,ng.
1. Are any fault lights l i.t on the, operators.indi.cator panel?
2. Is the drive powered on and in a ready condition?
3. Is the write - protect switch. on?
4. Are the fans blowing indicating power to the disc drive?
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
5. Is a pack in place and the lid closed (only for non-wi,nchester
type disc,drives)_?
If any of the above items are true for the situation then appro riate action
should be taken to correct the condition occurring. (See Attachment)
Once the action has been taken answer any outstanding console messages with
an "A" and the system should return to normal. If the same type problem does
occur on another drive but either thru the same path and/or control unit further
actions must be taken. There is the high possibility that the problem is occurring
in one of the other areas.
If none of the previously mentioned situations have occurred the problem will
probably be associated with a control unit, cache and/or path.
The following suggestions may help in isolating the problem.
In dealing with cache disk systems certain errors or conditions can occur
where a loss of data will be incurred. This is an unfortunate fact, no matter
how many safeguards are introduced to the system. There is always the possibility
of losing data. Some instance where data loss is inevitable are:
1. Loss of a power supply to a cache module.
2. Uncorrectable errors in cache.
3. Loss of a memory board and/or memory control board.
4. Hard errors in the file register board (i.e., FRB parity errors, etc.).
If one of the above conditions does occur systems (S.P.S.) and a customer
engineer should be notified immediately. It should be noted that without the aid
of some special tools and diagnosis the person observing or trying to locate the
source of the problem will not know, with the exception of the hard FRB error, what
condition is occurring. He will only know, thru his observations, from display
readouts, of a potential problem that might exist where loss of data will be incurred.
In trying to isolate control units (this also includes path problems) and
cache memory problems, the operator should attempt to put all of the disc drives
into a bypass mode. If this is done successfully and processing continues
normally, it is advisable to leave all of the drives in a bypass mode and record
any errors that have been logged. At this point the C.E. should be called and
time scheduled to look at the problem.
There will be the possibility of errors occurring wh.e.re the. data in cache
cannot be written back to the disc drive. An attempt should he made to do the
following - if disc drive has no errors.
1. A data copy should be attempted and write-protected placed on the
data copy drive. Refer to Amperif Cache Disc Operators Manual for data copy
procedures.
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
2. Disable control unit B and do a bypass all
a. If it works then o.k. - leave control unit B disable and
down via software (console).
b. If the problem reoccurs, do the above procedure (a) for
control unit A.
3. If the above procedure does not work notify systems personnel and
C.E.'s.
4. At this point the data is stored in the data copy and is good. Cache
can be re-initialized and data restored to cache from the data copy drive.
Another attempt can be made to place all the drives into bypass after the data
has been restored.
5. If the disc drive has an error, attempt to move the pack and I.D.
plug to a spare disc drive and try again to put the drives into bypass.
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Control Units
In isolating control units as previously mentioned all drives should be
placed in bypass if possible. Once this has been accomplished we can now begin to
isolate which control unit and/or path is causing problems.
The procedure to be followed for isolating control unit should be as follows:
1. Before attempting to determine which control unit may be bad, check
the shut-down authorization light coming from the U.P.S. If this light is one
place the U.P.S. into bypass mode, reset the control units but do not
initialize cache. Answer all outstanding messages. The system should go back
to normal. It should be noted that the U.P.S. is defective and should be
looked at. It is not advisable to put disk drives in a caching mode, they
should be placed either in write-thru or bypass modes. If a problem persists
then continue with the following steps.
2. Down the associate control unit, via the console, with the outstanding
error message.
3. Answer the message with the proper response. If the error does not
return and normal processing continues the control unit and/or path is bad.
Time should be scheduled for emergency maintenance.
4. If an error message does continue do the above procedure (#3) for the
other control unit.
a. If an error still continues it is possible that the disc drive
or other problem exists.
5. If a control does prove to be bad, it would be useful information to
the customer engineer whether or not the problem stays with one path or both.
6. In determining which path may be bad, down the associate path on
the bad control unit with the outstanding error message. Up the control unit
and see. if the problem goes to the other path. If so, do the same procedure
for the other path. If the problem goes thru both paths of the control unit
you can be certain that there is a control unit problem. Leave the control
unit down and all the drives in bypass and schedule emergency maintenance.
7. In the last step if a problem stayed wi'th_a path one can isolate the
problem further. By this I mean whether the problem stays with the path from
the XFER switch to the control. units or a path from an IOU to the XFER switches.
8. When isolating a path problem one must insure that both IOU's in
question, associate IOU's on the XFER switch., are in the system. If they are
not, then notify the OCO's that a configuration change must be made.
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
9. To determine which or both paths are bad, down the control unit and
both paths in question. Switch the XFER switch and up the associated path and
control unit. If no problems occur then the possibilities are that the XFER switch
or path from the IOU's to the XFER switch is bad. If the problem remains, it
will be associated with the control unit or path from the XFER switch to control
unit.
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7
DISK DRIVE ERRORS
FAULT CONDITION
ACTION TO RE TAKEN
I. Fault light indicated on
1.
Reset fault light and answer any
operators panel.
outstanding messages.
2.
Recycle drive.
a. If fault does not appear answer
any outstanding messages.
b. If fault remains remove pack and
I.D. plug to known good spare
disc drive and answer any outstand-
ing messages.*
**NO NEED TO CALL C.E.
*NOTE: A C.E. should be called and
notified of the problem.
II. Is the disc drive powered on
1.
Power on disc drive and press ready
and ready.
a. If drive comes ready answer
message an "A" system should go
back to normal.**
b. If drive does not come ready or
fault occurs remove pack and I.D.
plug and place on a known good
drive.*
III. Is the write-protect switch on.
I.
Remove write-protect and answer
messages.**
IV. Are fans blowing (i.e., is there
1.
If power is to the disc drive then
power to the drive)
try to recycle drive.
a. If drive comes ready and you are
able to answer message with an A
proceed normal.**
b. If drive does not come ready then
remove pack and I.D. plug to a
known good spare drive.*
c. If no power is going to disk drive-
the associated CRT speaker should
be checked and put back on. If
tripped - if continues to trip
move etc.*
V. No pack or LID not closed.
1.
Put pack into drive, make sure drive
is to be i;n system - start drive and
answer with an A.
Attachment "A"
Approved For Release 2008/02/12 : CIA-RDP94T00858R000600880001-7