PHASE II FINAL REPORT VOLUME V SYSTEM ORGANIZATION FUNCTIONS, AND PROCEDURES
Document Type:
Collection:
Document Number (FOIA) /ESDN (CREST):
CIA-RDP78-03952A000100050001-7
Release Decision:
RIPPUB
Original Classification:
S
Document Page Count:
493
Document Creation Date:
November 16, 2016
Document Release Date:
February 3, 2000
Sequence Number:
1
Case Number:
Publication Date:
March 1, 1965
Content Type:
REPORT
File:
Attachment | Size |
---|---|
CIA-RDP78-03952A000100050001-7.pdf | 20.08 MB |
Body:
Approved ForRelease 2000/05130 : CIA-1DP78-03952A0q, 0
SYSTEM ORGANIZATION,
FUNCTIONS, AND PROCEDURES
DIRECTORATE OF SCIENCE AND TECHNOLOGY
OFFICE OF COMPUTER SERVICES
df 1
Approved For Releas6-20601051,30`-: U1A-RDID78-03952A000100050001-7
GROUP
Excluded Fr.rn
ClO?
4 2114*,1
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
WARNING
This material contains information affecting
the National Defense of the United States
within the meaning of the espionage laws,
Title 18, USC, Secs. 793 and 794, the trans-
mission or revelation of which in any manner
to an unauthorized person is prohibited by law.
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 :go- 78-03952A000100050001-7
CONFIDENTIAL;
2/4.04,C-
././-
Phase II Final Report
Volume V
SYSTEM oRWTgATTPL?
FM4T115-0-nD PROCEDURES
CHIVE/R-3-65
1 March 1965
DOE REV DATE al Mr Pi Y g
0R1G COMP TYPE
OR1G MAAS PAGES REV CLASS -
JUST NEXT REV eV/ Minh RR 104
CONFIDENTIAL
Approved For Release 2000/05/30 : IA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cl
CONR
8-03952A0Z:010-0Q
TABLE OF CONTENTS
Page
5.1. Introduction 1
5.1.1. General 1
5.1.2. System Overview 2
5.2. System Organization 9
5.2.1. Background 9
5.2.2. Proposed Organizational Concept 10
5.2.3. Position Descriptions 21
5.3. Data Base 31
5.3.1. The Selection Problem 31
5.3.2. Basic Selection Criteria 31
5.3.3. Sources to be Exploited 33
5.3.4. Level of Coverage 34
5.4. CHIVE Indexing Technique 41
5.4.1. Introduction 41
5.4.2. Concepts 41
5.4.3. System Description 53
5.5. System Files
5.5.1. Introduction
Approved For Release 2000/05/30 : C
-03952A000100050001-?
-
67
67
Approved For Release 2000/05/30 : C 8-03952A000100050001-7
CONFIDENT1
Page
5.5.2. Document Index Files
73
5.5.3. Document Image Files
78
5,5.4. Vocabulary Control Files
81
5.5.5. Unsynthesized Information
Files (UIF)
106
5.5.6. Summary Information Files
(SIF)
115
5.5.7. Special Projects Files
121
5.5.8. Referral Service Files
127
5.5.9. Management Data File
133
5.6.
System Flows and Transactions
141
5.6.1. Document Input
141
5.6.2. Document Retrieval
151
5.6.3. Information File Building,
Maintenance and Retrieval
159
5.6.4. Task Tables for System
Transactions
168
5.7.
File Conversion
189
5.7.1. Introduction
189
5.7.2. Document Index Files
190
5.7.3. Document Image Files
203
5.8.
Computer Interface
211
5.8.1. General
211
Approved For Release 2000/05/3
52k8001 -00050001-7
A?roved N411E11E140 8-03952A000100050001-7
Page
5.8.2.
Command Language
212
5.8.3.
File Definitions and the EDP
File Analyst
216
5.8.4.
Summary
218
5.A
The Organizational Problem
221
5.A.1.
Organizational Objectives
221
5.A.2.
Alternative First-Level
Organizational Concepts
228
5.A.3.
Organizational Alternatives
Within A Geographic Division
258
5.B
Preliminary Evaluation of the CHIVE
Indexing Experiment
273
5.B.1.
Summary Description of
Experiment
273
5.3.2.
Preliminary Findings
279
Feasible Alternatives in Index
Design
287
CONTIDEIVT AL
Approved For Release 2000/05/30 : -03952A000100050001-7
Approved For Release 2000/05/30 : Cl
NFIDENT1AL
2A000100050001-7
5.C.
CHIVE Indexing Guide
Page
297
5.C.1.
Introduction
297
5.C.2.
Content Indexing System
298
5.C.3.
Header Data Transcription Guide
324
Tab A
Code Schedules
351
Tab B
Project CHIVE Tags
365
Tab C
CHIVE Index Terms
387
Tab D
CHIVE Header Form
388
Tab E
Authorized Abbreviations/CHIVE
389
5.D
Inherited Files
393
5.D.1.
Introduction
393
5.D.2.
Index Files
397
5.D.3.
Document Image Files
463
Approved For Release 2000/05/30 : CS40511
V)11bW4A
.
60100050001-7
Approved For Release 2000/05/30 : ClitigF81039L2ff, ,A,PrirrinVA I
FIGURES
Page
5-1 CHIVE System Flow Chart 3
5-2 UIF File Building Alternatives 112
5-3 SIF File Building Alternatives 120
5-4 Document Input Processing 142
5-5 Document Retrieval Processing 152
5-6 Information File Maintenance 164
5.D-1 List of China-Related Inherited Files 494
5.D-2 Vocabulary Control, Summary and
Unsynthesized China-Related Inherited
Files 495
5.D-3 Format A - SR Subject/Commodity File
Card 496
5.D-4 Format B - SR (China) Area Detail
File Card 498
25X6 5.D-5 Format C - SR Organization
25X6 File Card; SR Personality File
Card; SR Foreigner File Card 500
5.D-6 Format D - SR Soviet Organization File
Card; SR Soviet Personality File Card;
SR Soviet Foreigner File Card 502
5.D-7 Format E - All Other Organization File
Card 504
5.D-8 Format F - All Other Personality File
Card; All Other Foreigner File Card 506
Approved For Release 2000/05/30 : 1111-03952A000100050001-7
Approved aNase eFIDal
0/05/30 : CIA
103952A000100050001-7
5.D-9
Format G - PI Subject/Commodity File
Page
Card; PI Area File Card
508
5.D-10
Subject/Commodity and Area Files
510
5.D-11
Organization Files and Derivative
Files
511
5.D-12
Job 3 File Statistics
512
5.D-13
Reports Title Index
513
5.D-14
Job 3 Card Format
514
5.D-15
Job 3 (KWIC) Elements of Information
515
5.D-16
FIB Town/City Information Card Format
517
5.D-17
FIB Installation Information Card
Format
518
5.D-18
FIB Location Cross Reference Card
Format
519
5.D-19
FIB ICF Coordinate Card Format
520
5.D-20
FIB ICF City Cross Reference Card
Format
521
5.D-21
FIB ICF Name Card Format
522
5.D-22
FIB Model-Type Brochure Index Card
Format
523
5.D-23
Punched Card Characteristics of the
IRS DOcument Index File (New)
524
50D-24
Punched Card Characteristics of the
IRS Document Index File (Old)
525
5.D-25
Punched Card Characteristics of the
Film Index File
526
Approved For RelpaRVF05ITFAUggr-03952A000100050001-7
laU141-1"1
Approved For Release 2000/05/30
LIP6E-1110.7
Page
5-1 CHIVE inputs 38
25X1A 5-2 Index Report 171
5-3 Over-Counter Document Search 175
5-4 Generation and Input Processing of
Formatted Information/Index Records
Prepared Under Contract 179
5-5 Information Analyst Activity Relative
to an All-Source, All-File Search for
a Named Personality 183
Approved For Release 2000/05/30 : CIA-
CON
ODZI') IAL
03952A000100 500v -
Approved For Release 2000/05/30 : Clh-WER:5118-03952A000100050001-7
Chapter 5.1.
INTRODUCTION
5.1.1. GENERAL
This volume of the report is primarly concerned with
the non-EDP aspects of the CHIVE system, that is, the
organization of personnel required to operate the system
and types of personnel needed, the nature and extent of
the data base to be exploited, the indexing philosophy and
technique, the files which will be identified to the user,
system flows and data handling procedures, and the man-
machine interactions projected for the computer-centered
system. Of course, not all design problems have been
resolved. Moreover, even if they had, it would not be
possible to describe within the confines of one volume
all of the transactions which must be performed in a
system as large and complex as this. However, illustra-
tions of representative tasks are included and some
concepts of system data flows are presented to dEmonstrate
the impact of hardware and programs upon personnel actions.
The recommendations interspersed in this volume
result from Phase II of the design study, including a
preliminary evaluation of the CHIVE Indexing Experiment
INTRODUCTION
General
5.1.1.
- 1 -
Approved For Release 2000/05/30: CIA-MBE143952A000100050001-7
Approved For Release 2000/05/MMECRDP78-03952A000100050001-7
conducted between November 1964 and January 1965, and
are supported by material in some of the appendices to
this volume as well as in earlier CHIVE documentation.
The other appendices present further details on the
indexing language and technique and the files to be
inherited from the existing central reference reposi-
tories. All are recommended reading for recipients of
this report who desire more detail on specific aspects
of the system, as well as further background on the
alternative configurations considered and the steps
taken to arrive at the recommended system. A supple-
mentary appendix to this volume will be issued later
describing the CHIVE Indexing Experiment in greater
detail, and reporting the final conclusions derived
therefrom.
5.1.2. SYSTEM OVERVIEW
A simplified graphic view of the CHIVE system can
be obtained by referring to Figure 5-1. In this diagram
the flow paths within the system are separated for
descriptive purposes into three major functional cate-
gories--document input processing (flow path 2), document
retrieval processing (flow path 1), and information file
building and maintenance (flow path 3). The following
INTRODUCTION
System Overview
5.1.2.
- 2 -
Approved For Release 2000/0?e@gAIA-RDP78-03952A000100050001-7
25X1B
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-VER'V03952A000100050001-7
paragraphs will summarize briefly the major elements of
the system, leaving the more detailed explanations to
subsequent chapters in the volume.
In general, the philosophy of the CHIVE system is to
combine the required intellectual talents of trained
intelligence information analysts with the processing and
storage capabilities of the computer. The source documents
to be input to the system, the necessary human functions
to be performed relative to these documents (i.e., reading,
selecting, indexing, querying and reporting) and the
outputs to be derived from the system are quite similar to
those which characterize one or more elements of the
existing central reference operation. Only if the proposed
system is compared to an individual register s,lbsystem
within the current OCR complex does the contr,st appear,
and then only with respect to certain features of the
existing subsystem.
In terms of file organization, the system follows
the approach used in SR/OCR and DD/OCR in maintaining a
separation between an index and the document holdings to
which it refers. This necessarily has implications in
terms of input time which may compare unfavorably with
some of the current systems which are oriented toward
multiple-filed documents (e.g., BR/OCR), but it also
INTRODUCTION
System Overview
5.1.2.
- 4 -
Approved For Release 2000/05/30 : CIA-RgFrTgg3952A000100050001-7
Approved For Release 2000/05/30 : Cli&W8-03952A000100050001-7
offers certain advantages in such areas as procedural
standardization, index integration, number and variety
of access points to the files, space requirements, etc.
The information is received primarily 'n the form
of documents; however, index records to maps, photographs,
and films will also bu included in the system, as will
certain machine-language data prepared on contract (but
under CHIVE control) by external organizations (e.g., the
Library of Congress).
Following preparation of the index record (a function
normally performed by humans except where only a limited
retrieval capability seems required), the index will be
converted to machine storage with the aid of an optical
character reader and placed in a random access device,
ultimately the IBM/System 360 Data Cell Drive. The
information storage capacity of one Data Cell Drive will
allow us to accommodate the content of an estimated
600,000 index records (the actual storage capacity is
400 million characters of information), and there is no
practical limit on the number of modules that could be
provided. The same device would be used te hold what
might be called the directory to the index records
themselves, i.e., a list of the terms which appear in
INTRODUCTION
System Overview
5.1.2.
- 5 -
Approved For Release 2000/05/30 : CIA-FEEK*63952A000100050001-7
Approved For Release 2000/05WINA-RDP78-03952A000100050001-7
the index records and, for each term, the record and
phrase number(s) containing said term. This would
obviate the need to examine every index record in the
file to see if it contains the term (or terms) sought.
Index entries can be retrieved from the index store at
the rate of about two per second depending on the number
of terms involved in the search formula.
CHIVE's recommendation is that most textual docu-
ments should be converted to microfilm and stored either
in the form of 35 mm. aperture cards (containing up to 8
image,_ per aperture) or packed microfiche (sheet microfilm
records containing up to 60 letter-size pages on each
microfiche). Documents in excess of a certain page limit
and those of poor image quality should be kept in hard
copy. Maps, films, and photos will continue to be
stored in the conventional manner in the physical reposi-
tories in which they are now located.
Whether the 35 mm. aperture card or microfiche
storage system is chosen, the document images should be
filed in motorized card files, but should be retrieved
and refiled manually. Assuming 10 million documents were
to be stored on site, the estimated floor space required
for a packed microfiche system would be an area approxima-
tely 30' x 60'; for the 35 mm. system, 40' x 70'. Output
INTRODUCTION
System Overview
5.1.2.
- 6 -
Approved For Release 2000/05/SECREtr-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: CaKBAT8-03952A000100050001-7
from either the hard copy document or microimage files
would consist of paper copies. The integrity of the
document collection will be maintained such that none
of the master microimages, or original documents if
filed only in hard copy, will leave the file except for
photoduplication or hard copy printing.
INTRODUCTION
System Overview
5.1.2.
- 7 -
Approved For Release 2000/05/30 : CIA3iRU03952A000100050001-7
Approved For Release 2000/05/30: CIAKA305-03952A000100050001-7
Chapter 5.2.
SYSTEM ORGANIZATION
5.2.1. BACKGROUND
The organizational configuration recommended by
CHIVE is the product of much thought and discussion
Imo
extending back into the Phase I study and reflects a
variety of views expressed by persons both internal to
um;
the CHIVE design team as well as to OCR. One of the
two most vexing and, at the same time, one of the most
important of the CHIVE design problems, it is not
anticipated that the organizational plan which has
evolved will be attractive to all. Nevertheless, it
Iwo
appears to offer the best hope of achieving the
isme desired system objectives, consistent with the human
factor requirements imposed by the environment within
Iwo which the system must operate.
The search for a revisioli of the existing central
mow
reference organizational structure was largely influenced
by the findings of the DD/I survey and the set of system
10.40
'vomit
requirements derived therefrom. These findings and inferred
goals have been described in the CHIVE Phase I Report, in
CHIVE/R-1-63, and (in more abbreviated fashion) in Volume
IV, Chapter 2, of this report. The organization study
SYSTEM ORGANIZATION
Background
- 9 - 5.2.1.
Approved For Release 2000/05/30 : CIAMIRIF03952A000100050001-7
Approved For Release 2000/05/36EUETRDP78-03952A000100050001-7
itself may be said to have consisted of three phases:
a. An analysis of the personnel or management
requirements imposed on the system by the
overall system objectives.
b. A study of various alternative organizational
configurations which might be adopted, ranging
from a completely decentralized activity to
various kinds of centralized operations,
including alternative configurations at different
hierarchic levels.
c. An evaluation of one organizational concept by
the process of subjecting the concept to a
practical experiment which simulated to some
extent the problems to be encountered in a live
environment.
Phases a. and b. are described in some depth in
Appendix 5.A. to this volume and are briefly reviewed
below. Phase c., which resulted in some revision of the
organizational concept, is discussed in Appendix 5.B. and
only its conclusions are reflected here.
In considering the managerial problem of how best to
organize the input and retrieval functions to be performed,
as well as the personnel to carry out these functions, a
number of organizational requirements were set forth.
SYSTEM ORGANIZATION
Background
5.2.1.
- 10 -
Approved For Release 2000/05/gtcRt-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAW8-03952A000100050001-7
These requirements, or objectives, may be summarized
for the purposes of this review as follows:
a. Specialization with minimum processing duplication
b. Minimum customer contact points
c. All-source service from any point
d. Close communication between input and query
handlers
e. Close communication between system operators
and users
f. Document control as the first priority
g. Operator job satisfaction
h. Personnel flexibility
The next step was to pass various organizational
configurations against these objectives to determine which
would appear to offer the best hope of accommodating the
defined goals. Because of the size of the contemplated
activity in terms of the number of personnel needed to
operate the system, this required that alternative con-
figurations be considered not only at the first organiza-
tional level, but at least at one additional level below
that.
For the first cut the following four different
organizational concepts were considered:
a. Retention of the existing OCR configuration
SYSTEM ORGANIZATION
Background
5.2.1.
- 11 -
Approved For Release 2000/05/30 : CIAMtEX03952A000100050001-7
Approved For Release 2000/05/MCCRFORDP78-03952A000100050001-7
b. Development of a single, all-source document
retrieval system, with a separate biographic
information facility
c. Dispersal of some or all of the information
storage and retrieval activity among the
research and production components
d. Continuation of the central system, but on an
all-source, geographically-organized basis
Where the additional subdivision of personnel would
be required because of the size of a particular component
these additional means of grouping the analysts assigned
thereto were studied:
a. Organization by document source (Collateral,
Comint, etc.)
b. Organization by function (input, retrieval,
information file maintenance, etc.)
c. Organization by class of data to be stored
and retrieved (biographic, installation,
subject/commodity, etc.)
d. Organization by topic (political, scientific,
economic, military, etc.)
The study very quickly made clear that none of the
alternatives considered resolved all problems that could
be anticipated. However, the combination of the geographic
SYSTEM ORGANIZATION
Background
5.2.1.
- 12 -
Approved For Release 2000/05a0GrOk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClagF8-03952A0001000500017
approach at the first level, and topical specialization
(where required) at the second, seemed to come closest
to meeting the organizational objectives outlined above.
It remained to be seen, however, whether an information
analyst could perform all-topic indexing of all-source
documents satisfactorily, and what effect it mi-it have
on his morale and attitude if he had to operatL in this
kind of environment.
The CHIVE Indexing Experiment afforded the opportunity
to test the configuration proposed and, as detailed in
Appendix 5.B., identified a number of problem areas which
suggested that some additional organizational and procedural
alternatives might well be considered. Of principal
interest from the organizational point of view was the
recommendation that the geographic concept be retained but
that the coding process per, se be separated from the
function of selecting documents and identifying the
subjects or objects to be indexed. Acceptance of this
approach meant some compromise of the single-point indexing
concept but offered the advantage of increased job
satisfaction on the part of the more highly qualified
analyst, helped reduce the selection problem, and suggested
the possibility of acquiring more personnel for less money
to perform the more routine input functions. Since it
still permitted achieving all the other organizational
SYSTEM. ORGANIZATION
Background
- 13 - 5 2 . 1.
Approved For Release 2000/05/30 : CIA-Frgant3952A000100050001-7
Approved For Release 2000/05/3V:CdiATRDP78-03952A000100050001-7
objectives, it was selected as the alternative best
satisfying the system requirements and is the approach
recommended here.
5.2.2. PROPOSED ORGANIZATIONAL CONCEPT
The responsibility for implementing a spec_fic
organizational configuration must be left to those who
will direct the operation since there are a variety of
factors to be considered which are beyond the purview
of the system designer. To assist those, however, who
will be charged with this activity, it might be useful
to summarize the principal CHIVE organizational recom-
mendations in the context of the major functions to be
performed within the system, and to give some feel for
the interrelationships between these functions since
these could have implications for management in terms
of communication interface, assignment of physical
space, and so forth. This first _,00k will be an
abbleviated one since much of the same ground is
covered (if from a slightly different point of view) in
more detail in other sections of this volume. A set of
position descriptions outlining the duties and responsi-
bilities of the various types of personnel within the
system concludes the chapter.
SYSTEM ORGANIZATION
Proposed Organizational Concept
5.2.2.
- 14 -
Approved For Release 2000/05ntR04-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: gMtp1978-03952A000100050001-7
5.2.2.1. Input Control and Customer Service
The CHIVE system would be built largely around
Information Analysts organized (at the first level) into
some four or five geographic components. It is our view
that it is difficult to identify any better way of
organizing the input and retrieval activity than by
grouping the primary individuals involved by geographic
area. As stated in earlier documentation, this approach
loses the advantage of source specialization in processing
and poses the problem of geographic overlap in document
analysis and query coordination. At the same time, it
contributes to standardization of vocabularies and
procedures important in an all-source environment, anC is
in focus with customer inquiries which normally relate to
a particular geographic region of the world. Thus, on
balance, while it does not overcome all operational
problems that can be envisaged, of all the alternatives
considered it seems to come nearest to meetin7 the system
objectives.
Without specific restrictive criteria (which, thus
far, seem impossible to obtain) with respect to the content
of the documents to he processed, the experienced
Information Analyst, operating in close communication with
his customers, appears to offer the best hone of resolving
the data selection problem. The Information Analyst would,
SYSTEM ORGANIZATION
Proposed Organizational Concept
- 15 - 5.2.2.1.
Approved For Release 2000/05/30 : CIPM1M-03952A000100050001-7
Approved For Release 2000?IftitTCIA-RDP78-03952A0001 050001-7
therefore, be responsible for determining not only what
documents entered the system files but what data within
these documents was captured for retrieval purposes.
The Information Analyst operating out of a geogra-
phic component would also be solely responsible for the
selection and processing of data input to information files
required by customers, and would handle all queries levied
on the system. By virtue of the fact that he was personally
involved in the input process, he would not only be familiar
with the current reporting but would know what material
had been stored for retrospective searching and how to get
at it.
Whether the Information Analyst should also specialize
by topic within area or by some class of intelligence data
(e.g., biographic, installation, etc.) remains a moot point.
CHIVE continues to favor the former in the belief that it
would lessen the number of times a document would have to
be handled, but additional testing of both concepts is
desirable.
5.2.2.2. Index Preparation
The function of physically preparing the index records
to documents, including both the header (bibliographic) as
well as the content data descriptions, would be assigned to
special personneZ known as Header Indexers and Content
-1
Approved For Release 200gge
SYSTEM ORGANIZATION
Proposed Organizational Concept
5 . 2 . 2 . 2 .
: CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: GIWREVIP78-03952A000100050001-7
Indexers, operating in close communication with the
analytical components.
Content Indexers serving one geographic desk, e.g.,
the Far East, should probably be located together as
a unit attached to said component. The Content Indexers,
like the Information Analysts, would be subdivided by
geographic area and each would normally process the output
of his counterpart analyst or analysts.
Content Indexers would each have a set of the dictio-
naries and other vocabulary control tools pertinent to hi:
area of responsibility. In addition, a master set of other
area dictionaries would be located within each content
indexing group for reference purposes.
Content Indexers would translate the items of data
tagged by Information Analysts into the codes and other
descriptors dictated by the vocabulary of the system.
To increase their sense of participation in the more
intellectual aspects of the input process (and, thereby,
reduce turnover), they might be given full responsibility
for general subject indexing as distinct from named-
object control.
Header Indexers would perform a function similar to
content indexing, but on the bibliographic elements of a
document. One group of Header Indexers would operate in a
SYSTEM ORGANIZATION
Proposed Organizational Concept
5.2.2.2.
- 17 -
Approved For Release 2000/05/30 : CU:SWEV-03952A000100050001-7
Approved For Release 2000/?Et3REICIA-RDP78-03952A000100050001-7
centralized mode, serving all geographic components by
header indexing, immediately upon receipt, those documents
for which CHIVE has a repository responsibility. Other
Header Indexers would be assigned to each geographic
organization to capture the necessary bibliographic data
pertaining to non-repository-type documents which had
been reviewed by Information Analysts and selected for
retention by the system.
5.2.2.3. Dissemination
The dissemination function, apart from any necessary
re-routing of documents within a CHIVE geographic component,
is external to the system per se. However, it might be
advantageous to co-locate dissemination personnel with the
centralized header indexing group
between document receipt and file
repository-type documents.
5.2.2.4.
This
operation
equipment
Data Transcription
function
required
is to be
refers
if, as
to shorten the time
availability for
to the rather formalized typing
planned,
used to convert
optical recognition
index and other records
into machine-recognizable form. Header Indexers can type
their inputs in a form suitable for processing by a page
reader. However, a central pool of typists will also be
SYSTEM ORGANIZATION
Proposed Organizational Concept
5.2.2.4.
- 18 -
Approved For Release 200W4,r. CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : qttlikle78-03952A000100050001-7
needed, operating at the system level, to convert the
majority of transcript sheets received from Content
Indexers as well as search rr,quests from Information
Analysts into the graphic quality required. This central
pool can be supplemented by typists assigned to the
various area desks who, in addition to typing finished
reports, memoranda, etc., would also transcribe many of
the file maintenance and query transactions for input
to the page reader.
5.2.2.5. Image Processing and Document File Maintenance
Image processing is that activity conducted by the
so-called "Document Delivery System," i.e., the micro-
filming and associated operations required to convert
incoming documents to microimage form, as well as the
reproduction of items retrieved from the document store
for delivery to customers. This is a relatively discrete
function although, if an aperture card storage system is
employed, it requires some support from the machine side
of the house. Otherwise, its principal interface is with
the document store itself to which materials are passed
after microfilming and from which it receives, in turn,
items to be reproduced.
During the evolutionary development of the CHIVL
system both the new and old system operators will require
SYSTEM ORGANIZATION
Proposed Organizational Concept
5.2.2.5.
- 19 -
Approved For Release 2000/05/30 : Cl4gait7r8-03952A000100050001-7
Approved For Release 200ogg1ft'ETCIA-RDP78-03952A000100050001-7
access to many of the same document collections. If
the logistical problems are not too severe, it would
seem advisable to co-locate all master document files
in one general physical area to lessen the communication
Problem as well as render file maintenance operations
more efficient. This might increase the distance which
now obtains between an existing central document
collection and a set of users, but over time the majority
of users would probably benefit from the establishment
of one "Document Center." Similarly, because of the
close relationship between the document files themselves
and the image processing function, it is recommended that
the latter be connected both physically and organizationally
to the former.*
5.2.2.6. Machine Functions
The principal machine-related activities and hardware
include:
a. EAM personnel and equipment needed to input data
to files not yet absorbed into the new system and
to retrieve data therefrom. Assuming no conver-
sion to an EDP storage medium, the latter, in
particular, will necessitate the retention of an
EAM facility for as long as the inherited files
have value.
b. EDP hardware needed to operate the new system,
including associated I/O devices (e.g., the page
reader), and computer operator personnel.
* Problems involved in co-locating files are discussed in
Volume III.
SYSTEM ORGANIZATION
Proposed Organizational Concept
- 20 -5.2 2,..6
Approved For Release 2001-: CIA-Ku1378-03952A000100050001-7
Approved For Release 2000/05/30 : Cl1aW3-03952A000100050001-7
c. System analysts/programmers (referred to in
this report as EDP File Analysts) who will
develop and refine the machine operations to
be performed, define new files to the system,
etc.
Logically, all of these personnel and operations should
be centralized in one organizational component whether
located within the central reference complex or external
to it.
5.2.3. POSITION DESCRIPTIONS
The personnel involved in making up the CHIVE operator
complex will include the following: Information Analyst,
Content Indexer, Header Indexer, Dictionary Editor, Data
Transcriber, Information Control Clerk, Document File
Clerk, Reproduction Equipment Opeator, EAM Operator,
Computer Operator, and EDP File Analyst.
5.2.3.1. Information Analyst
The Information Analyst will be the principal inter-
mediary between the customer and the system. He will be
responsible for selecting what goes into the files and
will screen all output before it is delivered to a
requester. Senior Information Analysts will serve in
various supervisory capacities frm the sub-Section
to the Branch or Division level, directing, coordinating,
and reviewing the work performed by their subordinates.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.1.
- 21 -
Approved For Release 2000/05/30 : CIAMMT-03952A000100050001-7
Approved For Release 2000/05/?ifcgURDP78-03952A000100050001-7
All Information Analysts will hold professional positions,
and will specialize in a particular geographic area and
(where required by-reason of work volume) by topic within
area.
Every Information Analyst will be trained in applying
the indexing vocabulary to documents by actual involve-
ment in the coding process. He will also be thoroughly
familiar with all the CHIVE-built files available within
the system as well as the query language used to interro-
gate or modify said files. In addition, he will know what
inherited files were acquired from the existing system
and their general content, although not necessarily the
vocabulary used in these files.
The duties of an Information Analyst will include:
a. Receiving and reviewing the content of documents,
cables, graphics and other incoming data for
information worthy of retention by the central
reference system.
b. Selectively marking the elements of information
to be extracted from the documents for represen-
tation in the system's index files and distribu-
ting the marked documents to Content and/or
Header Indexers.
c. Exploiting the content of document index records
for the purpose of building formatted information
files pertaining to a specific subject or class
of subjects.
d. Preparing file maintenance transcript sheets as
the means of adding da,-a to, or changing data
within, said information files.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.1.
- 22 -
Approved For Release 2000/05g@c041-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clktidifir3-03952A000100050001-7
e. Receiving requests from customers and preparing
the necessary search prescription after
consulting the relevant vocabulary control files
and other Information Analysts most familiar
with the vocabularies of certain inherited files.
f. Requesting copies of documents as well as dossiers
and other master records from the central docu-
ment repository.
g.
Reviewing, analyzing, and synthesizing data
recovered as a result of the search process and
preparing responses in raw or finished form for
delivery to the customer.
h. Advising customers about files or persons external
to CHIVE that might be worthwhile consulting, and
personally contacting same if required.
1. Recording necessary management data relative to
requests received, responses furnished, and other
system processes.
5.2.3.2. Content Indexer
The Content Indexer will he a semi-professional possessing
at least a high school education. His duties will include:
a. Extracting the elements of information in a docu-
ment identified for him by the Information
Analyst.
b. Consulting the relevant dictionaries and other
vocabulary control files for the purpose of
seJecting the appropriate controlled terms to
express these items of data.
c. Arranging the data into a form for machine entry
using pro-forma content data transcript sheets.
d. Consulting with the appropriate Information Analysts
and Dictionary Editors with regard to the applica-
tion of the index language and possible revisions
to the system vocabularies.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.2.
- 23 -
Approved For Release 2000/05/30 : CIA-RgEn?013952A000100050001-7
Approved For Release 2000/05/AE.CMIRDP78-03952A000100050001-7
e. Initiation of additions or changes to the
vocabulary control files through preparation
of file maintenance transcript sheets.
f. Reviewing printout of changes and additions to
the files including incorrect entries.
5.2.3.3. Header Indexer
The Header Indexer will occupy a clerical position and
must be a qualified typist. The duties of the Header
Indexer will include:
a. Extracting the standard header (bibliographic)
data appropriate to the category of document
involved, and expressing this data (where re-
quired) in the codes used by the system.
b. Typing the data in the prescribed manner for
machine entry using the correct header data
transcript sheet.
c. Consulting with the Dict 'nary Editor for
Header Data with regardto':he use of the header
data codes and format, al. recommending changes
when required.
5.2.3.4. Dictionary Editor
The Dictionary Editor will be an Information Analyst
with primary responsibility for control of one of the
system's vocabulary files. Some Dictionary Editors will
have system-wide control over the application of terms in
their respective subject areas. Others (e.g., an Organiza-
tion Dictionary Editor) may govern the use of terms only
within a given country or other geographic area. Tho
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.4.
- 24 -
Approved For Release 2000/05SEERelk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CggBg8-03952A000100050001-7
duties of a Dictionary Editor will include:
a. General review of the content and format of
sample transcript sheets emanating from
Indexers assigned to the area unit of which
he is a part.
b. Providing advice and counsel to Indexers on
the use of the specific dictionary for which
he is responsible.
c. Reviewing all new entries to the dictionary for
the purpose of determining whether each was a
legitimate entry and whether format and content
met established procedures.
d. Personally initiating changes to a dictionary
where required.
e. Reviewing printouts of changes and additio,ls
and insuring that all revisions to the dictionary
are published and disseminated.
f. Consulting with other Information Analysts and
custr?rs regarding current requirements and
possible improvements to the system's vocabulary
control files.
g. Advising Information Analysts preparing request
statements on the terms to be used in the
query prescription.
5.2.3.5. Data Transcriber
The Data Transcriber includes any person exclusively
assigned to operate a key-driven device from copy provided
via another system operation. The duties of a Data
Transcriber will be as follows:
a. Receive format instructions from Information Analyst,
Content Indexer, or other individual for typing,
tape perforation, or card punching.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.5.
- 25 -
Approved For Release 2000/05/30: CIA-BDPRE103952A000100050001-7
Approved For Release 2000/05/36ECKTRDP78-03952A000100050001-7
b. Prepare typed copy, punched paper tape, or
cards for optical character recognition or
other form of computer entry.
c. Check transcribed cop l for accuracy and correct
if necessary.
d. Operate typewriter, Flexowriter-like device,
026 Key Punch, and 056 Verifier.
5.2.3.6. Information Control Clerk
Information Control Clerks will be assigned to
most operational components of the system. Their general
duties will include:
a. Receiving material such as hard copy documents,
machine listings, document request forms, paper
and magnetic tapes, card decKs, etc.
b. Accounting for material received and maintaining
necessary special-purpose logs of requests and
other actions.
c. Intra-office routing and delivery of materials
to staff personnel and mailing of system
products to customers.
d. Assisting Information Analysts in the routine
maintenance of manual files including the
insertion of handwritten entries to machine
listings and other hard copy records.
5.2.3.7. Document File Clerk
The duties of the Document File Clerk will include:
a. Filing newly-processed documents or refiling old
materials in+-..o one or more of the following types
of document files: personality or installation
dossiers, card files, open-shelf document files,
and 16 mm. or 35 mm. aperture card collections.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.7.
- 26 -
Approved For Release 2000/05EXICKMk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClOCEIM-03952A000100050001-7
b. Receiving requests for documents or other
records to be retrieved and recovering same
from the appropriate files on either a
routine or priority basis.
c. Maintaining dossier and other special-purpose
logs pertaining to transactions affecting the
document files.
d. Recording action taken on document request forms,
forwarding requests for unrecovered documents
to other file repositories for searching, and
transmittal of master records to image
processing for photographic reproduction.
5.2.3.8. Reproduction Equipment Operator
The duties of the Reproduction Equipment Operator
will include:
a. Receiving inroming documents and determining which
are photographable and which must be stored in
hard copy.
b. Operating the appropriate microfilming equipment
required to reduce the documents to a micro-
storage medium and reviewing the quality of the
photographic record.
c. Receiving documents retrieved from the master files
and reproducing same on a variety of image-
processing equipment.
d. Servicing and supplying reproducing equipment.
e. Supplying copies of documents to Information
Control Clerks for delivery to internal or
external requesters.
5.2.3.9. EAM Operator
EAM Operators will be required to process certain
card files inherited from the existing system as well as
- 27 -
Approved For Release 2000/05/30 : CIA-SIE1RT03952A000100050001-7
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.9.
Approved For Release 2000/05/1C8aRDP78-03952A000100050001-7
select new files. In general, the duties of an EAM
Operator will include:
a. Operating electrical accounting machines
including interpreter, reproducer, tabulator,
sorter, and printer units.
b. Performing routine machine operations in
accordance with conditions outlined by EDP File
Analysts, Information Analysts, and Indexers.
c. Wiring panels in accordance with directions.
5.2.3.10. Computer Operator
In general, the duties of the Computer Operator will
include:
a. Maintaining a schedule and operating log of the
components of the computer complex. -
b. Loading and unloading Tape Units.
c. Loading and operating stored programs.
d. Tracing and correcting program errors.
e. Correcting failures in card, paper tape, or
optical character reading equipment.
f. Wiring and/or selecting control panels for
use in card reading machines.
5.2.3.11. EDP File Analyst
The duties of the EDP File Analyst will include:
a. Determining from Information Analysts requirements
for new system files and devel(N-ing the record
structures, file formats, and output products
needed to establish and maintain such files.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.11.
- 28 -
Approved For Release 2000/05MOCRHA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: CgkBP8-03952A000100050001-7
b. Preparing general and special-purpose programs
either for the purpose of converting extant
machine-language files or to provide new data
processing capabilities.
c. Testing newly designed programs utilizing the
computer and necessary input/output units.
d. Conducting studies of system data flow for the
development and refinement of programs.
e. Determining utilization requirements for input/
output devices including displays, and designing
programs to permit exploitation of input/output
capabilities.
f. Designing quantitive techniques and statistical
devices for special program applications.
g. Preparing procedures descriptions including
coding formats and flow charts for operator task
guidance.
SYSTEM ORGANIZATION
Position Descriptions
5.2.3.11.
- 29 -
Approved For Release 2000/05/30 : CIA-&PaCI:RiFt3103952A000100050001-7
Approved For Release 2000/05/30 : CIRW8-03952A000100050001-7
Chapter 5.3.
DATA BASE
5.3.1, THE SELECTION PROBLEM
The selection problem has been with OCR since its
inception. No coordinated study of selection as an
entire OCR problem has ever been made. Individual
registers have established selection criteria, some
more formalized than others. An attempt to summarize
these criteria for compatibility, or to establish
common criteria to be used by all registers was not
deemed necessary heretofore. Since each register has
been more or less independent, ipso facto, its criteria
have for the most part been unrelated to those of any
other register. This condition has led to non-uniform
levels of coverage and, in some cases, duplicative
processing of the same subject matter. Regardless of
CHIVE, if OCR adopts a geographical organization
posture, uniform criteria for document series and depth
of subject indexing become mandatory within geographic
component.
5.3.2. BASIC SELECTION CRITERIA
Selection criteria will depend on several factors:
DATA BASE
The Selection Problem
5.3.2.
- 31 -
Approved For Release 2000/05/30 : CIASRORES-03952A000100050001-7
Approved For Release 2000/05/3tHWTRDP78-03952A000100050001-7
(a) the documents used and information needed by the
analytic offices; (b) the all-source concept and
organizational configuration thereof. These two factors
have to be balanced against the manpower and resultant
capability available for the operation.
There seems to be a consensus of opinion that
several levels of indexing should be applied to the
various categories of documents:
- Entire series to be indexed in depth.
- Entire series to be rejected for depth indexing,
but to receive header or bibliographic control.
- Entire series to be rejected completely.
- Specific documents within a series to be indexed
in depth.
Selection of an indexing level for a particular
document category is contingent upon customer reaction
and acceptance, which determination requires discussion
of interest in series not covered now and re-examination
of series presently covered. Customer participation in
determining selection criteria can mean the success or
failure of the system in terms of usage. Once the level
of indexing is agreed upon, document priorities will need
to be established for implementing the CHIVE system since
all categories cannot be implemented within the initial
System simultaneously.
DATA BASE
Basic Selection Criteria
5.3.2.
- 32 -
Approved For Release 2000/05/3SteTRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAWf?T-03952A000100050001-7
5.3.3. SOURCES TO BE EXPLOITED
The following major document series are planned
for CHIVE control:
Raw Intelligence Reports (Collateral)
State Airgrams
Military Attache Reports
CIA Reports--00,CS, etc.
Military Command Reports
Selected Other Governmental--AID, USIA, etc.
25X6 International Organizations--NATO, etc.
25X6
25X1A
Cables (Collateral)
CIA-TDCS
Non-CIA
Finished Intelligence
U.S.
Open Publications and Translations
FDD
JPRS
DATA BASE
Sources to be Exploited
5.3.3.
- 33 -
Approved For Release 2000/05/30 : CIAMU1T-03952A000100050001-7
25X1A
Approved For Release 2000/05/35Ctra-RDP78-03952A000100050001-7
COMINT
Messages
Reports
Photo Interpretation Reports (T/KH)
Maps, Films, and Ground Photos
Miscellaneous
Select Contractual-IIIIII etc.
State Biographic Cards
Unclassified Selected Periodicals, e.g., for
China: Peking Review, Survey of China
Mainland Press, etc.
Criteria for the depth of coverage will be developed
by the CHIVE information analyst working in concert with
the research offices. He will direct the indexer as to
coverage and depth, i.e., which personalities, which
organizations, and/or which subjects should be indexed.
The CHIVE Indexing Experiment has shown the need for title
coverage of most documents regardless of the level of
indexing unless the document or series is completely
rejected. This includes title preparation for those types
to be selectively indexed which have no titles, e.g.,
non-CIA cables.
5.3.4. LEVEL OF COVERAGE
5.3.4.1. Raw Intelligence Reports
Since the information content of IR's supports a
DATA BASE
Level of Coverage
- 34 - 5. 3. 4 . 1 .
Approved For Release 2000/05/gtckht-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CAR8-03952A000100050001-7
variety of intelligence interests, all IR's will be
considered for some level of indexing. Duplicative
information which frequently occurs between sources
will be eliminated wherever possible, based on the
information analyst's recall capability supplemented
by data contained in dictionaries and identifier
lists.
5.3.4.2. Cables
CIA cables (TDCS's) have always been handled as
Information Reports and should be continued as such.
As for non-CIA cables, the very fragmentary and highly
perishable nature of these cables and the frequent
duplication by follow-up reporting would indicate that
only a small percentage of these cables are worthy of
storage for retrospective search purposes. Only those
cables containing positive foreign intelligence infor-
mation will be indexed for header control as well as
content. All others will be rejected completely--the
Cable Secretariat continuing to retain repository
responsibility for same.
5.3.4.3. Finished Intelligence
The Intelligence Publications Index (IPI) and Special
Register's Job 3 are published by OCR to provide current
awareness, and, to a lesser extent, retrospective subject
- 35 -
DATA BASE
Level of Coverage
5.3.4.3.
Approved For Release 2000/05/30 : CIA4KIKET-03952A000100050001-7
Approved For Release 2000/05/aCM-RDP78-03952A000100050001-7
and area searching for finished intelligence. In
addition, some finished intelligence is incorporated
into the files of BR and FIB.
Since the Agency has a repository responsibility
for finished intelligence, bibliographic control over
such material will be established in the CHIVE system
for document retrieval purposes. Furthermore, some
named-object indexing of finished intelligence documents
will be performed similar to the control currently
maintained by BR and FIB.
During the evolution of the CHIVE system, the biblio-
graphic and named-object control achieved by the 25X1A
25X1A and subsequent branches will in part
duplicate the contents of the IPI and Job 3. This dupli-
cation seems unavoidable since the issuance of these
publications should continue in order to serve the current
awareness needs of analysts, and it does not seem
feasible during the implementation period to split the
preparation of the publications between CHIVE and the
existing activities. When implementation has been completed,
however, it will be desirable to investigate the feasibility
of producing a permuted title index to finished intelligence
from the machine-stored data base as a replacement for both
the IPI and Job 3,
DATA BASE
Level of Coverage
5.3.4.3.
- 36 -
Approved For Release 2000/05/WW-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CgRE*1-6-03952A000100050001-7
STATSPEC
5.3.4.5. Eapila Translations
FDD and JPRS translations can be considered as one
type. Heretofore in CIA no subject indexing scheme has
incorporated both of these open literature sources. The
manpower needed to cope with this large volume (see Table
5-1) is of significant concern. However, broad customer
interest dictates in-depth subject and named-object
control.
DATA BASE
Level of Coverage
- 37 - 5.3.4.5.
Approved For Release 2000/05/30 : CIASHEMT03952A000100050001-7
25X1A
I I I
pprov d For Release 2000/05/30: CIA-RDP78-03952A000100050001-7
Table 5-1
CHIVE INPUTS
Series
Approximate
Annual
Volume
Repository
Responsibility
Bibliographic
Content (C)
miity
B) and/or
Control
Remainder
,moiliel?IN?????????
Ram Intelligence
(including Tres's)
Cables
Finished Intelligence
Translations
FDD
JPRS
COMUNT
Photo Interpretation Reports
Maps
Films and Ground Photos
25X6
253,500
192,000
7,803
78,000 items
44,300 items
109,050
7,900
6,000
87,000
903,035
7,200
36
625 items
1,560 items
3,400 items
104 items
12,925
X
X
IND
?00.
???
9111
B and. C
Not processed
Approved For Release 2000/05/30 : CIA-RDP28-0$3952A000100050001-7
B only
B and C
B and C
B and. C
B and C
B and C
B and C
Not processed
B and C
B and C
B and C
B and C
B and C
B
B and C
B and C
B cnly
B
B only
B only
B only
B only
B and C
Not processed
Not processed
Not processed
Not processed
Not processed
Approved For Release 2000/05/30: ClAfiefE18-03952A000100050001-7
5.3.4.6, COMINT
All hard-copy SI material with the possible exception
of military order-of-battle data will be considered for
indexing in depth. Teletypes are excluded in their
entirety pending the design of an automatic processing
capability which will take advantage of the fact that the
data is available in machine-language. An information
analyst knowledgeable in both collateral and SI may be
able to spot duplicative information if such exists. One
large series of SI material which, in the present OCR/SR
system, is given cursory control, will be studied to
determine whether it should receive any title or subject
control whatsoever. A few items in this series were
processed during the experiment, but the titles were so
general as to be practically worthless for retrieval.
5.3.4.7. Photo Interpretation Reports
The unquestioned value of this category requires
that all published reports receive in-depth content
and header indexing.
5.3.4.8. Maps, Films, and Photos
These categories of receipts will be excluded from
CHIVE processing control because of the specialized
knowledge needed for their analysis and input, the
DATA BASE
Level of Coverage
5.3.4.8.
- 39
Approved For Release 2000/05/30 : CIA-EMET03952A000100050001-7
25X1A
Approved For Release 2000/05/35ECREIRDP78-03952A000100050001-7
difficulty of separating the indexing function from
the acquisition activity, etc. It has been agreed,
however, to have these materials indexed by GR and the
Map Library according to the CHIVE indexing scheme and
the index records will be incorporated into the CHIVE
data base.
5.3.4.9. Other
A number of miscellaneous classes of documents will
also be processed by CHIVE. Most (e.g., press reviews
and surveys) will receive named-object indexing primarily.
The large volume of State biographic cards needs rigid
selection and weeding not only to determine names of
interest but also to eliminate repetitive information.
Like the there is little high-
grade ore contained therein in relation to volume.
DATA BASE
Level of Coverage
5.3.4.9.
- 40 -
Approved For Release 2000/05@ket-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClaSRI-03952A000100050001-7
Chapter 5.4.
CHIVE INDEXING TECHNIQUE
5.4.1. INTRODUCTION
The most critical design element of the proposed
system is the indexing system to be applied to input
documents; the performance of the system is no better
than the data which it is supplied. The transformation
of textual material to the system language is an expen-
sive process - one which has been given more attention
than any other in the Phase II effort.
5.4.2. CONCEPTS
5.4.2.1. Document/Information Retrieval
The system will provide combined information
retrieval and document retrieval capability. Documents
themselves will be at the heart of the system, with
their index records providing access to them through
content control. The index records will also be the
base from which information files will be built. That
is, in the process of indexing documents, facts about
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.1.
- 41 -
Approved For Release 2000/05/30 : CIASIECRU-03952A000100050001-7
Approved For Release 2000/05/56Ctra-RDP78-03952A000100050001-7
named things of intelligence interest will be extracted
and stored. The approach will be to extract information
about specific named objects, keep this information in
the context of the document for document retrieval, and
manipulate this information out of context for informa-
tion retrieval. It is not proposed to create non-
redundant summary records from index records at input
time either through human or machine collation. Summary
records will be formed and maintained on select high-
interest personalities, installations, and other finite
subjects, but the creation of these records will be an
analytic activity requiring the synthesis of index
records and documentary information.
In addition to the index records, the indexer working
aids will themselves be a source of answers to questions.
For example, the Organization Identifier List will contain
names of organizations, their locations, type of activity,
etc.
5.4.2.2. Manual Indexing
An investigation of the state-of-the-art of automatic
indexing reveals that it is still largely experimental and
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.2.
- 42 -
Approved For Release 2000/0512CW-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClkaaV8-03952A000100050001-7
is not sufficiently precise to meet most of the Agency's
retrieval requirements. Automatic indexing techniques
usually involve word frequency counts, assigning weights
to high-frequency words, and storing these words as index
terms. Other techniques include syntactic analysis,
sometimes in conjunction with the above statistical
process. It is obvious that these techniques could not
be applied to an intelligence storage and retrieval
system requiring a high relevance/recall rate, since
much intelligence information is inferential and inter-
pretive and requires analysis for high-quality indexing.
Human indexing, therefore, with its recognized faults
is still superior to automatic techniques and is the only
feasible system for CHIVE. However, some documents will
require only title indexing and in these cases automatic
title-indexing techniques can be applied. The most notable
title-indexing system is the Key-Word-In-Context (KWIC)
method. In this system, the key words in titles are
permuted so that each word appears in its alphabetic file
position along with the other significant surrounding
words from the title. The permuted titles can be machine
stored for searching on demand, or printed listings can
be generated for manual perusal.
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.2.
- 43 -
Approved For Release 2000/05/30 : CIA3KFRU03952A000100050001-7
Approved For Release 2000/05/39EaRAIRDP78-03952A000100050001-7
5.4.2.3. Depth--Subjects vs. Named-Oblects
It need hardly be argued that intelligence interests
are catholic in nature, and that if an information storage
and retrieval system arbitrarily decides to limit its
coverage to personalities, installations, or conceptual-
type subjects, it automatically limits its ability to
satisfy its total customer population.
Intelligence analysts have found that "named-objects"--
e.g. installations, personalities, organizations--most
often provide the clues to resolving research problems.
OCR request experience is an accurate reflection of this
interest. We recommend, therefore, that these subjects
receive the greatest emphasis; and, in view of OCR experience
relating to the kinds of things users are interested in
concerning named-objects, we recommend that an increased
number of attributes of named-objects be brought under
control. The latter are the elements of information which
identify a named object, e.g., a person's address,
organizational affiliation, etc. In-depth indexing of
named-object attributes does not necessarily have to mean
an equivalent increase in the volume of data indexed or in
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.3.
- 44 -
Approved For Release 2000/05gfefe-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIARSI3g-03952A000100050001-7
indexing time since common attributes, such as addresses,
types of organizations, and products of an installation,
will be stored in indexer identifier lists (see Section
5.4.2.5. below), and it will not be necessary to re-index
this data when it is reported repetitively in documents.
We recommend that subject indexing, that is, the
kind of indexing performed by the Intellofax system and
the Subject/Commodity Section of the Special Register be
continued at least to the present level, but on a broader
data base to include important document series (e.g.,
foreign translations) which are excepted today.
5.4.2.4. Index Language; Linkage
The CHIVE indexing language consists of controlled
entries taken from identifier lists and code schedules,
as well as words and phrases extracted directly from
documents.
5.4.2.4.1. Identifier Lists and Code Schedules
In the case of certain kinds of named-objects, identi-
fier lists are required to ensure that the same organization,
place, etc., is always entered in the same manner so that
information is not missed during retrieval because of
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.4.1.
- 45 -
Approved For Release 2000/05/30 : CIPSBORET-03952A000100050001-7
Approved For Release 2000/05SKIMA-RDP78-03952A000100050001-7
incorrect or synonomous entries. In the subject indexing
area, a subject authority list or code scheme is required
to control the depth of indexing, synonyms, and homographs.
In some cases, the authorized entry form will be
identical to the way the entry will frequently appear in
documents. In other instances, the entry will be converted
to a code to either express the hierarchic structure
built into the identifier list--e.g., the hierarchic
arrangement of organizations in a Communist country--or
to compress a long entry into more abbreviated form to
conserve storage space.
5.4.2.4.2. Extracted Words and Phrases
Words or phrases extracted from documents are used
(a) to index certain kinds of named-objects which will
not receive identifier list control, (b) to give greater
specificity to subject indexing, and (c) to provide
information retrieval via the index record.
In the first instance, it is felt that identifier
list control of all named-objects is impractical and
impossible. Where the volume of reporting is reasonably
restricted, or where one can predict fairly well which
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.4.2.
- 46 -
Approved For Release 2000/05/itaArRDP78-03952A000100050001-7
Approved For Release 2000/05/30: CgOPT8-03952A000100050001-7
named-objects will be the subject of customer queries,
it makes sense to control input through identifier lists.
Such is the case, for example, for place names and priority
organizations and installations. Personalities, however,
are neither few in number nor can one readily anticipate
which names will be requested. Similarly, in the case of
lower-level installations, it would not pay to exercise
a high degree of input control when it is probable that
the referenced information will be retrieved infrequently,
if at all. For both these categories, therefore, we
recommend that the burden of overcoming the synonym
problem be transferred to the output end of the system.
Key words taken from documents are added to subject
index categories to provide greater retrieval specificity
without complicating the subject schedule. The subject
indexing vocabulary provides a medium-depth, generic
searching capability. Key words added to the subject
schedule provide a specific search capability, e.g.,
equipment nomenclatures, types of research, new concepts,
etc.
The third application for entering key words from
documents is to provide a level of information retrieval.
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.4.2.
- 47 -
Approved For Release 2000/05/30 : CIAMMU03952A000100050001-7
Approved For Release 2000/05/3UMTRDP78-03952A000100050001-7
In this case, the entry is uncontrolled, but the class
of entry is searchable. For example, one of the
personality attributes in the CHIVE system is "Reason
for Travel."
information would be provided by the index record which
would aid in selecting documents or in some cases obviate
the need to refer to documents.
5.4.2.4.3. Linkage
Index entries which are related (e.g., an organization
and its address) will be linked together in the index
record so that the relationship can be interrogated at
the index record level, thus negating the need to refer to
documents to determine ties among elements of information.
This is necessary because intelligence documents typically
include many people, organizations, areas, subjects, and
their interrelationships. If there were no way to deter-
mine the contextual relationship between these subjects,
the system would be overburdened with false retrieval
matches (false drops) requiring reference to many
irrelevant documents.
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.4.3.
- 48 -
Approved For Release 2000/05/?ElaW-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CKWT78-03952A000100050001-7
Linkage can be accomplished through the use of
formatted input, as is typical in punch card systems
(i.e., all entries in one defined record are by
definition linked), or by appending a linkage symbol to
each index entry, as is typical in systems utilizing un-
formatted input. Formatted input records are not practical
for CHIVE because of the long record lengths and large
number of variable elements of information included.
Experimentation with appending the linkage symbol to each
entry has worked very successfully and will be adopted.
5.4.2.5. Requirements for Identifier Lists and Thesauri
The use of identifier lists is recommended for the
following reasons:
(a) There is little consistency in the way named-
objects are reported, e.g., the Institute of
Physics of Moscow University may be referred
to as the Moscow Institute of Physics, or the
Moscow Physics Institute, or the Physics
Institute of Moscow University, or the Nuclear
Physics Institute, etc. Even place names are
translated and transliterated in a variety of
ways. Therefore, if named-objects were entered
as reported, it would be a very difficult
retrieval problem to determine the right synonyms
to use in order to find the variant entries. An
identifier list includes variants but allows only
one correct entry format.
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.5.
- 49 -
Approved For Release 2000/05/30 : CIASMftT03952A000100050001-7
Approved For Release 2000/05/3CUCIRMDP78-03952A000100050001-7
(b) An identifier list (e.g., for organizations)
contains not only the name of the organiza-
tion, but also a number of identifying attri-
butes of the organization, including address,
commodities produced, etc. This capsule
summary aids the indexer in identifying and
discriminating among organizations and improves
the quality of the indexing.
(c) As was pointed out earlier, an identifier list
helps decrease redundant indexing because the
common attributes of a named-object do not have
to be repetitively indexed when they are listed
in the identifier list.
(d) Identifier lists are of value for answering
queries of a non-complex nature such as the correct
spelling of an organization or place, the precise
location of a facility, etc.
Identifier lists will be required for installations
and organizations, place names, significant national and
international meetings and conferences, and personalities
on whom physical or logical dossiers are maintained. The
initial identifier lists will be constructed from the
machine language data which exists in OCR, and will be
issued to indexers in machine-listing form organized
geographically in the various sort orders as required.
Key words will be appended to hierarchic classification
terms to reflect the terminology of documents and to provide
greater search specificity. The initial concept is that
these words will be entered as written in documents and
will not be subject to thesaurus control. The key words
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.5.
- 50 -
Approved For Release 2000/05/311CM-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: ClAfiefq8-03952A000100050001-7
may be printed out, however, in answers to queries on the
hierarchic subject codes to which they are appended, and
should aid in determining which documents are relevant.
For example, if a requester is searching for a particular
aluminum alloy and three of the index records retrieved
refer to alloys in which he is not interested, the
requester can screen out these references from further
consideration.
If in the future it is determined that dictionary
control over key word entries will raise the quality of
the indexing and retrieval, key word thesauri can be
created by obtaining listouts of the key words which have
been applied to the individual hierarchic codes. These
key word lists would be turned over to dictionary editors
who would resolve synonym and homograph problems and weed
out undesirable terms. It is felt that this method of
building a thesaurus, i.e., building it from the actual
terminology used in documents, is both superior to and
cheaper than trying to adapt an established dictionary
to the Agency's indexing problem. In addition, one can
take advantage of the uncontrolled key word indexing prior
to the building of the thesauri.
CHIVE INDEXING TECHNIQUE
Concepts
5.4.2.5.
- 51 -
Approved For Release 2000/05/30 : CIA-SIMET03952A000100050001-7
Approved For Release 2000/05/3g6KRIDP78-03952A000100050001-7
5.4.2.6. Header Data Indexing
The foregoing discussion dealt with CHIVE concepts
related to indexing the subject content of documents.
Another important aspect of document indexing relates
to the so-called header (or bibliographic) elements of
the document such as title, author, control number, etc.
Header data indexing is required for the following
reasons:
(a) To obtain bibliographic control of documents
over which the Agency has a repository
responsibility.
(b) As searching parameters in conjunction with
subject or named-object searches.
(c) To provide minimum index control over docu-
ments which are not indexed in depth.
In the first instance above, header data control
would perform a service comparable to that performed by
the source card file maintained by the CIA Library.
The machine-stored header data record will be used to
verify the receipt of documents in the Agency and to re-
cover specific documents whose control numbers are unknown.
In the second instance, header data control will be used
most often to limit searches (e.g., searches can be restric-
ted to certain document series or dates), or a subject
request can specify that information is required only
when authored by a particular scientist. In the third
CHIVE INDEXING TECHNIQUES
Concepts
5.4.2.6.
- 52 -
Approved For Release 2000/05SeCREA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIWPT8-03952A000100050001-7
instance, header data will provide minimum, but important,
search keys at very little input cost. Permuted title
indexes can be published for certain series (e.g., finished
intelligence) in lieu of in-depth indexing. Similarly,
searches can be made for all reports issued by a particular
post during a specific time period when an important event
occurred. In this latter case, all documents can be
retrieved whether they were subject indexed or not.
Whereas the selection of documents for content indexing
will be subject to well-defined criteria and therefore
limited, it is anticipated that most documents can be
brought under header data control. This possibility is
rendered more likely by the fact that header data indexing
(with the exception of title expansion) can be performed
by clerical personnel, as borne out by the recent CHIVE
Indexing Experiment.
5.4.3. SYSTEM DESCRIPTION
What follows is a summary description of the indexing
technique. A detailed description is given in Appendix 5.C.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.
- 53 -
Approved For Release 2000/05/30 : CIA-SEIMF03952A000100050001-7
25X1A
Approved For Release 2000/05/3gMTRDP78-03952A000100050001-7
5.4.3.1. Elements of Information and Indexing Tools
As stated above, the CHIVE indexing concept includes
"named-objects" and "subjects." Named-objects refer to
people, places, organizations/facilities, and conferences/
meetings. Subjects include commodities, concepts, research
activities, military activities, and all other topics and
events which do not fall under the above-defined named-
objects.
5.4.3.1.1. Personalities
Personality names will be entered more or less as
they appear on documents. Only those misspellings will
be corrected which it is possible to recognize without
reference to identifier lists or other support files.
The use of name search tools such as the Name
Tables and printouts of unique personal name/surname
combinations entered into the system will be investigated
as substitutes for controlling names during input pro-
cessing. When a specific name is searched, and all of
the records relating to that personality have been
identified, this identification will be retained so that
subsequent searches for the same personality will have
to address only those records which have been entered
since the previous search.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.1.
- 54 -
Approved For Release 2000/051KROA-RDP78-03952A000100050001-7
25X60
Approved For Release 2000/05/30: CI5kEkBg8-03952A000100050001-7
A detailed list of the attributes of personalities
which will be indexed is included in Appendix 5.C. Most
of these attributes will be entered in a prescribed
manner and thus will be available for direct searching
in term files. For example, all locations will be entered
from approved gazetteers, dates will be formatted, organi-
zation affiliations will be entered from organization
identifier lists, etc. This will provide the capability
to make information retrieval type queries from the index
5.4.3.1.2. Organizations/Installations
Two levels of control will be applied to organizations
and facilities. Priority organizations will be included
in identifier lists. These lists will also include
significant attributes of the organization, e.g., addresses,
synonymous names, function code, products, etc. The lists
will be built from the machine language data which exists
in SR, BR, and FIB. The organization identifier lists will
be issued on a country basis in several arrangements, i.e.,
by name of organization, by function, and by place name
location.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.2.
- 55 -
Approved For Release 2000/05/30 : ClAglECIRE8103952A000100050001-7
Approved For Release 2000/05/36Eetk1RDP78-03952A000100050001-7
For organizations on the list, the indexer will
enter an identifying number in lieu of the organization's
name, thus ensuring that all indexed information relating
to a specific organization can be retrieved exclusive of
other organizations with the same or similar names.
Attributes of organizations included in the identifier
lists will not be re-indexed when the same information
is repetitively reported.
Low-level installations and organizations will not
be identifier list controlled. They will not be indexed
by name but rather by location and a function code. It
may be desirable later to produce listings of these
facilities for mapping and aerial photographic customers.
Once these listings are established, it is unlikely that
any further indexing of these facilities would be required
unless the status of the facility changed.
5.4.3.1.3. Area/Locations
For indexing large geographic areas, e.g., blocs,
countries, and provinces within countries, the ISC area
code has proven a satisfactory tool and it is recommended
that it, or a similar country code, be adopted. For place
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.3.
- 56 -
Approved For Release 2000/05/?tcak-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClAFFW15-03952A000100050001-7
names within countries, there are a number of gazetteers
available, e.g., OCR generated gazetteers, the NIS
gazetteer, etc. The NIS gazetteer has recognized faults,
but it is generally conceded to be the most authoritative
tool available and it is recommended that it be used as
the authority for entering place names.
The basic gazetteer will be updated with new place
name entries encountered in documents and will be issued
on a country-by-country basis. Place names will be
entered in clear text as they are spelled in the gazetteer,
appended to the appropriate country code. Geographic
coordinates will be entered in index records only when
they are not associated with a place name. Coordinate searches
will be accomplished by a machine search of the gazetteer
to locate the appropriate place names having the desired
coordinates, followed by a search of the place name term
file plus a search for those coordinates that were
disassociated with a place name.
5.4.3.1.4. Meetings/Conferences
Significant national and international meetings and
conferences will be controlled in identifier lists.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.4.
- 57 -
Approved For Release 2000/05/30 : CIAMHOF03952A000100050001-7
Approved For Release 2000/05/3MCERATRDP78-03952A000100050001-7
Earlier comments on the use of identifier lists for
organization control apply to this category also. Less
significant conferences will not be indexed by name,
but will be subject indexed with appropriate ISC subject
codes.
5.4.3.1.5. Subjects
The Intelligence Subject Code has been used throughout
the Intelligence Community for a number of years for subject
indexing, and it is generally recognized as the best
general indexing tool for intelligence documents. For
these reasons, CHIVE has recommended that it be used as
the basic subject indexing tool in a revised OCR system.
However, during the CHIVE Indexing Experiment, several
weaknesses were noted which should be corrected prior to
its adoption in a going system.
5.4.3.1.5.1. ISC Structure
The 1960 revision of the ISC did much to simplify
its structure. However, experience in using this edition
points to several areas where further simplification is
desirable.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.5.1.
- 58 -
Approved For Release 2000/05/?kett-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Ca-gg8-03952A0001000500017
(a) Expanded Use of Modifiers: The ISC subject
modifiers are a faceting device which can be
combined with certain subjects to specify
actions or states which affect those subjects.
For example, the modifier "049 Production"
can be combined with any commodity to indicate
production of the commodity. The 1960 revision
greatly expanded the use of these modifiers
over previous editions, but further expansion
is desirable in two ways:
(1) The 1960 revision limited the use of the
modifiers, i.e., each modifier could only
be used with specific chapters or sections
of the ISC. As a result, in sections where
a modifier cannot be applied, it has been
necessary to set up a subject code in lieu
of the modifier. For example, modifier
"069 Government Policies, Laws, Legislation,
etc." can only be applied to the commodity
chapter of the ISC. As a result, a subject
code for government policy has had to be
set up in various non-commodity sections
of the ISC. If the modifiers were freed
and the redundant subject codes deleted,
it would increase the efficient application
of the ISC. During the CHIVE Indexing
Experiment, the modifiers were freely
applied, and no particular difficulties
ensued.
(2) In a subject classification system, the
same subject is often repeated in several
different sections because each section
gives a different meaning or emphasis to
the subject. For example, in most classifi-
cation systems, guided missile subjects
would be found under engineering, production
activities, and military activities. This
repetition is logical, but it complicates
the structure of the system and makes it
hard to apply. A generalist indexer often
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.5.1.
- 59 -
Approved For Release 2000/05/30 : CIAWIRAT03952A000100050001-7
Approved For Release 2000/05/AEUEIRDP78-03952A000100050001-7
finds it difficult to determine whether
the information he is reading is oriented
toward engineering or production. If he
mistakenly puts engineering information
under production, it may be lost in a
later retrieval run. With the addition
of some new subject modifiers, much of
this repetition could be eliminated, i.e.,
the various subject facets could be shown
through the use of modifiers to distinguish
production from military activities, etc.
This would also considerably reduce the size
of the ISC.
(b) Expanded Use of Clear Text: Some of the detailed
subject breakdowns inthe ISC could be eliminated
with a more liberal use of clear text. During
the experiment's indexing consistency test, it
was found that there was a low-level of consistency
in applying the ISC. This can be attributed to
the depth of subject detail in the ISC, i.e., one
indexer will use "621.349 Uranium" and another
indexer will index the same subject matter using
"621.351 Natural Uranium."
If some of this subject detail were further
reduced so that there was only one subject code
for uranium, the consistent application of the
ISC would rise measurably. Moreover, indexing
specificity (e.g., natural vs. enriched uranium)
could still be achieved by using controlled
clear text as an extension of the subject code.
The advantage of this approach is that with more
consistent application of the ISC there is less
likelihood of losing information. This may
often put a burden on the searcher in that with
fewer subject categories, more material will
initially be retrieved, but this is preferable
to losing information and the free use of clear
text can help alleviate the problem. Thus, if
the clear text is uncontrolled, it can be used
as a screening device to get rid of unwanted
references, or if it is controlled, it can be
used as a searching device to restrict the volume
retrieved.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.5.1.
- 60 -
Approved For Release 2000/05attalf-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CI1AP-8-03952A000100050001-7
5.4.3.1.5.2. Subject Schedules for Occupations and
Installations
A subject schedule or code is required for occupations
25X60 and installation types in order to respond to queries on
such subjects as all
25X60
During the recent Indexing Experiment, specified
subject codes in the ISC were designated for this purpose.
Since the ISC was not constructed with this aim in mind,
the designated codes proved quite inadequate. Problems
were caused by the previously alluded to duplication of
subjects (e.g., an atomic installation could be indexed
in several different places), and by the multiplicity of
subjects in the ISC (i.e., a rather simple code schedule
was required, and the ISC was too detailed for the required
need). In addition, the ISC did not have appropriate
subjects for some occupation and installation categories.
*This need for a generalized subject schedule for occupa-
tions and installation types is to be distinguished from
the requirement to retrieve by specific activity. The
latter capability will be provided either through the
ISC code itself or, where necessary, through ISC plus key
word.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.5.2.
- 61 -
Approved For Release 2000/05/30 : CIA-SE1V03952A000100050001-7
Approved For Release 2000/05/3gCERRIRDP78-03952A000100050001-7
In view of the above problems, it is recommended
that the ISC not be modified to perform this function,
but that separate subject schedules be developed based
on the experience available in the Foreign Installations
Branch and Biographic Register.
5.4.3.1.5.3. Area Rules
The present ISC area rules proved inadequate on
several counts during the recent experiment.
(a) The terminology is sometimes confusing--e.g.,
some rules read "nationality is primary area."
Since the CHIVE indexing procedures provide
for area tags for nationality and primary
country, the terminology is subject to
ambiguous interpretation.
(b) Some of the rules are illogical--e.g., there
are two subject codes in Chapter VII which can
be used for foreign military training and the
area rule for one of the codes is the opposite
of the other.
(c) The CHIVE indexing technique allows more
flexibility in area relationships than is
allowed in the ISC as used in the Intellofax
system. Consequently, there are many subjects
which do not have area rules which should have
them appended for CHIVE purposes.
All these rules should be re-examined and modified before
the CHIVE system goes operational.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.1.5.3.
- 62 -
Approved For Release 2000/05MCW1-RDP78-03952A000100050001-7
SECRET
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
5.4.3.1.5.4. Subject Gaps
Prior to the recent experiment, it was felt that a
number of subjects occurred in Codeword materials which
did not appear in Collateral documents and that, therefore,
the ISC would not be adequate for indexing these materials.
For this reason, sections of the Special Register code
manual were utilized as a supplement to the ISC during
the experiment. As it turned out, the SR supplement was
not used a great deal because the ISC had subject cate-
gories which were almost comparable. However, there are
a limited number of special-purpose subjects which should
be added to the ISC to make it fully suitable for all-
source indexing.
5.4.3.2. Tags
Each entry in the CHIVE indexing system is preceded
by a tag. A tag is a three-digit mnemonic symbol which
identifies the entry which follows. Tags are used to:
(a) Distinguish between homographs, e.g., Washington
a person vs. Washington a city or street.
(b) Organize machine files, i.e., separate people's
names from organizations and subjects and
thereby facilitate searching.
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.2.
- 63 -
Approved For Release 2000/05/30 : ClAk5liFt3103952A000100050001-7
Approved For Release 2000/05/AE:CaRTRDP78-03952A000100050001-7
The CHIVE tags were made mnemonic as a memory aid.
The first character of a tag represents a major subject
category, e.g., "P" = Personality, "0" = Organization,
etc. The second and third characters further specify
the element of information being indexed, e.g., "PVN" =
Personality Name Variant, "POH" = Personality Organization
Head, etc. (See Appendix 5.C. for a detailed list of the
CHIVE elements of information and their associated tags.)
5.4.3.3. Phrasing
The requirements for linkage were discussed earlier.
In the CHIVE system, linkage is accomplished through a
system defined as phrasing. A phrase is simply a group
of tags and terms which the indexer relates together
with a unique number which is assigned to each tag and
value in the group. On retrieval, queries can specify
that the input linkage must be present for the query to
25X6 be satisfied--i.e., a query may specify all information
on a person
25X6
Without phrasing (linkage), all documents which contained
CHIVE INDEXING TECHNIQUES
System Description
5.4.3.3.
- 64 -
Approved For Release 2000/05/MM-RDP78-03952A000100050001-7
25X60
Approved For Release 2000/05/30: CIWPf8-03952A000100050001-7
these two terms would be retrieved and in some cases the
relationship would be accidental. On retrieval, the phrase
linkage can be reconstituted by testing for those terms
which have the same document accession number and phrase
number.
The rule for phrasing is very simple. All terms
which are logically related can be combined in a phrase.
Thus, if a person is affiliated with an organization in
these three elements of information can be
combined together in a phrase. However, if additional
information were given that this individual also traveled
25X6 to an additional phrase would have to be constructed
otherwise it might be interpreted that the organization, if
it appeared in the same phrase, was located in both
25X6
Phrases can be very simple or complex. The simplest
phrase contains only a place name or an area and one subject.
A complex phrase may contain a number of index terms which
constitute a rather detailed biographic sketch of an
individual. Further details on phrasing with examples
are contained in Appendix 5.C.
CHIVE INDEXER TECHNIQUES
System Description
5.4.3.3.
- 65 -
Approved For Release 2000/05/30 : CIASIONT03952A000100050001-7
Approved For Release 2000/05/3WAETRDP78-03952A000100050001-7
5.4.3.4. Header Data Indexing
Header data indexing will be performed by clerical
personnel who will type the information on formatted
transcript sheets. During the recent experiment, as
illustrated in the header section of Appendix 5.C., a
single transcript sheet was used for all documents. This
does not appear to be as efficient as developing unique
transcript sheets for different series.
The elements of information comprising header data
will be taken from the document or the information from
the document will be translated into code to achieve
conciseness and uniformity of entry. A formatted
transcript sheet can be used since the header data
elements are fixed in number for each document series,
and the length of entries is either fixed or a maximum
field length can be determined. The use of a formatted
sheet obviates the need for tags and the only linkage
required is to the document control number. The latter
will be appended automatically to each header data term.
A detailed list of the elements of information comprising
header data is included in Appendix S.C.
CHIVE INDEXER TECHNIQUES
System Description
5.4.3.4.
- 66 -
Approved For Release 2000/05aGCRGIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIV-gg8-03952A000100050001-7
Chapter 5.5.
SYSTEM FILES
5.5.1. INTRODUCTION
This chapter classifies and describes the logical
files and sub-files which will be available in the CHIVE
system. These are the files which are identified to the
user--i.e., the CHIVE information analyst and, perhaps
ultimately, the research analyst. They are the files he
must be familiar with, if he is to take full advantage of
the resources of the system and exploit it intelligently.
The total number of individual system files,
including old as well as new, might easily exceed a
hundred. However, it is possible to classify all the
various files into no more than nine types, each with
very distinctive functions and properties.
categories are as follows:
Document Index Files: Files
These
nine
containing all the raw
document index records in the system, including not only
the complete index records themselves but the access
mechanism to these records. The documents referenced by
these records may include any form of information carrier
--e.g., maps, photos, films, or other, and need not neces-
sarily be readily accessible to the system.
- 67 -
Approved For Release 2000/05/30: CIA?kW-03952A000100050001-7
SYSTEM FILES
Introduction
5.5.1.
Approved For Release 2000/05/AKWRDP78-03952A000100050001-7
Vocabulary Control Files: Files required to insure
consistent entry of index terms (tag and value) into the
Document Index Files and other system files. The principal
function of these files is to reduce the synonym problem at
search time. They include "identifier files" for named
objects (which, like scope notes in a code schedule, help
to distinguish one specific subject from another), code
books, dictionaries, thesauri, and other authority lists.
Unsynthesized Information Files: Files consisting of
select phrases or terms extracted from documem_ index
records or directly from the raw documents themselves.
Such files would be built to facilitate retrieval where
a substantial number of requests for the pertinent data
can be anticipated on a continuing basis. Unlike Summary
Information Files (see below), records in these files would
often contain duplicative and/or contradictory information.
Periodically, however, inforation in such files might be
reviewed and added to the appropriate Summary Information
Files.
Summary Information Files: Files built either from
records (or portions of records) in the Document Index
Files, from records in Unsynthesized Information Files, or
from the raw documents themselves during or after input
processing. The distinguishing feature of these files is
the fact that they will ordinarily contain evaluated, non-
redundant data about named objects or events associated
SYSTEM FILES
oduction
Approved For Release 2000/0WricRe114-RDP78-03952A00010000.#1t.
Approved For Release 2000/05/30 : ClaSP1-03952A000100050001-7
with named objects. Named-object identifier files could
be placed in this file category, the only apparent
difference being the limited amount of historical data
ordinarily found in such files.
Special Project Files: The unique features of these
files are as fellows: (a) the inputs to the files originate
outside CHIVE; (b) CHIVE actually acquires the files and
not simply "profiles" thereof; (c) additions or modifica-
tions to the files can be anticipated; (d) the files do not
use the elements of information and/or vocabulary controlled
in CHIVE. Special Project Files may otherwise have the
properties of any of the file classes named above. These
files will be processed by CHIVE but maintained by CIA or
other agency analysts. The degree of CHIVE involvement in
such files remains to be determined since the responsibility
for such files is currently assigned to the Applications
Division of OCS.
Referral Service Files: These files differ from
Special Project Files in that they are not substantive
data files but rather descriptions or profiles of files
located outside the CHIVE system. Referral Service Files
will consist both of profiles of analysts' special fields
of competence as well as files maintained by analysts and/
or information repositories external to CHIVE. CHIVE will
not maintain, or retrieve data from, the substantive files
themselves. It will simply inform customers of those files
potentially relevant to a given query.
SYSTEM FILES
Introduction
- 69 - 5.5.1.
Approved For Release 2000/05/30 : CIA1M03952A000100050001-7
Approved For Release 2000/05/%69K-RDP78-03952A000100050001-7
Document Image Files: Files of documents stored by
the CHIVE system. From a functional point-of-view they
include "aspect" systems (where the index is stored
separately from the documents) as well as self-indexed
document files. Both existing OCR document collections
as well as CHIVE-originated document repositories are
encompassed by this category. The storage media for such
files will include hard copy, various types of microimages,
and even digital storage in some instances. Similarly,
the categories of documents involved will differ widely in
size, shape, classification, and point of origin.
Management Data Files: Files of data collected on
the activity of the CHIVE system to (a) enable operational
management to evaluate the cost/performance ratio of the
system and (b) to guide system designers in improving
hardware and software support. From the point-of-view of
what data is collected, most of the Management Data Files
will have to do with either system processing times or
processing volumes.
System Processing Files: Files used to support the
system in processing data. Most such files will be
organized in table form enabling values to be obtained from
arguments. Examples would include a file of legal tags and
other error correction files, decode dictionaries which
would convert codes into clear text for display to a reader,
SYSTEM FILES
- 70 - Introduction
5 . 5 .1.
Approved For Release 2000/05$11CROA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAW8-03952A000100050001-7
intermediate files which exist only temporarily during
the processing, of a transaction, working storage files,
etc. Since these files are largely internal to the CHIVE
EDP System and the information analyst need not interact
with them in any direct way--only know what functions the
system is capable of performing--they will not be covered
further in this volume but rather in Volume VII of the
report.
For each of the file categories listed above a
second-level categorization may be required, i.e., one
which classifies CHIVE files fron the point-of-view of
the origin of the files. These classes are three in
number:
Chive-Built Files: Files built by and for the CHIVE
system either from new inputs or through the conversion of
existing OCR files to the format and vocabulary of CHIVE.
These files will be continually updated as part of the
regular processing cycle.
Inherited Files: Files originally established by the
various OCR systems which it was not found possible to
integrate with new CHIVE files. Such files will include
records in hard copy as well as machine language. In some
instances these files may be transferred to another storage
medium (e.g., magnetic tape) if querying and output can
thereby be improved. Similarly, some existing machine-
readable files may be restructured and interrogated in
SYSTEM. FILES
- 71 - Introduction
Approved For Release 2000/05/30 : CIASMEBT03952A000100050001-7 5.5.1.
Approved For Release 2000/05/31EaRITIRDP78-03952A000100050001-7
the vocabulary of a single CHIVE language. Neither
of these changes, however, implies true conversion to
the CHIVE system. Another significant difference
between these files and Chive-Built Files is that while
both will be used by the CHIVE information analyst, no
additions will be made to the Inherited Files once the
CHIVE system is fully operational.
Supplemental Files: Files not built or maintained
by CHIVE, nor inherited from OCR, but which contain data
functionally useful to CHIVE as a secondary source of
information. All Special Project Files (see above) fit
this category, as do reference aids of various kinds (e.g.,
Who's Who compilations, gazetteers, commercially published
indexes, etc.) obtained from external sources and left
essentially in the form in which they were received.
In the broadest sense the CHIVE system must
necessarily include not only the new files it creates
but the files it inherits from the existing system. The
discussion of these separate but related subjects, however,
has been divided in the pages to follow to lessen the
possibility of losing the reader in the file forest. In
the main body of this chapter, we will focus primarily on
the CHIVE-Built Files, providing a summary description of
their functions, data content, maintenance criteria, and
SYSTEM FILES
Introduction
5.5.1.
- 72 -
Approved For Release 2000/05g1E0W-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIASIkgg-03952A000100050001-7
other characteristics. Appendix 5.D to this volume
describes the principal Inherited Files which must be
accommodated by the system, with primary attention given
to those which fall in the categories of Document Index
and Document Image files, as defined above.
It should be emphasized that the basic objective in
this chapter is to communicate a more or less static
image of the files in order to simplify understanding of
the structural framework (or file philosophy) of the
system. In Chapter 5.6. we will examine the more dynamic
aspects of file activity within the system, i.e., the
transactions which will affect the files, interactions
which might take place between files, etc.
5.5.2. DOCUMENT INDEX FILES
5.5.2.1. Master Index File (MIF)
The Master Index File of the CHIVE system will contain
the index entries for all the documents available in the
CHIVE system as well as certain classes of documents
located in repositories not under CHIVE management. Examples
of the latter include maps, the storage responsibility for
which will be retained by the Map Library Division of ORR,
and select open-source books and periodicals which may be
accessible only at the Library of Congress or at some other
holding agency. Conceivably, certain documents indexed by
SYSTEM FILES
Domment Index
- 73 -
Approved For Release 2000/05/30 : CIAIRInt-03952A000100050001-7
Approved For Release 2000/05/kTRDP78-03952A000100050001-7
CHIVE may not even be available at all--for example,
select Soviet periodicals never received in this country,
but which were described in secondary sources that were
accessioned. In all cases, however, whether the
original source document is readily available or not, the
preparation of index records for the CHIVE Master Index
File will be under CHIVE format and vocabulary control no
matter where the records are physically prepared.
All index records will be stored in such a manner
that a search, based on certain criteria, will produce all
the records in the system or, at the customer's option,
phrases and/or terms within records which may apply to
the search criteria. The index records will contain
sufficient information to enable the requester to determine
if the document referred to in the index entry should
be requested for detailed study. In the case of named-
object associated informPLion, the entries will have
sufficient information-bearing content to permit summary
data files to be built and responses given to certain
queries directly from the index records themselves without
referral to the source documents.
Records entering the Master Index File will originate
from the following sources:
- CHIVE information analysts processing incoming
documents in the CHIVE geographic divisions.
- Graphic analysts indexing photos and films in the
Graphics Register (GR).
SYSTEM FILES
- 74 - Document Index Files
Approved For Release 2000/05/acWr-RDP78-03952ACIC5i00050001-7
uproif
mod
Tommie
4111W
Approved For Release 2000/05/30 : eragD78-03952A000100050001-7
- Map catalogers processing maps in the Map Library
Division (ML), ORR.
- Miscellaneous additional organizations (either under
Contract to CHIVE or agreeing to follow CHIVE input
procedures) exploiting primarily foreign language
documents. Examples of such organizations might be
the Library of Congress, FDD, etc.
- Documents received by CHIVE in machine language
(e.g., Comint teletype) on which a limited form of
automatic indexing is to be performed.
- Machine-converted document index files from
existing central repositories.
With regard to input selection criteria, assuming
continuation of present practices, CHIVE will have the
responsibility to serve as the Agency's repository for
community-published positive intelligence materials (with
the exception of cables and maps), and to provide reference
service on "active" documents. In addition, it will pre-
sumably assume OCR's obligation to serve as the office
of record for archival storage of certain CIA document
series
In order to fulfill these responsibilities, CHIVE
will be obliged to index at least the header (or biblio-
graphic) data for every "intelligence" document received.
By "intelligence" documents we mean all categories of
textual materials generally considered to be in the
mainstream of intelligence reporting. These include
Comint (messages, reports, and teletype), T/KH reports,
USIB-produced IR's USIB-produced finished intelligence,
the FBIS, photo enclosures to IRs, and USIB-produced trans-
SYSTEM FILES
- 75 - Document Index Files
Approved For Release 2000/05/30 : CISMISO78-03952A0001600%080117
Approved For Release 2000/05/3WaIRDP78-03952A000100050001-7
lations of foreign documents. By agreement with the
Library it will also store map index records
generated by ML.
The preparation of index records on other categories
of materials e.g., cables, non-USIB-produced reports
studies,ni films, and original open-source literature,
depend on the substantive content therein.
The content of document index records can include any
term type permitted by the vocabulary of the CHIVE
ndexing system. (For a list of all permissible term
see Appendix 5.C.) No single record will, of course,
oortain all poss bte term types sin-e some terms will be
uue to certain kinds of documents.
Outputs from the Master Index File will consist of
scheduled and ad hoc products. The principal items
be provided within each category are briefly described
Scheduled Products
KWIC listin of titles or expanded titles of all
documents which ha- not been content indexed, as
well as the PHIS laily Reports. The permuted
port!on of the list4nrY will be the title and
exnahded title, wh.le the reference portion will
clude basic header data for the document
including document control number. Separate, as
well as combined, listings will probably be run
for the different categories of documents
involved, e.g., SI Teletype, FBIS, Finished
Intelligence, and Raw Intelligence Reports,
- 76 -
Approved For Release 2000/055513ERGR-RDP78-03952A000100050001-7
SYSTEM FILES
Docnr?ni- Index Files
5., 2.1.
Approved For Release 2000/05/30 : CIAFFW8-03952A000100050001-7
- Map catalog cards in 3" x 5" form containing in
clear text on each card the entire index record
for a map. This record would include accession
number, area code, subject, scale, classifica-
tion, map title, date of publication and name
of publisher. The cards would be outputted in
the sequence of the Map Library Card Catalog to
facilitate interfiling at the Map Library.
- Output similar to the map cards, but reflecting
index records on films stored in the Master
Index. In this instance, the records will
probably be of tab card size to conform with
the size of the existing file. The sequence
will also conform with the existing Intellofax
reference card file on the film collection.
- Accessions lists comprised of clear-text index
records on maps, ground photos, and perhaps
select additional document receipts (e.g.,
tables of contents of foreign scientific
periodicals) processed by CHIVE.
Ad Hoc (Query) Products
- Listings in natural language of document index
records or subsets thereof (i.e., phrases within
records) containing the search terms specified
in the query. Subject or concept-oriented
queries will normally require output of complete
document index records including header as well
as content data. Named-object-oriented queries
will ordinarily result in the output of select
phrases only which match the search criteria,
together with a limited amount of header data
(e.g., document classification, document type,
and appropriate document control numbers).
- Listings of control numbers only for documents
whose index records match the search parameters.
SYSTEM FILES
Document Index Files
5.5.2.1.
- 77 -
Approved For Release 2000/05/30 : CIA-Millif03952A000100050001-7
Approved For Release 2000/05/3gC:RAIRDP78-03952A000100050001-7
- Listings containing simply a computed figure of
the number of index records matching the search
parameters. This kind of intermediate output
will enable the customer to broaden or narrow
the search prescription depending on the volume
of the anticipated output.
- Listings of index records containing terms which
match standing customer queries or analyst
interest profiles. Whenever hits occur because
new information is received on a subject or
person of interest to a particular research
analyst, the customer would be notified through
the transmittal of the listing containing the
pertinent record.
5.5.3. DOCUMENT IMAGE FILES
The Document Image File (DIF) is the central repos-
itory for active textual intelligence documents. Maps
and graphics are to be retained within the respective
organizations currently responsible for their retention,
although all of these will be used in conjunction with
the computer-based Master Index File described in the
previous section. The Document Image File shall consist
of those textual intelligence documents for which the
Agency has repository responsibility as well as other
documents which are judged to contain information of
potential value within the intelligence community. As
a central repository it is to be all-source, containing
SYSTEM FILES
Document Image Files
5.5.3.
- 78 -
Approved For Release 2000/05/?tcRt-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAW8-03952A000100050001-7
USIB finished and unfinished intelligence reports,
FBIS, Open-Source literature (including JPRS and FDD
translations), COMINT (messages, reports, and teletypes)
and selected cables. The system will be inclusive of
inherited document image files (see Appendix 5.D.) as well
as newly-accessioned, CHIVE-processed documents. To
effect this, files currently maintained in various
locations (SR, LY/Circ, etc.) would be moved to a single
physical area within the headquarters building along
with the CHIVE document system, thus offering the user
a single point of entry for his reference needs. The
discussion of a proposed approach to implementing this
central repository is contained in section 5.7.3.
The primary purpose of the Document Image File is
to serve as a central reference point from which
identified documents may be retrieved and copied for
distribution. The identification of the documents (by
unique identification number) may be accomplished via a
computer search of the Master Index File, or it may be
known by some other means by the requester. The file
must be responsive to either type of demand. Documents
SYSTEM FILES
Document Image Files
5.5.3.
- 79 -
Approved For Release 2000/05/30 : CIA-911E1113T03952A000100050001-7
Approved For Release 2000/05/30SFE#4DP78-03952A000100050001-7
are not to be circulated outside of the file area;
and requests are to be serviced by producing a durable,
hard-copy replica of the document master for distribu-
tion to the requesting user. The design goals for the
volume and turn-around times in responding to these
file demands are outlined in sections 6.2.1. and 6.5.6.
Aside from its primary purpose of providing a
repository for retrospective reference, a number of
secondary purposes must be served by the system.
First, provision must be made for a backup file capa-
bility. This duplicate file must be produced as a
by-product of the input procedure, and must be suitable
both as an alternate reference point in the event of
loss or destruction of items in the main file, as well
as a means of reconstructing the main file in the event
of catastrophic destruction. Provision for selective
protection of vital records is also within the scope of
the document image subsystem although no special design
consideration has been devoted to this requirement in
this study. An additional implicit requirement of the
document system is the need to provide archival quality
records for those items requiring prolonged retention.
SYSTEM FILES
Document Image Files
5.5.3.
- 80 -
Approved For Release 2000/05.1513:CREIA-RDP78-03952A000100050001-7
MOI
pip
25X1B
Approved For Release 2000/05/30: Ca00}8-03952A000100050001-7
5.5.4, VOCABULARY CONTROL FILES
5.5.4.1. Personality Identifier Files
5.5.4.1.1. Master Dossier File (MDF)
The functions of the Master Dossier File are:
- To identify hard copy folder files maintained by
CHIVE on select personalities.*
- To reduce search time on requests for select
personalities by virtue of the fact that the
information analyst determined in advance of
the request which incoming name references
pertained to these personalities. An analogy
could be drawn here to the difference between
searching a tightly controlled classification
*The reason for maintaining hard copy folder files,
in addition to storing documents in the Master Image File
in microimage form, would be the anticipated high request
activity on these select documents which would increase
reproduction costs significantly. An alternative to
maintaining hard-copy personality files would be to
maintain lists of documents referring to select individuals,
and, when one of these individuals' file is requested, to
reproduce all the documents referred to in the list. This
approach of building a dossier on a "demand" basis would
make sense if experience proves that redundancy in name
searches is minimal. Pending further study, however, of
the redundancy factor during Phase III, we have assumed a
requirement for some hard-copy personality files and made
provision for same in the system design described here.
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.1.
- 81 -
Approved For Release 2000/05/30 : CIA-Mege103952A000100050001-7
Approved For Release 2000/05/305.EgkElbP78-03952A000100050001-7
system and an uncontrolled keyword index. A
listing of the Dossier File, wherein is contained
a unique file number and set of attributes for
each personality cited, is directly analogous to
a listing of a classified schedule which contains
both a unique code and often scope notes defining
each term in the listing.
- To provide by means of a printout of the
identifying information on each dossier personality
a summary-type information record which can be used
to answer requests, serve as a reference aid for
research analysts, facilitate screening of dossier
files without the necessity for examining the files
themselves, etc.
The initial CHIVE Master Dossier File will be derived
from the BR dossier system with possibly some deletions
to the latter file. The subsequent creation of new
dossier records will occur largely as a result of name
searches, taking advantage of the fact that document index
records and their related documents have been analyzed
in the course of answering the customer's inquiry.
Dossier identifier records may refer either to
physical or logical dossiers. In those cases where it
seems desirable to establish a hard-copy file on a
personality, a digital record containing select elements
of identifying information (see below) will be prepared
and added to the Master Dossier File. In addition, all
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.1.
- 82 -
Approved For Release 2000/05a0CgMk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : 4-g5'8-03952A000100050001-7
documents containing information on the individual will
be reproduced and stored in a folder on the personality
involved. Logical dossiers will also be represented by
identifier records in the digitalized MDF, but the docu-
ments relevant thereto will be accessible only in the
Master Image File.
Maintenance of both the hard copy dossiers as well
as the digitalized Master Dossier File--although left to
the discretion of the analyst--would also ordinarily be
performed as a corollary function to name searches and
not at the time incoming documents are indexed. This
means that a given hard-copy folder file will not
necessarily contain all the available documents on an
individual which may be held in the Master Image File
except immediately subsequent to a request having been
answered on said personality. Similarly, the digital
record on a dossier personality will only be current as
of the time of the last request.
The contents of a digital record in the Master
Dossier File will be:
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.1.
83 -
Approved For Release 2000/05/30: CIANSHEMX03952A000100050001-7
Approved For Release 2000/0513g6KRDP78-03952A000100050001-7
- Personality Name
- Variant Name
- Telegraphic Code
- Dossier Number
- Birth Date
- Citizenship
- Date of Death
- General Occupation
- Organization Affiliation
- Position Title
- Organization Affiliation Date (year only)
- Date Record was Last Updated
- Document Reference Numbers
Whenever new dossier records are added to this file
or changes made to existing records as a result of name
searches, the following actions will take place:
(a) Documents not previously filed in physical
dossiers will be reproduced for same.
(b) Dossier identifier records will be created or
updated.
(c) The list of document control numbers attached
to each identifier record will be compiled or
updated.
The effect of (c) will be to establish the identity of
the individual mentioned, thus capturing the results of
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.1.
- 84 -
Approved For Release 2000/05at4tft-RDP78-03952A000100050001-7
AIM
Approved For Release 2000/05/30 : CIA#EW8-03952A000100050001-7
the analysis. Persons searching the same name at a
later date will have to employ standard search strategy
techniques only to recover those records from the
Master Index File which might have entered the system
subsequent to the previous search. Earlier references
will be available either via the hard copy dossier
itself or, in the case of logical dossiers, through an
"absolute" search on the document numbers known to be
relevant to the individual concerned. The latter can
be a semi-automatic process in which the information
analyst need only specify the dossier number involved--
i.e., the computer will find the document numbers perti-
nent to the dossier, and either print them out or use
these numbers to locate and output the corresponding
index records.
In the initial CHIVE system, scheduled outputs
from Master Dossier File will consist of:
- Master listings in natural language of the personality
identifier records arranged by name within
citizenship.
- Cumulative supplemental listings in the same
arrangement as the master listings.
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.1.
- 85 -
Approved For Release 2000/05/30 : CIA-MERIEV3952A000100050001-7
25X1A
Approved For Release 2000/05/3%E:CdihIRDP78-03952A000100050001-7
Demand (ad hoc) products of the file will include
natural language printouts of the records on a variety
of media (e.g., cards, listings, etc.) in any sort
order desired by customers.
5.5.4.1.2. Name Group Tables
The function of name group tables is essentially
that of any dictionary of synonym and "see also"
references. Such tables properly belong in a list of
"vocabulary control files" since, like any term
dictionary, they serve to relate the several ways in
which a term (in this case personality name) can be
spelled to a standard code.
In the CHIVE system, it is proposed to experiment
with the two kinds of name group tables developed by
and for the Surname Table and
Given Name Table. Each of these tables contains a list
of all the surnames or given names, as the case may be,
which have occurred within the system. Listed with each
name is a reference to the name group to which it has
been assigned.
The functions of these name tables are: (a) to
determine if a specifically spelled surname or given
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.2.
- 86 -
Approved For Release 2000/051tirElfk-RDP78-03952A000100050001-7
25X1A
Approved For Release 2000/05/30 : CgtMN;i8-03952A000100050001-7
name is contained within the system, and (b) to associate
a group number to the name. A new name entry in a
document index record, before it can be filed, must
match a name in the name table. If it finds no match,
the machine will print out a notice to the information
analyst to this effect. The latter will then consult
his tables or the...II/CHIVE expert concerned (pro-
cedure to be determined), assign a
name, and re-enter the
Ideally, the name
record into
group table
group number to the
the machine file.
concept reduces the
intellectual problem for the name searcher by providing
for a guided search of potentially relevant, alternative
name spellings. This capability will not, however,
preclude the searcher from bypassing the name grouping
feature if he wishes the machine to yield only those
records which exactly match the spelling(s) in his
request.
Scheduled products of the Name Group Tables will
include listings of both the surname and given name
tables arranged in both name and group number order.
Query products may include the variant names searched
within a given name group as well as "see also" references
to names in related groups.
SYSTEM FILES
Vocabulary Control Files
5.5.4.1.2.
- 87 -
Approved For Release 2000/05/30 : CIA-Ma103952A000100050001-7
Approved For Release 2000/05/31 6KRDP78-03952A000100050001-7
5.5.4.2. Organization/Facility Identifier Files
The Master Organization/Facility Identifier File
(MOFIF), like other vocabulary control files, is
required to insure consistent indexing of items of
information derived from documents--in this case organi-
zations or facilities (installations). In the sense
used here, organizations and facilities are defined in
the broadest possible terms. They include political,
economic, military, cultural and scientific bodies, as
well as physical installations which are relatively
fixed in terms of geographic location (e.g., a weather
station).
Like the Master Dossier File, the functions of the
MOFIF will be:
- To identify hard-copy folder files maintained by
the system on select organizations where high
request activity is anticipated.
- To provide, via printouts from the file,
identifying information about organizations which
an information analyst can browse in order to
determine (a) whether he has previously assigned
a code or unique identifying number to an organi-
zation and/or (b) whether there is a hard-copy
dossier available on an organization.
- To reduce search time on requests for "controlled"
organizations.
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 88 -
Approved For Release 2000/0WROA-R0P78-03952A000100050001-7
25X6
Approved For Release 2000/05/30: Clglqfq8-03952A000100050001-7
- To provide a summary-type information record
which can be used to answer requests, serve as
a published reference aid, etc.
Not all organization and facility names will be
placed under MOFIF control. Furthermore, not all organi-
zation references encountered in documents will be
indexed by the specific name and/or identifying number
of the organization mentioned. Some organizations will
be indexed only by type using the OTF tag. Still others
(e.g., a laboratory or committee) will be indexed by
their parent organization's code, but will not be
assigned a unique identifying number of their own, con-
sequently they too will not appear in the MOFIF.
The initial CHIVE Master Organization/Facility
Identifier File, which will be composed
will be built during Phase III of the
CHIVE project from existing organizational dictionaries
developed by FIB, SR, and BR. Each organization record
resulting from this process of analysis and synthesis
will include, in addition to the CHIVE-assigned
identifying number and name, the code number or numbers
(if any) by which the organization was previously
25X6
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 89 -
Approved For Release 2000/05/30 : CIA-BEEIZET03952A000100050001-7
Approved For Release 2000/05/36ECKIRDP78-03952A000100050001-7
identified in the OCR register(s) from which it was
derived. These "cross reference" numbers will help
indicate to the searcher whether there is information
stored on an organization in
one
or
inherited from OCR (e.g., SR Detail
and plant folder files, BR dossier,
more of the files
Index, FIB card
etc.). The absence
of such cross references in a record would mean either
that there was no inherited information available on
the organization, or that the organization was so loosely
controlled in the earlier system that a subject search
or some other method for accessing the files would be
required to uncover the pertinent data.
As in the case of personalities, it is planned that
certain organizations would have hard-copy dossier files
where a high request rate is anticipated (but see footnote
in section 5.5.4.1.1. regarding request redundancy study).
If this plan is implemented, incoming documents containing
information on dossier-controlled organizations would
probably be added to the dossiers as a part of the initial
processing activity rather than at the conclusion of
search operations on said organizations (as was proposed
in the case of personality dossier maintenance). The
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 90 -
Approved For Release 2000/05/HdeVRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cl?tEk8P8-03952A000100050001-7
reason for this is that the MOFIF would ordinarily have
to be consulted when indexing an organization in order
to obtain the correct CHIVE identification number for the
organization. Such being the case, little additional
effort would be required to determine from the MOFIF
listing whether a dossier was being maintained on the
organization and, if so, to direct that a copy of the
document be deposited in the dossier concerned.
The contents of a digital record in the Master
Organization/Facility Identifier File will be as follows:
- Translated Name and/or Number
- Functional (Assigned Name)
- Foreign Language or Transliterated Name
- Variant Name(s)
- Previous Name(s)
- Name Abbreviation
- Telegraphic Code
- CHIVE-Assigned 0/F Number
- Dossier Indicator
- Cross-Reference Numbers
? FIB
? SR
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 91 -
Approved For Release 2000/05/30 : ClAfalia8f03952A000100050001-7
Approved For Release 2000/05/30SF6WDP78-03952A000100050001-7
? COMOR
? BR
? BE
? NPIC
? TDI
- Address
? Country
? Political Subdivision
? Place Name
? Coordinates
? Street Address
? Cable Address
? Post Box Number
- Parent Organization
- Type 0/F
- Source Citations
- Remarks
Not all the elements of information listed above
will appear in every organization/facility record in the
MOFIF. Not only will the type of organization have an
effect on the elements of information that will customarily
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 92 -
Approved For Release 2000/05/Alcaitic-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClAkiAlg-03952A000100050001-7
appear in its identifier record (e.g., a political body
will not have a COMOR or BE number, nor perhaps a
specific address), specific elements of information will
be unavailable on many organizations and facilities.
Source citations, where desirable, for items of
data carried in the MOFIF can be included in the identi-
fier records by referencing the control number of the
document which provided the information. One source
reference for each element of information in an MOFIF
record would probably be sufficient. If additional
supporting evidence for a given fact was required, the
index records in the Master Index File could be searched.
The "Remarks" field of an MOFIF record is intended
for use in recording historical facts about changes in
organizational nomenclature, hierarchic relationship to
other organizations, etc. This information will not be
directly accessible, but may be displayed on printout to
enable searchers to determine how to formulate a request
which will insure recovery of all pertinent data about an
organization despite organizational changes which might
have taken place over the years.
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 93 -
Approved For Release 2000/05/30 : ClAgRIRM-03952A000100050001-7
Approved For Release 2000/05/SECtiA-RDP78-03952A000100050001-7
In the initial operational system, current thinking
is that human interface with this file, as well as all
other vocabulary control files, will be through the
medium of the printed listing. It is recognized that
this is not a wholly satisfactory solution (although a
familiar one), particularly in view of the probable
increase in the number and size of vocabulary control
files which the CHIVE indexer must routinely consult.
For this reason, an in-depth study of the matter is
planned during Phase III of the CHIVE project.
Scheduled outputs of the Master Organization/
Facility Identifier File will consist of:
(a) Master listings in natural language of the
organization identifier records pertaining
to a single country. The types of master
listings required are:
(1) A permuted title listing of all organization
names (including official, assigned,
variants, etc.). The reference portion of
the listing will be ordered by 0/F number
and contain the complete identifier record
for each organization referenced.
(2) A listing of MOFIF records, without
organization name permutation, ordered on
place name.
(3) A listing identical to (2) but ordered on
type of organization/facility.
SYSTEM FILES
Vocabulary Control Files
5.5.4.2.
- 94 -
Approved For Release 2000/05/glitai-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAS-E14g03952A000100050001-7
(4) A listing identical to (2) but ordered on
geographic coordinates.
(b) Cumulative supplements to the master listings
issued in the same arrangements as the master
listings but bound together. Alternatively,
supplementary listings may be issued in the
form of pages to be inserted in master listings.
Demand (ad hoc) products of the file will include
natural language printouts of the records on a variety
of media (e.g., cards, listings, etc.) in any sort order
desired by customers.
5.5.4.3. Meeting/Conference Identifier Files
In the planned central retrieval system a requirement
exists for retrospective searching of documents dealing
with certain meetings and conferences. The conditions
which dictate whether a given conference should be
indexed by name or identifying number cannot be stated
with complete precision at this time. Nevertheless, the
fact that some conferences (possibly international
scientific meetings attended by USSR nationals) must be
controlled dictates that the capability be provided in
the CHIVE system to retrieve the pertinent data, whatever
the input criteria might ultimately turn out to be.
SYSTEM FILES
Vocabulary Control Files
5.5.4.3.
- 95 -
Approved For Release 2000/05/30 : CIA*51RfET-03952A000100050001-7
Approved For Release 2000/05/58QU-RDP78-03952A000100050001-7
The function, therefore, of the Master Conference
Identifier File (MCIF) is to relate the several ways
in which the name of a conference or meeting may be
spelled to a standard code and, in addition, to supply
other identifying information which would facilitate
distinguishing meetings having similar names.
The initial data base for the MCIF may be derived
25X6 from the BR's International Conference
and Travel File. In this instance, a requirement does
not exist to merge separate OCR system vocabularies
since BR is the only organization which maintains a
conference authority file. Informal consultation between
the CHIVE conference dictionary editor and BR's authority
in this area during the evolution of the CHIVE system
should enable standardization to be achieved between the
two systems on the identification of international
meetings which both systems index.
The contents of the MCIF will be as follows:
- Name of Meeting/Conference
- Assigned Code Number
- Location
? Country
? City
SYSTEM FILES
Vocabulary Control Files
5.5.4.3.
- 96 -
Approved For Release 2000/05/gpca*-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: Clh-1EPA-03952A000100050001-7
- Date of Conference
- Type of Meeting (Subject)
- Sponsor Organization
Scheduled outputs from the file in the initial CHIVE
system will consist of the usual master listings arranged
in this instance by name of conference, location, and
sponsoring organization. The name listing will be a
permuted title arrangement with the reference portion of
the listing ordered by code number and containing the
complete identifier record for each conference referenced.
Supplementary listings will also be provided either in
cumulative form or (as indicated earlier) as pages to be
inserted into master listings. Demand products will, of
course, be issued in any sort order desired.
5.5.4.4. Location Identifier Files
The function served by the Master Location Dictionary
(MLD) is to specify the approved entry form for certain
classes of locational-type information listed below.
(Detailed address information, e.g., cable address or
street name, will not be under vocabulary control and,
consequently, will not appear in this file.) Additional
SYSTEM FILES
Vocabulary Control Files
5.5.4.4.
- 97 -
Approved For Release 2000/05/30 : CIASINFREV03952A000100050001-7
Approved For Release 2000/05/aCOKRDP78-03952A000100050001-7
uses of the file will be to confirm file coverage by
location, to show hierarchical and synonomous relation-
ships between place names and political/administrative
regions of the world, and to support requests by location
defined in terms of country, political subdivision, or
place name.
Location identifier (authority) files will ultimately
be maintained on all countries, but in the initial system
primary concentration will be placed or the element
of the Master Location Dictionary. Specific senior
content indexers will serve as dictionary editors for
certain geographic portions of the file, rejecting or
approving all new entries generated as a byproduct of
the document input process.
The initial Master Location Dictionary will be
constructed on the base of the NIS Gazetteer. The ISC
4-digit classification system will be used to identify
country and political subdivision. Place name entries
will be carried in full text.
Map catalogers at the Map Library, according to the
terms of a tentative agreement arranged between CHIVE and
SYSTEM FILES
Vocabulary Control Files
5.5.4.4.
- 98 -
Approved For Release 2000/05akW-RDP78-03952A000100050001-7
25X6
Approved For Release 2000/05/30 : dxf-WaT78-03952A000100050001-7
the Map Library, will employ a modification of the ISC
area code which appends the ML provincial codes to the
ISC area code. This expanded code will permit ML to
continue to index provinces and other political sub-
divisions where this degree of index specificity may not
be required for document retrieval.
During the preparation of the map index transcript
sheets at the Map Library, the map cataloger will either
(a) enter the modified ISC area code in addition to the
ML area code on the transcript form, or (b) enter the ML
area code only. If the former procedure is followed,
both codes will be converted to machine language but only
the modified ISC area code will be stored in the CHIVE
Master Index File. The map catalog cards returned to ML,
however, will carry the ML area code since the ML card
catalog employs this system and the use of another area
code would upset the existing file arrangement. Alter-
natively, a conversion table may be built which would
permit the computer to convert the ML area code appearing
in the index records (option [b] above) to the modified
ISC area code, thus obviating the need for both codes to
be entered on the transcript forms by the cataloger.
SYSTEM FILES
Vocabulary Control Files
5.5.4.4.
- 99 -
Approved For Release 2000/05/30: CIA-BEIERS03952A000100050001-7
Approved For Release 2000/05/3gCNTRIDP78-03952A000100050001-7
The Master Location Dictionary records will contain
the following elements of information:
- ISC 4-Digit Numeric Notation
- Major Area, Subordinate Geographic Region, Country,
or other Political Subdivision (including cross
references)
- Remarks (scope notes and comments on historical
changes)
- Place Name (including cross references)
- Remarks (place name scope notes and comments on
historical changes)
- Geographic Coordinates
Scheduled master and supplemental listings of the
Master Location Dictionary will include:
(a) A listing arranged hierarchically by ISC area
code and containing the major area and political
subdivision names together with any "Remarks"
pertaining thereto.
(b) A listing identical to (a) but ordered alpha-
betically on area and political subdivision
name.
(c) A listing arranged hierarchically by ISC area
code with the minor sort alphabetical by place
name. This listing will also include the
"Remarks" field pertaining to place names, as
well as geographic coordinates.
(d) A listing identical to (c) but ordered on
geographic coordinates.
Ad hoc (demand) products of the file will include
a geo-coordinate computation capability. This program,
SYSTEM FILES
Vocabulary Control Files
5.5.4.4.
- 100 -
Approved For Release 2000/0VQRRA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClgkR0T8-03952A000100050001-7
which uses a mathematical technique based on the overlap
of two convex polygons, will allow an information analyst
to retrieve all references to place names falling within
any regular (or irregular) shaped area whose vertices are
known.
5.5.4.5. Subject/Commodity Authority Files
5.5.4.5.1. Intelligence Subject Code (ISC)
A modified form of the ISC classified schedule will
be used in CHIVE as the dictionary authority for entry
of all terms of a descriptive, semi-abstract nature
whether they modify named objects or stand alone. The
file is designed to perform three main functions: (a)
to display relationships among these descriptive terms,
(b) to define these terms when required, and (c) to serve
as a code book for input to the computer. The relation-
ships displayed include synonyms and alternate spellings
as well as class inclusion and class membership. The
file also serves an important mechanical role by requiring
that every ISC code in a new index or query be present in
the file before the transaction of file maintenance or
searching is processed, hence controlling input errors.
The file will be maintained manually, that is, all
SYSTEM FILES
Vocabulary Control Files
5.5.4.5.1.
- 101 -
Approved For Release 2000/05/30 : CIA-WFM03952A000100050001-7
Approved For Release 2000/05/313EaAIRDP78-03952A000100050001-7
relationships and original entries will be externally
controlled with changes made only by a change sheet
following approval by the ISC dictionary editor.
To increase the specificity of the ISC, it will be
augmented by the addition of key words in clear text.
Initially, the information analyst will be permitted to
append these key words to any ISC code without reference
to a controlled list. After a suitable length of time,
however, a key word dictionary separate from the ISC
classified schedule may be developed to provide guidance
for consistent entry.
The ISC is generally recognized as a satisfactory
mechanism for indexing intelligence documents to a
medium level of specificity. It is detailed enough to
organize a document collection into manageable categories,
but not so detailed that it is difficult to learn or
apply with reasonable uniformity. The ISC is particularly
strong for indexing political and socio-economic concepts.
Key word indexing will be used to supplement the ISC
in those areas where it is weakest, and to obtain more
specificity in commodity indexing. Heavy emphasis will
SYSTEM FILES
Vocabulary Control Files
5.5.4.5.1.
- 102 -
Approved For Release 2000/05athEA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA613g-03952A000100050001-7
be placed on key word indexing of equipment nomenclatures
and model types. Key words will also be used to index
scientific processes and concepts, as well as military
strategy and tactics. In other fields, e.g., politics,
there will be less need for key word enhancement of ISC
codes.
Specific revisions required of the ISC to accommodate
it to an all-source document base include the following:
- Reduction of the depth of subject coverage in
selected areas to simplify its application.
- Development of separate schedules for the classifi-
cation of such subjects as organization types and
personality occupational categories. (This would
require the deletion of organization types which
are currently scattered throughout the ISC.)
- Expansion of the list of coded modifiers.
- Expansion of the ISC to provide for special subject
requirements unique to certain sources, e.g., photos
and SI documents.
The contents of a digital record in the ISC file will
consist of:
- ISC 6-Digit Numeric Notation
- Clear-Text Term Definition
- Scope Notes
SYSTEM FILES
Vocabulary Control Files
5.5.4.5.1.
- 103 -
Approved For Release 2000/05/30 : CIA4UNN03952A000100050001-7
Approved For Release 2000/05/gcni-RDP78-03952A000100050001-7
Outputs from the file will include:
- Master listings of the complete ISC dictionary
arranged hierarchically by ISC code with appro-
priate indentations for each lower-level category.
In addition to codes and term definitions, the
listing will contain pertinent scope notes.
- Master listings of the subject index to the ISC
arranged alphabetically by index term, including
"see references."
Both of these types of listings must be classified
as demand products since the frequency and number of
changes to the ISC vocabulary will dictate the periodicity
of master re-runs. Indeed, it is likely that this
vocabulary control file more than any other will be
updated by page inserts rather than by re-issuance of the
complete master schedule.
5.5.4.5.2. Header Data Dictionary
This file encompasses a variety of separate and
distinct system dictionaries which control the entry of
data pertaining primarily to the header portion of
document index records. It includes such specialized
system tables or dictionaries as the following:
- Document Category File
- Report Producing Component File
SYSTEM FILES
Vocabulary Control Files
5.5.4.5.2.
- 104 -
Approved For Release 2000/05gfcaf-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClgiglE-78-03952A000100050001-7
- Series/Periodical Name File
- Classification File
- Codeword Control Stamps File
- Dissemination Controls File
- Photo Type File
None of these files are of such magnitude that their
size alone or access requirements would justify their
storage in a digital medium. Nevertheless, an important
goal of the system is to present information to the external
customer in a language with which he is familiar. This
must be accomplished even though the information is carried
within the system in a different form. For this reason,
wherever a convention of codes has been established for a
certain type of information, and this information must be
displayed to a user on output, the file must be available
in digital storage and a conversion routine provided to
substitute clear text for codes on output.
The data content of records in all the separate
authority files making up the Header Data Dictionary is
identical, i.e., code (whether numeric, alphameric, or
alpha) and applicable term. No machine-generated products
SYSTEM FILES
Vocabulary Control Files
5.5.4.5.2.
- 105 -
Approved For Release 2000/05/30 : ClAiiWWW03952A000100050001-7
Approved For Release 2000/05/3gWRIDP78-03952A000100050001-7
either on a scheduled or demand basis, are presently
envisaged from the file as such, although, as indicated
above, the file will be machine searched to serve other
system functions.
5.5.5. UNSYNTHESIZED INFORMATION FILES (UIF)
No attempt will be made in this section to specify
the particular Unsynthesized Information Files which will
be built by information analysts in the CHIVE system. It
is assumed there will be a continuing requirement for
certain of the analogous information files currently
being maintained (e.g., BR's International Conference and
Travel File), but this is a decision best left to the
information analysts within the system, working in concert
with their external customers. This section merely out-
lines the characteristics of Unsynthesized Information
Files (UIF), explains the rationale underlying their
establishment, and describes some of the methods by which
such files will be constructed and maintained.
As indicated in the introduction to this chapter,
Unsynthesized Information Files consist of select elements
of information about a given subject whether it be a
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 106 -
Approved For Release 2000/05SECREA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA510g-03952A000100050001-7
personality, an installation, or some class of activity
or event. They are to be distinguished from Special
Project Files (discussed in section 5.5.7.), some of
which may be information rather than document reference
type files, in that they reflect only the elements of
information contained in CHIVE document index records.
Similarly, they are distinguishable from Summary Informa-
tion Files (see section 5.5.6.) which are evaluated,
concise statements of fact about similar topics.
Most Unsynthesized Information Files being maintained
in OCR today are the products of specialized input activity
which is separate and distinct from other input processing.
The principal reason for this situation is that the regular
processing system (or systems) cannot readily be modified
to accommodate these specialized indexing requirements,
primarily because of the limitations of the supporting
EAM equipment. A good example is the BR travel index
which, since its inception, has been functionally and
physically separate from the dossier processing system.
In the CHIVE concept of document processing, however,
wherein all the data of significance in the document is
captured in one pass, so to speak, by the person originally
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 107 -
Approved For Release 2000/05/30 : CIABINRE/-03952A000100050001-7
Approved For Release 2000/0566CMT-RDP78-03952A000100050001-7
assigned to index the document, the resultant product
will feed both the document reference system (i.e., the
Master Index File) as well as such Unsynthesized Informa-
tion Files as the information analyst has decided to
build. This means, obviously, that the UIF contain data
no different from that stored in the Master Index, only
select subsets of the same records or phrases.
The reader might well ask at this point why have
Unsynthesized Information Files at all if their content
is identical with elements of information stored in the
Master Index File? Why not simply query the Master Index
when a specific set of data is desired?
This brings us to the criteria for establishment
of a UIF:
- The customer's information requirements must be
capable of definition in terms of logical data
units which have specified characteristics--i.e.,
that there is a logical separation of data elements
into related files so that any one file contains
data relative to a given subject or function.
- A sufficient number of requests can be anticipated
on a continuing basis for the particular set of
data elements contained in an information file to
justify establishment of the file.
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 108 -
Approved For Release 2000/05/ilider-RDP78-03952A000100050001-7
uni
Approved For Release 2000/05/30 : Cbagsg-03952A000100050001-7
Where neither of the above conditions obtain, the
data would remain in the Master Index File, and requests
for the retrieval of specific elements of information
would be handled like any other ad hoc queries levied on
the system. On the other hand, if these requirements are
met, it is generally agreed that it is useful to group the
data elements involved into files, organized on a functional
basis, since they can then be handled as logical elements
in the system for maintenance, retrieval, and system
output.
The basis of organization is affected not only by the
type of information to be processed, but also by the
relative activity of data within a file and the user's
control of the information stored in the file. Thus, in
establishing an information file system, the user will
probably want to functionally group his data (e.g.,
personality travel, leader appearances, missile site
order-of-battle, directories of government officials,
etc.).
For particular applications the user may desire to
have his stored information combined on a different basis.
=Mt
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 109 -
Approved For Release 2000/05/30 : CINEME9-03952A000100050001-7
Approved For Release 2000/05WINI-RDP78-03952A000100050001-7
Ordinarily, he would exercise this option only on the
Summary Information Files discussed in the next section,
since the resultant output would be an evaluated,
higher-quality product. However, the system proposed
will provide for a multi-file output capability from any
file stored within the system. This capability will be
achieved by allowing the user to query several files
and selectively assemble on an output work tape the
resultant information. The data will then be presented
to the user in the format he specifies.
For example, if the user has three files containing
25X1 B
With regard to the means by which inputs to the UIF
are to be obtained, it is evident that a separate indexing
and transcription process is not required if the CHIVE one-
time indexing concept is implemented. In other words, the
plan is to exploit the document retrieval system in order
to build information files. On the other hand, once the
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 110 -
Approved For Release 2000/05ALtRtiA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CI1-W1-03952A000100050001-7
data has been put into machine readable form, the informa-
tion file inputs might be derived either by (a) automatic
duplication of portions of the index records during their
input processing into the Master Index, or (b) by
periodically querying the Master Index. Figure 5-2
illustrates these alternatives graphically.
CHIVE proposes to follow the latter path for the
following reasons:
- It will create less of a burden on the machine
processor which would otherwise have to examine
every incoming record to determine if it contained
data relevant to a particular information file.
- There is no real requirement to update the informa-
tion files at the instant that the data is entered
into the machine.
- By requiring some external action to be taken
before data is transferred to an information file,
management control is enhanced.
To facilitate file building, standing queries will
be written, punched, and entered into the system. Thus,
when the information analyst wishes to add new data to
an information file, he can merely call for the pertinent
query by name and the computer will make the necessary
search and load the data into the relevant file.
The content of records in a UIF may consist only of
a specified set of fixed elements of information which
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
Approved For Release 2000/05/30 : CIASIffeRM03952A000100050001-7
Approved For Release 2000/05/30: CIA-RDP78-03952A000100050001-7
Index
Record
Figure 5-2
UIF FUN BUILDING _ALTERNATIVES
Index
Record
Processor
DP Processor
Master
Index
File
Unsyn.
Info.
Files
Master
Index
Human-triggered
Info. File
Building
- 112 -
Approved For Release 2000/05/30: CIA-RDP78-03952A000100050001-7
25X1B
25X1B
25X1B
Approved For Release 2000/05/30 : CgkBPT8-03952A000100050001-7
appear once in each data record or a combination of
fixed and repetitive elements of information including
some fields of variable length. For example, in a file
elements in a record contained in this file might have
a fixed number of characters or, alternatively, some may
be fixed while others (e.g., "function attended") may be
fields of variable length. Similarly, in the case of a
travel file
Document references, while not vital to information
files which ordinarily do not require consultation of the
documents from which the data was originally extracted,
can nevertheless be included in UIF records where
required by citing the pertinent document control number.
Similarly, the security classification of a record in such
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 113 -
Approved For Release 2000/05/30 : CIAMS76103952A000100050001-7
Approved For Release 2000/05/3g?KTRDP78-03952A000100050001-7
files is easily denoted since the record should typically
carry the same classification as the parent record in
the Master Index. In both cases, the transferral of
the document control number and security classification
from the header portion of a Master Index Record to a record
in a UIF can be accomplished automatically at the same
time that the content data is duplicated for storage in
a UIF record.
Provision will be made to automatically identify
those records in the Master Document Index whose content
has been extracted for a given information file so that
the same records need not be searched later. The system
will also allow the analyst to specify both an "active"
and a "history" file of information for any one functional
area if it seems desirable to save the digital records
representing a file for some indefinite period of time.
Outputs from the UIF will largely consist of periodic
listings of the complete contents of a UIF arranged in
various sequences. Such listings will be used to service
customers as well as information analysts within CHIVE
who will analyze and modify the listed records which will
then be converted back to machine language and input to
SYSTEM FILES
Unsynthesized Information Files
5.5.5.
- 114 -
Approved For Release 2000/0MAZREIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : &&9?:F78-03952A000100050001-7
Summary Information Files. Some of these listings will
no doubt be required on a regularly scheduled basis,
e.g., a leader appearance listing published weekly.
Others will be demand products issued irregularly as
the need arises.
In addition to the provision of hard-copy machine
listings of
digital UIF
meeting the
the records
an entire file for browsing purposes, a
file itself can be queried for records
request specifications, in which case only
satisfying the
request will be output.
This
mode of man/file interface will probably be used less
than hard-copy browsing; but, unlike the CHIVE Vocabulary
Control Files discussed in section 5.5.4., the means by
which the UIF records will be made available to the
human in the system will not be exclusively through a
hard-copy representation of the file contents.
5.5.6. SUMMARY INFORMATION FILES (SIF)
Summary Information Files (SIF), like the Unsynthesized
Information Files, can be classed as formatted in nature
since their specifications can be pre-defined and the data
elements making up their content can be handled as logical
SYSTEM FILES
Summary Information Files
5.5.6.
- 115 -
Approved For Release 2000/05/30 : CIA-BERRIE-D3952A000100050001-7
Approved For Release 2000/05/4d&RDP78-03952A000100050001-7
entities in the system for purposes of input, query,
and output processing. Pertinent input data is organized
by major subject and formatted for ready retrieval and
tabulation by content.
Summary Information Files consist of semi-evaluated
data relative to specific classes of events or named
objects. In format and content they are indistinguishable
from UIF files, differing only in the fact that redundant,
and, usually, contradictory information has been removed
from the SIF files through a process of human analysis
and synthesis of the raw data originally received. While
they cannot be accurately described as containing only
"finished intelligence" (if, by definition, this term is
meant to apply only to the refined outputs of an intelli-
gence research facility), neither is their content truly
"raw" and, for this reason, the expression "semi-evaluated"
has been used advisedly.
Certain of the Vocabulary Control Files which must
exceed the boundaries of a typical dictionary in order to
adequately "identify" a controlled term can also be
properly classified (as pointed out in the introduction
to this chapter) as Summary Information Files. These
SYSTEM FILES
Summary Information Files
5.5.6.
- llt
Approved For Release 2000/05/315E.WRDP78-03952A000100050001-7
ONO
mmw
25X1B
25X1B
sviro
Approved For Release 2000/05/30 : Clagg-03952A000100050001-7
include the personality and organization/facility identifier
files discussed in sections 5.5.4.1. and 5.5.4.2., respec-
tively. Not only are these formatted files whose content
is fixed and specific, they contain a variety of summarized,
semi-evaluated facts about named objects which, when
displayed, can serve a variety of information needs other
than that of supporting the document indexer.
An example of an SIP would be a
Officials File. In reality, a type of organization summary
file, since the stored data would consist of a set of
facts about significant public and private organs of
society within the country concerned and not summary data
about personalities as such, the file entries might contain
in tabular form information on the name of an organization,
its subordination, the names of its officers and perhaps
ordinary members of the organization, their individual
position titles, and the dates of appointment and/or
earliest and latest dates of identification for each. The
data would be retrievable on the basis of any of the
categories which make up the file so that answers to such
questions as:
SYSTEM FILES
Summary Information Files
5.5.6.
- 117 -
Approved For Release 2000/05/30 : CIASEERW03952A000100050001-7
25X6
Approved For Release 2000/05/gqff-RDP78-03952A000100050001-7
25X1B
25X1B can be
readily obtained.
Summary Information Files can, of course, include
fully automatic, semi-automatic, and manual data files
ranging from the highly structured to the unformatted,
narrative type. FIB's Installation Summaries File and
the various types of biographic reports on file in BR
are examples of manual summary information files which
are only partially formatted if at all. BR's Who's Who
Card File is a formatted, semi-automated summary file
on personalities. The CHIVE system will also produce and
maintain manual files of summary information, but the
focus of discussion in this section is on the digital,
and not hard-copy, summary files planned for the system.
The criteria dictating the establishment and
maintenance of an SIF file are identical to those for a
UIF file, i.e., the set of data elements making up the
file must be definable and a "substantial" number of
requests for such data must be anticipated. The means
by which input and maintenance transactions will be
obtained, however, will differ from that of the UIF files.
SYSTEM FILES
Summary Information Files
5.5.6.
- 118 -
Approved For Release 2000/05At9kpek-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAKLW8-03952A000100050001-7
Unlike UIF processing, additions to and modifications
of SIF files cannot be automatically generated since
human judgment is required, if not to identify potentially
relevant inputs, to make the conclusive determination that
a record should be added to a file or a change made to
an existing record.
The process by which these functions will be performed
will vary depending on the nature of a particular SIF file.
Figure 5-3 depicts the various approaches which can be
used.
As indicated in the figure, the SIF file builder
has essentially three options open to him as the means of
inputting data to a Summary Information File: (a) he may
query the basic index records to documents for data
pertinent to an established SIF file (Option 1), (b) he
may obtain a printout of a UIF file which serves as the
raw data base for an associated SIF file (Option 2), or
he may arrange with other information analysts to have
all documents containing information pertinent to his
needs routed to him (Option 3).*
*A fourth option, which has not been suggested because it
would lead to completely duplicative document handling,
would require the SIF specialist to examine all incoming
documents for their possible relevance to a summary data
file. Such an approach would never be required as long as
the initial document indexing was sufficiently specific to
enable the information file specialist to recover the
pertinent index records and/or documents by a search of
the Master Index. SYSTEM FILES
- 119 - Summary Information :files
Approved For Release 2000/05/30 : CIA-BWM339%2A060100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Figure 5-3
SIF FILE BUILDING ALTERNATIVES
2ption 1
Search Request
from SIF
Specialist
Index Record
Listing
Select Re-
cords or
hrases)
Doc 't
Image
File
Option 2
Search Request
from SIF
Specialist
V
Listing of
UIF
Records
14-- Requests for
Select Doctts
SIF
Specialist
Review
Additions
or Changes
to SIF
SIF
File
Option 3
Original
Docits
Screened b
nfo Analys
Select
Doctts
SIF
Specialist
Review
- 120 -
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Additional
Routine
Input
Processing
4
_ J
Approved For Release 2000/05/30 : ClAg611V-03952A000100050001-7
Options 1 and 2 both provide for the possibility
that the SIF specialist may wish to examine certain
documents for himself to clarify some fact reported in
the records listed for him as a result of his inquiry.
It is anticipated, however, that in most instances the
listed records will speak for themselves and no reference
to documents will be required in the SIF input process.
SIF files will be available in both digital and
hard-copy (listing) form, and, like the UIF, can include
active as well as history files. Source citations in
the form of document control numbers may be listed at
the end of each summary information record or referenced
to each term (element of information) in the record.
5.5.7. SPECIAL PROJECT FILES
In the remarks made earlier in this chapter with
regard to Special Project Files, it was noted that the
limits of CHIVE responsibility for special file building,
maintenance, and output processing have not as yet been
satisfactorily determined. Furthermore, in a survey of
existing OCR files which were either exclusively or in
25X6 part (see Appendix 5.D.), no files were
identified which in the CHIVE context would be considered
SYSTEM FILES
Special Project Files
5.5.7.
- 121 -
Approved For Release 2000/05/30 : CIAMBEa03952A000100050001-7
Approved For Release 2000/0545CM-RDP78-03952A000100050001-7
as "special project" in nature. For both of these reasons,
therefore, it is difficult to specify the characteristics
of individual Special Project Files which might be
included either in the initial or final CHIVE system.
One must anticipate, however, that the need for such
special files will be expressed by customers from time to
time, and some remarks may be in order as to the features
which distinguish these files from other system files
and how requirements for such files might be accommodated.
Processing demands of a one-time nature which
necessitate the input, manipulation, and retrieval of a
peculiar set of data will not fall within the Special
Project File category since these will be handled like
any other request. However, if the system were asked to
continue the activity on an indefinite basis, this method
of handling would no longer suffice and a special project
need would have been established.
Special projects will include any files obtained
from organizations external to CHIVE which cannot be
fully integrated with equivalent CHIVE files and which
require machine handling. In this sense the term "special
projects" could apply to certain EAM files inherited from
OCR, as well as to files acquired from other agencies.
SYSTEM FILES
Special Project Files
5.5.7.
- 122 -
Approved For Release 2000/05RTIVRDP78-03952A000100050001-7
mow 25X1 B
ow.
tonal
NNW
Approved For Release 2000/05/30 : MRIg78-03952A000100050001-7
Special Project Files will also describe any customer
input requirements which cannot be satisfactorily handled
by the established system for representing the informa-
tion content of documents. The data involved might be
largely numeric in character or, if non-numeric, would
require the extraction of items of information not
planned for inclusion in the system and which, if accepted,
would significantly add to total processing time.
It is well known that individual members and groups
within the CIA customer population have a number of
relatively unique and distinct information handling
problems which cannot be met by generalized information
system attempting to serve only the common interests of
the many. Examples of such requirements were uncovered
during the earlier fact-finding survey of the DD/I. The
following is but a partial list of some of the needs
expressed:
SYSTEM FILES
Special Project Files
5.5.7.
- 123 -
Approved For Release 2000/05/30 : CSE-eaPT8-03952A000100050001-7
25X1B
Approved For Release 2000/05/455cM-RDP78-03952A000100050001-7
In the past when a research analyst, or group of
analysts, developed an information control problem which
was not being satisfactorily handled either in the
appropriate production office or by the central reference
system, one of the following courses of action was
generally adopted:
- The problem was contracted out to some external
organization which obtained the necessary source
documents and did the input processing and retrieval
25X1A required to respond to the identified need.
Examples of this approach include OSI's Project
25X1A and LSD/SI's
- A special group was set up within one of the
research offices to perform either or both the
input and data manipulation functions depending
on what was required. The
of ORR is a good illustration of this type of
problem solution.
25X1A
- OCR was asked to expand its operations either by
increasing the depth of its indexing or by broadening
its document coverage, or both. OCR projects
generated by a demand for increasing indexing depth
beyond that normally provided by the basic indexing
SYSTEM FILES
Special Project Files
5.5.7.
124
Approved For Release 2000/00KRETA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAGRg-03952A000100050001-7
systems employed include the SR Test Range
Activity File and the BR Travel File. Illustra-
tions of projects which required expansion of the
OCR document base can be found in the
the Library's Science Information Service (SIS),
25X1A the SR PI Reports File, the etc.
25X1B
- Mechanical, if not data extraction, assistance was
requested either of OCR's Machine Division or of
some external machine facility (governmental or
private) where manual techniques for data manipula-
tion and display were unsatisfactory. Examples
File for OBI.
Today, the resources available to an analyst faced
with an unresolved information processing requirement
are much the same--but with one significant difference.
If he is willing to prepare the input data of interest
to him to the point at least where it can be transcribed
into machine-recognizable form, he now has a powerful
machine capability available to him to perform a host of
operations on the data.
The CHIVE EDP System can, of course, increase this
capability. The question, however, which remains is
whether CHIVE should get involved in "special-project"
applications at all, where only a limited set of customer
interests are served, and, if so, whether its involvement
should be restricted to the provision of EDP support or
whether it should also assist in input preparation.
- 125 -
SYSTEM FILES
Special Project Files
5.5.7.
Approved For Release 2000/05/30 : CIA-Wlz7L8r03952A000100050001-7
25X1A
Approved For Release 2000/05/WW-RDP78-03952A000100050001-7
Currently, the OCS/Applications Division is per-
forming a major role in supporting special project
requirements of CIA research analysts. In all such
projects, however, the data extraction responsibility
has been assumed by the customer concerned.
It might be argued that to avoid confusion of
responsibility, the role CHIVE should play in this area
(if any) should be restricted to the assumption of the
responsibility for those special projects where, for
one reason or another, it seems most efficient to have
the central reference organization, rather than research
analysts, prepare the input data. This, however,
heightens the risk of gradually proliferating the informa-
tion processing responsibilities of the CHIVE system to
the point where it might become simply a collection of
special projects.
In the design concept presented here, the responsi-
bility for CHIVE's undertaking certain special projects
has been accepted, but the duty of preparing the input
to Special Project Files is assumed to be the customer's
and the report reflects this philosophy. In the final
analysis, however, the matter can only be resolved by
SYSTEM FILES
Special Project Files
5.5.7.
- 126 -
Approved For Release 2000/05giltalp-RDP78-03952A000100050001-7
owl
god
Approved For Release 2000/05/30 : Clk-W8-03952A000100050001-7
management decision. If the choice is to include
special projects within CHIVE, including the function
of data preparation, the approach suggested above may
provide a modus vivendi for relating the respective
roles to be played by CHIVE and OCS/Applications in
the handling of these projects.
5.5.8. REFERRAL SERVICE FILES
Current manpower ceilings seriously limit the cover-
age of present central system operations. Even if
available manpower can somehow be more effectively
utilized, the volume of material of potential value is
so great that complete coverage would still not be
possible. Thus, the only alternative appears to be to
develop support from other systems, including centralized
as well as personalized (analyst-driven) file activities.
To do this, CHIVE must identify information resources
available in such systems and determine how best to tap
these resources for the Agency consumer.
In earlier CHIVE documentation, reference was made
to a "support mode" which envisaged not only the referral
of customers to persons or files of possible interest
SYSTEM FILES
Referral Service Files
5.5.8.
- 127 -
Approved For Release 2000/05/30 : CIA-SIDERET03952A000100050001-7
Approved For Release 2000/05/3g.QUIRDP78-03952A000100050001-7
external to CHIVE, but the actual acquisition of certain
machine-language files and supporting documentation which
would be searched in-house in behalf of CHIVE customers.
The "support mode" concept remains valid in present CHIVE
thinking, but in the further refinement of the design a
distinction has been made between: (a) files outside of
CHIVE's control which are actually available within the
system in either manual or mechanized form, and (b)
information resources not directly accessible to CHIVE to
which customers may be referred. The former have been
classified in this report as "Supplemental Files"
(see section 5.5.1.), i.e., files neither built by CHIVE
nor inherited from OCR, of which Special Project Files
may be one class. The latter are now termed "Referral
Service Files" and are the subject of this section.
The community's information resources are so vast
and scattered that even the simple identification of
all potential sources constitutes a major problem. For
this reason, it is planned to concentrate initially on
the identification of the many different human file
resources scattered amongst the various service components
and production shops within this Agency. Only after this
SYSTEM FILES
Referral Service Files
5.5.8.
128
Approved For Release 2000/05altRVt\-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIRW8-03952A0001000500017
is done will any attempt be made to obtain descriptions
of information resources and repositories in other USIB
components.
Perhaps the simplest means of beginning to build a
referral service capability will be to derive a set of
analyst profiles from requests levied against the central
system. Using this technique, search terms chosen for
query purposes will gradually form the set of subject
identifiers descriptive of each analyst's interests. This
approach will be supplemented by the circulation of
questionnaires to analysts throughout the research (and
select service) components of the Agency, which would
solicit narrative statements of their areas of substantive
knowledgeability, including their personal files or files
maintained by their respective sections, branches, or
other organizational component. Not all analysts, of
course, can be expected to respond. However, experience
with similar surveys in other organizations suggests that
a response figure of about 80% is not beyond reason.
The returned questionnaires will be indexed in the
vocabulary of the CHIVE system, including both the ISC
classified schedule as well as key words. These descriptors
SYSTEM FILES
Referral Service Files
5.5.8.
- 129 -
Approved For Release 2000/05/30 : CIA-Fe5M63952A000100050001-7
Approved For Release 2000/05/36ECRERDP78-03952A000100050001-7
will not necessarily be limited to conceptual-type
subjects (although the emphasis will probably be on
these kinds of topics) but will, in all probability,
also contain on occasion named-object identifiers such
as the names of persons, organizations, military
installations, etc. It is not anticipated that the
responses will include every specific subject heading
in an analyst file. However, it is hoped that the
principal categories of information contained in such
files will be described and this alone would greatly
assist analysts in seeking to exploit the Agency's
human and documentary resources.
In addition to storing information descriptive of
the subject matter in which a person or file specializes
the data which will be contained in these referral
service records will include:
- Name of Individual
- Organization Identification (component to which
analyst or file is attached)
- Address of Individual or File (room and phone
number)
- Descriptive Title of File
- Overall Security Classification of File
SYSTEM FILES
Referral Service Files
5.5.8.
- 130 -
Approved For Release 2000/0%.3&FRik-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: CIMIrt8P8-03952A000100050001-7
- Releasability
- Countries or Geographical
Person or File
Areas Covered by the
- Primary Intelligence Activity Supported by the
Person or File (e.g., Missile
Photography,
Intelligence, etc.
OB, Ground
- File Storage Medium (documents, 5" x 8" cards,
EAM cards, magnetic tape, etc.)
Assuming the cooperation of a reasonable number of
25X1B
analysts, it is likely that the collected records will be
sufficiently voluminous and file order requirements so
varied that a machine data base will be needed. It is
not contemplated, however, that the Referral Service
mr
Files will be automatically searched at the time queries
are levied against the substantive data files of the CHIVE
system. Rather the content of such files will be made
moo
available in the form of a published Directory of Informa-
tion Resources. This Directory would be issued to CHIVE
system operators and perhaps, in a variety of classifica-
tions, selectively disseminated to Agency consumers.
When specifically requested to do so by a customer,
CHIVE personnel, in addition to searching the basic files
of the CHIVE system, will consult the Directory for the
demi
SYSTEM FILES
Referral Service Files
5.5.8.
- 131 -
Approved For Release 2000/05/30 : CIA-FallER03952A000100050001-7
Approved For Release 2000/05/36EM1RDP78-03952A000100050001-7
purpose of determining what other files or intelligence
analysts might possess information pertinent to a given
query. The type of service that will be provided, if
and when a potentially relevant resource is uncovered,
will vary depending on such factors as the location of
the file or the urgency of the request. In some
instances, CHIVE information analysts will act as the
intermediary between the customer and the other informa-
tion resource. In other cases, they will simply refer
him to the appropriate system.
With regard to the provision of a referral service
capability for files outside of CIA, advantage might be
taken of efforts currently being sponsored both by DIA
and by CODIB to collect descriptions of intelligence
data files maintained in an automated form by Department
of Defense elements and USIB member agencies, respectively.
It is planned that a catalog of such files will be
published periodically and may be interrogated on an ad
hoc basis. If these external collection programs prove
successful, the data resulting therefrom might be merged
with the product of the internal file survey to form a
relatively comprehensive record of information resources
throughout the Community.
SYSTEM FILES
Referral Service Files
5.5.8.
- 132 -
Approved For Release 2000/058aRtIt4-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIWP8-03952A000100050001-7
5.5.9. MANAGEMENT DATA FILE
The CHIVE Management Data File will contain two
types of data:
- Data, obtained by computer methods, about the
processes performed by the EDP portion of the
CHIVE system.
- Data, obtained by manual methods, about the
non-computer processes of the CHIVE system.
The following paragraphs discuss the sources and
collection methodologies for these two types of data,
Imo the reasons for the dichotomy, and the use of this
data to operational management.
and
mast
5.5.9.1. Collection Techniques
As indicated above, the method employed in collecting
the data (EDP data or manual data) determines the origin
of the data and to a large extent the use of the data by
CHIVE managers.
EDP data collection refers to an activity within the
computer itself. The monitor program system (with its
attendant bookkeeping functions) will supervise all
computer operations. This is implied under the philosophy
of a multiprogramming system. This method of operating
provides a natural means of recording:
SYSTEM FILES
Management Data File
5.5.9.1.
133
Approved For Release 2000/05/30: CIA-KIBUREg3952A000100050001-7
Approved For Release 2000/05/36EMERDP78-03952A000100050001-7
(a) Process times. The computer has timing
mechanisms which the monitor can use to record
computation, input, and output times as indi-
vidual entities, as well as the total time the
computer uses to process a transaction.
(b) Error rates and types. A variety of errors
and malfunctions may abort an operation or
degrade the output. Certain of these, e.g.,
misuse of the language, transcription errors,
equipment disorders, and illegal file manipula-
tions, may be more readily detected and recorded
by the computer programs than by manual means.
(c) File activity. It is of significant importance
to determine which files or parts of files
experience a high rate of use. File system
design, program system design, and language
structure are just a few of the areas which
affect the use of the files and are, in turn,
influenced by usage statistics.
This dynamic data may be supplemented by such relatively
static data as:
- Equipment availability
- Day, month, and year
- Priority of the transaction
As this data is recorded (either by the bookkeeping
routines within the monitor or by specially produced
CHIVE programs as adjuncts to the monitor) it should be
entered into a file. This file is essentially a log of
CHIVE EDP transactions and their associated management
data. The normal method of labelling entries in such a
SYSTEM FILES
Management Data File
5.5.9.1.
- 134 -
Approved For Release 2000/05ftibeft-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clk-W8-03952A000100050001-7
file is by "job" or transaction number. Under the
multiprogramming mode of computer operations, it is
mandatory that each "job" which enters the computer be
uniquely identified. This "job" or transaction number
provides a natural storage and retrieval device.
Manual data collection refers to that activity
outside the domain of the computer which collects
management data about the processing of transactions.
As presently envisioned, the process will begin when a
transaction is initiated and will end when the transaction
is completed. For example, a query against a file is a
transaction which begins with the request and ends when
the requester obtains the data and materials which
satisfy his request. Between these two events many
functions are performed in many organizational elements.
The majority of the time-consuming and error-prone functions
are performed by people. Data regarding these functions
may be conveniently collected by manual techniques. It is
suggested that data regarding each transaction accompany
the transaction during the entire process. If feasible, a
standard form should be used. Examples of manual data are
as follows:
SYSTEM FILES
Management Data File
5.5.9.1.
- 135 -
Approved For Release 2000/05/30 : CIA-RDEvai0B952A000100050001-7
Approved For Release 2000/05/3gWIRDP78-03952A000100050001-7
- Name of Requester
- Requester's Organization
- Name of Analyst
- Analyst's Organization
- Type of Transaction
- Transaction Number
- Dissemination Code
- Time Received and Time Released (by each organiza-
tional unit which handles or is responsible for
the transaction)
- Organizational Identifier (for each component
which handles or is responsible for the transaction)
Not all of these are applicable to each transaction.
However, the last two items--times and organizations--
must be supplied for each component and each transaction
for two reasons:
(a) To account for each transaction and its location
in the system.
(b) To provide a complete file of data for process
evaluation.
5.5.9.2. Storage, Retrieval, and Processing
The EDP data, due to the collection method, is
naturally stored as a file of data within the CHIVE EDP
system. As such, it can be processed and retrieved through
SYSTEM FILES
Management Data File
5.5.9.2.
- 136 -
Approved For Release 2000/05AtiMA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA?IW-03952A000100050001-7
the use of the CHIVE query language. It is suggested
that initially no language capabilities be added for
this specific purpose since the number and nature of
reports on machine processes which CHIVE management will
require is not completely predictable.
The manual data collected in the system should initially
be stored, retrieved, and processed by manual methods. As
the system is used and evaluated, the file of manual
data and the number of management reports will increase.
At some point, this data must be processed by the EDP part
of the CHIVE system. For this reason, it is important to
design the manual data forms so that, as volume increases
and operational procedures become firm, the data may readily
be input to the computer and integrated into the EDP
management data file. When this point in system evolution
is reached, all manually collected data regarding CHIVE
operations will be retrieved and stored by initiating a
transaction. Thus, data about the processing of the
Management Data File is recorded in the Management Data
File and constitutes a resource which management may use
to study its own evaluative and analytic activities.
SYSTEM FILES
Management Data File
5.5.9.2.
- 137 -
Approved For Release 2000/05/30 : CIA-MIM63952A000100050001-7
Approved For Release 2000/05/3gMhDP78-03952A000100050001-7
5.5.9.3. Reports and Their Use
In discussing reports and their content, a distinction
should be made as to when, during the evolution of the
system, the reports are needed. This is particularly
true in the case of those reports drawn from the EDP
management data file prior to the incorporation of the
manual data.
The purpose of reports based on data collected by
EDP methods is to assist the CHIVE analysts and designers
in improving, correcting, and modifying the EDP portion of
the system. During the initial stages of operational testing
t will be necessary to examine EDP operations carefully in
order to eliminate bottlenecks and optimize equipment
usage. Certain reports will be highly specialized, e.g., an
analysis of disk storage use over some period of time, and
will not be necessary as a regular product. Of continuing
interest will be reports which provide management with an
insight into the amount of time used on the computer and
its various components. This has long-range implications
regarding computer hardware acquisition.
Reports derived from the manually collected data will
vary in frequency and detail as the system gains operational
SYSTEM FILES
Management Data File
5.5.9.3.
- 138 -
Approved For Release 2000/056tkW-RDP78-03952A000100050001-7
SECRET
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
acceptance. In any new system, there will be imbalances
which must be adjusted if the best results are to be
obtained from the available personnel and equipment.
The parameters which can be measured within the system
are primarily concerned with rates and volumes. It is
suggested that forms be designed and procedures instituted
which will provide managers with raw data on how long a
transaction stays in each component. This is the first
step toward the elimination of delay points in the system.
Shifting of manpower and new procedures will undoubtedly
be necessary. This in turn will prompt another round of
reports and analysis. And so on. Of interest to manage-
ment in terms of long-range changes to the system will be
reports on sources and types of transactions. Such
reports are generated by the present system and will be
produced by CHIVE. Data on the number of cards produced,
number of file accessions, number of references generated,
and number of pages delivered will also provide managers
with the necessary background for making adjustments in
the processing of transactions.
After all management data has been combined in the
file maintained by the EDP system, reports can be
SYSTEM FILES
Management Data File
5.5.9.3.
- 139 -
Approved For Release 2000/05/30: CIA-14Mq3952A000100050001-7
Approved For Release 2000/05/3C6ECRERDP78-03952A000100050001-7
generated on a regular or demand basis with much less
expenditure of manpower. The nature of the reports
will probably vary little after the shakedown period
is completed. However, the volume of data which must
be manipulated dictates an EDP mode of report generation.
SYSTEM FILES
Management Data File
5.5.9.3.
- 140 -
Approved For Release 2000/056Lt.a4!1-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA98-03952A000100050001-7
Chapter 5.6.
SYSTEM FLOWS AND TRANSACTIONS
This chapter provides a more detailed view
of system flows and transactions, i.e., the more
dynamic aspects of the data processing activity,
including some descriptions of illustrative tasks.
The document image storage and delivery portion
of the system is covered in outline only, leaving
the more definitive treatment of this subject to
Volume VI. Similarly, only passing mention is
made of the EDP design since it is fully discussed
in Volume VII.
5.6.1. DOCUMENT INPUT
Referring to Figure 5-4, the input to the
system will be described.
The principal categories of incoming
documents will consist of (a) textual-type
documents received in all source classifica-
tions ranging from Unclassified to T/KH, (b)
select documents (principally SI Teletype)
SYSTEM FLOWS
Document Input
5.6.1.
- 141 -
Approved For Release 2000/05/30: CIA-MMQ3952A000100050001-7
25X1B
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cli&-Rig-03952A000100050001-7
received in machine language (as well as hard copy),
(c) graphic images in the form of ground photography
and films, (d) maps, and (e) machine language (ML)
index records prepared by external organizations
according to CHIVE rules and formats. Graphics and
maps will continue to flow to GR and the Map Library
Division (ML) through their existing acquisition
channels. The only significant change in their oper-
ations will be that they will employ the CHIVE vocabu-
lary in their indexing or cataloguing operations, and
will transmit a copy of their index transcript sheets
to CHIVE for conversion into machine readable form
and entry into the Master Index File. CHIVE in turn
will return to them a printed version of their index
records for entry into their manual files where this
seems desirable.
Documents selected by the information analyst
which are available in machine language and have a
formatted header and title (e.g., SI Teletype) will
bypass indexing and transcription steps and go, in
their machine language versions, directly to the EDP
SYSTEM FLOWS
Document Input
5.6.1,
- 143 -
Approved For Release 2000/05/30 : CliSELSBM-03952A000100050001-7
Approved For Release 2000/05/SMWRDP78-03952A000100050001-7
System where the necessary conversion to CHIVE format
will be performed. The hard copy versions of the
documents will be sent simultaneously to microfilming
for processinj into the microimage store (Master ImaIe
File) Other machine language receipts, consisting
of abstracts of foreign scientific and technical
literature, bibliographic records, and formatted in-
formation extracts pertaining to named-object data
appearing in open sources, may likewise be input
directly to the EDP System. Printed versions of
these receipts, however, may be passed to information
analysts within the system who will thereby be afford-
ed the opportunity to review their content, and, if
desired, delete the corresponding machine record
from the EDP file. Since the source documents will
not accompany these ML inputs, no photoprocessing
will be required.
The remainder of this section will deal with
the principal input flow process depicted in Figure 5-4,
i.e., that relating to all-source textual documents.
SYSTEM FLOWS
Document Input
5.6.1.
- 144 -
Approved For Release 2000/05/3?EatiRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : 1Ig78-03952A000100050001-7
Upon their receipt in the mail room, these
documents will be counted, batched by type, and
assigned document control numbers where required.
The batches will then be forwarded to a dessemination
unit where the documents will be disseminated to other
offices as well as to CHIVE. Documents to be dis-
tributed to CHIVE will be divided into two categories;
(a) reports for which CHIVE has a repository responsi-
bility, and, therefore, must be kept regardless of
substantive content (hereafter referred to as "R"
documents); and (b) non-repository ("NR") documents
whose retention value can only he determined after
examination by an experienced intelligence information
analyst.
"R" documents (constituting the vast majority of
incoming receipts) will be addressed to the appro-
priate CHIVE subcomponent, but will flow initially to
a centralized Header Indexing Group which will index
the bibliographic data on the documents. ,Once this
operation is completed, the documents will be trans-
SYSTEM FLOWS
Document Input
5.6.1.
- 145 -
Approved For Release 2000/05/30 : CSUM1778-03952A000100050001-7
Approved For Release 2000/05/SECM-RDP78-03952A000100050001-7
paitee(1 directly to the Document Delivery System for
image processing into the Document Image File, while
the header index would be sent to the EDP System for
conversion to machine language. The "R" documents
whLob, in all probability, would be the ones most
Then re uested v Agency- customers in the period
aebiately following their receipt, will (by this
process) find their way quickly into the document
store where they will be available for retrieval.
Following image processing, they will be forwarded
to the CHIVE analytical desks marked on the documents
Lor content review and indexing where warranted.
NR" documents will bypass the centralized
Header indexing Group, being forwarded by the
Dissemination Unit directly to the analytical
components within: the CHIVE geographic divisions.
Hore a further redistribution of some of the "R"
is well as ?NR" documents might take placeif the
iLLial dissemination was not sufficiently precise.
Lo any event, the:: ultimate recipient if both types
SYSTEM FLOWS
Document Input
- 146 -
Approved For Release 2000/05/geater-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClWEREIT8-03952A000100050001-7
of documents will be an information analyst specializing
in an area or topic within area. His responsibility,
relative to the "R" documents, will be to determine
whether content indexing is warranted in addition to
the header indexing already performed. If not, he
will destroy the documents and send a notice to the
EDP System that no content index will be forthcoming.
If the documents, however, do warrant content indexing,
he will mark the parts of the documents which he wants
reflected in the index, and will pass the marked docu-
ments to a Content Indexing Group serving his Division.
(The activity of these individuals is described below.)
"NR" documents will likewise be examined by infor-
mation analysts and will either be destroyed or marked
for some form of indexing. If indexing is required,
the documents will be sent first to header indexing
clerks functioningat the division or desk level.
They will prepare header transcript sheets, like their
counterparts in the centralized Header Indexing Group.
Where content indexing is not required but storage is
SYSTEM FLOWS
Document Input
5.6.1.
- 147 -
Approved For Release 2000/05/30 : CIASECRET-03952A000100050001-7
Approved For Release 2000/05/36EMTRDP78-03952A000100050001-7
desired, the "NR" documents will be sent to the
Document Delivery System for microfilming, while
their corresponding header transcript sheets will be
passed to the EDP System. The remaining "NR" docu-
ments (and transcript sheets) which were to be content
indexed will be forwarded to the Division's Content
Indexing Group where they will rejoin the select "R"
documents discussed above.
In the Content Indexing Group, semi-professionals
known as content indexers will prepare content data
transcript sheets by extracting and formatting the
data identified for them by the information analysts.
A selected portion of this work will be inspected and
revised if necessary. Corrections and changes will
be written on the data sheets.
Once the content data transcript sheets have been
prepared, the marked-up copies of the "R" documents
can be destroyed since an image of these will already
be available in the Document Delivery System. The
indexed "NR" documents, however, will be forwarded
SYSTEM FLOWS
Document Input
5.6.1.
- 148 -
Approved For Release 2000/05/39EatfRDP78-03952A000100050001-7
IMO
awl
omit
IMMO
Approved For Release 2000/05/30 : Cl00171-03952A000100050001-7
to the Document Delivery System for processing into
the Master Image File.
Content data transcript sheets for both "R" and
"NRudocuments will be sent to a Data Transcription
Group where they will be copied by typists.* The
typed index entries, after sight verification, will
then be fed to the EDP System for machine processing.
Within the EDP Subsystem, a Page Reader will
convert the clear-text header and content indexes
into machine language. Following this operation,
punched Work Cards will be generated by the computer
from a portion of the header data record which will
be used in the Document Delivery System (see below)
in the preparation of the microimage store. The
complete digitalized records of the header and content
indexes will be processed by computer programs which
will check the records for format and certain types
of content errors and add them to the pertinent system
files.
*Header data sheets can presumably be typed by the
header indexers who prepared them.
SYSTEM FLOWS .
Document Input
- 14 9 -
Approved For Release 2000/05/30 : CIASIRCRE13-03952A000100050001-7
Approved For Release 2000/05/fMURDP78-03952A000100050001-7
In the Document Delivery System, documents to
be kept in hard copy for reasons of length, image
quality, or other will be shelf-filed in an area
contiguous to the microimage file according to their
meaningful document control numbers. The remaining
documents will be routed to a microfilm section.
There they will be photographed, and, assuming the
storage medium selected is the 35 mm. aperture card;
the resultant product will be an aperture card with
the document batch and serial numbers eye-visible in
the aperture. After these numbers are punched into
the aperture cards, the aperture cards will be mechani-
cally collated on these numbers with the deck of Work
Cards prepared by the computer from the header data
records to the same documents. Following collation,
other data punched in the Work Cards will be reproduced
and interpreted for the Vital Materials Repository (VMR)
and NSA as appropriate. Lastly, a master set of the
cards will be filed in document control number sequence
in the Master Image File.
SYSTEM FLOWS
Document Input
5.6.1.
- 150 -
Approved For Release 2000/05/4ECIMRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CligreF8-03952A000100050001-7
5.6.2. DOCUMENT RETRIEVAL
Referring now to Figure 5-5, the recovery of
information from the files will be discussed.
The retrieval process will ordinarily begin with
a customer external to CHIVE originating a request
for data ether on a form designed for this purpose,
by lepnone contact, or by personal visit to the
system. He will be put in touch with an information
analyst working on the geographic/topical area of
concern. The information analyst will be familiar
with the current reporting, having screened incoming
documents to determine what should be indexed, and
will also have had extensive training in the indexing
vocabulary, the logical files available within the
system, and the query language required to conduct
the computer search.
After ascertaining the clearance level of the
customer, the degree of sensitivity desired in the
search, and the heterogeneity of the document base
to be explored (e.g., "search document and photo
SYSTEM FLOWS
Document Retrieval
5.6.2.
- 1 1 -
Approved For Release 2000/05/30 : CIPMREZ-03952A000100050001-7
25X1B
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cl/frarr78-03952A000100050001-7
indexes, but not maps or films"), the information
analyst (assuming a machine search is required) will
translate the request into a set of commands using
the formal language developed by CHIVE (see section
7.A,). To prepare the necessary search criteria he
will consult the various Vocabulary Control Files--
e.g., MOFIF, ISC, etc.--in order to derive the proper
terms on which the search should be conducted. This
research might also reveal whether certain inherited
files would be worth interrogating (see section 5.5.4.2.1.).
Having determined what descriptors to employ in the
search, he will obtain a request number from a central
control point and proceed to fill out an inter-leaved
set of request forms on which he will identify himself
(as well as his customer) by name and address, cite
the file (s) to be interrogated, detail the logic and
priority of the search, and define the output format
required. One copy of his request statement will then
be sent to thereuest control point to be added to
the file of open requests. Assuming, however, that
some inherited files must also be searched since the
SYSTEM FLOWS
Document Retrieval
5.6.2.
- 153 -
Approved For Release 2000/05/30 : CIASBOREI-03952A000100050001-7
Approved For Release 2000/05/AE.CERRTRDP78-03952A000100050001-7
date span of the -request encompassed the ?period prier to
the initiation of the CHIVE system, the information
analyst may be required to take one or more of the
following additional steps:
a. Eeeemine hard copy files of cards or docu-
ments co-located with his organization
component.
Reeeest the retrieval of hard copy records
(c.e., AIRA, one-name cards, etc.) from the
system's centrally-located, master document
cellection.
c. Consult uith other information analysts
familiar with the contents, vocabularies,
and record formats of machine files in-
heritedeby CHIVE and obtain their assistance
(here rcuired) in preparing the special
request forms to interrogate said files.
The formulated machine requests will be typed
en,-, sight verified, and than transmitted to the Page
.e.ealee via the pneumatic tube system. For those
requests to be passed against the EDP files, the
eomputer will check for such things as the complete-
ness of the recruest statement and validation of the
terms composing the query. All requests will then
Le queued for processing against the pertinent
SYSTEM FLOWS
Document Retrieval
3.5.2.
- 154 -
Approved For Release 2000/05/4tderRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClAS-ER-03952A000100050001-7
inherited and. CHIVE-built files.*
Searches of unconverted EM files will be con-
ducted as at present, with the output taking the
form of existing machine listings which cite
documents, personality dossiers, installation num-
bers, or photo accession numbers relevant to the
request. For files converted to EDP and the CHIVE-
built Master Index File, the product of the search
will also be a listing, albeit in a different form.
On the first page(s) of the listing will appear
the identity of the information analyst levying the
request, the request itself, and the list of docu-
ment control nuMbers which satisfied the search
criteria. On succeeding pages will appear, depend-
ing on the output format requested, either the
complete "hit" index records or select elements
thereof. (Output of a statistical count of the
number of documents which matched the search
*The periodicity of searches may differ between these
files, i.e., inherited files may customarily be
searched only once a day while the CHIVE-built
files will be searched on a demand basis.
SYSTEM FLOWS
Document Retrieval
5.6.2.
- 155 -
Approved For Release 2000/05/30 : CIAM1RE8103952A000100050001-7
Approved For Release 2000/05/W8gRDP78-03952A000100050001-7
prescription, without the records themselves, is
also possible if the information analyst so desires.).
Codes appearing in the records would be translated
into clear text for ease of understanding by the
information analyst and customer (if the latter
also reviews the listing directly).
The information analyst will study the various
machine listings received to determine the relevance
of the retrieved records to the search prescription,
and, particularly in the case of inherited file out-
puts, will consult with other information analysts
familiar with the contents and vocabularies of such
files as required. In a certain percentage of cases
the output records may, themselves, answer the
request. If so, the retrieval activity will end with
the ,information analyst transmitting the desired
information by mail or phone to the customer. On the
other hand, the response might have been such that
he will wish to re-enter the request with .mproved
criteria.
SYSTEM FLOWS
Document Retrieval
5.6_2.
- 156 -
Approved For Release 2000/05/Rckft-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cl1-rtR-03952A000100050001-7
When the index record output is satisfactory
but, in itself, does not supply the answer sought,
the information analyst may order the pertinent
documents from the Document Delivery System before
transmitting .he results of the search to his custom-
er for review. If so, he will encircle the appro-
priate document numbers appearing on the first page
of his listing and send this page to the Document
Delivery System. Where inherited files, however,
are involved he may be ordering personality or instal-
lation dossiers, as well as documents, and will,
therefore, follow a. slightly different procedure.
Graphics and map index records uncovered during the
_initial search will be transmitted to the customer
who will order these items for himself.
Dossiers, following their retrieval from the
file, will be forwarded directly to the information
analyst erequesting same. A replica, rather than the
file copy of all other documents, however, including
those recovered from the existing Intellofax and SR
SYSTEM FLOWS
Document Retrieval
5.6.2.
- 157 -
Approved For Release 2000/05/30 : CIAMMI03952A000100050001-7
Approved For Release 2000/05/36gMTRDP78-03952A000100050001-7
collections as well as from the microimage and hard-
copy files of CHIVE, will be prepared before being
transmitted to the analyst.
The information analyst will review the output
from the various document files, and, after removing
those documents which do not appear to be pertinent,
will transmit the response to the customer. Alterna-
tivtely, the information analyst may be asked to
respond to the inquiry by phone, memorandum, completion
of a customer's response form, or by the preparation
of a narrative report (e.g., a biographic summary).
In the latter case, he would obviously have to supply
information rather than documents, which might ne-
cessitate a more sophisticated analysis and synthesis
of the materials at hand.
Lastly, the information analyst may update
certain of his identifier records, as well as dossier
files, to reflect the results of his analysis (see
section 5.5.4.1.1.), or send a marked copy of his
report (if it deserves retention) back through the
input process for indexing and storage in the Master
Image File. He will also return any master cards or
SYSTEM FLOWS
- 158 -
Approved For Release 2000/05/39E;atTRDP78-0)@/13alMOSY7a1
Approved For Release 2000/05/30 : Cl1-ggi-03952A000100050001-7
dossiers to their appropriate files, and report the
closing out of the request by completing his copy
of the request form. The latter will be sent for
processing into the Management Data Files.
5.6.3. INFORMATION FILE BUIMING, MAINTENANCE, AND
RETRIEVAL
As has been pointed out, the CHIVE system, like
the existing central reference operation, will require
a variety of dictionaries and other support tools
(given the general title of Vocabulary Control Files
in this report). In addition, it will maintain sub-
stantive files of information either in unsynthesized
or summary form. Since the procedures for building
such files as well as retrieving data therefrom will
differ substantially from the document indexing and
recovery process, they are reviewed here separately.
Moreover, these files, unlike the Master Index records,
will require continual maintenance, i.e., the deletion
of obsolete or useless data as well as the correction
of or addition of information to, existing records in
SYSTEM FLOWS
File Building
5.6.3.
- 159 -
Approved For Release 2000/05/30 : CIAMM-03952A000100050001-7
Approved For Release 2000/05/3tRUCTRDP78-03952A000100050001-7
the file. The Master Index File, on the other hand,
will require little maintenance at the sub-record
level as such--only the addition of new records to
the file and the periodic retirement of segments of
the file to a less accessible storage medium.
5.6.3.1. Vocabulary Control File Maintenance
Vocabulary Control Files (e.g., MOFIF, MLD,
etc.) will be consulted by content indexers as well
as header data indexers in order to select the ap-
proved term or code for representing a subject or
named-object mentioned in a document.* These files,
initially, will be represented in listing form
although some alternative reference medium will be
intestigated. If the indexer finds no suitable entry
for the topic mentioned in the document, or if the
entry is erroneous or incomplete, he will prepare a
File Maintenance Transcript Sheet on which he will
specify the changes to be made to the file in question,
*The maintenance of the personality identifier file
(Master Dossier Index) is excepted from this discus-
sion since, as the reader will recall from section
5.5.4.1.1., names will not be "identified" during
the input process.
SYSTEM FLOWS
File Building
- 160 - 5,6 3.1
Approved For Release 2000/05/3SELVEIRDP78-03952A000i00050001-7
Approved For Release 2000/05/30 : ClA5k6Pg-03952A000100050001-7
using a portion of the same command language employed
in the retrieval of records from the Master Index File.
The File Maintenance Transcript Sheet will be
Passed to a dictionary editor who will be responsible
for reviewing all changes made to this specific vocabu-
lary control file. He will insure that the proposed
transaction is legitimate and proper, and, after enter-
ing the proposed changes by hand in his master listing,
will forward the transcript sheet to the Data Tran-
scription Group for typing.
After the transcript sheet has been copied and
any necessary corrections made, it will be processed
in essentially the same manner as the Document index
Transcript Sheets, that is, the forms will be convert-
ed to machine language by the Page Reader and the
resultant output fed to the EDP System for updating
the pertinent machine files. A record of the changes
made will then be printed out in the various arrange-
ments required, and returned to the dictionary editor
as well as all indexers using the particular vocabu-
SYSTEM FLOWS
File Building
5.6.3.1.
- 161 -
Approved For Release 2000/05/30 : CIA-ERWF03952A000100050001-7
Approved For Release 2000/05/311E:CdiETRDP78-03952A000100050001-7
lary control file affected. The frequency of
preparation of these printed supplements to master
listings, as well as the frequency with which the
master listings themselves will be rerun, will vary
depending on the number of changes occurring over a
given period of time. The initial period of CHIVE
operation will permit time for some experimentation
to arrive at the most satisfactory procedure.
5.6.3.2. UIF and SIF Processing
As indicated previcusy, formatted information
files consisting of logical data units either in
unsynthesized or summary form may be initiated
either by: (a) analysts external to the CHIVE
system having a pressing and continuing need for the
retrieval of select facts (as distinct from documents)
pertaining to a given subject or function; or (b) by
CHIVE information analysts reacting to the accumu-
lative effect of specific request patterns. Require-
ments of this nature, since they will increase both
the human and machine burden, will be reviewed by
managers at the branch or higher level to determine,
SYSTEM FLOWS
File Building
5.6.3.2.
- 162 -
Approved For Release 2000/05/39EatTRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClOOFF8-03952A000100050001-7
the anticipated load on the system and its capacity
to respond to same.
Accepted requests for the establishment of UIF
or SIF files will be assigned to one or more infor-
mation analysts conversant in the subject matter in-
volved, for initiation of the input as well as main-
tenance and retrieval processing. Assuming the data
is to be stored in digital files, the information
analyst responsible for the file will consult first
with a specialist assigned to the EDP System known
as an EDP File Analyst. The latter will be throughly
familiar with the internal operations of the EDP
System and, in particular, the method used to estab-
lish new digital files. His duties would be analagous
to those of an individual in the Planning Staff of
the Machine Division/OCR, i.e., he will design the
format and record. structure of the machine file re-
quired by the information analyst and see to it that
the file is actually established.
In general, the approach of the area information
analyst will be to use the document retrieval system
- 163 -
SYSTEM FLOWS
File Building
5.6.3.2.
Approved For Release 2000/05/30 : ciASEGREIT-03952mooloomoo1-7
25X1B
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clgag-03952A000100050001-7
to help build the required information files. If
the file, however, is to have the characteristics
of an Unsynthesized Information File (see section
5.5.5, above), the actual involvement of the infor-
matior analist in the input process may not be great
since, presumably, the data requested is already re-
flected in the content of document index records
(i.e., the UIF would be built directly from re-
arranged elements of index records).* Where this
is indeed the case, the information analyst will
periodically direct the computer to take such action
by calling for the appropriate standing query and.
record generation job to be run.
SIF files, on the other hand, will require more
activity on the part of the information analyst
since they will consist of evaluated, summary records
about named-objects or events. These can only be
*If the data is not already being captured, then the
request must be classified as a"special project"
which would require a procedure all its own.
SYSTEM FLOWS
File Building
5.6.3.2.
- 165 -
Approved For Release 2000/05/30 : CIAMMIN03952A000100050001-7
Approved For Release 2000/05/3gaKRDP78-03952A000100050001-7
generated (as suggested in section 5.5.6.) by the
analysis of the output from a UIF- file, from the
Master Index File, or by the processing of the in-
coming documents themselves. Assuming the SIF is
to be built from data in a UIF, the information
analyst will, review the listed product from a UIF,
comparing it with a listing of any records already
stored in the SIF. If he decides to make a change
to the SIF either by adding new data, deleting what
was there, or by replacing old information with
new, he will prepare a File Maintenance Transcript
Sheet (similar, if not identical, to that used to
update vocabulary control files) on which he will
describe the transactions to be performed. This
form will follow the usual path to typing, thence
to the Page Reader, and finally to the EDP System
for computer processing.
The retrieval of data from either the SIF or
UIF files might be initiated for a variety of reasons,
the principal ones being as follows:
a. To provide a listing of changes to the
master file in order to update the infor-
SYSTEM FLOWS
File Building
- 166 - 5.6.3.2.
Approved For Release 2000/05/acMRDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIAW/M03952A000100050001-7
mation analyst's printed version of the
file.
b. To provide a listing of the complete master
file either for reference use by the infor-
mation analyst* or for periodic publication
and distribution to interested customers.
c. To search, in response to a customer's
request, for a specific fact or correlation
of facts which could not be readily derived
by human browsing of the printed records.
Whatever the reason for initiating a retrieval
transaction the process will be virtually the same
as that followed in the retrieval of document index
records (using the same retrieval language), with
the exception that no inherited files should be
involved in the search and no documents will ordi-
narily need to be retrieved from the document image
store. Schedules can, of course, be set up for the
levying of standing queries which would cause the
listing of all or a portion of a file on a periodic
basis without any action being required on the part
of the responsible information analyst.
*The listing will be the primary mechanism for
analyst-SIF communication.
SYSTEM FLOWS
File Building
5.6.3.2.
- 167 -
Approved For Release 2000/05/30 : CIA9KEIREEI-03952A000100050001-7
Approved For Release 2000/05/SECRETRDP78-03952A000100050001-7
5.6.4. TASK TABLES FOR SYSTEM TRANSACTIONS
Examples of the step-by-step procedure by which
some of the system transactions outlined above might
be carried out using the equipment, file organization,
program organization, and operator procedures described
elsewhere in this report are provided below. Obvi-
ously, there are a variety of procedures that might
be used to perform any of these tasks. 4hat is sug-
gested here must, therefore, be regarded as tentative
and subject to modification as procedures are worked
out in detail during Phase III.
With regard to the method of presentation, it
should be pointed out that written descriptions of
even the most routine human activities make difficult
reading at best. Anf.this is no less true of a data
processing operation, especially when couched in the
language of the systems analyst. Secondly, it is a
fact that if flow charts were prepared of many current
central-reference operations, the resultant products
would also appear relatively complex. Yet, somehow,
humans manage to carry out the operations involved.
SYSTEM FLOWS
Task Tables
5.6.4.
- 168 -
Approved For Release 2000/05/1%616RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClA44-03952A000100050001-7
Lastly, it should be recognized that some atypical
problems .are covered in the task tables which would
not ordinarily be encountered in the average trans-
action. These, necessarily, further complicate the
narrative discussion.
The tables which follow have four columns. The
first column (STEP) contains the number of the oper-
ation. The number is used in the body of the table
to reference deviations from the normal sequence of
operations. The phrase, "go to step 10," will tell
the reader that the next operation in the sequence
is step 10. The second column (AGENT) identifies
the person or equipment which is chiefly responsible
for carrying out the operation. The third column
(LOCATION) shows where most of the operation is
carried out.
The fourth column (OPERATION) has one
or more sentences for each operation which describes
what takes place in the operation. These are either
processing operations, in which some action is taken
on the data covered by the task table, or they are
decision operations in which a question is asked and
SYSTEM FLOWS
Task Tables
5.6.4.
- 169 -
Approved For Release 2000/05/30 : CIA-EPTRET03952A000100050001-7
Approved For Release 2000/05/36MKTRDP78-03952A000100050001-7
the consequences are given for the two or more
possible answers'. These consequences are usually
in the form of "go to statements. The statement,
"STOP," is the last statement in the OPERATION
column for a particular task and indicates that the
task is completed.
SYSTEM FLOWS
Task Tables
5.6.4.
- 170 -
Approved For Release 2000/05/3@gatiRDP78-03952A000100050001-7
STATSPEC
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Next 3 Page(s) In Document Exempt
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : aC078-03952A000100050001-7
Table 5-3
OVER-COUNTER DOCUMENT SEARCH
Step
Agent
Location
Operation
1.
Requester
Will vary
Communicate available biblio-
graphic identifying data on
document (s) wanted by phone,
mail, or in person to Docu-
ment Delivery System, and
,
indicate response priority.
2.
Informa-
Document
Prepare request form if not
tion
Delivery
already made out.
Control
System
Clerk
3.
Informa-
Document
If control number is available
tion
Control
Clerk
Delivery
System
for the requested document,
send one copy of the request
form to the search unit respon-
sible for the particular col-
lection or sub-file in which the
document would be stored; if the
control number is not available,
go to step 10.
4.
Document
Document
If the document would ordinarily
File
Delivery
be in the Microimage File, search
Clerk
System
the motorized card file for the
document control number cited
and proceed to step 5; if the
document would ordinarily be in
the Hard Copy File, go to step 18,
- 175 -
Approved For Release 2000/05/30 : Clfirffe1VB-03952A000100050001-7
Approved For Release 2000/05/38EMIRDP78-03952A000100050001-7
Step
Agent
Location
Operation
5.
Document
Document
If the document is found, remove
File
Delivery
document, replacing it. with an
Clerk
System
"out" card, and send document
with request form attached to
reproduction; if document is not
found, and it is in a Category
for which the system has a re-
pository responsibility, forward
request to Hard Copy File
searchers and qo to step 18.
6.
Reproduc-
Document
Prepare paper copy of document
tion
Delivery
on appropriate image-processing
Equipment
System
equipment.
Operator
7.
Reproduc-
Docament
Transmit paper reproduction of
tion
Delivery
document plus request form to
Euuipment
Operator
System
request receipt point, and re-
turn master image to appro-
priate files section for refil-
ing.
8.
Informa-
Document
Deliver copy of document (if
tion
Control
Clerk
Delivery,
Jyste,ALL
found) to requester. Otherwise
notify requester that document
is either still in transit or
not available in CHIVE .(and
why). If requester wishes,
hold the request for a second
search after a suitable time
interval.
- 176 -
Approved For Release 2000/055birElp-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : eRg78-03952A000100050001-7
Ste-
Agent
Location
Operation
9
Informa-
Document
Record temporary or final corn-
,
tion
Control
Delivery
J.y;
pletion of action on request
form and transmit form to Data
, ,
Clerk
Transcription Group for typing
-
and subsequent insertion (via
, ,
the Page Reader) into the
'
,
Management Data Files. End of
,
Over-Counter Document Search.
.
STOP.
,
10.
Informa-
Document
Telephone, or send copy of
tion
Delivery
request form to, EDP System.
Control
System
Clerk
ii.
Informa-
Computer
If a priority request, deliver
tion
Control
Center
to console operator; if not
priority, send to key punching
, Clerk
and go to step 16.
,
12.
'Computer
Computer
Key the request into the corn-
Operator
Center
puter using the inquiry console.*
13,
Computer
Operator
Computer
Center
Using the document identifying
handles provided by the re-
quester (e.g., post, airgram
number, jPRS number, date, or
other), search the header data
portion of the Master Document
Index File and print out the
corresponding document control
numbers.
*Cross reference listings, arranged in various sequences,
will also be available for consultation and may be used
in preference to machine queries to recover document
control numbers where this approach would be equally
effective.
- 117 -
Approved For Release 2000/05/30 : CIAME78-03952A000100050001-7
Approved For Release 2000/05/36EMTRDP78-03952A000100050001-7
Step
Agent
Location
Operation
14.
Computer
Computer
Transmit results of printout
Operator
Center
to Information Control Clerk.
15.
Informa-
Computer
Telephone or transmit request
tion
Control ,
Clerk
Center
form with list of document con-
trol numbers to Document De- .
livery System. Go to step 4.
16.
Key
Computer
Key punch search specifications
17.
Punch
Operator
Computer
Center
Computer
and transmit cards to operations
section to await batch proces-
sing,
Insert the request into the
Operator
Center
1 computer and go to step 13.
18,
Document
File
Document
Delivery
Search the appropriate. segment
of the -Tard Copy File. If docu-
Clerk
System
ment is found, remove document, ,
replacJ.ng with an "out" card,
send document with request form
attached to reproduction, and
go to step 6. If document is
not found, so indicate on re-
quest form, send request form
back to receipt point, and go
to step 8.
,
178
Approved For Release 2000/059Lecet-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Cg-RBg8-03952A000100050001-7
Table 5-4
GENERATION AND INPUT PROCESSING
OF FORMATTED INFORMATION/INDEX
RECORDS PREPARED UNDER CONTRACT*
Step Agent
Location
Operation
:. i Informa-
tion
Contractor
Receive and log in the periodical,
monograph, or other publication
, Control
to be exploited.
Clerk
2. Informa-
Contractor
Obtain code designation (if a
tion
serial) from an official list
Control
and enter same on a routing
.: Clerk
sheet clipped to the publica-
,
3. Informa-
Contractor
tion.
Sort and distribute publications
tion
to appropriate translators de-
Control
pending upon language or content
. Clerk
4. Trans-
Contractor
of publication,
Scan content of publication for
labor
data of interest to CHIVE and
determine elements of informa-
tion to be extracted.
*This table illustrates the procedure which might be
followed where the following conditions prevail:
(a) CHIVE can influence the automation of data at
the source- (b) the elements of information to be
extracted lend themselves to a highly formatted
record structure. Information of this type which
enters the central reference system now, but only
in hard copy, includes the Political and Scientific
Biographic Cards from JPRS, Bibliographic Cards
from the MIRA contract at the Library of Congress,
abstracts of scientific articles from FDD, etc,
- 179 -
Approved For Release 2000/05/30 : CISEEIRE78-03952A000100050001-7
Approved For Release 2000/05/?kaETRDP78-03952A000100050001-7
Step
Agent
Location
Operation
5.
Trans-
lator
Contractor
Type formatted transcript sheet
for Ach article, monograph, or
other, containing the pertinent
information required. 'Enter
. .
data in English in the appro-
priate columns or spaces pro-
vided, and in the coding con-
vention required where this does
not require dictionary consul-
tation. For the latter (e.g.,
organization names), enter
descriptor in clear text. Type
"remarks" - type information,
the abstract body (if a scien-
tific article), and similar un-
formatted text at the end of
the index record.
6.
Trans-
lator
Contractor
Clip transcri9t Sheet to publi-
cation and transmit both to
coding group co-located with
the Contractor or internal to
CHIVE'.
7.
Content
Contractor
Add codes, where required, on to
Indexer
or CHIVE
transcript Sheet in addition to
clear text after consulting per-
tinent CHIVE dictionaries.
8.
Content
Contractor
Return publications to file and
Indexer
or CHIVE
send transcript Sheets to typists.
9.
Typist
Contractor
or CHIVE
If typed product is to .be read by
CHIVE's Page Reader, type entries
in form of hard copy; otherwise,
generate paper tape as well as
hard copy on Flexowriter-like
device and go to step 11.
- 180 -
Approved For Release 2000/05/4tcllerRDP78-03952A000100050001-7
IP
Approved For Release 2000/05/30 : AgaT78-03952A000100050001-7
Step
Agent
Location
Operation
, 10.
Page
Reader
CHIVE
Read typed copy and feed machine-
language product to computer.
11.
Computer
CHIVE
Process records into Master Docu-
ment Index File.
, .
, 12.
Computer
CHIVE
If CHIVE area desk most concerned '
with input records generated by
'
.
contractor does not desire to
'
review additions made to the files,
, '
.
input process is completed. End
of Input of Formatted Index
Records Prepared under Contract.
STOP. If opposite is true, print
out (on a periodic basis) a hard
. ,
.
copy listing of new records enter-
ing system, transmit listing to
.
appropriate CHIVE area desk, and
go to step 13.
13.
' informa-
Lion
CHIVE
Scan output listing for unwanted
items.
Analyst
,
14.
Informa-
tion
Analyst
CHIVE
Prepare a File Maintenance Tran-
script Sheet containing the usual
job specifications (e.g., trans-
action originator, classification,
file to be addressed, date, etc.),
the numbers 3f the unique records
to be add:ressed, and the operation
(presumably a "delete") to be
performed.
- 181 -
Approved For Release 2000/05/30 : CISMET8-03952A000100050001-7
Approved For Release 2000/05/WW-RDP78-03952A000100050001-7
Step
Agent
Location
Operation
15.
Informa-
tion
Analyst
CHIVE
Send transcript sheet via typing
and Page Reader to EDP System
for processing.
16.
Computer
CHIVE
Delete unwanted recorda from
the pertinent file.*
*An alternative approach to that taken in steps 13-16
would have the information analyst responsible for
the file make use of a remote display device to screen
additions to the file and make deletions thereto.
Indeed, such a device could be introduced much earlier
in the input cycle as the means by which codes would
be added to the records and any unwanted entries
deleted before file updating is actually undertaken
by the computer.
- 182 -
Approved For Release 2000/05gedat-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : 6FNCAg78-03952A000100050001-7
ten
1.
Table 5-5
INFORMATION ANALYST ACTIVITY RELATIVE TO A.N
.A.LL-SOURCE, ALL -FILE SEARCH FOR A NAMED PKSONALITY
Agent
q!ation
Anaayst
';aalyst
Infor-
itation
Analyst
Infor-
mation
Analyst
Location Operation
-All vary
- 183 -
Obtain available identifying data
(e.q., name, citizenship, occupa-
tion, affiliation) on personality
wanted by phone, mall or in per-
son from requester.
If request has been levied on
right party, accept same if
request has been levied on riqht
area desk but wrong Information
Anal7st (-because, on this desk,
1-..h re is more than one analyst
and each specializes in a differ-
ent topic), transfer request to
correct individual.
Obtain request number from cen-
tral control point an::.1 enter in
First section of interleaved. re-
cfuest form no elementary data
needed for logqinrj purposes,
I .e., name of rerTuester, date,
name of analyst handling request,
etc.
Send one copy of request form to
control point for filing with.
other ''opon" requests.
Approved For Release 2000/05/30 : CISEalbe78-03952A000100050001-7
Approved For Release 2000/05/AEWRDP78-03952A000100050001-7
Step
Agent
Location
Operation
S.
Informa-
tion
Analyst
C.G.D.
Search Master Dossier Index list-
ing for references to inherited
as well as CHIVE-built dossiers.
If an entry for the personality
is found, extract dossier number
and date dossier identifier
record was last updated.
.
Informa-
tion
Analyst
C.G.D.
Enter in the query statement
section of one copy of the inter-
leaved request form the specific
search parameters to be used in
querying the CHIVE-built Master
Index File. For example, if the
Name Group Table is to be used,
enter single spellings of both
surname and personal names; if
the name group feature is to be
bypassed, enter the specific
variant spellings to be included
in the search; if FNU's are not
wan4-.er1, so specify; if a dossier
is avaiMble on the personality,
exclude unwanted references
already on file in the dossier
by specifying that the date of
preparation of any document index
record containing the desired
name should not be of a lesser
value than the date the dossier
identifier record was last up-
dated. Also list any other
factors 'which will serve to limit
the scope of the search e.g.,
citizenship, general or specific
occupational category, date of
birth range, etc.
- 184 -
Approved For Release 2000/05/MM-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : MCRN78-03952A000100050001-7
[Step
Agent
Location
Operation
.
'
,
'
.
.
.
.
,
7.
8.
9.
Informa-
tion
Analyst
, ,
' ,
?
.
.
?
'
,
Informae
? tion
Analyst
Informs.-
tion
? Analyst
C.G.D.
.
C.G.D.
C.G.D.
?
Assuming the Special Register
(SR) name index portion of the
Detail File to Comint Reports
has not been integrated with
the CHIVE Master Document Index,
complete the copy of the inter-
leaved request form used for
searches of the SR Detail File
consulting (as necessary) with
an Information Analyst familiar
with the vocabulary and file
structure of this inherited
file system. Refer to the
printed version of the Name
sGroup Table to help select the
variant name spellings to be
searched in this file, and also
include any variant spellings
required if the transliteration
system employed in this file is
unique.
If a dossier was discovered on
the personality in step 5, enter
its number on the dossier re-
trieval copy of the request form.
Forward the completed request
form resulting from step 6
Ithrough typing and. Page Reader
to the Computer Center for re-
trieval of the pertinent index
,records from the CHIVE-built
Master Index File; forward the
request form resulting from step
7 directly to the Computer Center
for manual retrieval and subse-
quent listing of the relevant
name records from the punch card
- 185 -
Approved For Release 2000/05/30 : CReRE178-03952A000100050001-7
Approved For Release 2000/05853CRETX-RDP78-03952A000100050001-7
Step
Agent
Location
Operation
file, inherited from SR; forward
the dossier request form to the
hard copy section of the Docu-
ment Delivery System for re-
covery of the dossier desired.
10.
Informa-
tion
Analyst
C.G.D.
Telephone or communicate in some
other fashion the details of the
request to the Graphics Register
(GR) for manual retrieval of
photographs on the individual
wanted from the inherited GR Per-
sonality Photo File. (Photos on
the person processed subsequent
to the initiation of the CHIVE
system will be uncovered, in-
itially in the form of index
records, in the computer search
of the Master Index File refer-
red to above.)
11.
Informa-
tion
C.G.D.
While awaiting receipt of the
listed index records from the
Analyst
Master Index File and SR Name
File, as well as the arrival of
the hard copy dossier and photos,
investigate any self-indexed
card or document files on per-
sonalities inherited from BR
which may be located either
with the area desk or in the
central hard copy files of the
Document Delivery System. Also
examine any Supplementary Files
(e.g., Who's Who publications,
commercial indexes, etc.) avail-
able at the area desk.
- 186 -
Approved For Release 2000/05atR@Ifk-RDP78-03952A000100050001-7
4110
Approved For Release 2000/05/30 : CgRg8-03952A000100050001-7
Step,
Agent ,
Location
Operation
.
,
.
?
,
,
,
" .
,
.
. ?
13.
, .
.
.
.
-
.
14.
Informa-
tion
Analyst
Informa-
tion
Analyst
Informa:-
tion
Analyst
C.G.D.
.
e.G.D.
? ?
.
,
.
,
C.G.D.
Upon delivery of the index list-
ings frnm the Master Document
Index and SR lame File searches,
iLeLne eie references printed
out to determine whether they
indeed refer to the person sought.
Consult again, if necessary,
with an Information Analyst
familiar with, the SR system to
I interpret the output from the
SR file.
Assuming the request will not
be rerun with improved criteria,
identify the documents desired
by encircling the appropriate
document numbers appearing on
the first pages of the listings.
(Alternatively, the listing may
be on a two-part form which will
allow the Information Analyst to
keep a carbon copy of the index
record listing after using the
original as an order for docu-
ments.)
Transmit the document orders to
the Doeument Delivery System, and
any photo control num-Hers to GR,
for retrieval and reproluction
of the items desired.
*It is assumed, for the purposes of this table, that all
material available on the personality being searched must
be examined before a response can be made to the requester.
For this reason, the search cannot end with the retrieval
of an index record or card from a manual file,
- 187 -
Approved For Release 2000/05/30 : ClikEME78-03952A000100050001-7
Approved For Release 2000/05/AMKRDP78-03952A000100050001-7
' Step
Agent
Location
Operation
15.
,
Informa-
tion
Analyst
C.G.D.
'
Assemble all material collected
from the various document re-
positories (i.e., hard copy
dossier . . _ eproductions of
documents from the CHIVE.Mastef
Image File, inherited Comint
Document File, and GR Person-
ality Photo File : . . original
items pulled from self-indexed
card or document files . . .
and reference works from the
Supplementary Files). :Remove
those items which, after analy-
sis of the documents themselves,
prove to be unrelated to the
person in question, and prepare
the response in the manner re-
quested by the customer. End
of All-Source Search for a
Named Personality. STOP.
- 188 -
Approved For Release 2000/05fttailk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClOOF7T8-03952A000100050001-7
Chapter 5.7.
FILE CONVERSION
S.7.1. INTRODUCTION
Of the many types of extant central reference
files which might be candidates for full or partial
conversion to the CHIVE system, two are of primary
concern. These are the document index and document
image type files. In the former category are such
files as the following:
- SR Detail Index File (Comint)
- SR Detail Index File (PI)
FIB Active InstallationIndex File
BR Dossier Index File
IRS Document Index File
- GR Ground Photo Index
Inherited document image files include:
- IRS Document File (includes aperture cards
and hard copy)
SR Comint Document File
- BR One-Name File
- FIB .Active Installation File (includes cards
and folders)
- BR Dossier Folder File
There are, of course, many other types of
central reference files in addition to those listed
above, including some already in machine language.
FILE CONVERSION
Introduction
- 189 - 5.7.1.
Approved For Release 2000/05/30 : CIA-gpen-I33952A000100050001-7
Approved For Release 2000/05/3WARTRDP78-03952A000100050001-7
Most of these, however, are either information
files of such short-term interest that there would
be little reason for converting the existing records,
or are vocabulary control type files which, while
they might be used to build analogous CHIVE indexing
and retrieval tools, would not be converted per se.
The discussion in this section, therefore, will
cover only index and image files, in that order.
5.7.2. DOCUMENT INDEX FILES
5.7.2.1. Reasons for Conversion
One of the most important reasons for converting
tha inherited files to the CHIVE system would be to ,
create a truly centralized source of reference data
and information for the Agency. Conversion of the
existing document index files to magnetic tape under
the CHIVE system would provide a means of establishing
effective data systems management.
The conversion of the inherited files would
result in a reduction in the total number of document
index files that would have to be maintained. In
addition, conversion of these files would tend to
FILE CONVERSION
Index Files
5.7.2.1.
- 190 -
Approved For Release 2000/05Abalf-RDP78-03952A000100050001-7
mei
011111
mstsi
mommi
mot
mirso
Approved For Release 2000/05/30 : ClkW8-03952A0001000500017
simplify the operating procedures of the document
indexing and retrieval system. By converting, only
one set of procedures would be needed as opposed to
a set of procedures for the inherited files and a
different set of procedures for the CHIVE-built
files if conversion were not undertaken. Further-
more, a reduction in the total number of personnel
in the document indexing and retrieval system and a
reduction in space should be obtained by converting
the inherited files.
5.7.2.2. Degrees of Conversion
There are at least three different degrees or
types of conversion that are possible. The first is
a direct conversion and is probably the simplest and
least expensive. Direct conversion means simply that
the card image would be converted directly to tape.
This type of conversion would not reduce any of the
duplicative information existing in the card files.
Moreover, it is the least desirable because it would
provide the least amount of flexibility.
FILE CONVERSION
Index Files
5.7.2.2.
- 191 -
Approved For Release 2000/05/30 : CIA-FOISe4tET3952A000100050001-7
Approved For Release 2000/05/36MEIRDP78-03952A000100050001-7
The second type of conversion is to convert the
card files to the CHIVE format. This would eliminate
any redundancy existing in the card files by pulling
all data that was indexed on any particular document
into one logical CHIVE record. This type of conver-
sion is more desirable since it would provide good
flexibility and would eliminate the built-in redun-
dancy of the existing card files.
The third type of conversion would be a complete
conversion, both syntactic and semantic. The syntactic
as-Deets of the change would be similar to that de-
scribed in the preceding paragraph. The semantic or
vocabulary conversion, however, would re7luire a con-
siderable amount of intellectual participation by
analysts from the respectiveareas where the inherit-
ed files originate. This type of conversion would
be the most desirable and most flexible, but it
would also be the -mot complex and difficult to
accomplish.
FILE CONVERSION
Index Files
5.7.2.2.
- 192 -
Approved For Release 2000/0WthaA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIWER*8-03952A000100050001-7
SR Detail Index File Study
Of the various document index files described
in Appendix 5.D., only one has been looked at in any
detail to ascertain the conversion possibilities.
That file was the SR Detail Index File--the study
being performed to determine the advisability and.
Air
feasibility of converting the file from cards to
magnetic tape or to a direct access device. Some of
04.04
the findings of this preliminary study are presented
mw
below, Whether these are representative of similar
conclusions that might be reached vis-a-vis other
inherited document index files after investigation
f their individual conversion potential, one cannot
say. Further study of the entire problem will be
required during Phase III before any final recom-
NNW
arid
mendations can be made.
The following are the data that were collected
from the SR study. The number of cards, in millions,
that would have to be read to convert all of the Detail
FILE CONVERSION
Index Files
5.7.2.3.
- 193 -
Approved For Release 2000/05/30 : CIA-SEMISI03952A000100050001-7
Approved For Release 2000/05/306WRIDP78-03952A000100050001-7
Index File is as follows:
No. 1 File-Subject/Commodity
No. 4 File-Area
7.2
4.0
No.'s 2,3,6,7,8,9 Files-Organi-
zation and Personality 4.1
15.6
This means that 15.6 million cards would have
to be read to acquire all of the data in the current
Detail Index File. This data applies to conversion
to tape or conversion to a direct access device. Both
approaches are discussed in the following sections.
5.7.2.3.1. Conversion to a Magnetic Tape File
For the first part of this study, it was assumed
that the Detail Index File would be converted to one
long file ordered on series-document number, with all
data pertaining to any one document constituting a
logical record. The file size converted to tape would
be approximately 930 million characters. This would
result in approximately 40 tapes for the master file,
with that many as first backup also. This indicates
that at least 80 tapes would be required at any one
FILE CONVERSION
Index Files
5.7.2.3.1.
- 194 -
Approved For Release 2000/05atatt-RDP78-03952A000100050001-7
!al
VIP
Approved For Release 2000/05/30 : Cl/W6P8-03952A000100050001-7
time to represent the file on tape.
Assuming a thousand cards per minute input rate
with 20% allowed for manual handling, this results in
307 hours of 360/Mod 30 machine time to read the file
in, This is n-Juivalent to approximately 1.3 months
of Mod 30 time (eight hours per day), Assuming the
read-in is performed on extra shift, the minimum cost
would be $3,000. In addition 30 to 35 hours of 7090
or 360/Mod 60 time would be needed for sorting., merg-
ing, and file building. This cost would amount to
approximately $14,000 . Programming and analysts costs
are estimated at $10,000. Therefore, an initial cost
or conversion would, at a minimum, cost about $27,000.
It would. take a minimum of three hours to read
a tape file of this size. An additional half-hour per
day would be required for input request processi.ng,
sorting of input and output, output processing, output
and maintenance. It was assumed that the Mod. 60 would.
be used to do the search processing. This would amount
to approximately $200 per hour. Assuming a once-a-day
search, '1:..he approximate monthly machine rental to per-
form the maintenance and retrieval of the SR Detail
FILE CONVERSION
- . 195 - Index Files
?Approved For Release 2000/05/30 : CIA-FEBERE13952A0B100050001-7
Approved For Release 2000/05/MMTRDP78-03952A000100050001-7
Index File would be $15,400. This is approximately.
two-and-a-half times the present EAM rental of SR's:
Machine Branch.
Turn-around time on requests would suffer by
converting a large file of this nature to tape. The
SR personnel contacted indicated that a 24-hour turn-
around on all requests would be unacceptable. They
further indicated- that approximately 20% of the re-
quests handled by SR require a two-hour-or-less re-
sponse time. These priority requests are spread
throughout the file, not just in a selected portion
of the file.
The amount of space presently occupied by the
SR Machine Branch (card files and EAM gear) is ap-
proximately 4,300-square feet. A reasonable value to
place on this would be about $4 per square foot, per
year. Assuming that 3,000 square feet of this area
could be saved by -conversion, this would result in
an effective savings of $12,000 per year.
FILE CONVERSION
Index Files
5.7.2.3.1.
- 196 -
Approved For Release 2000/0NeRetA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClA041-03952A000100050001-7
5.7.2.3.2. Conversion to a Direct Access File
Slightly different ground rules were chosen for
this technique than were used on the "long tape file."
Instead of trying to form one logical record from all
the cards existing in the Detail File which originated
from any, one document, the existing file structure
was assumed to be transferred to the Data Cell. Also,
it was assumed that a directory or access file of a
very simple nature would be maintained to enhance re-
trieval on this file. It was further assumed that
the IBM 360/Mod 60 would be used to build the file
and perform the operational activities required of
the file.
The file would reside on a 2321 Data Cell which
has a capacity of 400 million characters on line stor-
age. However, the cells on a Data Cell Drive may be
changed much in the same manner that tapes are changed
on tape drives or disk packs on disk drives. Only
one Data Cell Drive, which can have a maximum of
ton cells on line, is required. The converted Detail
Index File would occupy approximately thirty cells
assuming about 75%, packing.
FILE CONVERSION
- 197 - Index Files
Approved For Release 2000/05/30 : CIA-BreR6103952A001MQVA01
Approved For Release 2000/05/35EMTRDP78-03952A000100050001-7
The read-in of the file, assuming a thousand
cards per minute reading rate and 20% handling, would
take 307 hours on the 1402 attached to the Mod 60.
The data could be read on to tape as an interim
measure to save some rental on the Data Cell. How-
ever, some of these savings may be absorbed by addi-
tional programming costs. Assuming the Mod 60 would
be operating in a multi-programmed mode, the cost )f
initial conversion would be as follows:
Reader (1402) $ 1,600
Channels 200
Tapes 3,000
Data Cell 200
CPU 100
Analysis and Program- 10,000
ming
$15,100
As was mentioned earlier, the structure of the
file would be the same as exists presently in cards
Therefore, no sorting for the input conversion is
needed.
Retrieval on the file would take advantage of
the directory to reduce the number of records that
must be read to satisfy a request. A rough estimate
of the average number of cards accessed from the exist-
FILE CONVERSION
Index Files
- 198 -
Approved For Release 2000/05/?kle-RDP78:8362iDdIAY0050001-7
Approved For Release 2000/05/30 : Clagg-03952A000100050001-7
ing file is in the range of 60 to 70 thousand per
request. Therefore, 100 thousand cards per request
was assumed as a very safe estimate for the direct
access file. Assuming 10 requests per day (based on
current usage), this results in approximately one
million cards being processed per day. The average
time of 137 microseconds per card was estimated for
card processing. This results in approximately
0.83 hours per month CPU time. CPU time for re-
trieval and maintenance is 1.83 hours plus about 10%
for handling which equals approximately two hours
per month. This results in approximately $300-400
per month rental for the Mod 60 (for everything
except the Data Cell). A range of costs are provided
instead of more stable figures because of the dif-
ficulty in estimating for a multi-programming
environment.
Estimated use of the Data Cell is approximately
20 hours per month for retrieval and 54 hours per
month for maintenance if the entire file is passed
each maintenance run. These two functions result in
FILE CONVERSION
Index Files
5.7.2.3.2.
- 199 -
Approved For Release 2000/05/30 : CIA-NeREP3952A000100050001-7
Approved For Release 2000/05/3WRIRDP78-03952A000100050001-7
approximately $1200 a month rental.
5.7.2.3.3. Summary
The comments in this summary generally apply to
both parts of the study except where specifically
stated otherwise.
The following table of data was provided, with
some modifications by SR personnel, from the
Report:
25X1A
File
Request Rates
Searches/Mo.
Searches/Day
Requests/Mo.
Requests/Day
No. 1
32
1.5 167
7.6
No. 4
62
3.0
443
20.0
No. 8
24
1.+
41
2.0
No. 7
14
0.6
88
4.0
No. 6
11
0.5
33
1.5
No.'s
2,3,9
73
3.3
1076
50.0
215
10.0
1848
85.1
The table shows, as the headings indicate, the
average requests per month and day. It should be noted
that 90% of the requests against the No. 4 (Area) and
No.'s 2,3,9 (Personality)
files are selected by manu-
ally browsing the files. This means that 57% of the
SR requests are handled manually. Further, from these
facts, it is seen that the conversion to tape or direct
FILE CONVERSION
Index Files
- 200 - a. 3
Approved For Release 2000/05artalt-RDP78-03AzAuuth00050001-7
Approved For Release 2000/05/30 : Clfr'W8-03952A000100050001-7
access file would effectively replace an EAM system
that is handling an average of only 93 requests per
month. This usage rate is very low.
Even if the total request rate were used, it
would still be a low usage rate for a computer driven
file. The last statement is made for two reasons.
First, if the actual number of requests (from a
computer file point-of-view) were 215 per month,
it would be highly questionable whether this would
be large enough to warrant conversion. Second, the
original 215 "requests" do not actually represent
that many requests from a tape or direct access file
standpoint. To explain--a sheet of paper entering
the SR machine area containing instructions for
searching a. file may ask for references relating to
pipes, paper, and cars. These parts are treated as
three requests, not one, even though they all would
go against the same file. However, this would repre-
sent only one request against the file from a tape or
direct access point-of-view. Therefore, the total
request rate of 215 per month would have to be divided
FILE CONVERSION
Index Files
5.7.2.3.3.
- 201 -
Approved For Release 2000/05/30 : CIA-SE1RET03952A000100050001-7
Approved For Release 2000/05/36EUETRDP78-03952A000100050001-7
by some factor to reflect how many requests this
would represent in a tape or direct access system.
Data on what this factor should be is not available.
at this time.
On the basis Of these findings it is recommended
that the total Detail File not be converted to magnetic
tape. On the other hand, conversion of the Detail
File to Data Cell storage appears to be economically
feasible. The costs of performing the conversion and
doing the required retrieval on a Data Cell attached
to an IBM 360/Mod 60 are reasonable. Also, the turn-
around time on a request is satisfactory since it
would only take a little over five minutes to read
and process the required 100,000 records to answer a
request. This should leave adequate time for coding,
and outputting the request.
The decision to convert this file, however,
cannot be based on these technical considerations
alone. The usage rate must also be carefully ap-
praised. Finally, it is important to remember that
this conversion problem is but one of many CHIVE
FILE CONVERSION
Index Files
5.7.2.3.3.
- 202 -
Approved For Release 2000/05abiqyk-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clagg-03952A000100050001-7
implementation tasks which must be addressed during
the next 18 to 24 month period.
5.73. DOCUMENT IMAGE FILES
A comprehensive list of existing document image
files is contained in Appendix 5.D. along with a
capsule description of the function and activity
characteristics of each. Also included for most files
is an appraisal of each file's susceptibility to being
segmented according to geographical area as a means
of transition to the creation of an all-source document
file.
This section will discuss the conversion alter-
natives and recommend a posture for concurrent oper-
ation of inherited and CHIVE-built document image
files. It is felt that the approach presented will
constitute a basis for orderly implementation of a
new central document reference facility.
It is appropriate, first, to look at the reasons
why conversion to a single document system should be
considered. The overriding argument for such a step
is to eliminate the multiple reference points that an
FILE CONVERSION
Image Files
5.7.3.
- 203 -
Approved For Release 2000/05/30 : CIA- 3952A000100050001-7
Approved For Release 2000/05/3ggaDP78-03952A000100050001-7
analyst must currently consult and present to him a
central reference point where a comprehensive response
to his request can be provided. A further incentive
for conversion to a central document system would be
intra-Agency standardization of:
- File media and techniques
- Microfilm processing and reproduction equipment
- Hard copy quality and format
Conversion to a central repository and reproduction
facility also presents a potential for reducing oper-
ating costs by combining similar clerical efforts,
and by facilitating the use of more advanced proces-
sing devices.
Assuming then that there are advantages to be
derived from converting to a centralized document
reference facility, let us consider to what degree
this could reasonably be accomplished.
Of about 25 document image files which are candi-
dates for conversion (files enumerated in Appendix 5.D.),
many can be excluded from consideration as candidates
for conversion. A policy decision has been made to,
FILE CONVERSION
Image Files
5.7.3.
- 204 -
Approved For Release 2000/05/SKIZELT-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIMW8-03952A000100050001-7
the effect that only textual documents are to fall
within CHIVE repository responsibility. This im-
mediately excludes all graphic files (i.e., photo,
film, slide, and map files) which are to remain
the respective responsibilities of GR and the Map
Library.
Another major group of files are the dossiers
Which are subject-oriented folders relating to per-
sonalities, organizations, and installations. These
files are maintained and referenced by information
specialists who generally act as intermediaries between
the consumer and the files. It has not been demon-
strated that this type of information reference
service can be improved by conversion of the existing
files to another storage medium. Consequently, for
the present it will be assumed that these files,
which are primarily under the cognizance of BR and
FIB, will be retained in their present form.
The foregoing exclusions restrict the discussion,
then, to document image files currently maintained by
the Library (Intellofax) and SR. These files are
FILE CONVERSION
Image Files
5.7.3.
- 205 -
Approved For Release 2000/05/30 : CIA-IRBER103952A000100050001-7
Approved For Release 2000/05/306KROP78-03952A000100050001-7
characterized by direct reference activity by the
consumer, and, in most cases, respond by furnishing
the consumer with a document. Primarily, they fulfill
a document retrieval function rather than an infor-
mation retrieval function, and, as such, are prime
candidates for initial implementation as part of a
centralized document reference service. Other files
may prove suitable for incorporation into such a
facility, but they should be evaluated on an ad hoc
basis after a nucleus system has been established.
Our recommendatiOn, therefore, is that an all-source
document reference facility consisting of document
image files within Intellofax and SR be a design
goal for the initial system.
It should be pointed out that the document system
i5 largely independent of the CHIVE computer/indexing
effort and consequently could be implemented prior to
placing the EDP system on an operational basis. The
centralization goal could be attained either in one
step or on a modular basis. Either all incoming
documents from the two systems could be incorporated, up
FILE CONVERSION
Image Files
5.7.3.
- 206 -
Approved For Release 2000/05/nagy-RDP78-03952A000100050001-7
111.111.
RIP
Approved For Release 2000/05/30 : ClOEFF8-03952A000100050001-7
into the new system, or some portion of each (such
as Chicom materials) could be assimilated into the
CHIVE-built system. The latter approach offers the
advantage of limiting the volume during an initial
shakedown phase.
The question remains as to how such an all-
source document reference capability could be
instituted. Essentially, it involves the problem
of somehow combining two diverse inherited systems
and integrating these with a third, new CHIVE-built
system. As a fundamental tenet, total conversion of
the existing document image files to the newly adopted
file medium is not warranted or practical. The in-
herited files are very large in volume, having been
accumulated over a number of years. Conversion to
virtually any new system would require a copy of the
document to be completely re-photographed and re-
processed into the new file medium. Some partial
conversion to the new system might prove advisable
for any segment of the file where high reference
activity, over a long term, can be anticipated.
FILE CONVERSION
Image Files
5.7.3.
- 207 -
Approved For Release 2000/05/30 : CIA-BeeRE-1)3952A000100050001-7
Approved For Release 2000/05/AECAURDP78-03952A000100050001-7
However, because of the low activity rate of the total
file, the cost of converting records which will never
be active should be avoided. The recommended posture,
therefore, is that inherited files will not be con-
verted from their current form but will merely be co--
located within a-single area along with the CHIVE-built
files. The appropriate processing equipment will be
installed within this same area and a single reference
point will be presented to the consumer. Requests
will be serviced through the appropriate systems, and
responses furnished through a single distribution point
,.ffnere the proper enforcement of security restraints
will be administered. The inherited files will be
retained for reference purposes only and will not be
augmented. All new items introduced into the file will
be assimilated into the CHIVE-built system.
It is recognized that the recommended approach
perpetuates existing files and techniques while intro-
ducing one additional document system to operate con-
currently. Nonetheless, this approach seems to be
FILE CONVERSION
Image Files
5.7.3.
- 208 -
Approved For Release 2000/05Aftralf-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clk-RFA-03952A000100050001-7
the only feasible way to cut over to a single,
standardized document system and also eliminate the
extreme cost and effort associated with a large-
scale retrospective conversion. Experience has
shown that there is a bias of reference activity
toward more recent materials which would effect a
gradual phasing out of the inherited systems with
the growth of the CHIVE-built document file.
FILE CONVERSION
Image Files
5.7.3.
- 209 -
Approved For Release 2000/05/30 : CIA-IneRn3952A000100050001-7
Approved For Release 2000/05/30 : CIAWIK03952A000100050001-7
Chapter 5.8.
COMPUTER INTERFACE
5.8.1. GENERAL
The EDP portion of CHIVE will perform the follow-
ing functions:
- Build and maintain files
- Create sub-files from existing files
- Search files and retrieve data from them
- Display data
The techniques chosen to implement these functions
provide a built-in flexibility that will also allow
revisions in the definition of the content and struc-
ture of CHIVE-built files.
In a computer based system, special effort must
be oe'oted to inputting data, searching for it, re-
organizing it, and subsequently displaying it. An
integral -.part of the EDP system is a command language
that allows these types of manipulation. It is recog-
nized that "unlimited" flexibility is allowed if
the user can be persuaded to use machine language.
More practically, a set of commands is provided that
COMPUTER INTERF7V?E
- 211 - General
Approved For Release 2000/05/30 : CIAME15-03952MWM050001-7
Approved For Release 2000/05/1ECOKRDP78-03952A000100050001-7
permits personnel other than programmers to use the
EDP system.
The CHIVE command language is fully described in
Appendix 7.A. The language allows the user to direct
the performance of the four functions mentioned above.
Full use of the commands requires good knowledge of
the indexing procedures, logic, and the content and,
structure of the records and files to be manipulated.
It is planned that only information analysts, diction-
ary editors, and, to some extent, content indexers,
will be trained to use the language.
The responsibilities concerned with defining new
files and modifying existing file definitions will be
assigned to the EDP file analyst. (See section 5.2.3.
Lor further description.) The EDP file analyst must
be trained to a level similar to that of a programmer,
since he must be able to specify files to the system,
initiate jobs for the machine operations personnel and
participate in subsequent check-out.
5.8.2. COMMAND LANGUAGE
The command language permits the information
COMPUTER INTERFACE
Command Language
5.8.2.
- 212 -
Approved For Release 2000/0513%EateDP78-03952A000100050001-7
WA
Wok
p.
,
Approved For Release 2000/05/30 : CIAW8-03952A000100050001-7
ana ysts to direct the EDP system to provide desired
:esults and.. products. The first consideration of the
user 13 to build and maintain files. The usual file
maintenance o-oerations are provided. They are:
- Adding new data to a file
- Changing existing data
- Deleting existing data
The user can control the file maintenance operations
in either of two ways The first way is the usual
one of specifying a unique record identification and
then having the desired maintenance perforMed on that
record. The second ,4ay is to specify logical condi-
tions t'hat coulJ TIllalify a sinrIle record or many
records ithin a file for the specified maintenance
operation. For example, it may be desired to change
the names of all factories named. the Stalin Works to
In such a ease it is only
:accessary to sst up the test condition with a replace
command. The desired changes are made without requir-
ing the user to hnow in advance the unique identifi-
cations of all of the records involved in the trans-
COMPUTER INTERFACE
Command Language
5.3.2.
- 213 -
Approved For Release 2000/05/30 : CIA-IneREP3952A000100050001-7
Approved For Release 2000/05/3g?U-TRIDP78-03952A000100050001-7
action.
The second concern of the user is to search the
files. The CHIVE command language provides basic
search operators -and logical linkage. The available
operators are: and, or, not, greater than, less than,
and equal. In addition, a "scan" operator allows
searches for a contiguous string of characters in a
value field. Notation is provided for specifying that
the character string can be in any position within the
value field and in some relative position. For example,
it may be desired: to find all occur: ences of the
character string ACZN22 no matter liere it occurs in
the value field or only when it is the first six
characters of a value.
Another capability provided by the command lan-
guage is to allow indirect searches. Here we mean
that the user can specify the results of one search
to be used as arguments in a subsequent search. An
example would he: "What universities or colleges
were attended by engineers working at radar plants
in Country A?" Atfirst search is necessary to deter-
COMPUTER INTERFACE
Command Language
5.3.2.
- 214 -
Approved For Release 2000/05tRizty?-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-
VON"
-03952A000100050001-7
mine the names of engineers associated with radar
plants in country A. A second search can then be
made to associate these engineers with schools. The
command language allows the researcher to specify that
the names of the engineers be automatically used as
input arguments to the second search. Thus the
problem involved with routing an intermediate machine
output to an information analyst, setting up a second
search, and then submitting it to the system are elimi-
nated.
New files can be created by preserving the results
of extensive searches of large document files. In
addition, the capability of restructuring records is
provided by the HIT processing commands of the CHIVE
language. These commands allow a user to manipulate
records after they have been found to satisfy search
criteria and before they are transmitted to an out-
put file. The control available permits saving for
output all or specified portions of the original
.records. In addition, computations can be specified
and the resulting values can be appended to the new
COMPUTER INTERFACE
Command Language
5.3.2.
- 215 -
Approved For Release 2000/05/30 : CIA-FSEEREV3952A000100050001-7
Approved For Release 2000/05/WWRDP78-03952A000100050001-7
output records. The resulting files can in turn be
searched and updated in the same manner as any other
system data file.
The command language also governs printing and
displaying data. Section 7.11. describes output proces-
sing in detail and Appendix 7.C. shows samples of the
types of reports provided by the EDP System. To
specify a report it is only necessary to use the print
command and then to state the name of the file, the
sort sequence, and the output format desired. The
format type includes such parameters as number of lines.
per page, width of printed portion of page, top and
bottom literals, pagination, etc. The current report
capability is felt to be adequate at this stage of the
CHIVE development. Additional features will be pro-
vided only after actu-1 need is established in an
operational environment.
r 0 0
FILE DEFINITIONS AND THE EDP FILE ANALYST
The CHIVE command language allows manipulation
of data in existing files and also permits a way of
creating sub-files which can in turn be processed by
the EDP system. These features directly concern
COMPUTER INTERFACE
- 216 - File Definitions
Approved For Release 2000/05/?kkii-RDP78-93%52.A000100050001-7
Approved For Release 2000/05/30 : Clagq-03952A000100050001-7
the information analyst.
The tasks and procedures associated with changing
file definitions and adding new files to the system
are the responsibility of the EDP file analyst. The
CHIVE EDP programs are controlled by external descrip-
tions of the data files to be processed. The data
descriptions taken collectively are called File Format
Tables. Each table describes a file and its consti-
tuent elements. If it is desired to process files
other than those currently defined it is necessary to
add new table descriptions to those already in
existence.
The File Format Tables contain all the informa-
tion about an item that is required to process it.
Included are the terms allowed in a record, term
groupings, which terms are used as identifiers, addres-
sing parameters, occurrence data, bow stored, and con-
tent legality parameters. Extensive revisions can
be made to the tables. In addition to adding new files,
terms can be added to or deleted from an existing file.
Legalities can also be changed. It is important to
note that revisions of this type do not require any
COMPUTER INTERFACE
- 217 -
File Definition
Approved For Release 2000/05/30 : CIA-8eeRaiD3952A00A19035.0001-7
Approved For Release 2000/05/3(REBODP78-03952A000100050001-7
maintenance to the EDP programs.
The external file definition concept requires
a special maintenance system. There are two main
functions involved: the first concerns generating
file format tables, and the second involves restruc-
turing existing file data records. File format
tables are generated from descriptions supplied by
file analysts. Some types of table revision will
result in producing a table that is inconsistent with
the existing file. In this case, the existing file
is processed so that its item structure reflects the
new table revisions. After this Step it is possible dm
for the EDP system to operate correctly on the revised
file with the new file format table.
5.3.4. SUMMARY
The CHIVE EDP System can be viewed by the informa-
tion analyst as a tool for manipulating data. In order
o get at this information, he must learn the rules
and Procedures attendant with the CHIVE command Ian-
juage. Forms will be designed to aid and guide in
transcribing the commands. The EDP system is designed
COMPUTER INTERFACE
Lillitaw.
5 . 4.
- 218 -
Approved For Release 2000/05/Ackty-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : Clk8g8-03952A000100050001-7
to allow random transactions which will obviate to
some tent the scheduling of input to the machine.
Output will be 7;ufficiently identified so it can be
routed -loacI7. to the information analyst.
It is recognized that the interaction of the
man and machine is never smooth. For this reason
two remote consoles will be included in the initial
system. These consoles will permit experimenting,
in an operational environment, with the problems of
direct communication between the information analvst
and the EDP System_ They should be 'helpful in expedi-
ting icarch processing, reducing pa;-)r outpat volumes
and in simplifyin the problem of routing request,73
to and from. the computer.
COMPUTER INTERFACE
Summary
5.3.4.
- 219 -
Approved For Release 2000/05/30 : CIA-SIBERBT03952A000100050001-7
Approved For Release 2000/05/30 : CIRW8-03952A000100050001-7
Aopendix 5.A.
THE ORGANIZATIONAL PROBLEM
This appendix describes the reasoning Ifihich led
CHIVE to recommend the geographic organization of
input and retrie -al personnel with additional topical
specialization or certain priority countries. In it,
various alternative organizational configurations are
described and their advantages and disadvantages dis-
cussed.. A formal report on the CHIVE Indexing Experi-
ment which led to some revision of the organizational
concept recommended here--namely, the removal of the
coding responsibility as such from the information
analyst's area of concern--will be published in the
near future as an additional appendix to this Phase II
moo Report.
5.A.1. ORGANIZATIONAL OBJECTIVES
mart
In considering the overall problem of how best
to organize the functions to be performed and personnel
to carry out these functions in a future storage and
Jaw
retrieval system, it appears logical to address oneself
first to the primary objectives of the contemplated
ORGANIZATIONAL PROBLEM
- 221 - Objectives
Approved For Release 2000/05/30 : CIA-SEriteT03952A606100050001-7
Approved For Release 2000/05/3g?KTRIDP78-03952A000100050001-7
system and to derive from these a subset of organi-
zational-or management requirements which, if met,
could assist in the attainment of the ultimate system
goals. A particular organizational and management
framevork, of course, cannot by itself insure the
achievement of a system superior to that now in
existence. On the other hand, it is equally clear
that despite all the advantages of EDP hardware
(including stored program logic, speeds, etc.) and
new developments in the information retrieval state-
of-the-art, these tools alone are as yet insuffi-
cient to provide any major breakthroughs, and indeed
have inherent disadvantages as well as advantages
which, in the final analysis, must be taken into
account. For this reason the efficient organization
and employment of personnel takes on added significance.
In fact, it may well determine whether a major step
forward is possible.
The principal CHIVE system design objectives
which have been discussed in some detail in earlier
ORGANIZATIONAL PROBLEM
Objectives
5.A.1.
- 222 -
Approved For Release 2000/05Atek-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIWI?F/8-03952A000100050001-7
documentation may be summarized for the purposes of
this discussion as follows:
Objectives derived from user needs
1. Broder document coverage
2. Increased indexing specificity
. More exhaustive indexing
4. Capability to answer more complex questions
S. Reduction of retrieval time
5. Single-service point
Common system vocabularies
3. All-source output capability
ObAectives d.erivoi needs
Micro-storage medium
10. Increased transcription speeds
11. Increased file utilization
12. More efficient use of available manpower w/o
unacceptble degradation of system performance
13. Reduction of index and support file query time
14. Reduction of manual labor involved in preparing
'system outputs (research aids, acquisition lists,
etc.)
ORGANIZATION-\L PROBLEM
Objectives
- 223 -
Approved For Release 2000/05/30 : CIA-Faha1t3952A000100050001-7
Approved For Release 2000/05/30S:Egla1/4411DP78-03952A000100050001-7
15. Improved communication with customer
16. Increased index record lengths so as to
reduce file proliferation
17. Improved evaluative tools for management
Some of the above are themselves organizational
objectives for CHIVE, e.g., items 6, 8, and 15.
Other listed objectives, if they are to be achieved,
have implications at least for the organizational
side of the total system design effort as well as
for other design tasks. Combining the former with
some deductive reasoning about the latter which is
oriented towards the personnel and. management impli-
cations thereof, it is possible to form a list of
what might be called CHIVE organizational require-
ments. This list follows, and it is important to
this discussion since it sets the goals in terms of
which various alternative organizational configur-
ations are compared.
Oblectives Influencing CHIVE
Organizational Structure
I. Specialization with minimum processing
ORGANIZATIONAL PROBLEM
Objectives
5.A.1.
- 224 -
Approved For Release 2000/052tAtft-RbP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIRW8-03952A000100050001-7
duplication
Encourage specialization on the part of
information analysts to the extent possible so
as to improve the quality of inputs and relevance
of outputs to customer needs. At the same time
minimize duplicative processing activities--i.e.,
multiple readings of the same documents, expen-
diture of intellectual time in term selection,
transcription, etc.
2. Minimum customer contact points
Facilitate direct interface between the user
seeking information and the information analyst
-aost knowledgeable on the problem. Provide a
coordination capability where required, but
organize analysts so as to reduce need for same.
3. All-source service from any point
Organize system so that requester, if he so
desires, can receive all pertinent information
from whatever source that bears on his search
problem.
4. Close comTrcunication between input and query
handlers
ORGANIZATIONAL PROBLEM
- 225 - Objectives
Approved For Release 2000/05/30 : CIA-SEITRET03952ACf00406050001-7
Approved For Release 2000/05/M:CaRTRDP78-03952A000100050001-7
Enable person querying system store to be
thoroughly acquainted with processed inputs.
Similarly, keep indexers informed of requests
being handled by the system. Ideally, input and
query processors should be one and the same.
5. Close communication between system operators
and users
Operators should be fully cognizant of intel-
ligence needs and priorities of research analysts.
This is especially important in theCIA appli-
cation where the breadth of customer subject
interests and responsibilities and the volume of
the data base are so large as to prevent equal
attention being given to all subjects or source's.
6. Document control--first priority
The primary responsibility of the central
reference system, i.e., to establish a basic
retrospective search capability for all positive
intelligence documents of immediate or potential
interest to the Agency, must not be diluted by
the additior :rf special tasks which, if permitted
ORGANIZATIONAL PROBLEM
Objectives
5.A.1.
- 226 -
Approved For Release 2000/05gtAlfk-RDP78-03952A000100050001-7
VMS
Approved For Release 2000/05/30 : Clardg-03952A000100050001-7
to grow unrestrained, would prevent the achieve-
ment of fundamental goals. Elemental priorities
must be established and adhered to, and personnel
organized in a fashion to bar the drift toward
serving- specialized user interests.
7. Job satisfaction
Morale of the central reference personnel
must be maintained to reduce turnover and attract
high-quality persons to the staff. Information
analysts positions should afford opportunities
for career growth and offer sufficient intelleC-
tual challenge to interest professional employees.
8. Flexibility in personnel allocations
New processing requirements and shifts in
intelligence interests and priorities should not
unduly upset the central reference operations
and organizational structure. Requirements for
retraining should be minimal if standard vocabu-
laries, input, and retrieval systems prevail
throughout CHIVE. Ideally the shift of one or
more persons to more pressing tasks would not
completely destroy an existing activity assuming
ORGANIZATIONAL PPOBLEM
- 227 - Objectives
Approved For Release 2000/05/30 : CIA6113M103952A60.0100050001-7
25X1B
Approved For Release 2000/05/3V:WRDP78-03952A000100050001-7
the assignment of more than one person to a
given subject or geographic area to begin with
5.A.2. ALTERNATIVE FIRST-LEVEL ORGANIZATIONAL CONCEPTS
Keeping in mind the above-listed objectives for
organizing the central reference personnel and acti-
vities, what kind of organizational configuration
would appear to offer the best hope of meeting most
if not all of these aims? In this section we will
review some of the possible alternatives without
necessarily considering all variant approaches which
might theoretically be envisaged. The focus here will
be on the initial; or first-level, organizational
breakdown. In a subsequent section we will address
the problem of how to manage activities within the
rough organizational framework selected.
5.A.2.1. Alternative A - Retention of Present
Configuration
Under this concept the existing structure
of OCR would be accepted as is. Input and querying
would be organized by subject (Biographic Register,
and Intellofax), by
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.1.
228
Approved For Release 2000/0SPERM-RDP78-03952A000100050001-7
-4 4
apt
Approved For Release 2000/05/30 : CIRWW8-03952A000100050001-7
subject within source (Special Register), and
by information carrier (Graphics Register and
Map Library), Specialized. EDP systems could be
developed .which would be tailored to the needs
and desires of each Register or Division which
might well employ different vocabularies, input
and output processes, document storage media,
etc. Alternatively, all systems might be required
to adopt common file formats, dictionaries, pro-
grams, document storage and delivery systems,
and. so forth in order to simplify management
understanding and control of processing activities
and reduce design costs.
The principal advantages of this approach
are operator and management familiarity with ad-
ministering such a system, the availability of
trained personnel and established operational
procedures, the avoidance of any drastic reshuf-
fling of personnel and slots with all the atten-
dant problems associated therewith, and the
assurance of continuing a level of system per-
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.1.
229
Approved For Release 2000/05/30: CIA-16150M3952A000100050001-7
Approved For Release 2000/05/3?WRDP78-03952A000100050001-7
formance at least as high as that which it now.
obtains. In summary, the retention of the exist-
ing configuration is attractive because it would
be the easiest to implement, and because we know
it works even if the efficiency and quality of its
performance is perhaps less than might be desired.
The major reason for not following this route
is that, while the risks are less, the system will
always be constrained by the organizational struc-
ture within which it must operate. Thus the
potential for real improvement will be limited.
Specifically, it would be impossible to make any
real progress toward achieving objectives 1-3
above and limits severely what can be accomplished
on objective 8. Redundant reading and analysis
of collateral documents could scarcely be avoided
and the trend toward all-source information files
might foster duplicative processing (already
initiated by FIB's exploitation of Comint materials)
in the SI area as well. Semi-duplicative document
repositories, such as now exist in FIB, BR, the
ORGANIZATIONAL PROBLEM
First-Level ConceptS
5.A.2.1.
- 230 -
Approved For Release 2000/MtRON-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : ClaW3-03952A0001000500017
Intellofax System, and to a minor extent GR,
would probaly persist because of the difficulty
of identifying in advance which repository will
choose to keep a given document. Customers
seeking to exploit all the subsystems would still
be faced with the necessity of interrogating each
system separately unless an inter-system reference
group were provided or the system contacted assumed
the responsibility of querying all others. Either
of the latter potential solutions, however, would
interpose request "interpreters" between the
customer and the ultimate respondent with consequent
ill effects to the communication process.
In brief, while Alternative ?i is appealing
because of its familiarity, its inherent disad-
vantages are sufficient in number to influence a
search for something better if such can be found.
5.A.2.2. Alternative B - Single, All-Source Document
Re -ieval System: Separate Biographic Information Facility
Bet.een the extremes of a completely central-
ized, all-source, all-topic storage and retrieval
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 231 -
Approved For Release 2000/05/30 : CIA-F8IBEREI3952A000100050001-7
Approved For Release 2000/05/3gcelkIRDP78-03952A000100050001-7
system and the existing decentralized configur-
ation of OCR many variations and alternative
combinations can be conceived. That which has
attracted the most attention perhaps is the
concept of merging Intellofax, the Special Regis-
ter, and the Foreign Installations Branch but
leaving the Biographic Register as a separate
activity. Proponents of this approach (some of
whom would also except FIB from the merger) gen-
erally point to the "unique character" of the
BR operation, its "analytical" responsibilities,
its production of finished intelligence, the
fact that it is not a document retrieval system
at all but rather an inforTe,ation file, and so
forth.
Most of those favoring this compromise ap-
proach are somewhat vague on the organizational
details. Some, apparently, would establish an
all-source BR, removing the responsibility for
personality control of Comint materials from the
conjoined Intellofax-Special Register operation,
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 232 -
Approved For Release 2000/OgNRRA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : C1W-W8-03952A000100050001-7
Others would not oake this transfer of responsi-
bility arguing, inter alia, that most BR customers
are not cleared for Comint anyway. Some would
retain the all-source FIB system as a separate
file as weIi, presumably with installation index-
ing remaining a part of the Intellofax-SR document
input activity. The redundant analysis of docu-
ments common to each of these systems has either
not been considered by those who have recommended
this approach or has been aecepted as a necessary
evil.
Of those favoring Alternative B or some vari-
ation thereof, most do so in the belief that there
are indeed advantages to be gained from the all-
source, approach, integrated indexing, system
standardization across OCR, common vocabularies
and other reference tools, and other CHIVE goals.
Most would, therefore, adopt CHIVE 's system recom-
mendations if biographic data handling at least
were excluded.
7:71:1at appears, however, to disturb people the
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
-233 -
Approved For Release 2000/05/30: CIAMLF1715103952A000100050001-7
Approved For Release 2000/05/30SMOMP78-03952A000100050001-7
most about the prospect of including biographic
intelligence in a centralized system is the index
transcription problem. It is pointed out first
of all that, while the necessity for filling out
transcript sheets has long been accepted by Intello-
fax and SR analysts, it would not be readily ac-
cepted by BR personnel who, in recent years, have
employed a file system (sometimes referred to as
a "Collectanea" by Jocumentalists)* which requires
no transcription at all. Second, there is the
fact that any transcription requirement, no matter
how limited, would diminish the number of person-
ality references which could be processed by BR
since it would necessarily add to processing time.
Third, there is the argument, freluently expressed,
that BR's need for multiple access points to per-
sonality data has fallen off steadily over the
*This term refers to any file system that used the
general approach of lifting sections from a single
source document, reproducing these excerpts, and
physically filing them under each of the categories
or key words of interest.
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 234 -
Approved For Release 2000/058RRtit4-RDP78-039521000100050001-7
Approved For Release 2000/05/30: Cli1W78-03952A0001000500017
past several years following the assumption of
eaAmunity responsibility for political person-
alities. 7:rAy have more than name control over
files, the reasoning goes, if the majority of
rc!quests are for specific named individuals?
The transcription argument might, indeed,
just.fy leaving BR outside the central system
concept were it not for the fact that following
such a course helps none at all to resolve BR's
storage and retrieval problems. Examined real-
istically, it appears clear that there are only
two fundamental ways of processing biographic
or any other kind of information: (a) by creat-
ing an index to documents containing the pertinent
information (which index is then screened prior
to the recovery of the documents themselves) or
(b) by filing (and, if necessary, reproducing)
the documents under the terms which constitute
the desired search parameters (i.e., by estab-
lishing a "self-indexed document collection).
If the choice is totake the index path then
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 235 -
Approved For Release 2000/05/30 : CIA-FeleRF03952A000100050001-7
Approved For Release 2000/05/3SECRUkDP78-03952A000100050001-7
certain elementary requirements must be met if
retrieval from the system is to be successful.
In the case of large personality record col-
lections it means the index must carry sufficient
identifying information about the personality
to enable the searcher to distinguish between
personalities bearing similar names. The more
identifying information extracted from the docu-
ment the better, but at the price of increased:
transcription time. Alternatively, the more ab-
breviated the index the less the transcription
burden, but at the cost of more irrelevant docu-
ments retrieved.
The "collectanea" (or self-indexed document
file) approach offers the user a reverse set of
advantages and disadvantages. On the one hand,
it virtually eliminates the function of having to
transcribe words from documents. On the other hand,
it vastly increas(s he physical storage require-
ments of the syst A by virtue of the fact that
each document must be multiplied by as many file
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 236 -
Approved For Release 2000/05gt:RrEyt-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIASM-03952A000100050001-7
25X1A
25X1A
headings as one chooses to store the document
Hander. Since no system has unlimited space,
this usually means that the means of access to
the document collection are severely limited in
comparison with document index systems In addi-
tion, the filing problem is exaggerated by the
xplorion of LThe original document population
(witness I.Ez'H3 assignment ofil file ciers fulltlme to Its; central biographic card file andE
cLerIc U dossier system)-
The point of this brief detour into the
of -.,),ansanality data handling is to make
clear that nothing is really gained by leaving
OR oatsi6e the central system framework unless
it has airaftl been concluded that biographic
data -will not bo controlled by an index per se.
r!',ven this '...,;oulj not necessarily dictate the
ei:clusion ryf us
,
1.ra,inie process in, 31.11A-,::IL
would be .oerfcIcy possible for the input analyst,
after indexing the remainder of the document's
content, Is hae the document or selected pages
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 237 -
Approved For Release 2000/05/30 : CIA-MR6103952A000100050001-7
Approved For Release 2000/05/11EMETRDP78-03952A000100050001-7
therefrom reproduced and filed (,,n hard copy or
microimage form) under the personality names of
interest. If, on the other hand, the decision is
to index biographic information then there are
certain very real benefits in integrating this
index activity with the representation of other
subjects discussed in documents.
for the remaining arguments deployed in
the cause of keeping BR outside the integrated
processing activity, they have little bearing On
the manner in which biographic data should be
ored and retrieved. Rather, they relate to
le a_alytical functions to be performed, i.e.,
interpretation, correlation, synthesis, etc.,
after the raw material has been recovered from
the files. Admittedly this intellectual process
could be carried out by a separate group altogether,
as indeed often occurs when a customer (e.g., a
scientific intelligence analyst) chooses to review
and interpret the basic documentation himself.
But it can also be performed, perhaps equally well,
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.2.
- 238 -
Approved For Release 2000/05/?keik-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIRW8-03952A000100050001-7
Lrr persons who also index and retrieve biographic
information. 'Thichever path is chosen it need not
affect where and how documents are processed.
5.A.2.3. Alternative C - Co-located Organizational
'oafiguation
A radically different organization concept
from those discussed thus far, one which deserves
at least brief consideration, is the notion of
decentralizing document processing in the Agency
by di persin(j the activity amongst the research
and production components. Among the arguments
for upgrading the so-called "analyst files" versus
attempting to improve the central reference system
are the folloing:
- Analyst files will continue to be main-
tained whatever is done centrally. Since
they are a major information retrieval
resource vihy not make them even more
effective and efficient?
- Providing analysts with manpower support
in the form of information assistants
hysically co-located with research per-
sonnel in the production offices would
relieve the analyst of most of his file
maintenance problems and enable him to
devote more time to research.
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3
- 239 -
Approved For Release 2000/05/30 : CIA-FEtant3952A000100050001-7
Approved For Release 2000/05/3gUaDP78-03952A000100050001-7
- Analysts could more readily control what
goes Into the files thus reducing input
chaff and providing semi-evaluated re-
trieval.
- Full-time information specialists could'
index more material than analysts can
process into their files today thus im-
proving the breadth and depth of coverage.
In the decentralized as in the centralized
system approach, it is possible to think of many
wayT, in which the processing activity might be
organized. The following, however, are perhaps
the most logical alternatives:
a. Decentralized input and files/central
directory of files
Under this approach OCR would virtually
disappear with the exception of the Library,
FDD, and possible the Graphics Register.
Analysts would continue to process materials
into their own files but might be provided
some machine assistance in the areas of file
manipulation, storage, and reproduction. In
addition, a master profile or directory of
analyst files would be created and maintained
- 240 -
Approved For Release 2000/0*EZRW-RDP78-03952A000100050001-7
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
Approved For Release 2000/05/30 :gictIA3P78-03952A000100050001-7
at some central location. Analysts with a
search problem would consult the directory,
determine which file(s) to peruse, and then
either exploit the file directly or work
through the analyst who maintains the file.
Personnel formerly attached to OCR could
either be assigned to the research analysts
as information assistants where they would
perform the bulk of the input and retrieval
activity, or the research analyst population
might be increased by converting the slots
to intelligence production positions.
ago b. Decentralized Input/Centralized Files
tool
ftig
maintaining decentralized analyst files,
mow
research analysts and/or their information
egg assistants would be required to transcribe
their indexing in such a fashion that a
This scheme would be much the same as
the above in that input processing would
still be performed on a decentralized basis.
The difference would be that, in addition to
warmi0
ftgir
onwii
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
- 241 -
Approved For Release 2000/05/30 : CISEEIRET5-03952A000100050001-7
Approved For Release 2000/05/30EKTIDP78-03952A000100050001-7
record thereof could be passed to a
central storage and retrieval facility.
Similarly, reproductions of the documents
they wished to store or the pertinent cita-
tions thereto would be sent to central
storage. Adoption of this approach would'
greatly increase search specificity over
the directory technique and greatly simplify
the problem of gaining access to the data
files themselves.
c. Decentralized input and files for select
subjects/centralized input and files where
interests overlap
This system is perhaps best represented in
the real world by NSA where files of restricted
interest are co-located with the most appropriate
customer offices, while files of interest to
many are maintained centrally.
d. Centralized input and files/information
specialists co-located with research components
This system would continue the central refer-
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
- 242 -
Approved For Release 2000/04WERGIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : AC:078-03952A000100050001-7
ence activity without prejudice to decen-
tralized analyst files, but representatives
of the central system would serve on permanent
or rotational assignments in the customer
offices. Their function would not be to index
material for analysts, nor to actually search
and retrieve material from the central system,
but to improve communications between the
analysts and the central storage and retrieval
operation. They would provide advice to
analysts on the reference services available
to them, transmit their queries to the proper
components, identify unnecessary and/or
duplicative data files, inform the central
service of current intelligence priorities and
anticipated retrieval needs, and in general insure
that both sides of the house achieved a full
understanding of each other's problems, capa-
bilities, and requirements.
There is much that is attractive about all
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
- 243 -
Approved For Release 2000/05/30 : afr-CRW78-03952A000100050001-7
25X1A
Approved For Release 2000/05/46GMRDP78-03952A000100050001-7
the above alternatives primarily because all
Provide better user definition and control of
what the Agency should be retaining in its record
collections, and because all provide a means
for the analyst to exploit potentially useful
files maintained by others. With the exception
of stem 3ad., however, which appears to offer
some significant advantages which might well
be tested on a limite( basis, all suffer from
one or more of the following disadvantages which
-
sufficiently serious to recommend the rejection
of the decentralized organizational concept as
a practical solution:
re
Elimination of part or all of the existing
central processing activitivies would inevi-
tably give rise to increased record keeping
by Agency analysts. Indexing by these
analysts would be highly duplicative and
inefficient because of overlapping interests
amongst Agency components. Even today the
duplication of analyst file activity is.
sufficiently widespread to cause some ? to
seek ways in which the situation might be
ameliorated. In a recent study* one re-
search analyst reported that "the files of
The Analyst's Inbox in the DWI
Area: Help or Hindrance?, 30 June 1964, OTR/IPC,
Confidential.
- 244
ORGANIZATIONAL PROBLEM
-
Approved For Release 2000/05/igdre-RDP78-q3 A. 2 . 3 .
V6ib *166514/2P:='"
Approved For Release 2000/05/30 : gk-kirTID78-03952A000100050001-7
several offices within OCI and ORR practi-
cally mirror each other, if not in totality,
then at least in certain subjects." Among
the reasons for this situation, the same
analyst observed, is the failure of manage-
ment to properly define the exact responsi-
bility of the analyst beyond his geographic
area, the necessity for the analyst to be
aware of the "big picture," fear of requests
from Agency officialdom whether they fall
within the analyst's assigned mission or nor,
physical distance from other potentially
useful files, etc. Whatever the truth of
these remarks (and all were noted during the
mow CHIVE Fact-Finding Survey of the DD/I), any
enlargement of the analyst's filing responsi-
bilities would result in a corresponding
increase in duplicate files.
mr
- It would be virtually impossible to establish
and maintain inter-analyst consistency in
indexing, and to enforce adherence to standard
rules and practives. The many components in-
volved, each responsible to a different line of
command, would make coordination and management
most difficult.
- Analysts regard file maintenance as a necessary
evil. Any suggestion that they expand their
input activities, especially if it requires
them to prepare index records in a fashion
mai which can be "captured" for storage at a central
location, would meet with great resistance.
- Analysts select only a small percentage of
ono
incoming documents for filing. This
fraction of collected intelligence infor-
mation ordinarily reflects a current
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
- 245 -
Approved For Release 2000/05/30 : C1EAM78-03952A000100050001-7
Approved For Release 2000/05/36ECRE-RDP78-03952A000100050001-7
problem bias or that material pertinent
to an analyst's production assignments
for the coming year. Moreover, some
information which would be filed by an
analyst with less experience on the job
would be ignored by the more senior type
who has already stored such information
in his head. Unfortunately, the analyst's
cranium, although a well-recognized part
of the Agency's institutional memory, is
not easily accessed by information seekers
and is lost when the analyst leaves the
Agency.
- Analysts almost universally state that
they I:'ant and need a central system for
retrospective search and file back-up.
They do not feel that their own files,
nor even the sum of all files of all
research components even if they could
be made readily available to them, would
fully satisfy their requirements.
- The possibility of co-locating select
central reference files with the primary
users, as suggested in 3.c. above, is
practical only for intelligence organi-
zations having clear demarcations of sub-
ject and area responsibility. Regrettably,
no such pattern prevails in this Agency,
as pointed out in the study referred to
above.
- Agency reference responsibilities to other
USIB components, whether imposed by DCL)
directive (e.g., biographic) or the result
.
of tradition and historical precedent,
could-he met only with great difficulty
if the centralized file concept were
abandoned. Interface problems of inde-
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.3.
- 246 -
Approved For Release 2000/05W0W-RDP78-03952A000100050001-7
imp
Approved For Release 2000/05/30 :WligP78-03952A000100050001-7
scribable complexity would inevitably
arise.
In summary, there appears to be no accept-
able alternative to a central reference system
for a consumer population as large and complex
as that represented by the DD/I and other CIA
and non-CIA components.
5.A.2.4. Alternative D - Centralized, Geographically
Organized Configuration
Assuming the organizational objectives listed
on pages 224-227 are indeed the controlling
parameters in selecting a management framework
for a future information storage and retrieval
system for the Agency, it is difficult to con-
ceive of any better way of organizing the person-
nel involved than by grouping them initially by
geographic area. While this would not overcome
all operational problems that can be envisaged,
of all the systems considered it comes nearest
to meeting the requirements outlined above.
In a geographic organizational arrangement
there would be, perhaps, five major geographic
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 247 -
Approved For Release 2000/05/30 : Gi&CF&K78-03952A000100050001-7
Approved For Release 2000/05/3EalliTRDP78-03952A000100050001-7
divisions reporting directly to a single manager,
presumably at the Assistant Director level. Most
of the existing central reference repositories
(i.e., BR, FIB, SR, and DD) would be abolished
and their personnel transferred to the new geo-
graphic components. Previous area assignments
would be taken into account in relocating per-
sonnel.
Documents would be disseminated to the geo-
graphic divisions by an external dissemination
group which would also handle, dissemination to
the research offices. These documents would in-
clude all materials of whatever classification,
format, or mode of presentation. International,
documents (those dealing with subjects or events
occurring in more than one country) would be
routed to each of the geographic desks concerned
when the application of area expertese in the
indexing process seemed justified by the char-
acter of the subject matter dealt with in the
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 248 -
Approved For Release 2000/05aft*frRDP78-03952A000100050001-7
Approved For Release 2000/05/30 :WP78-03952A000100050001-7
document. The majority of documents, however,
would be processed by one desk only. A single
master file would be maintained of all documents
indexed by the central reference system.
Most requests would be levied directly on
the geographic unit having responsibility for
the area of concern. Occasional requests would
have to be coordinated between the divisions when
more than one country was involved, but this would
be the exception rather than the rule. The respondent,
under the new configuration, would be familiar with
reporting from all sources on the matter of interest
to the customer, and could thus insure that the data
retrieved reflected the full response potential of
the system.
The proposed configuration would lose the
advantage of source specialization in processing
and would pose occasional problems of geographic
overlap in document indexing and query coordination.
However, these disadvantages are not felt to be
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 249 -
Approved For Release 2000/05/30 : csEeM78 -0 3952A0 0 01 00050001 -7
Approved For Release 2000/05/3EMTRDP78-03952A000100050001-7
serious. The system would come very close to
achieving all of the organizational goals set
forth earlier as the following review of said
objectives demonstrates:
a. Processing duplication
There would be a minimum amount of re,4.
dundant reading and expenditure of intel-
lectual effort in input processing since
the majority of documents would be com-
pletely processed by the nerson to whom
they were sent. While the international
document problem will arise, there are
fewer international documents than there
are documents dealing with multiple sub-
'pm
TIF
jects (i.e., persons, organizations/instal- OP
lations, commodities, etc.). Nor must Infor-
mation Analyst specialization necessarily
be surrendered. Instead of concentrating
on biographic, installation, or other data;
they could specialize in certain topic areas
of interest to intelligence--e.g., military,
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 250 -
Approved For Release 2000/05/30.? C1A-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 :%4P3P78-03952A000100050001-7
economic, political affairs, etc.--within
the country to which they are assigned.
In addition, the extant duplication of
document files would be eliminated with
concomitant benefits in terms of storage
space, reproduction loads, and filing require-
ments.
B. Customer Contact Points
Analyst inquiries normally relate to a
particular geographic area of the world,
although the information sought is frequently
diverse in character and not restricted to
any particular collection resource. Under
the configuration proposed, there would ordi-
narily be no need for the requester to inter-
rogate more than one component of the system
since the organization of service personnel
would mirror the manner in which user organi-
zations are themselves organized, i.e., by
topic within country.
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 251 -
Approved For Release 2000/05/30 : CRelZPV8-03952A000100050001-7
Approved For Release 2000/05/39EaRURDP78-03952A000100050001-7
C. All-source service
One of the principal advantages of geo-
graphic organization is that, in addition
to the establishment of all-source files, ,
there is an extra benefit to be derived from
the bringing together of information analysts
who have specialized source hackground. This
pooling of knowledge will make for more in-
formed reference personnel and will help
remove gaps and ambiguities in the data files
and authority lists developed in separate
source environments.
d. input-output communication
The geographic organization of central
reference personnel does not, in itself,
assure or encutbr communication between
input and. query handlers. Ra-,:her, this is
affected by the communication processes built
into the system, and by the extent to which
personnel specialize in the various functional
areas of innut and output processing. These
ORGANIATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 252 -
Approved For Release 2000/0ki5130-? CLA-RDP78-03952A000100050001-7
=k
Approved For Release 2000/05/30 : gcREF'78-03952A000100050001-7
matters will be discussed in the next
section.
e. Operator-user communication
7fhile it may seem that geographic organi-
zation oer se offers no inherent benefits
over the present central reference configur-
ation in terms of insuring better communi-
cation between information and research
analysts, in fact the information analyst in
the proposed system, by virtue of the fact
that he has access to a wider variety of
sources and shares a subject/area assignment
similar to that of his research counterpart,
should be more cognizant of the later's
resources an6 problems and, therefore, be
able to offer him better service This,
to be sure, is not enough, given the separate
physical and operational environments in
which each operates, and for this reason
experiments such as locating certain Infor-
mation Aai.ytr3. in the research components
ORGANIZATIONAL PROBLEM
,First-Level Concepts
5.A.2.z.
- 253 -
Approved For Release 2000/05/30 : CIARE1.pIVI3-03952A000100050001-7
Approved For Release 2000/05/3.FNRDP78-03952A000100050001-7
should be tried as well.
Processing priorities
Geographic organization at the upper
management levels cannot prevent information
persolinel from .being assigned to respond to
na.crow inteiests. vithin the geographic
f3iy.isons an tiranizationai structure reflect-
ing p:;Jocesing concerns (eq., document con-
L.:col vs. ocial file projects) might help
doing what, but since person-
always n T,hifted around it is manage-
. 1: eontrol which, in the final analysis,
dtermine the direction and continuity
c);f:
jbL satisfacLion
it would appear that the system ptoposed
a richer and more meaningful environ-
tont or Lhe information specialist than
now available to him in the majority
cvf: 6C1. registers. He would not be assigned
ono :ianction only as, for example, the input
ORGANIZATIONAL PROBLEM
First-Level Concepts
- 254 -
Approved For Release 2000/05/3SECCREPDP78-03952A000100050001-7
Approved For Release 2000/05/30: gagF78-03952A000100050001-7
analyst in the Intellofax System 7 he would
have access to a greater variety of docu-
mentary materials; he would be able to
specialize in a substantive area of intelli-
gence concern; he would have contact with
users of the information store and thus gain
some appreciation of the problemto which
his effort was addressed; and, not least
important from the Agency's point of view,
he would be better able to assume a research
position if the opportunity arises for him
to make such a move--as it often does.
h. Flexibility
Common system standards and procedures
across the geographic division, as well as
the increased. availability
Personnel on
any geographic area should lessen the problems
entailed in re-allocating personnel to accom-
modate changes in user needs. In a sense,
the bringing together of, all persons working
ORGANIZATIONAL PROBLEM
First-Level Concepts
5.A.2.4.
- 255 -
Approved For Release 2000/05/30 : CliglalE78-03952A000100050001-7
Approved For Release 2000/05/36EMTRDP78-03952A000100050001-7
on the same country--persons now scattered
amongst the various OCR registers--is ana-
lagous to the establishment of a medical
clinic composed.of specialists in various
suject areas versus the continuation of
individual medical practice. The assemblage
of these various skills increases overall
flexibility and assures the highest quality
services
Before concluding this section of the discus-
sion, some additional facts may be worth noting.
In a report to the Critical Collection Problems
Committee of USIB, the DireCtor/SCIPS observed that
"information processing activities, as contrasted
with collection or research, generally are not
oriented to area or country organization." How-
ever, he went on to point out, "most of 'Os '4CY
information handling activities surveyea [b SCIS]
are concerned with peripheral descripti data
rather than the substantive content of the informa-
tion items and are, therefore, organized on a
ORGANIZATIONAL PROBLEM
First-Level Concepts
5. A 2 . 4 .
- 256 -
Approved For Release 2000/05SSCF4B1i'-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 :511f13P78-03952A0001000500017
functional basis rather than a geographic
coverage basis." A different situation exists,
he said, "where the process is dependent upon
substantive content such as . . . deep indexing."
In the latter case "then the lowest organization
level is more apt to be structured on a geo-
graphic area basis, like collection and research
activities are prone to be."*
In fact there can be little doubt that the
processing of multi-source documents by geo-
graphically-organized personnel will work.
Within our own agency we have the Analysis Branch
of the Document Division organized on this basis
to process inputs into the Intellofax System.
As is well known, the system deals with a wide
_variety of intelligence report series and other
documentary media. BR and FIB are similarly ar-
ranged and, though they confine themselves to
restricted subject areas, are faced with an even
25X1A
ORGANIZATIONAL PROBLEM
Filst-Level Concepts
5.A.2.4.
- 257 -
Approved For Release 2000/05/30 : CEIMBP8-03952A000100050001-7
Approved For Release 2000/05/3UMTRDP78-03952A000100050001-7
greater diversification of documentary inputs,
including books, periodicals, newspapers and
even photos. Outside CIA there is the DIA
Jocument storage and retrieval system which
reeieves inputs from all USIE agencies and
inclexes peraons, organizations, locations, as
well as other subj cts. Thus, the issue is not
,:ilhether input processing organized on geographic
lines will work, or whether a multiplicity of
doctiment types can be handled by a single
organization-, but what the tradeoffs are versus
some other approach to the problem.
5.A.3. ORGANIZATIONAL ALTERNATIVES WITHIN A GEO-
GRAPHIC DIVISION
The preceding section was addressed to the issue
of the first-level organization of the central proces-
sing activity. The problem, however, does not end
Iere since, even if the geographic division concept
is excepted, each-geographic division would be so large
that some division of personnel into more manageable
Atinistrative units would be required.
ORGANIZATIONAL PROBLEM
Geographic Division.
5.A.3.
258
Approved For Release 2000/05/Seatii-RDP78-03952A000100050001-7
Approved For Release 2000/05/30: Agg78-03952A000100050001-7
Referring back to the organizational o7pjectives
listed earlier it appears that if th (Jec7fadic
arrangement makes sense as the first cat, it would
likewise be the preferred approach at every succeeding
management level within the organization until the
country level itself is reached. For example, if it
did not seem desirable to group persons by document
source or by the subject matter in documents they were
assigned to store and retrieve because of the effects
this would have on processing overlap, interface with
the customer, capability for providing all-source
service, and so forth, then it would make equally
little sense to permit them to creep back into the
system, although at a lower level, if the effect on
the system's performance was still the same.
The geographic concept begins to break down,
however, when the volume of activity (input as well
as requests) on a single country is characteristically
so great that a relatively large number of information
analysts must be assigned to the same country. It
would be possible, of course, to have both the docu-
- 259 -
ORGANIZi\TIONAL PROBLEM
Geographic Division
5.A.3.
Approved For Release 2000/05/30 : ClArFaE/T3-03952A000100050001-7
Approved For Release 2000/05/MCSKRDP78-03952A000100050001-7
ments as well as the requests distributed indiscrimi-
nately amongst these analysts, but specialization is
always advantageous if it can be achieved at minimum
or no cost to other system goals.
3ince not enough is known at this point about
the input/output traffic that can be expected on every
country in the world, nor 74hat the manpower require-
ments and constraints will be on the CHIVE system,
it is impossible to state with any degree of certainty
where a division of personnel within a given geo-
graphic area will be required. For some areas it
seems logical to predict that an analyst will have
complete responsibility for a country, e.g., one of
' emerging states in Africa which is of little
conseauence in international affairs and, therefore,
engenders little in the way of intelligence reporting
or analyst interest. On the other hand, many informa-
tion analysts will be required for the larger countries
such as the USSR and China and thus the organization!
of these analysts !becomes a matter of serious concern.
The most reasonable alternative ways of grouping
ORGANIZATION\L PROBLEM
Geograohic Division
- 260 -
Approved For Release 2000/055EICIEWRDP78-03952A000100050001-7
mot
Approved For Release 2000/05/30 :4?9413P78-03952A000100050001-7
personnel assigned to one country would seem to be
the following:
.A-3.1. Organization by Document Source
Adoption of this approach would mean that
separate groups of analysts would be established
for each major document category. These cate-
gories might be the open literature, collateral
intelligence reports, Comint and T/KH, etc_ The
principal advantage to be gained from this method
of organization would be the availability of
personnel trained on a document source basis, It
would have stronger selling power if the indexing
systems used were to differ by source. noway
the latter will not be the case. Its disadvan-
tages are that almost every request would have to
be coordinated among the different source-oriented
units since customers ? would customarily want more
than one source searched; Information Thalysts
would operate in different worlds and none would
;
have a complete picture of reporting in his :parti-
cular area of concern; the tendency would be to
ORGANIZATIONAL PROBLEM
Geographic Division
5,A.3.3.
- 261 -
Approved For Release 2000/05/30 : Cl/SMEN8-03952A000100050001-7
Approved For Release 2000/05/30613aDP78-03952A000100050001-7
:qaintain separate rather than integrated all-
nourec ;Alas: and the multiple service-point
?roblem would remain. On balance, it does not
seem to be a desirable approach.
5.A.3.2. Orr-anization bv Function
This syntem would allocate to certain infor-
mation analysts assigned to a country the respon-
sibility for indexing all documents received on
their area, to others the responsibility for
answering all requests on said country, and pos-
saav to a third group the task of maintaining'
"special project files and establishing and
periodically updating information files consist-
ing of summarized data about a particular persOn
or group of pnrsons, installation, or activity.'
The notion of distinguishing input from
retrieval personnel is not a new one. Libraries
have traditiOnally followed this approach in
separating tile cataloguing from the reference
librarian function. Many EDP-supported informa-
tion retrieval systems have also chosen this
ORGANIZATIONAL PROBLEM
Geographic Division',
5.A.3.2.
- 262 -
Approved For Release 2000/05SECRE1'A-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 :kE1931gP78-03952A000100050001-7
route, the original 433-L system of SAC being a
prime exanpie in which so-called "query special-
ists" (as distinguished from "coding specialists"
and "file modification specialists") were to
- handle all searches directed against the system.
The advantages of separating personnel by
the functions named are the following:
- It heightens the job satisfaction of those
assigned to the output end of the activity,
thus reducing personnel turnover and en-
abling the system to recruit and retain
higher-quality personnel.
- Persons unqualified to deal effectively
with requesters can be separated there-
from with less embarrassment to management.
Similarly, persons who have neither the
ee interest, background, nor temperament to
become effective indexers can be given
? assignments more in keeping with their
qualifications.
- New personnel can be trained more quickly
if the job responsibilities are more
narrowly defined. This will reduce the
total amount of unproductive time expended
by the system, a matter of no small signi-
ficance if the turnover rate is reasonably
high.
- By encouraging specialization the quality
of the system's performance is enhanced.
It permits processing to go on undistObed
07GANIZATION2\L PROBLEM
Geographic Division
5.A.3.2.
- 263 -
Approved For Release 2000/05/30 : CIAKEW8-03952A000100050001-7
Approved For Release 2000/05/305:FeaDP78-03952A000100050001-7
by request interruptions with some con-
sequent increase in operational efficiency.
By formally separating the document storage
and retrieval responsibility from special
and general-purpose information file main-
tenance, system functions would be better
defined and management would have a clearer
picture of their investment in either area.
This Nsould bar the often unnoticed drift
of centralized retrieial systems toward.
increased special f. se-building activities
to the detriment of establishing a basic
retrieval capability over the documents
entering the system.
The principal disadvantages of functional
separation are:
- Query specialists would be unfamiliar with
the inputs to the system except those they
retrieved as the result of searches levied
against the files. As a result they would
tend to lose touch with current intelli-
gence reporting unless some mechanism was
provided for them to read select incoming
documents, review the product of the in
dexer activity, or other. In addition,
all persons who index documents as well.
as answer requests retain a great deal of
information in their heads which is never
reflected in the index representation of
documents. Subtle though this advantage'
may be, it makes for more effective service
to customers in ways too numerous to mention.
And it is most difficult to acquire this
knowledge through any other mechanism than
participating in the input process itself.
- Input specialists would have little ap-
preciation of customer needs. Being barred
- 264 -
ORGANIZATIONAL PROBLEM
Geographic Division.
5.2.3.2.
Approved For Release 2000/05WRga-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : thAWFT78-03952A000100050001-7
from dealing with requesters, they would
not know that subjects to stress in their
input processing, nor how to distinguish
the significant from the insignificant.
- The inevitable tendency would be to con-
sider the query specialist a cut above
the indexer to the detriment of the input
person's morale. As experience has shown
in OCR, the request handler would be re-
garded as having the more interesting job
primarily because, having contact with
users, he could understand better .that
contribution the entire activity was
making to the intelligence mission. Those
indexers who were unable to make the
change from input to request handling ? ?
because no vacancies developed would
ultimately take positions elsewhere.
Those who remained would tend to repre-
sent the less capable and imaginative
until, ultimately, the entire input staff
would take on these characteristics.
This approach would conflict with the mode
of operation in most OCR components. With
the exception of the Intellofax 3ystem,
most CYC.11. systems have chosen to have
the same individuals handle queries who
handle input to the files. Both the Special
Register as well as sections of the Bio-
graphic Register have actually operated
for varying periods of time on a functional
basis but reverted back to the integrated
configuration. Certainly, the majority
of experienced OCR staff members would
prefer to have information analysts operate
in both modes and would resist the other
approach.
- Peak request or input loads would require
the temporary assignment of personnel to
- 265 -
ORGANI1ATIONAL PROBLEM
Geographic Division
5.A.3.2.
Approved For Release 2000/05/30 : ClaPflitE48-03952A000100050001-7
Approved For Release 2000/05/?gcgaRDP78-03952A000100050001-7
the duty which was not their prime
responsibility. Indexers who performed
the retrieval function would thereafter
be able to claim, and rightly so, that
they were able to do the job otherwise
they would not have been called on in
the first instance. This would tend to
weaken management's argument for con-
tinuinj the distinction.
As can be seen, while a good case can be
made for either configuration, we tend to favor
not making a formal division of central reference-
personnel along functional lines. 7ihile there :
will, J..nevitaly, be some persons in the system,
whose functions will be more or less unique, and
others who because of personality or other
limi-
tations willbe confined to a restricted area of
operations, these will be the exceptions rather,
than the rule and, in the latter case at least,
would not be reflected in the formal organi-
zational structure.
.A.3.3.? Organization Named Named Object
This configuration would organize the infor-
mation analysts by the major classes of data stored
ORGANIZATIONAL PROBLEM
Geographic Division
- 266 -
Approved For Release 2000/05SWIM-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 :5VETP78-03952A000100050001-7
and retrieved by the system. For example, within
the USSR Division there might be a Personalities
Branch, an Organization/Installation Branch, and
a Subject/ommodity Branch. The Special Register
is divided on this basis today and, in a sense,
the collateral repositories of OCR, i.e., BR, FIB,
and the Intellofax System, are reflections of the
same concept except on a larger scale.
We know that this approach will work since
has been proven over many years of operating
experience. Furthermore, by introducing this
kind of division at a much lower operational
level (namely the country desk) than is the case
today, many of the ills of the existing system
such as conflicting vocabularies, overlapping
document files, diverse input/output procedures,
and so on might well be eliminated. It also
offers the advantage of immediately identifiable
manpower trained in these particular areas and,
in addition, permits a high degree of analyst
specialization.
71hat makes this solution unattractive? The
ORGANIZATIONAL PROBLEM
Geographic Division
5.A.3.3.
- 267 -
Approved For Release 2000/05/30: CIAW&RfET-03952A000100050001-7
Approved For Release 2000/05/30 ? QA-RDP78-03952A000100050001-7
SECRET
principal objection is, of course, the fact that
it would be a rare document that would not have
to be read aad indexed by all three groups. While
Comint materials would be less troublesome in this
regard, collateral documents and open literature
are not typically oriented to any single type of
named object. Attempts to coordinate the input
effort so as to reduce duplication would be
extremely difficult to implement, and document
dissemination would in all likelihood take the
form of dissemination of the same documents to
all three points. Finally, there would remain
the problem of coordinating the response to queries.
A significant proportion of the requests would
relate to all three subject areas and require a
coordinated response.
In summary, while this configuration is pre-
ferable in many ways to the existing central refer-
ence organization, it would be less efficient and
?economical than what might be desired. That there
may be a better alternative was suggested earlier,
- 268 -
ORGANr4ATIONL PROBLEM
Geographic Division
5.A.3.3.
SECRET
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
?
411111
IMP
,
Approved For Release 2000/05/30 : Wifri'78-03952A000100050001-7
anflit will be the subject of the next section.
T32\.3.4. Organisation2i3 Topic
The major selling points for a topic approach
to the organization of the central reference acti-
vity beneath the country level are (a) that it
corresponds more closely than any other configur-
ation to the kinds of requests we can anticipate
will be levied on the system; and (p) that, while
it does not eliminate entirely the problem of the
multi-subject document, it would seem to confine
the problem to reasonable bounds. If the former
statement is accepted, then organization along
topic lines would lessen the need to coordinate
the search activity in order to provide the
customer with a complete response to his query.
Similarly, if documents tend to relate to a single,
though broad, subject area of intelligence concerr
or example political affairs, scientific and
technical intelligence, military activities, or
economic matters), then the need for multiple
- routing of documents should be diminished and
- 269 -
Approved For Release 2000/05/30 : CIPM041-03952A000100050001-7
ORGANIZATIONAL PROBLEM
Geographic Division
5.A.3.4.
Approved For Release 2000/05/305.E&READP78-03952A000100050001-7
processing duplication minimized.
A preliminary examination of documents enter-
ing the current system, as well as a review- of
aueries levied on the system by analysts, indi-
cates that both do tend to concentrate on one
or another of these basic subject areas. This
is not too surprising since these are the classic
divisions of strategic intelligence, and collection
as well as production organizations within the
intelligence community reflect this fact. It
also appears that there is a reasonable balance
of documents as well as queries in each of these
topic areas such that there would not be a pre-
ponderance of personnel assigned to any one field.
As to whether it might be desirable to
fu7ther refine the topical breakdown within poli-
tical affairs, economics, etc., this would depend
on the number of information analysts assigned to
any one topic. Additional subdivisions are clearly
possible and could be advantageous in that they
would permit increased analyst specialization and
- 270 -
Approved For Release 2000/05116CREA-RDP78-03952A000100050001-7
ORGANIZATIONAL PRO3LEM
Geographic Division
5.A.3.4.
Approved For Release 2000/05/30 : gc-VETP78-03952A0001000500017
lessen the span of control problem for super-
visors. On the other hand, these benefits might
ultimately be offset by the inability of the system
to separate documents cleanly on the basis of these
increasingly narrow subject categories.
Documents dealing with two or more major
topics would, of course, be received by the system.
However, this need not cause any undue concern.
Multi-processing of a single multi-subject docu-
ment by different topical specialists is less
important than multi-processing of an international
document by geographic specialists. Such documents
would be directed to the unit which seemed princi-
pally concerned for complete indexing even when
the choice seemed rather arbitrary. If it appeared
that the information reported seemed of more than
average significance, this would not preclude an
information copy of the same document being routed
to another unit.
Research analysts should prefer to deal with
topic-oriented information specialists since they
ORGANIATIONTiL PROBLEM
Geographic Division
5.A.3.4.
- 271 -
Approved For Release 2000/05/30 : CI1SME7T3-03952A000100050001-7
Approved For Release 2000/05/3?geBaDP78-03952A000100050001-7
would find them better able to understand their
search problems. Indeed, such information
specialists might in time become more factually
3mowledgcab1e than their customers since they
would have fewer extraneous responsibilities and
could concentrate their exclusive attention on
the subject at hand.
IMO
ORGANI-ATIONAL PROBLEM
Geographic Division
- 272 -
Approved For Release 2000/09:61KROVRDP78-03952A000100050001-7
-
OM
involved included 16 indexers, 4 senior indexers, 3
Approved For Release 2000/05/30: ?g78-03952A000100050001-7
Appendix 5.B.
PIREE,IMININRY EVALUATION OF THE CHIVE INDEXING EXPERIMENT
5.3.1. SUMMARY DESCRIPTION OF EXPERIMENT
A joint OCR/CHIVE indexing experiment was con-
ducted, from about 15 November 1954 to 15 January 1'2:35.
Approximately two months training preceded the indexin-4
phase of the'experiment, while the query and evaluation
phase is expected to extend through my. The personnel
,Jleric. typists, and 3 project monitors. The d.ata
25X6 consisted, of some 5,000 all-source documents on
25X6 col-k during the period. 1 July -
7. 1
30 September 1,7.64.
, The experiment washeld to test certain or-Tlani-
6,;
zational and indexing techniques -proposed by CHIVE.
7 .1
Specifically, it -oas desired to test the following
.major concepts:
?
7
7 That with aderJuate supporting tools, a person
can satisfactorily index all of the, information
contained in documents, i. n people, -organi-
zations/instaliations, areas, subjects, etc.
EVLa:',TION OF EXPORIMENT
SupilLary
5.E.].
- 273 -
Approved For Release 2000/05/30 : Cl) '-03952A000100050001-7
Approved For Release 2000/05/3WIETRDP78-03952A000100050001-7
- That all-source materials (including Col-
lateral, SI, and T/KH) not only can be proces-
sed and retrieved in one integrated system
but that certain advantages will accrue there-
from.
- That personnel organization by geographic
area and, if necessary, by topic is feasible
and desirable.
That the CHIVE indexing approach will provide
at least as many entry points to documents as
that now obtainable from the sum of the indivi-
dual indexes and other controls established
in the various registers of OCR.
- That header data (bibliographic) indexing
can be performed by clerical personnel with
a minimum of guidance.
To.test these concepts, an experimental
Branch was established. The Branch was organized into
four topical sections: Political, Economic, Military,
and Scientific and Technical. Each section was headed.
by a senior indexer. More than half of the OCR person-
nel assigned to the project had some previous indexing
experience, but less than half were currently full-
time indexers. Each section was allotted personnel
who had experience in working with SI materials,
25X6 background, or familiarity with the Intelligence
Subject Cpcle. Some of the individuals had more than
274
25X1A
EVALUATION OF EXPERIMENT
Summary
Approved For Release 2000/066SMA-RDP78-03952A000100050001-7
25X6
25X6
25X6
Approved For Release 2000/05/30 :gakTP78-03952A000100050001-7
one of these attributes. Unfortunately, few of
the indexers had previous topical specialization
similar to that employed in the experiment.
The indexing tools used during the experiment
included:
- The Intelligence Subject Code
- A listing of on whom the
Biographic Register maintains dossiers
- The Special Register
Manual
- The NIS Gazetteer
- The Special Register Code Book Supplement
to the ISC
- The CHIVE Indexing Manual
- Miscellaneous dictionaries and other reference
works.
The Intelligence Subject Code and the SR Code Book
Supplement were used to index subjects and commodities.
The BR dossier list, SR Organization Manual, and the
NIS Gazetteer were used as authorities for entering
people, organizations, and place names--that is,
EVALUATION OF EXPERIMENT
Summary
5.B.1.
- 275 -
Approved For Release 2000/05/30 : CIWERV8-03952A000100050001-7
25X1A
Approved For Release 2000/05/3RWRIDP78-03952A000100050001-7
whenever a significant person or organization was
encountered, the indexer had to refer to the dossier
list or organization manual, find the correct entry,
and enter the code assigned by BF or R. All place
names were checked in the NIS Gazetteer for the
correct entry form. The CHIVE Indexing Manual con-
tained the explanation of the indexing techniques,
the method of transcription, and some preliminary
indexing rules and procedures.
The data base was All-source and consisted of ,
Collateral intelligence reports, translations, the
FBIS, newspaper articles, Comint,
T/KH materials, and miscellaneous
other series. Each document category was represented
in proportion to the total documents currently re-
ceived in that category during a year. All of the
documents concerned
tions with other countries.
For documents which contained multi-country/
subject content, the rule was established to index
that material which would normally be processed by
an operational Consistency was dif-
25X6
25X6
25X6
EVALUATION OF EXPERIMENT
- 276 - Surninary
Approved For Release 2000/055KRER-RDP785:13:952A000100050001-7
Approved For Release 2000/05/30: 65cligT178-03952A000100050001-7
ficult.to obtain here because the indexers had
slightly different interpretations as to what a
25X1A would process.
It was decided not to apply any selection
criteria, but to index all of the information
concerning As a result, many low-
level personalities and installations, as well as
fragmentary subject matter, were indexed which would
not be captured in an operational system. No
selection criteria were applied because it was felt
that realistic criteria could not be established
prior to the experiment and that artificial criteria'
would affect the experimental results. It was further
felt that great indexing depth would aid in estab-
lishing future criteria--that is, that the experi-
ment would. show that redundant indexing of many sub-
jects is unrealistic. However, despite the lack of
criteria, an indexing consistency test following the
experiment showed that each indexer tended to apply
his own criteria based on his views of what was
important.
25X6
EVALUATION OF EXPERIMENT
SUM:Clary
5.B.1.
- 277 -
Approved For Release 2000/05/30 : CI4KIXE1T8-03952A000100050001-7
Approved For Release 2000/05/39BapROP78-03952A000100050001-7
The documents were broken out into the
four topical categories mentioned above. Each senior
controlled the flow of material to his indexers thus
assuring that each processed a variety of sources.
Upon completion of the indexing, the seniors re-
viewed the transcript sheets for accuracy and logic.
However, many errors were not caught because neither
the indexers nor the seniors were as well versed in
the system as would be desirable in an operational
system. In fact, it would be fair to say that it
was not until the end of the experiment that the
indexers and seniors were beginning to gain confi-
dence in what they were doing. In addition, severaL
of the indexers were not suited to the task and would
have to be given other assignments in an operational'
system.
Following review by the seniors, the documents
and transcript sheets were transmitted to the three
typists for header data transcription. One of these
clericals acted as a senior for resolving problems.
In addition, an OCS system analyst who had planned
- 278 -
Approved For Release 2000/0MRRA-RDP78-03952A000100050001-7
EVALUATION OF EXPERIMENT
Summary
5.B.1.
Approved For Release 2000/05/30: CniS8-03952A000100050001-7
the header data transcription task oversaw this
phase of the operation. The documents were then
filed by a CHIVE accession number, and the tran-
script sheets were transmitted to key punching.
Computer processing resulted in a print-but of
index records which contained errors. These
listings were reviewed by one of the project monitors
and fina corrections were made.
- PRELIMINARY FINDINGS
The final results of the experiment await the
conclusion of the query phase. However, prelimi-
nary findings relating to indexer reactions, se-
lection -problems, indexing times, etc., can be
described, and these are perhaps the critical
factors affecting the organization of the proposed
CHIVE system,
5.B.2.1. Personnel Considerations
The personnel involved in the experiment were
college-graduate professionals and less than half
had worked in jobs that involved full-time indexing.
Even those with an OCR indexing background, had
- 279 -
EVALUATION OF EXPERIMENT
Preliminary Findings
5.B.2.1.
Approved For Release 2000/05/30 : CIAWfil-03952A000100050001-7
Approved For Release 2000/05/3EGRINRDP78-03952A000100050001-7
worked or allied. taskssuch as querying or diction-
ary building, or had served as experts on some aspect
of indexing. In this experiment they did nothing but
Index and found the tools and rules with which they
eori+,
I 0 0 0 0 0 0 0 0 0 0 0 0 0 114
00000000000000000000000000000000000000000000000000000D000000001
, 73 4%411 4101112131515q(617141
avnz
2525262/2020353i32133435361383350515753555555571459505125.155554647585960610/6365655567586970/1/2/3751575/115/
1111111111111111111111111111/1
111111111111111111111111111111111111111111111111
1
' 21122222227222222222222222222222222222222222222
2
22222222222222122222
2 22222 2 2 :
1 333 3 33 3 3 3 33 3 33
1
33
3 3 3
3 3
Vfrsin intdEprineA
wn r27P.T.,
334 31 )3333 3 3
3 3 33333 3333 3 3 3 :
3
; 14 4 4 4 4 4 4 4 4 4 4 4 4:4
4 4
4 4 4444444444444444444444444444444444444444444444444444444444d
t
1 55555555555555555555555555555555555555555555555555555555555555555555555555551
, 666666666666666663666666G666666666666666666668666666G666666666
1 1 1 7 7 1 1 1 7 ) 1 1 117
1
1 1
7 1 7
7 7
1 7 1 7 7 7 7 7 1 7 7 1 7
7 7 7 7 7 7 7 7 7 7 7 7 7 7
7 1 7 7 7 7 7 7 7 7 7 7 7 7
6 3666666666666 E
1 7 7 7 7 7 1 1 1 1 1 11/1
8888888888888:88888888888888888888888888888888888
8 8 8 8 8 8 8 88 a s 88888888888888888i
999999999999 9:9
999999999999999999999099999999999999999999999999
5 9 999999999999F.
11 4 5 , 4 5 W1112.114.50
'41.0252
72 225'24252221183031473311363631839545152535555565754505051:.25155555451u536441626.165
4 ;4.54.6474/. r,.1.,. .5N,1.1,
- 514 -
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
4
a
1
2
3
4
5
6
7
9
CONFIDENTIAL
Sag
Approved For Release 2000/05/30 : CIA-l -03952A000100050001-7
Figure 5.D-15
JOB 3 (KWIC) ELEMENTS OF INFORMATION
Col. Field Name
1 File
2-15
Document
Identity,
Series & No.
16 Year
Date of
Information**
Content Description
The numbers (1-4) which
identify the exploded
punch card records.
Numbers are suppressed
in printout.
Clear-text transcrip-
tion of the document
series and the docu-
ment number. Series
and number are sepa-
rated by a blank space.
Year of publication of
the report taken from
the documentation. The
numbers 0 to 9 are com-
bined with over punches
to develop 2-digit year
on printout.
X overpunch for 4
No overpunch for 5
n overpunch for 6
X overpunch for 7*
17-18 From Month 0-12
19 Year as in 16 above
20-21
22
* *
To Month 0-12
Year as in 16 above
Corresponding
CHIVE Field
None
Card 03 - Doc/
Series identifi-
cation no. includes
series, no. & year.
The digraph sub-
ject portion of
the series is also
carried in field
02-02.
01-07 first
subfield
01-07 second
subfield
All year fields for Job 3 are one column and a 2-digit year
is developed in the same manner for all.
No information date on PI publication date appearing on the
date line is entered in cols. 20-22.
Approved For Release 2000/05/30 : Cl
- 515 -
03952A000100050001-7
NciDENTA
Approved For Release 2000/05/
Figure 5.D-15 (Cont'd.)
Col. Field Name
23,37,
51,65
24-36
38-50
52-64
66-78
79
80
Subject seg-
ment
Keyword Code
Clear aext
aiig150001-7
Content Description
3 or 0 are used to
show whether or not
the word following
is a keyword and,
therefore, to be
printed in the alpha
list of keywords
which comprise the
index. 3=Keyword.
0=Non-Keyword*
The clear-text words
taken from the docu-
ment. Some but not
all are dictionary
controlled.
a. SI Reports. The
number 2 identifies
Cuban reports. The
number 3 identifies UAR
reports.
b. PI reports. One
of 6 codes C,K,S,T,
Z,N used to indicate
the security channels
in which the document
is being handled. For
multiple channels,
highest indicator is
used.
Distribution control
symbols 1-7.
Corresponding
CHIVE Field
01-14
01-02 and 01-03
No one-to-one
relation with
CHIVE code.
01-04
No onP to-one
equivalency. CHIVE
code could be easily
expanded to accommo-
date these entries.
* All Keywords are dictionary contr-lled. See sections 5 and 6
for sample pages from China area book and Job 3 dictionary.
- 516 -
Approved For Release 2000/0 tk-RDP78-
Approved For Release 2000/05/30 :CI 6 952A000100050001-7
ZW014
CONFIDENTIAL
Figure 5.D-16
FIB TOWN/CITY INFORMATION CARD FORMAT
Col.
1-3
1.
FIB Country Code
4-6
2.
FIB Political Subdivision Code
7-30
3.
Location Name
31-32
4.
200 Chart Series
33-35
5.
B. E. - WAC Number
36-40
6.
B. E. - Town Number
41-42
7.
Degrees (N/S)
43-44
8.
Minutes (N/S)
45-46
9.
Seconds (N/S) (South "X")
47-49
10.
Degrees (E/W)
50-51
11.
Minutes (E/W)
52-53
12.
Seconds (E/W) West "X")
54-55
13.
Date of Latest Information (Yr)
56
14.
Source Code
57-64
15.
AMS Chart Number
65-69
16.
Location Identification Code
70
17.
Town Card Indicator ("X")
71
18.
Town Information Indicators
72-74
19.
Cat. Design. Code
75-80
20.
Town "C" Code
- 517 -
Approved For Release 2000/05/30 : CIAArati103952A000100050001-7
CONFIDENTIAL
Approved For Release 2000/0-5
.114. a r
6.91.
-RDP78-03
Figure 5.D-17
FIB INSTALLATION INFORMATION CARD FORMAT
Col. 1-3 1. FIB Country Code
4-6 2. FIB Political Subdivision Code
7-11 3. Location Identification Code
12-35 4. Installation Name
36-40 5. B. E. Installation Number
41-42
6.
Degrees (N/S)
43-44
7.
Minutes (N/S)
45-46
8.
Seconds (N/S) (South "X")
47-49
9.
Degrees (E/W)
50-51
10.
Minutes (E/W)
52-53
11.
Seconds (E/W) (West "X")
54-55
12.
Date of Latest Information (Yr)
56
13.
Source Code
57-64
14.
FIB Identification Number (Firm #)
65-70
15.
Installation Identification Code (ICC)
71
16.
Installation Use/Assoc. Indicators
72-74
17.
Cat. Design. Code
75-80
18.
Installation "C" Code
Approved For Release 2000/05
ri\i\lor?
-RDP78-03952A000100050001-7
cc)
Approved For Release 2000/05/30 : CIA?8-03952A060100050001-7
Figure 5.D-18
FIB LOCATION CROSS REFERENCE CARD FORMAT
Col.
1-3
1.
FIB Country Code
4-6
2.
FIB Political Subdivision Code
7-11
3.
Location
Identification Code
12-14
4.
"See"
15
5.
(Blank)
16-35
6.
Location
Cross Reference Name
36-40
7.
(Blank)
41-42
8.
Degrees
(N/S)
43-44
9.
Minutes
(N/S)
45-46
10.
Seconds
(N/S) (South "X")
47-49
11.
Degrees
(E/W)
50-51
12.
Minutes
(E/W)
52-53
13.
Seconds
(E/W) (West "X")
54-69
14.
(Blank)
70
15.
Cross Reference Card Indicator("12")
71-80
16.
(Blank)
- 519 -
Approved For Release 2000/05/30: CIA-Rtionair3952A000100050001-7
Approved For Release 2000/05/30 ? CIA-RDP7
SECRET.-
00100050001-7
s
Figure 5.D-19
FIB ICF COORDINATE CARD FORMAT
Col. 1-7
8-28
1.
2.
Sequence Number
Location
29
3.
Country Code (Target "X")
30-31
4.
Country Code
32-35
5.
Political Subdivision Code
36-40
6.
(Blank)
41-42
7.
Degrees (N/S)
43
8.
(Blank)
4A-45
9.
Minutes (N/S)
46
10.
"N" or "S"
47
11.
(Blank)
48-50
12.
nglgrees (E/W)
51
13.
(Blank)
52-53
14.
Minutes (E/W)
54
15.
"E" or "W"
55-58
16.
"APPR" (If Approximation)
59
17.
Irclank)
60-63
18.
WAC Number
64-69
19.
(Blank)
70
20.
Control "X"
71-72
21.
(Blank)
73
22.
Control "X"
74-75
23.
(Blank)
76
24.
Town Folder Indicator
77-79
25.
(Blank)
80
26.
Card Type "1"
,CONFIDENTIAL
..$,EqET
Approved For Release 2000/05/30 :-CIA-RDP78-03952A000100050001-7
sEcieT CONFIDENTIAL
Approved For Release 2000/05/30: CIA-RDP78-03952A000100050001-7
Figure 5.D-20
FIB ICF CITY CROSS REFERENCE CARD FORMAT
Col. 1-7
8-28
29
30-31
32-35
36-38
39
40-63
64-79
80
1. Sequence Number
2. Location
3. Country Code (Target "X")
4. Country Code
5. Political Subdivision Code
6. "See"
7. (Blank)
8. Location
9. (Blank)
10. Card Type "2"
- 521 -
Approved For Release 2000/05/30: CIA-RDP78:b3952A664F44*NTIAL
Approved For Release 2000/05/305,60"kbP78-0MAIDIKIN?O?QQ01-7
rinutiV I IAL
Figure 5.D-21
FIB ICF NAME CARD FORMAT
Col. 1-7
8-28
1.
2.
Sequence Number
Location
29
3.
Country Code (Target "X")
30-31
4.
Country Code
32-35
5.
Political Subdivision Code
36-63
6.
Firm Name
64-67
7.
Plant Number
68
8.
Status "X"
69-75
9.
Firm Number
76
10.
Plant Folder Indicator
77-79
11.
Industrial Category Code
80
12.
Alpha
- 522 -
Approved For Release 2000/0 keTA-RDP78-03
EoRAIAL
Approved For Release 2000/05/30: i 8-03952/QV
Figure 5.D-22
FIB MODEL-TYPE/BROCHURE INDEX CARD FORMAT
Col. 1-20
21-59
1.
2.
Model Type/Series
Descriptive Name
60
3.
?Tech. Material Indicator
61
4.
Tech. Material Language*
62-63
5.
Industry
64
6.
Category Code iICC
65-71
7.
Dossier Number (Firm #)
72-73
8.
Date (Month)
74-75
9.
Date (Year)
76-78
10.
Country Code
79-80
11.
USSR Area Code
*Admissable Entries are:
(1) English
(2) Native Language
(3) Other
Approved For Release 2000/05/30 : CIA-
)0___NTIDENTIAL
i3952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Figure 5.D-23
PUNCH CARD CHARACTERISTICS of the IRS Document Index File (New)
E-f
4.1 ? Subject Code
st Subject Modifier
E-4 Code
4
a
a Clear Text
E-4
4
o Subject
0
4
Organization
E-4
? Abbr.
Place Name
0
ul
Area, Code
Source Code
4
4
A
Document No.
A
4
H Pub. Date
8
4 Classification
Code
* NOTE: Numbers indicate action codes. These are literal entries.
- 524 -
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Figure 5.D-24
PUNCHED CARD CHARACTERISTICS OF THE IRS DOCUMENT INDEX FILE (OLD)
Fields
1
1 - 6
2
7 -1112
3
4
13-
14
5
15-
18
6
19-21-23-
20
7
27
8
2C
9
26-32
Punch
N pos.
,Data
Subject Code
Subject Modi-
fier (Action
Code)
Area Code
Classification
Code
Source Code
Locator No.
Related Area
Code
Related Area
Code
Pub. Date
Control No.
031
,
.
- 525 -
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Figure 5.D-25
PUNCHED CARD CHARACTERISTICS OF THE FILM INDEX FILE
1
Fields
1
1 - 6
2
7-1012
3
11-
4
1314-1E20
5
6
19-
7
2122-2E27
8
9
Punch
Data
pos_
Subject Code
ONMOMM
Area Code
Text (Language)
AMO
Code
Type Code
Mi
Holding Agency
Code
Pub. Date (Yr)
MOW
Classification
MI
Code
Title No.
alliMil
(Control No.)
Availability
WM
Code
- 526 -
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7
Approved For Release 2000/05/30 : CIA-RDp78-03952A000100050001-7
seat(' CONFIDENTIAL
CON IDEF,
CREt
Approved For Release 2000/05/30 : CIA-RDP78-03952A000100050001-7