Computing at Fermilab Since the Beginning
Computing at Fermilab is turning a number of corners now and during the next two or so years. With the new Tevatron running period starting up next month, a number of data-acquisition systems will now be VAX-based rather than PDP-11 based, as had been the situation since the beginning of the Fermilab experimental Program. Furthermore, there is every expectation that funding will be made available in FY86 and FY87 to build a good-sized building to house the Central Computing Facility and the ancillary functions associated with the data-acquisition equipment and the PREP pool. In addition, a new computer system with little or no historical constraints will also be acquired during that period. As we make these transitions, it appears appropriate to look back over the growth of computing at Fermi lab since the beginning.
During the early days, the small amount of computing required for physics or design work was accomplished by using the facilities at Argonne National Laboratory. Leased common-carrier lines connected a low-speed RADS (Remote Access Data System) consisting of a printer and a card reader at Fermilab to the ArgonneĀ· facility. Furthermore, a daily courier service moved cards and paper as required between the two laboratories. The primary equipment installed at Argonne at that time was an IBM 360 Model 75. Some scientific computing was also performed on the CDC 6600's at the Lawrence Berkeley Laboratory (LBL) and at New York University. During the first part of the 70' s, DUSAF, the architectural and engineering firm that did the design and supervised the construction of the Laboratory, used their own IBM 1130 for engineering, management, and business functions. For some time, this system also ran the Laboratory payroll.
The very first computers purchased by Fermilab to support physics experiments were two PDP 15' s acquired to support an experiment at the AGS and an early Fermilab experiment. At that time, these were state-of-the-art mini-computers appropriate for data-acquisition functions. One of these two processors continues today to perform functions supporting the Film Analysis Facility (FAF).
The first scientific non-data ta computer acquisition by the Laboratory occurred in 1970. About $500K had been allocated for the acquisition of a medium-sized computer to service the bubble-chamber film-measuring and analysis needs generated by FAF. The two contenders in a competitive acquisition were Digital Equipment Corporation (DEC) proposing a PDP-10 and Xerox Data Systems proposing a Sigma-7, Both proposals represented state-of-the-art computers at that time. The competition between the vendors was very intense, and by the time the dust had settled, DEC had made a proposal that the Laboratory found attractive. They had proposed donating at no cost to URA a development PDP-10 system which they had used internally for their own purposes and for which they no longer had a need. The machine also came with a five-year warranty (i.e., there were no maintenance charges for the first five years of the life of that machine at Fermilab).
Since that machine came before the Central Laboratory building had been built, the computer was installed in the protomain pole building in the Village. A second PDP-10 was later acquired in 1971 when the Princeton-Penn Accelerator was shut down and was located in one of the earliest portakamps acquired by the Laboratory, in a location southwest of the Meson Laboratory target building. Both computers were eventually moved to the Central Laboratory building and continued to serve the Laboratory for interactive and other support computing until 1980. In this time, the two machines had been upgraded a number of times to keep up with Fermilab's growing interactive computing needs.
In 1971 it was necessary to lay the groundwork for a data acquisition approach for the experimental program at the Laboratory. Up to that time, most experiments at most laboratories operated in an independent manner to acquire the necessary computing hardware for the data-acquisition and monitoring functions of each experiment. It was clear that there were great advantages if a coherent approach across many, if not all, experiments to fulfill their data acquisition needs, could be established. Thus early in 1971, a request for proposals (RFP) was put out for the acquisition of three robust mini-computer systems to serve the on-line data acquisition function. The RFP indicated that these three were the first of at least ten to be acquired during the next two years. The eight competitors in this acquisition included the PDP-15 and PDP-11/20 from DEC, the Sigma-3 from Xerox Data Systems, and the Nova-800 from Data General. By a very small margin, the PDP-11 won over the next closest competitor, which was the Data General Nova. These data-acquisition computers and their peripherals are called BISON systems, a contrived acronym for Basic Instrument for the Support of On-line Needs. Since the initial acquisition, the Laboratory has bought almost 100 PDP-11's of various models, more than half of which are used directly or indirectly in the experimental program. Some of the original PDP-11/20's are still performing Trojan work in a number of ancillary roles, such as engineering test benches and in low-performance data-acquisition areas, such as field mapping.
As part of the specifications for this data acquisition system, a general-purpose standard interface to experimental equipment was defined. Up to that point, each individual experiment had found its own mechanisms for interfacing the electronics associated with the experiment to the computer. In 1970, the European organization ESONE, with the participation of the U.S. NIM Committee, was in the process of defining the CAMAC standard for both data-acquisition and control purposes. At that time it had been decided to use the new, not quite fully defined CAMAC standard for much of the control system for the Fermilab accelerator. In 1971, when the Laboratory defined the specifications for these first data-acquisition computers, CAMAC had just matured to the point where industry was beginning to be interested in building components and modules. Thus it was possible to couple a requirement for such an interface with the acquisition of the first computers.
A number of parallel efforts were initiated to develop an on-line general-purpose program to gather and write the data to tape and to monitor and analyze the performance of the experiment in real time. The effort that proved most successful was one from a number of Cal Tech groups. The program that has become the standard Fermilab PDP-11 data-acquisition product has been extensively expanded and documented and is now widely distributed within the high-energy physics community and to others, as MULTI/DA.
In 1972, the Department of Energy (DOE), then the Atomic Energy Commission (AEC), initiated an attempt to buy computers for all Laboratories in need of additional central computers, in a single, centrally coordinated acquisition controlled by the AEC Controllers Office. The then Controller, John Abbadessa, had established a very good track record in Washington in overseeing all dealings between the AEC laboratories and the computing industry. It appeared to him than an even larger scale of economy could be accomplished if the AEC would couple the needs of all the laboratories in one large procurement.
Although every major laboratory within the AEC community, including the weapons labs and the high-energy physics labs, was included in this acquisition, Fermilab was not. To a large extent this was because of a perception both on the part of the AEC and the Fermilab Director that it would be possible for Fermilab to use the computing facilities at Argonne. Consequently, the Fermilab local computing needs were not included in this acquisition.
As a consequence of this major acquisition, Argonne acquired a major upgrade of their central facility with the delivery of an IBM 360/195 and SLAC received the triplex of IBM 370/168 computers. In that same acquisition, totaling almost $40M awarded to IBM and about $30M awarded to CDC, Brookhaven National Laboratory (BNL) and LBL acquired their CDC 7600 computers. Al though this coordinated acquisition was a financial success, the logistics and complications of bringing together all the requirements of the various laboratories turned out to be a bureaucratic nightmare. Dealings were long and tedious and although in some sense totally quite successful, such an acquisition has never been repeated by DOE and its predecessor agencies.
After a transition period following their receipt of the CDC 7600, LBL no longer had the need for one of the CDC 6600's that had been their computing mainstay up until that time. Consequently, they declared one CDC 6600 surplus. By that time, it was apparent that it was absolutely necessary to have on-site computing at Fermilab since remote computing to Argonne had serious deficiencies in terms of turnaround and general management requirements. Thus Fermilab asked for the surplus CDC 6600. In December of 1973, as the Central Laboratory building was being readied for occupancy, the surplus computer from LBL was delivered to Fermilab; it was the first major computer system to be acquired by the Laboratory. The former owners deemed it appropriate to attach a plaque to the CPU as shown below [image 1].
The location of the computing area in the Central Laboratory building had been under discussion for a time. It had been estimated that about one floor, including an east, west, and connecting wing were necessary to accommodate the projected needs for the computing facilities. Furthermore, there was a preference for all equipment to be on one floor, along with the user area. Due to the architecture of the building, the only floors that were so organized, were the ground floor, the Atrium floor, and the 13th, 14th, and 15th floors. Clearly the Atrium was inappropriate as was the basement with its potential for high-bay area functions. Thus, the floors desired by the Computing Department were one of the three upper floors. There were other plans for those three floors and instead the west wings on the 7th and 8th floors of the building were allocated. Experience since then has demonstrated that these were close to an ideal choice for the Central Computing Facility. Central location of the facility in the vertical dimension with the less-than-adequate elevator system of the Central Laboratory building made the computer resources, which are probably the most broadly utilized resources of the Laboratory, reasonably accessible to everyone within the Central Laboratory building. Second, with large amounts of equipment coupled together, the lengths of cable runs between components are seriously limited for high-performance computing systems. Substantially shorter distances are involved if one connects machines vertically, i.e., between floors, rather than machines distributed over an equal area on a single floor. Finally, since the elevators in the Central Laboratory building are not large enough to accommodate most high-powered computing equipment, all such equipment must be delivered by crane through the windows. With an extension, standard cherry-picker cranes (as shown below [image 2]) barely make the distance to the 8th floor. If the facility had been located at a higher level, deliveries of new equipment, difficult today, would have been much more difficult.
Early in 1974, the CDC 6600 was opened to the user community. The entrance to the computer room is shown below [image 3] and early disks are shown on the following page [images 4 and 5]. During the next four years, that became the mainstay for computing for the Fermilab physics program. As the demand for computing grew, a second CDC 6600 was acquired in 1975, as was a CDC 6400 in 1976, both from the second-hand commercial marketplace. In addition, during that period more peripherals or disks and more memory were acquired to keep up with the growing needs of the Laboratory. During that period, as was the case with almost all major computing systems, the operating system was a batch-oriented operating system. Certain important enhancements have been introduced at Fermilab to couple the three machines. This way, automatic load leveling was possible and the computing facility had a very high level of availability. The system worked very effectively during that period, even though it was not at the time state-of-the-art equipment.
Projections of the Laboratory needs had indicated that by 1978 there would be a need for substantially more computing capability at the Laboratory. Plans had already been laid as early as 1975 for an acquisition projected for 1978. After some negotiations with the Energy Research and Development Administration (ERDA), the then Fermilab parent government agency, it was agreed that an acquisition would occur in FY78, but with a delivery delayed until FY79. Thus, in December of 1978 after a competitive acquisition in which Amdahl, CDC, and IBM were the finalists, a CDC system was chosen based upon the then current top-of-the-line processor, the CDC Cyber 175. The system was delivered in phases over a period of two years, growing to a configuration which included three CPU's, 15 gigabytes of disk storage, 28 tape drives and two high-speed and two medium-speed printers. The operating system chosen was the newer interactively-based operating system that CDC had been developing for some years, the NOS operating system, which made a reasonable compromise between an interactive and batch mode of operation.
One less-than-successful project is worthy of mention. Starting in 1973, a project had been initiated to develop an onsite wide-area network to connect the data-acquisition computers directly with the Central Computing Facility. This was the BISON-NET project. The idea was to make it possible for an experimenter to analyze in real time a small fraction of the data being taken. The BISON-NET modules that were developed were CAMAC-based at both the experimenters' and the Central Computer Center ends. At the Central Computer, there was a special interface built to connect CAMAC to the Cybers for information transfer. The modules were based on then state-of-the-art memories, which made it possible to send 1024 words of 24 bits in each transfer. The transfer rate was one megabit per second sent over broadband coax cables, which had been laid between the experimental areas and the Central Laboratory building. The hardware was completed and was operational in 1975. But, because of the lack of manpower, user-convenient software was never adequately developed. Thus, although there were a few users of the facility, the idea never quite caught on. More recently, newer technologies, both in the computing and transmission arenas, have obviated the need for that locally developed system.
The Cyber system which started working in early 1979 has served the community's needs quite well from that time to the present. Additional increments have been made on a continuous basis to satisfy the growing needs of the Laboratory and to remove bottlenecks as they have developed. However, it was also apparent that in addition to modest increments in the form of additional memory, disks, tape drives, printers and communication gear, that substantially more CPU power would be required by the end of 1983. Thus, an additional increment was proposed for FY83 to at least double the computing power installed at the time. A total of $5.5M capital equivalent was authorized by the Department of Energy for that purpose. As a result of an RFP, once again CDC won out over Amdahl with the proposal for a dual processor Cyber 875 with three-quarters of a million words of memory. Other participants in this acquisition competition were IBM and Cray Research.
The first of the two Cyber 875 processors was delivered during the last days of 1983 and was opened to the user community early in 1984. Each of the 875 CPU's computes at a rate about twice that of each of the Cyber 175 processors. The 875 processor is integrated into the overall system that had been developed over the years so that it also looks just like another processor transparent to the user. The second 875 processor is currently in planning for delivery during the summer of 1985. The expectations are that with that processor, the Cyber complex will be complete and mature and will handle much of the Fermilab fixed-target program and engineering needs to close to the end of the decade. The disk system currently at 25 gigabyte capacity will also be increased to about 30 gigabytes. No additional major upgrades are planned for that system.
As part of the FY78 acquisition, an Automatic Tape Library (ATL) was also acquired. This system, which contains over 4000 6250 bpi tapes for a storage capacity of approximately 500 gigabytes, serves as an automatic archiving system backing up the online disk system. Thus, as files age and must be archived based upon size of file and last-date accessed algorithms, files are moved from the directly accessible disk system to the archival tapes. An index is kept for future retrieval purposes. Thus, once archived, if a file is recalled, the appropriate tape is automatically called up and the file found on tape is read back onto a disk for access by the program. The robot within the ATL, shown on the next page [image 6], is capable of retrieving any one of the over 4000 tapes in less than 5 seconds. The robot mounts the tapes automatically on a tape drive under direct control of the Cyber CPU and the file is found and transferred. Assuming no instantaneous queuing problems, the mean time for retrieval is less than 2 minutes and the worst-case retrieval time is less than 5 minutes.
Another important improvement in the facility associated with the FY78 acquisition was the acquisition of a data PBX (switch) manufactured by Micom. This switch, which has grown through the years, is capable of interconnecting any computer or terminal. The bandwidth is supported up to 9600 baud. In addition to about 750 terminal lines hardwired internally, there are almost 100 lines hardwired to multiplexors over dedicated lines to remote universities and 35 dial-up lines. There are also mechanisms installed now for dial-out procedures as well as callback procedures, all connected via the Micom. Recently there have been installed remote Micom switches connected with the master switch using T1 (1.544 mbs) telecommunications technology. Currently there are remote switches located in the accelerator Cross Gallery and in the Central Industrial Building. Most recently, a dedicated line operating at 9600 baud with 12 channels has been installed connecting to a similar Micom switch at SLAC. Thus, now computers at SLAC and LBL are available via the switch to any user at Fermilab; conversely users at those laboratories can access the Fermilab computers.
With introduction of an interactive system, it becomes possible to introduce a uniform word-processing facility across the Laboratory. Thus, shortly after the establishment of the stable system based upon the FY78 acquisition, small modifications to what was then the most user-friendly editor system available under the NOS operating system was introduced as a Laboratory-wide word-processing system. Although not as secretary-user friendly as one can find in the marketplace, it has proven to be a very comfortable system for many of the secretaries and has the enormous advantage, almost unique of all the laboratories and many businesses, of a uniform system across the Laboratory. The major advantage of a single system across the Laboratory is that secretaries can move from one position to another without having to learn new technology. Furthermore, during those crisis periods when large reports must be constructed for DOE or for some other review process, it is possible to bring together a large number of secretaries from various organizational entities at the Laboratory and make a coherent document with very little intersystem complication. The primary output devices for the central word-processing system have been daisy wheel-based Diablo (a Xerox subsidiary) typewriters/printers. More recently, a new terminal, Televideo 970, has been introduced that has the capability of handling both Roman and Greek fonts along with mathematical symbols. Also introduced recently is a new ink-jet technology printer from Exxon capable of printing in one pass a number of different fonts, including Greek and mathematical symbols.
As part of the FY83 acquisition, a local area network, a Network Systems Hyperchannel capable of transmissions at 50 Mbits/sec was also acquired. The Hyperchannel connects the Cyber complex to the business-oriented IBM 4341 and the VAX cluster which serves as an alternate front end to the Cybers. Although the Hyperchannel, schematically shown below [image 7], is currently installed and working and capable of transfers between the various major computing elements, it is not yet as fully integrated into the system as is desirable. Continuing efforts are underway to improve this situation.
Other networks that have been installed at the Laboratory include DECNET connections between almost all VAX computers at the Laboratory and some PDP-11's. It also includes broadband communications channels, some of which operate at speeds up to 1.5 megabits/sec connecting the experimental areas, Wilson Hall, and the Cross Gallery. Finally, in the Central Laboratory Building and Cross Gallery area and separately in the B0 area, there is already installed a backbone ETHERNET system which is also beginning to find usage. With the availability of all of these network backbones, it becomes more and more possible to satisfy new and growing needs with a relatively short turnaround time. The DECNET distribution is shown in the following diagram [image 8].
As the capability in the marketplace has grown and as the needs of the user community have similarly grown, graphics capability of the Fermilab systems has grown. As part of the FY78 acquisition, a number of Calcomp plotters were installed which were the mainstay of the graphical output of the Computing Center for a number of years. More recently Benson Varian high quality electrostatic graphic printer/plotters connected directly to the Cyber mainframes have been replacing the Calcomps. A locally developed device independent graphics system (DIGS) was developed and has been very effective. In the earliest days, the only choice of graphics terminals were the Tektronix 4010 and similar devices. In the last several years, a wide range of new competitive devices has become available, typically using bitmapped solid-state memory as opposed to the Tektronix 4010 electrostatic storage-screen based technology. Thus today the Laboratory uses a simple ADM 3 with a graphics-board attachment as the lowest-level graphics device, but includes Tektronix 4025 and Tektronix 4014 devices. Most recently, color-graphics terminals have become available at reasonably attractive costs and the Laboratory is beginning to support both the Envision 230 and the Seiko GR1104 color terminals. Seiko color printers are also now available. For the VAX-based machines, there are also VT240's and VT100's with graphics boards which are finding popular use.
In 1983, the engineering community thought it was time to acquire a CAD/CAM system. After extensive study by a CAD/CAM committee, a Cyber-based system, CD-2000, was chosen. This system which is now utilized by a number of different engineering and drafting groups, has been very successful for mechanical design, mechanical drafting, and to some extent circuit board and circuit diagram construction and maintenance. The system has proven to be quite responsive and quite cost effective. Currently there are over 15 stations connected.
In 1973, after Argonne National Laboratory had received their IBM 360/195 computer, they put into surplus an IBM 360/50 computer. This computer was also acquired as surplus by Fermilab and served for a long time as the remote terminal to the Argonne facility, as well as the local computer to service some of the on-site business computing needs such as the payroll and the PREP inventory system. The connection between Fermilab and Argonne was upgraded from a dedicated telephone line to a line-of-sight microwave link at that time as well. In 1980, the Model 360/50 was replaced by an IBM 4331. Since then, the IBM 4331 has been replaced by an IBM 4341 which has been upgraded a number of times with additional memory, disks, and auxiliary memory. At the present time, the memory size is 8 Mbytes, and there is 3.7 gigabytes of disk available.
As the Laboratory moves into the Tevatron period and has been planning for the major experiments of that period, projections of computer needs, especially of the collider experiments, both CDF and D0, make it evident that some of the architectural limitations of the currently installed systems make it virtually impossible to use these systems to analyze the next generation of experiments. Furthermore, the computing requirements projected for these experiments, based upon both paper extrapolations, and experience gained by UA1 and UA2, indicate that the computing cycles required are going to be very large as we move into this period. A CPU requirement curve is shown at the top of the next page [image 9] starting from the very beginnings of the first CDC 6600 installation and extrapolating to the end of the 1980 decade. It is evident from this curve that substantial additional computing capacity will be required at Fermilab to accommodate these needs. In addition, the computing capability required for these new experiments brings it into a new domain requiring many more productivity-enhancing tools. Furthermore, because both the programs and the data sets are so much larger, the architectural limitations of the older equipment are totally inappropriate.
The Laboratory has requested and DOE has placed in projected FY86 and FY87 budgets a total of $15M to establish a basically new Central Computing Facility. The available space in the Central Laboratory building has become so tight that there is no place within that building to place this new equipment. Therefore, as part of this acquisition there is, in addition to the $15M, an $8.3M request with which to build a new Central Computing Facility building to house the central computing equipment, together with the data-acquisition computer equipment and the PREP support activities. Funding is planned, but not yet authorized and includes the construction of the building primarily during FY86 with completion early in FY87. The computer system itself will be partially acquired in FY86, with the bulk of the acquisition occurring in FY87. A conceptual design of that system as developed by the Next Computer Acquisition Committee is shown below [image 10].
Since the earliest days that the Laboratory has had an organizational unit which has become to be known as the Computing Department, there has been some type of computer advisory committee. Over the years its character has changed a number of times, but in each case it has performed the function of advising the Director as to directions in which the Laboratory should move in the computing arena. These committees have been very supportive and instrumental in bringing about each of the major computer acquisitions.
After about 15 years of growth, computing at Fermilab stands on the threshold of a new major step. Systems that have been developed over the years have performed well for the Laboratory and have done so in a very cost effective manner. Changes occurring in the micro-electronics industry currently, and the whole issue of centralized versus distributed computing are issues that are very complex and are having major effects on the direction in which computing at Fermilab grows. There are many difficult choices to be made as we move through the second half of this decade and into the next. The Laboratory has performed well in the past and we can look forward to an exciting future bringing the appropriate computing facilities to the user in a cost effective and user effective manner.