EPICS: A Control System Software Co-Development Success Story*

M. Knott, Argonne National Laboratory, D. Gurd, Superconducting Super Collider Laboratory, S. Lewis, Lawrence Berkeley Laboratory, and M. Thuot, Los Alamos National Laboratory

Introduction

The Experimental Physics and Industrial Control Systems (EPICS) is the result of a software sharing and co-development effort of major importance now underway. The initial two participants, LANL and ANL, have now been joined by three other labs, and an earlier version of the software has been transferred to three commercial firms and is currently undergoing separate development. The reasons for EPICS's success may be useful to enumerate and explain and the desire and prospects for its continued development are certainly worth examining.

Control System Transplants

If a successful control system co-development is remarkable enough to report, it is worth commenting on control system transplants in general, and specifically those of accelerators. EPICS is not the first control system to be used at multiple accelerator sites and will not be the last. But, although earlier successful transplants have taken place, there have rarely been co-development efforts among multiple sites. The reasons are numerous, some cultural and some technical.

Technology: One important factor inhibiting the copying of a control system is the rapid technological evolution of the components and techniques used. We are now firmly wedded to the progress of the computer industry since we can rarely, if ever, get ahead of it. So when a few years pass after the successful commissioning of an accelerator and its control system, the components of that system are termed "obsolete." By obsolete, we of course mean that the use of the latest computer system, components, techniques, and software will yield a better and more cost-effective control system. With the sensible move toward standards by both the computer industry and the accelerator community, the use of standards instead of proprietary products or site-specific methods will constitute an obvious advantage.

Specific Accelerator Requirements: The differing technical requirements of a new accelerator are often reasons for resisting the transplant of another accelerator's control system. However, the current trend toward employing the similar control system methods and tools for all of the several accelerators of a multi-stage process weakens this argument considerably.

Knowledge of Software Internals: One important reason to develop one's own control system is the need to have an intimate knowledge of the control system developed, since normally the developers are given the task of supporting, expanding, and improving the system throughout its lifetime. This reason, while not ruling out the use of another lab's system since one can generally obtain source code and devote the time needed to understanding all aspects thereof, usually rules out commercial packages.

Unique Requirements: Another reason that commercial solutions are avoided in favor of transplanted or in-house designs is the truly unique requirements of accelerators. There cannot be a large market for control systems with timing precision down to nanoseconds. In addition there is the fact that few or no hard specifications are available at design start and that the system must be able to evolve to satisfy a host of future, unspecifiable requirements. A notable exception in the commercial market is V-System by Vista Systems, which had its origins in the Beam Telescope Control System mentioned below.

Not Invented Here: One non-obvious but possibly significant reason for developing a control system "in house" is the need to avoid anything "not invented here." We expect that sometimes one or more of the above reasons are given in what is actually support of this psychological motive. On occasion, the opposite occurs, and a staff member insists on an inappropriate transplant due to familiarity or personal co-development history.

The Historical Perspective of EPICS

Often not recorded is the genealogy of a multi-year, multi-branch, multi-decision development. We think it is important to describe that development genealogy in this case both to credit the developers and to help the community to benefit from the positive aspects thereof.

Beam Telescope Control System: Several stages of control system development at LANL preceded EPICS [1]. The operator interface, using windows on a workstation, was that developed for the Beam Telescope control system designed and built by LANL for installation at ANL's IPNS [2]. This system employed a VaxStation running a DEC-Windows GUI and used CAMAC as its distributed data collection and network implementation. This work whetted the appetite of the ANL team along with many others when it was demonstrated at ICALEPCS-'87 in Villars, Switzerland.

Toolkit Workshop at LANL: The next significant event was the Accelerator Automation Application Toolkit Workshop held at LANL in 1988 [3]. The initial members of the ANL team were present and participated with accelerator controls people from around the world in what was an electric atmosphere of idea exchange, horror-story telling, suggestions, and debate as the attributes of a new control system were recorded to set the LANL team's development directions. There were, in fact, suggestions and ideas put forth to continue the dialog but these never developed and the community went its separate ways.

GTACS: The outcome of the toolkit workshop was the development of the Ground Test Accelerator Control System, GTACS [4]. This system embodied many of the ideas put forth at the workshop, but more important, it embodied the key core software and tool-based design concepts that would enable it to be used in an exploding array of applications. The work at LANL by Jeff Hill, Bob Dalesio, and others proved to be a sound foundation upon which all of the work to follow stood. The Channel Access core software was a set of rules for the exchange of data between clients and servers and is, in fact, a data exchange "standard," sometimes called a software bus [5]. The tool-based approach to software modularity was not new, but the strict application of the approach in a control system enabled a tremendous flexibility in appearance and application. GTACS was distributed to several other sites and successfully adapted to an array of systems. However, the burden of supporting a control system in use at distant locations and institutions made the LANL team cool to the idea of a co-developer.

ANL Joins LANL: The LANL and ANL teams finally met around a table at ICALEPCS-'89 in Vancouver, Canada. The ANL team had a major new accelerator (the Advanced Photon Source, APS) control system to build and suggested a co-development effort with the LANL-GTA team. Due to previous support problems, it was soon clear that the LANL team wanted to see either the color of ANL's money or real proof of a commitment to contribute. The ANL team proposed both and sent Marty Kraimer to work with and for the LANL team for an extended period. This decision proved pivotal and has become the pattern suggested to any prospective collaborator. Marty was given a complete system to implement, which required that he learn all aspects of GTACS. He returned to Argonne and convinced the rest of the ANL team that GTACS was a system with well-thought-out basic properties and good performance. At the time, the ANL team felt that even if the disparate requirements of the GTA (a proton linear accelerator) and the APS (a positron storage ring synchrotron radiation facility) were to force a deviation and separation of the two teams' development directions, they would at least have gained a tremendous head start in their own mission. In fact, their leader at the time, Marty Knott, was convinced of just such a near-term separation, but the basic properties of GTACS were to prove him wrong.

GTACS Evolves Into EPICS: As the ANL team staffed up and became familiar with GTACS, they began (and were encouraged by the LANL team) to suggest improvements to increase the utility, convenience, and cross-platform transparency of GTACS. Each change was discussed and agreed upon by both teams, but the urgencies of the GTA program soon made serious improvements difficult. It was decided to continue the development of an improved version with the hope of retrofitting the GTA applications in the future. This downward compatibility goal imposed a discipline on both teams and kept the new version from getting out of control with "improvements." Extensible record and device support was implemented and many other minor improvements were made. Finally it was suggested that a more generic name be given to the new version in keeping with the multi-lab co-development effort and to reduce confusion. EPICS, for Experimental Physics and Industrial Control System, was the new name.

ICALEPCS-'91: At Tsukuba, Japan, Bill McDowell, now the ANL team leader, presented a paper referring to the collaboration and its fruits [6]. He was met with a certain level of outright disbelief that such a co-development could take place, let alone succeed. A great deal of interest was generated and EPICS, along with other control systems employing workstations, VME crates, and a LAN, was dubbed a "standard model" by Berend Kuiper [7]. From this point, queries about EPICS have grown exponentially and spread to an international scope.

LBL, SSCL, and CEBAF Join the Collaboration: In 1992, Steve Lewis's controls group at LBL asked to join the co-development effort. This department of LBL has several control system development activities and supplies services on a Lab-wide basis. They have since applied EPICS to some aspects of the ALS and to control of the Gamasphere (at LBL) and STAR (at BNL's RHIC) experiment systems. Dave Gurd's controls team at SSCL also expressed interest in joining the collaboration in 1992 and has since done so, spurring interest in using EPICS for very large control jobs [8]. The CEBAF controls team is the most recent addition to the collaboration.

Commercialization: After a long process, ANL, LANL, and the U.S. Department of Energy (DOE) signed a "non-exclusive license" agreement with three commercial firms for the use of EPICS Release 3.8. The three companies are Kinetic Systems of Illinois, Tate Integrated Systems of Baltimore, and Titan Corp. of Albuquerque. Kinetic Systems is now shipping systems based on EPICS under the name Intuit. Tate is shipping and marketing EPICS under the name TIS-4000 after adding many improvements related both to their process control market and to some of EPICS's current limitations.

Why Did the Initial Collaboration Work?

The question of why and how the initial LANL-ANL collaboration worked has been asked by many.

Serendipity: As often happens, the technical factors were mostly a fortunate confluence of events: a large new construction project (APS) occurring at a time when GTACS was developed enough to prove its viability; a good match of the plans already made by the ANL team and the key features of GTACS; and a time when "workstation wars" made the use of generous quantities of Unix-based computing power a practical reality.

Ego-Suppression: Perhaps the key factor was the willingness of the ANL team to adopt a design invented by others and later, as these newcomers proposed changes, the willingness of the LANL team to accept the suggested changes. In fact, this "ego-suppression" is still seen as a key factor in the continued progress of the collaboration. The willingness of the ANL team to become fully cognizant of GTACS at the beginning is also seen as key to working as a co-development team.

Tool-Based Approach: Another factor in this initial success, and a continuing factor, is the fact that GTACS and its successor EPICS use tool-based approaches [5, 9]. By tool-based we mean the use of software "tools" that are designed with carefully chosen boundaries between layers and modules, use good protocols, and allow independent development, similar to what any good bus protocol provides. Thus, a new tool can be developed to provide a new functionality or to replace an existing functionality and the presence of one version or the other or indeed both, or the complete absence of both will be invisible to the basic functionality and hopefully, all other tools. This approach allows two or more EPICS implementations to use completely different sets of tools, custom-designed to meet the particular requirements of the accelerator or application.

The EPICS Alarm Handler, one of ANL's first contributions to the collaboration, is an example of the value of this approach. A real problem in any control system is the handling, prioritizing, and presentation of alarms. "Alarm storms" can result from simple but fundamental events. The more one wants to anticipate and monitor for problems, the worse the situation becomes. ANL decided early that it needed a solution to this problem and proposed to LANL that ANL develop this tool. The development did not require total knowledge of GTACS, just the channel access portion. As the ANL team developed the Alarm Handler [10], it was given a chance that not many developers of control systems for new accelerators get: to try it out on an operating accelerator, namely the GTA at LANL. Thus the tool was critiqued, revised, and fine-tuned long before it was to become part of the ANL accelerator. LANL incorporated it into the GTA control environment (as have other collaborators). This episode, with its obvious risk-reduction benefit, proved to both LANL and ANL the basic soundness of the collaboration.

Standards: The use of standards wherever possible makes collaborating more acceptable to a prospective partner. Both LANL's and ANL's desires were identical in this respect [6]. The use of standards in the LAN portion of EPICS has allowed ANL to plan for a migration path through FDDI all the way to ATM to provide high bandwidth and fault tolerance. The use of network hubs will protect the initial investment in ethernet interface modules [11]. The initial concerns of many over the use of Ethernet as a backbone network have thus been laid to rest. Beyond network standards, EPICS is based on a Unix development and operator interface environment, the "C" programming language, the TCP/IP communication protocol, the VME-VXI bus/crate system, and some popular field-bus protocols.

Indirect Benefits: Two examples of development transfers indirectly related to EPICS are the use by ANL of VXI-based rf signal processing electronics and video image processing applications software developed by LANL for the GTA. While not part of the EPICS collaboration, these tools were easily transferable due to their compatibility with GTACS/EPICS.

Co-Development Mechanisms

A variety of communication and coordination methods are used to coordinate the activities of the members of the collaboration. Although personal contact and give-and-take discussions probably exchange the largest amount of data and personal preference information, electronic methods are a close second.

Team Visits: Sometimes the time-honored methods of face-to-face confrontation and discussion are necessary to "hammer out" an agreeable solution to a difficult problem. Such meetings are held about every month or two at a rotating site. Bob Dalesio of LANL is burdened with the tasks of agenda preparation, discussion mediation, and minute taking. This interaction of the collaborating teams at the requirements and design stage before committing to code is extremely beneficial. The thinking and experience of five or more labs really makes a difference.

Electronic Mail: A great deal of information is exchanged with e-mail, the benefits being that it is unambiguous, documented, and point-to-point(s). The geographical separation of the sites is quite invisible with this and other Internet-based methods. A related method, a bulletin board-like set of tools called NOTES, developed at the University of Illinois for such collaborations, is used for several purposes: HW bug reports; SW bug reports; EPICS suggestions; application program suggestions; and tech-talk, a question-and-response forum, are all used by the EPICS development and application staffs.

Electronic Conferencing: Only two sites, ANL and SSCL, have teleconferencing equipment at this time, but workstation-based "picture-in-picture" teleconferencing has been tested successfully between LBL and ANL. This latter system uses audio and slow-scan video over the Internet and is quite acceptable for this purpose. Common-view slide presentations and multiple pointers and markups are also provided.

Agenda and Priorities: With the four major co-developers having different projects to service, they naturally have different development agendas and priorities. When these priorities intersect or when EPICS developments are produced ahead of the timetables of other co-developers, there is an obvious benefit and a smooth cooperative effort is the result. However, the partnership is unstructured and the co-development partners have no mandatory obligation to perform their "assignments" on time. So, if a tool is needed ahead of the timetables of the others, that tool must be looked at as a "single-site" responsibility.

Release Mechanisms: As with any multi-person software development effort, let alone a wide-area effort, careful attention must be paid to release and configuration control. A suite of code management tools (currently SCCS) is employed for all shared software. This includes the core crate software, most commonly used device drivers, channel access, the display editor and manager, and several major tools such as the alarm manager. Only application programs, control screens, and device drivers developed and used locally are maintained locally. Software modules are checked out over the Internet, revised, debugged, and then and checked back in for inclusion in a new release. As an indication of the transparency of this WAN-based system, the files are maintained at LANL, the system administrator, Mike Bordua, is at LBL, and currently most of the software is being developed at LANL and ANL. At this writing, EPICS was at Release 3.11 (Beta-E)66444.

How Well Does it Work?

Are all co-development decisions unanimous and easy? The answer, of course, is "no." Many differences of opinion occur, some accidentally and others seen from far off. An example of an accidental difference was the development of a Graphical Database Configuration Tool (GDCT).

Graphical DCT Development: Maintaining the configuration of databases containing linked record processing has always been a tedious task, and it was apparent to all that a graphical tool would make the task easier. Largely without coordination with the others, ANL, LANL, and SSCL all began developing different approaches to a graphical DCT. All three turned up on the same collaboration agenda in January, 1993. As it turned out, not all solutions had all the desired features, and after some discussion, the graphical editing tool used by SSCL, ObjectViews from Quest, was joined to the methodology of the ANL team and the resulting graphical DCT developed to the point where ANL could use it. This Graphical DCT was then sent back to SSCL and is being enhanced further. A graphical version is also being developed by the SSCL team for the sequencer tool, a tool for specifying state machine attributes and one which will also benefit from a GUI. LANL's implementation of the graphical DCT employs a translator to a CAD package and this version is in use there. Both implementations are fully compatible with EPICS databases and a user has the choice of either tool.

The X Layer Decision: An early disagreement between ANL and LANL was which X-protocol layer should be used for the EPICS GUI: Xlib, a basic-level layer with more primitive but faster operating calls, or the Motif Toolkit layer. ANL's Mark Anderson pressed for the Toolkit approach and claimed that he could produce good performance. ANL's team backed him, but counted on expected workstation performance gains to insure success. The two teams agreed on a set of goals against which to measure performance and Mark eventually succeeded in meeting the desired performance of a new application screen in under 2 seconds and 1,000 updated screen elements per second.

Current Status

Is EPICS finished? Is it the best it can be? The answers are "no" and "definitely not!" What about the collaboration and commercialization? These and many other questions are asked every day by all of those participating in the collaboration and by those wondering if they should join. We will try to address the hottest of these topics.

Commercialization: As stated above, EPICS has been licensed for use by three private-sector companies. Two of these, Tate and Kinetic Systems, are actually shipping fully developed systems based on the core features of EPICS and including much of its functionality. Tate has added several features important to the process-control industry such as redundant field crate controllers and a low-cost, WAN-based, implementation of the field crate. These will be studied and considered for incorporation into the accelerator community version of EPICS. A continuing cooperative relationship with Tate or another commercial firm is being considered by the current set of EPICS co-developers, due in part to a need to insure continuity in the future regardless of the continued involvement of the present collaborators.

Recently Added Features: Much of the functionality and features are described elsewhere [1, 11]. A major new function now under final refinement is that of the Motif Editor and Display Manager (MEDM) [12]. This development had to meet several challenging requirements: the Motif-style look-and-feel and Toolkit layer, good performance, compatibility with the specification files of the original editor and display manager, and the ability to rapidly switch between edit and execution contexts. MEDM is now in common use at some sites and may soon become the default toolset.

One of EPICS's strengths is its acceptance of interfaces to third-party tools. Mathamatica, PV-Wave, Wingz, and interactive C from the commercial sector and CERN's NODAL and BNL's Devtest from the accelerator community have been successfully interfaced to EPICS's channel access.

Limitations of the Current EPICS: As stated above, EPICS is far from being the end-all of control systems. It has attracted the attention it has because it represents a good platform upon which to build a variety of control systems tailored to particular needs, and the fact that the tool approach lends itself to collaborative development. Quite understandably, EPICS's shortcomings usually stem from how it relates to the needs of a new application. For example, being a LAN-based distributed system, it has no ability to perform the precision timing tasks required in all accelerators, an exception being its ability to time-stamp actions system-wide to the millisecond level using software and to the microsecond using available hardware tools [13]. The solution to such a high-performance requirement has been to build the timing system external to EPICS and have EPICS manage it, distributing pulse delay and width data to specialized transmitters.

As EPICS is applied to very large systems or when used with point-to-point communication links, the need for a name server becomes apparent since EPICS uses LAN broadcast messages to initially establish its links. The applications planned at SSCL are examples of these requirements. Actually, Tate has seen the need to incorporate a name server for these same reasons, and so the incorporation into EPICS has an existence proof.

There have been a great many suggestions for corrections and improvements to EPICS (one of the results of a multi-lab collaboration) and many of these are working their way up the priority ladder. Composite devices, for example, consisting of related groups of process variables upon which vector operations could be performed, are now moving from the specification to the strategy phase. Control permission tools, which limit control of subsystems by name, location, and machine mode, are defined and now await implementation. A complete suite of unified configuration tools is another area of improvement we would like to see.

Online Documentation Tools: Two developments related to online documentation are eagerly awaited. One is the planned development of an online logbook toolset integrated with all EPICS run-time tools. The specification is aimed at meeting the requirements of logkeeping, yet a complete electronic implementation, and will include copy-paste abilities with other graphical tools as well as time-stamped manual and automatic entries (triggered, for example, by alarms or machine-state snapshots).

The other online documentation movement is the posting of all EPICS documentation to the World Wide Web system for viewing and distribution. Currently much of the documentation written at ANL is available for internal use, but wide availability is foreseen. ANL is using the Mosaic viewer, developed at the National Center for Supercomputing Applications at the University of Illinois [14]. The documentation can be viewed by pointing your viewer at http://epics.aps.anl.gov/welcome.html.

Current Collaboration Membership: In addition to the five labs referred to above, Duke University's Free Electron Laser controls team has joined the collaboration. Several other labs have expressed interest and, in fact, have obtained copies of EPICS to evaluate. Many of the synchrotron radiation experimentalists of both the Argonne APS [15] and the LBL Advanced Light Source are planning to use EPICS for beamline control and data collection control. This particular co-usage of EPICS is expected to produce common-use toolsets related to their experimental equipment. In a similar movement, two of the SSCL detector groups are considering using EPICS for their slow controls. In all, a combination of 24 laboratories, universities, and industrial firms are involved with EPICS.

Future Directions: There is no argument as to the value of the multi-lab co-development of EPICS. The real question we ask, now that the ship is launched, is where do we go from here and how do we keep it afloat? The changing priorities of the various labs as they build and then operate their accelerators are a real challenge to the smooth improvement of EPICS. Alternatives are being discussed which include a relationship with a vendor who would provide reintegration of the collaborator's improvements, release control, maintenance, training, and documentation of at least a core set of components and tools. The current collaborators have learned that participation in a serious co-development should not be considered free. A commitment in the form of resources to support EPICS is required for success. If any form of centralized management is to be established, commitment in the form of monetary support should be expected, whether that management is laboratory-based or commercially contracted.

Summary

EPICS has been launched as an unprecedented co-development effort by two, then several accelerator laboratories. It was made possible by a good basic set of tools, a suppression of ego, and a cooperative spirit among its participants. It is hopefully possible to continue to improve and make available to others in our community this certainly incomplete, but potentially great set of control system tools. It will require both continued commitment and good will to fulfill that possibility.

References

[1] Dalesio, L., Hill, J., Kraimer, M., Murray, D., Hunt, S., Claussen, M., Watson, C., Dalesio, J., "The Experimental Physics and Industrial Control System Architecture," submitted to ICALEPCS, Berlin, Germany, Oct. 18-22, 1993.

[2] Clout, P. and Rothrock, R., "The Los Alamos Telescope Control System," TRIUMPF Seminar, Vancouver, British Columbia, Canada, Nov., 1987.

[3] Howell, J., Bjorklund, E., Clout, P., Dalesio, L., Kozubal, A., Mottershead, C., Rothrock, R., Schaller, S., Stuewe, R., and Westervelt, R., "Accelerator Automation Application Toolkit Workshop Presentations," in Proceedings of the Los Alamos Accelerator Automation Toolkit Workshop, Los Alamos, New Mexico, 1988.

[4] Kozubal, A., Kerstiens, D., Hill, J., Dalesio, L., "Run-time Environment and Applications Tools for the Ground Test Accelerator," in Proceedings of ICALEPCS, Vancouver, British Columbia, Canada, 1989, pp. 288-291.

[5] Hill, J., "Channel Access: A Software Bus for the LAACS," in Proceedings of ICALEPCS, Vancouver, British Columbia, Canada, 1989, pp. 352-355.

[6] McDowell, W., Knott, M., Lenkszus, F., Kraimer, M., Daley, R., Arnold, N., Anderson, M., Anderson, J., Zieman, R., Cha, B., Vong, F., Nawrocki, G., Gunderson, G., Karonis, N., and Winans, J., "Standards and the Design of the Advanced Photon Source Control System," in Proceedings of ICALEPCS, KEK, Tsukuba, Japan, 1991, pp 116-120.

[7] Kuiper, B., "Issues in Accelerator Controls," in Proceedings of ICALEPCS, KEK, Tsukuba, Japan, 1991, pp 602-611.

[8] Gurd, D., "Control System Plans and Progress at the SSC," submitted to ICALEPCS, Berlin, Germany, Oct. 18-22, 1993.

[9] Dalesio, L., Kraimer, M., Kozubal, A., "EPICS Architecture," in Proceedings of ICALEPCS, KEK, Tsukuba, Japan, 1991, pp. 278-282.

[10] Kraimer, M., Cha, B., and Anderson, M., "Alarm Handler for the Advanced Photon Source," Proceedings of the 1991 IEEE Particle Accelerator Conference, San Fransico, California, 1991, pp. 1314-1316.

[11] McDowell, W., Knott, M., and Kraimer, M., "Status and Design of the Advanced Photon Source Control System," submitted to ICALEPCS, Berlin, Germany Oct. 18-22, 1993.

[12] Anderson, M., "Man-Machine Interface Builders at the Advanced Photon Source," 1991 Fed/Unix & International Motif Users Meeting, Washington, DC.

[13] Stettler, M., Tuot, M., et al., "A Distributed Timing System for Synchronizing Control and Data Correlation," LINAC Conference, Ontario, Canada, August, 1992.

[14] Bina, E., and Andreessen, M., "About NCSA for X," World Wide Web Uniform Resource Locator: http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/help-about.html (WWW online document, October, 1993).

[15] Coulter, K., Popper, R., Reid, D., Stein, S., Dalesio, L., Fite, C., Stettler, M., and Warren, D., "Motion Control in the Experimental Physics and Industrial Control System," submitted to ICALEPCS, Berlin, Germany Oct. 18-22, 1993.

*Work supported by U.S. Department of Energy Office of Basic Energy Sciences under Contract No. W-31-109-ENG-38.