FA Paper

docx

School

University of Florida *

*We aren’t endorsed by this school

Course

2020C

Subject

Industrial Engineering

Date

Feb 20, 2024

Type

docx

Pages

13

Uploaded by EarlKnowledgeWren4

Report
1 Failure Analysis in Software Engineering: Therac-25 Lauren Elsey lauren.elsey@ufl.edu University of Florida
2 C ONTENTS SECTION 1: IDENTIFYING COMMON TYPES OF FAILURE ............................................. 3 1.1: Types of Failure ................................................................................................................ 3 1.2: Causes of Error .................................................................................................................. 5 SECTION 2: TESTING TYPES AND METHODS .................................................................... 5 2.1 Functional Testing .............................................................................................................. 6 2.1.1 Black Box Testing ........................................................................................................... 6 2.1.2 Unit Testing ..................................................................................................................... 6 2.2 Non-Functional Testing ...................................................................................................... 6 2.2.1 Performance Testing ........................................................................................................ 7 2.2.2 Usability Testing ............................................................................................................. 7 2.3 Standards ............................................................................................................................ 7 2.3.1 IEEE 1633-2016 .............................................................................................................. 7 2.3.2 IEEE 16085-2004 ............................................................................................................ 8 SECTION 3: CASE STUDY ....................................................................................................... 8 3.1 Case Description ................................................................................................................. 8 3.2 Case Investigation .............................................................................................................. 9 3.3 Recommendations ............................................................................................................ 10 1.1 Conclusion ...................................................................................................................... 11 References .................................................................................................................................. 12
3 INTRODUCTION Software engineering is a rapidly growing field as technology becomes more prevalent today. With such a heavy reliance on technology, it is important that the software technology relies on is written in a way such that it is safe and free of errors. When it comes to software, not all errors can be evident, and it is important to highlight finding future errors to ensure the safety of individuals. SECTION 1: IDENTIFYING COMMON TYPES OF FAILURE 1.1: Types of Failure When it comes to failure in software engineering, very few cases turn out to be dangerous, or even have a lasting impact. Most errors end up being minor inconveniences in the software interface that are generally fixable. However, software in fields such as healthcare, transportation, or even banking, can have detrimental effects that impact the lives of several in a negative way. Individuals who are unfortunate enough to experience these software errors may lose money, or, in the field of healthcare, may die. There are many types of errors found in software engineering, including unexpected output, unexpected output that seems to be right, and several others [1]. When it comes to classifying types of error, there are several generic kinds that can be broken down into more case-specific errors. Figure 1 lists many of the more generic groups of errors that are commonly found in software [1]. Software System Failure Modes (SFM) Software Elements Failure Modes (EFM) M-I-1 SFM-1: Halt/ Abnormal termination with clear Software Elements: EFM-1: INPUT
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 message EFM-2: OUTPUT EFM-3: COMMUNICATION EFM-4: RESOURCE ALLOCATION EFM-5: PROCESSING Generic Failure Modes of Software Elements: 1. Timing / order failure 2. Interrupt induced failure 3. Omission of a required function or attribute 4. Unintended function or attribute in addition to intended functions and attributes 5. Incorrect implementation of a function or attribute 6. Data error which cannot be identified and hence is not rejected by software logic SFM-2: Halt/ Abnormal termination without clear message M-I-2 SFM-3: Runs with evidently wrong results SFM-4: Runs with wrong results that are not evident M-II SFM-5: Problematic, confusing, or less informative interface Table 1: Software System and Software Element Failure Modes [1] Table 1 separates the generic groupings of error into two main kinds of errors based on how they interact with the main elements of software. These five main elements of software include input, output, communication, resource allocation, and processing [1]. The first five errors, classified as Software System Failure Modes, or SFM, are errors that can be applicable to all software elements [1]. In comparison, Software Elements Failure Methods, or EFM apply to element specific errors [1]. To clarify, elements can have two types of failure, the SFMs or element specific errors, of which can all fall into one of the six EFMs [1].
5 1.2: Causes of Error Writing software can be an extensive and taxing process, meaning that there is a lot of room for possible error. This error can be a part of any one of the many steps from the process of idea generation to the implementation of software into daily life [1]. Figure 2 demonstrates the process of software from idea generation to the end goal of program release and possible causes of software failure in each step of the process [1]. Figure 2: Software Writing Process [1] SECTION 2: TESTING TYPES AND METHODS When testing software, engineers run two different testing types. The two testing types used are functional testing and non-functional testing. System Engineering and Modeling: Lack of communication No clear vision of project Requirements Analysis Budget misallocation Unrealistic timeline Design Insufficient research Rushed planning Code Generation Improper conversions Unhandeled exceptions Testing Undetected errors Incomplete tests Launch Successful launch Software Failure Device Failure System Failure Status of Complex System and Recovery Causes Human Error System Compatibility Cyber Attack External Events
6 2.1 Functional Testing Functional testing aims to test the functionality of the code, or how well the code runs, and if the desired results are achieved [2]. There are several kinds of functional testing, two of which include black box testing and unit testing [2]. 2.1.1 Black Box Testing Black box testing is a type of functional testing that depends on the testers lack of application knowledge [2]. Black box testing aims to focus more on the theory of code, without interference of the application and the purpose of the application [2]. This testing method isolates the code based on how the code is structured within the program and how the logic of the code works out [2]. 2.1.2 Unit Testing Unit testing is a functionality testing method that breaks software into segments [2]. These segmented sections of code are then tested individually to make sure that they are running properly. This example of functionality testing is one of the most common forms of functionality testing. Due to software programs being large sections of text combined into one, unit testing helps to break down the code and isolate areas of possible error. 2.2 Non-Functional Testing N on-functionality testing looks at the usability of the code, and how it works in real life applications [2]. While functional testing tests the code directly, non-functional testing looks at how the code interacts with the application [2]. The end goal of non-functionality testing aims to ensure user satisfaction with the software [2].
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7 2.2.1 Performance Testing Performance testing is a method where the software is checked under a certain amount of pressure [3]. The goal of performance testing is to measure the reliability, speed, scalability, and responsiveness of a program [3]. 2.2.2 Usability Testing Usability testing looks at how well users can utilize the software [2]. The goal of usability testing is to make sure that individuals who are not as well versed in the software can understand how the software works and be able to navigate around it [2]..Usability testing was originally developed to make sure that the software is comprehendible by individuals who are working with the code, even if they are not professionals in the field. This testing method was developed to help identify failures in the user-friendly nature of the software. If the software cannot be used without the presence of professionals in the field, it should be adjusted so that non-professionals can utilize the software [2]. This method entails working with users of the software to see if they find any bugs when working with the software [2]. 2.3 Standards When it comes to engineering, standards are often put in place to ensure that the quality of the structures are enough, to minimize possible risks or accidents. In the healthcare especially, there are several standards to help ensure patient safety. Two such standards include IEEE 1633- 2016 and IEEE 16085-2004 [4, 5].
8 2.3.1 IEEE 1633-2016 IEEE, the Institute of Electrical and Electronics Engineers, have set several standards to ensure the quality of electrical devices and electrical components. IEEE 1633-2016 sets tests to maintain guidelines on the reliability of software; the goal of this standard is to regulations on how the foundation of software is organized [4]. 2.3.2 IEEE 16085-2004 Along with IEEE 1633-2016, IEEE also set its 16085-2004 standard. This standard helps to analyze risk in software. This standard helps manage software over a long-term period by providing a process for risk management and offers up different techniques that help to minimize risk [5]. SECTION 3: CASE STUDY 3.1 Case Description One of the most impactful cases of failure in computer engineering is the historical Therac-25 case. Therac-25 was a medical device used to help treat patients with cancer that was involved in a series of accidents from 1985-1987 that led to the deaths of two individuals, with a minimum of four others who were seriously injured [6]. The Therac-25 device worked by shooting a beam of electrons at a patient with cancer to help dispense radiation [6]. When treating cancer, patients are administered a radiation dosage of around 200 and radiation dosages of around 500 results in death half of the time [6]. For comparison, the Therac-25 device can administer dosages up to 20,000 rads, or radiation absorbed dosages [6]. Preceding the Therac-25 device, the Therac-6 and Therac-20 helped to treat patients with cancer. However, the Therac-25
9 device relied more on the software and technology that the device used. The faulty code and lack of human interaction ultimately led to the deaths and serious injuries of patients due to radiation overdose. 3.2 Case Investigation There are several reasons that the Therac-25 device failed, software and non-software related. The main issue was with the software, the Therac-25 device ran on the same software that the Therac-20 device did but removed the hardware that helped to resolve any software defect that the Therac-20 device [7]. To clarify, the programs were the same, but the Therac-20 device had safety measures removed from the Therac-25. When it came to testing the software, reports show that the device was mainly tested through integration tests, tests that mainly ensure that the software works well with the application [8, 3]. While integration tests are important for the overall application, it could be said that there was not enough time dedicated to debugging, or removing the errors, from the software. Furthermore, because of the large similarities between the code for the Therac-25 device and the Therac-20 device, the FDA, Food and Drug Association, in charge of authorizing medical devices in the United States failed to review the device [8]. The overall lack of quality testing of the Therac-25 software led to unseen errors going uncorrected, and eventually harming patients. Building upon the faulty code, reports demonstrate that there was insufficient communication between the manufacturing company of the Therac-25 device, Atomic Energy of Canada Limited (AECL), and the operators of the machine [6, 8]. Reports show that the AECL would often dismiss claims of error with the Therac-25 device, stating that it was user error, rather than the fault of the company [8]. This unwillingness to claim responsibility is another example of how the Therac-25 device failed. Due
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 to an overwhelming sense of arrogance, the AECL failed to properly communicate the particularities of the Therac-25, leading to issues with the people operating the device [6]. 3.3 Recommendations The main problem leading to the deaths of individuals because of the Therac-25 device was negligence in proper testing methods. It is impossible to attribute this case failure to one person or group. The deaths of the patients were due to an overall gross negligence of awareness in the matter. The first step to the safety of future patients is to maintain the software. This can be done by the FDA testing every update that medical devices go through. While updates in software for medical devices may not seem worth the extra cost of testing these updates, it is almost impossible to fully grasp how software and application will work without those tests. It is imperative that the FDA testing updates in medical software become a priority to ensure that incidents like the Therac-25 case will not happen again. Another important step to take to ensure this will not happen, is sufficient training of operators of the machines. While there were several errors regarding the Therac-25 code, the failure of the nurses and the manufacturing company to not ensure proper training for operating the machine, also contributed to the deaths of the patients. To ensure that this miscommunication does not occur again, all medical device operators should go through a training period to promote medical awareness of devices and to properly care for patients. The final step to stopping future cases such as this would be to apply standards such as IEEE 16085-2004 and 1EEE 1633-2008 to ensure that the software is optimal for the application [4, 5]. These standards aim to prevent future software events by demonstrating proper software
11 testing methods and how to integrate software with its application in the safest way [4, 5]. Negligence on meeting standards can have unintended results and could even lead to more harm, so it is important that all software goes through some sort of standard testing to minimize future risk. C ONCLUSION Software engineering is a field where errors are not always detrimental. However, with the growing reliance on technology in society, it is important to establish proper testing techniques to ensure the reliability and safety of software. The Therac-25 case is an example of how a negligent testing period can result in the loss of lives. To minimize future risk, standards must continue to improve, and testing periods must continue to be worked upon.
12 R EFERENCES [1] T. L. Chu, G. Martinez-Guridi, M. Yue and J. Lehner, "A Review of Software-Induced Failure Experience," Brookhaven National Laboratory, Upton, NY, Nov. 2006. Accessed on: Feb 22, 2023. [Online]. Available: https://www.bnl.gov/isd/documents/32718.pdf [2] S. Medewar. “Different Types of Software Testing.” Hackr.io.   https://hackr.io/blog/types-of- software-testing#:~:text=1.-,Unit%20Testing,kind%20of%20tests%2C%20not%20testers . (Accessed Feb. 17, 2023) [3] S. Pittet. “The Different Types of Software Testing.” Atlassian.com. https://www.atlassian.com/continuous-delivery/software-testing/types-of-software-testing . (Accessed Feb. 17, 2023) [4] "IEEE Recommended Practice on Software Reliability," in IEEE Std 1633-2016 (Revision of IEEE Std 1633-2008), vol., no., pp.1-261, 18 Jan. 2017, doi: 10.1109/IEEESTD.2017.7827907. [5] "Standard for Software Engineering - Software Life Cycle Processes - Risk Management," in ISO/IEC 16085:2004(E) IEEE Std 1540-2001 , vol., no., pp.1-30, 1 Oct. 2004, doi: 10.1109/IEEESTD.2004.6298075. [6] P. McQuaid, Ed., “Software Disasters- Understanding the Past, to Improve the Future” in   Journal of Software: Evolution and Process , 2012. [Online]. Available:   https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/smr.500 [7] X. Li, "Quality Assurance and Health Management for Software Systems." Order No. 28390851, The University of Texas at Dallas, United States -- Texas, 2019. [Online] Available: https://www.proquest.com/docview/2497183785?pq-
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
13 origsite=primo&parentSessionId=RdsQr3UCbunmWYQlTxFVwp4uQYcGRzjtqk5f1wEyXZE %3D [8] D. Birsch, Ed., “Moral Responsibility for Harm Caused by Computer System Failures” in   Ethics and Information Technology , 2004. [Online]. Available:   https://link.springer.com/content/pdf/10.1007/s10676-005-5609-5.pdf .