What is Multi-Threading?
An algorithm that executes a single instruction in a uniprocessor system is called a serial algorithm. In the case of a multiprocessor computer system, multiple instructions are executed in parallel. A multithreaded algorithm is an algorithm that enables multiple instances of the code to be executed concurrently.
Multithreading is a concept that allows the CPU to execute threads independently while sharing similar resources. It is an important CPU feature that works on the process. Thread is a sequence of instructions that executes in parallel with other threads. Because of multithreading, multiple tasks can be performed within a single process.
Multithreading in operating system
Multithreading allows numerous threads corresponding to a process to run concurrently for optimum CPU use. Thread is the smallest unit of process code and is referred to as a lightweight process always within a process. It has its own countdown timer. It logs the following information:
- Directions for the next step.
- System registers contain the current working variables of the process.
- A stack that keeps track of the execution history.
Threads exchange information such as code segments, data segments, and open files. Any changes made to one thread's information are visible to all the other threads. There are some examples of multithreading, such as a word processor that displays a document uses different threads to perform tasks such as checking the spelling and grammar of the content and generating a pdf version of the document, all while typing content in the document. Multiple threads are used to load material, display animations, play a video, and so forth in multiple tabs corresponding to an internet browser.
Multithreading refers to a central processing unit's ability to run many code threads simultaneously, as long as the operating system allows it. This approach is not the same as multiprocessing. Dynamic multithreading is also possible on a single CPU. In a multi-processor machine, the active thread runs many threads simultaneously on different cores.
Simultaneous Multithreading (SMT) is a processor architecture that combines hardware multithreading with superscalar processing. SMT has a cost leadership strategy over many processors on a single chip. It can dynamically assign execution resources to the threads that demand each cycle.
Thread lifecycle
A thread's lifespan is divided into several parts. The various life cycle phases of a thread are listed below.
New: In this state, the lifecycle of the newly created thread (new thread) begins. It will stay in this state until a program starts.
Runnable: When a thread starts, it becomes runnable. It is deemed to be carrying out the work that has been assigned to it.
Waiting: The presently running thread enters the waiting state while waiting for another thread to complete a job and then transits back after receiving a signal from the other thread.
Timed waiting: A thread enters this state when it executes a method with a timeout parameter. It remains in this state until the timeout is over or a signal is received.
Terminated (Dead): When a thread completes its task, it reaches this state.
Execution type of multithreading
There are two types of execution:
Concurrent execution
Concurrent execution occurs when a processor successfully switches resources between threads in a multithreaded process on a single processor. Concurrent execution can be implemented efficiently using a variety of approaches. Each computational operation is implemented as a process in the operating system. The computing operations are implemented as a group of threads within a single operating system process.
Parallel execution
When each thread in a multithreaded process executes on a distinct processor simultaneously, it is known as parallel execution or parallelism. In the parallelism programming concept, various processes take place on the same machine or multiple types of equipment. Multithreading refers to executing multiple sequential sets of instructions (threads) simultaneously or in parallel time. On a single processor, these threads might execute. Multiple threads can execute at the same time on a multiprocessor. It is called parallel processing. In this case, each processor or core executes a separate thread in real-time. A distinct hardware thread can also run parallel with an independent software thread.
Dynamic Multithreading
It refers to an architecture that allows a single application to be executed in several threads dynamically. At procedure and loop borders, hardware creates profitably executed threads on a parallel multithreading pipeline. It is efficient for multiple processing. DMT refers to dynamic multithreading (Dynamic Multiple threads) where the threads are developed dynamically from the same software. The DMT processor uses a parallel multithreading pipeline to increase CPU utilization. Even though the DMT processor is built on active simultaneous multiple threads, multi-scalar architecture significantly impacts the execution method. The multi-scalar implements multiple control flows to reduce instruction fetch stalls and leverage control independence.
In DMT, program execution begins with a single thread. Hardware divides the program into portions at loop and procedure borders as instructions are decoded, and these pieces are subsequently executed as distinct threads in the SMT pipeline. Control logic keeps a record of the program's thread arrangement and each thread's start PC. The thread's PC ends gathering instructions when it meets the beginning of the next thread in the order list. If a former thread never reaches the start PC of the next thread in the ordered list, the upcoming thread in the ordered list is considered unpredicted and smashed. Parallelism is sought by threads far into the program. Data mispredictions are widespread because the threads don't have to wait for respective inputs.
DMT microarchitecture contains two pipelines: execution pipeline and recovery pipeline.
DMT Microarchitecture
Below is a figure(a) that represents the DMT block diagram. Every thread has its PC, trace buffer, rename tables, and load and store queues. The threads share the memory hierarchy, functional units, physical register files, and branch prediction tables. The duplicated hardware is represented by the dark-colored boxes.
Depending on the simulated layout, the hardware equivalent to the faintly tinted boxes can be reproduced or shared.
Execution pipeline
The execution pipeline is depicted in Figure (b). Instructions are recorded into a trace buffer supplied to the rename unit. Each thread has its trace buffer. The rename unit's main job is to keep the instruction in the waiting buffer. After logically mapping into physical registers, transmit it to the destination mapping to trace buffer. The load and store queue items are allocated at this stage. When the inputs to the instructions become accessible, they are issued for execution. The results are reported back into the physical register file, together with the trace buffers. When instructions have completed execution, they are removed from the pipeline in sequence, freeing up physical registers that are no longer needed. Early retirement is how users refer to this part of the pipe. At this point, it is only a hypothesis that the results are correct and can be used. The instructions and their prospective state are kept in the trace buffers and load/store queues. After all data mispredictions have been detected and fixed, the speculative state is eventually committed in sequence from the trace buffers into a final retirement register file. The load queue entries are then emptied, and the stores are loaded to memory. When instructions from the trace buffer are removed in sequence, threads are removed in sequence.
Recovery pipeline
Figure(c) depicts the data misprediction recovery procedure. The register inputs are compared to the previous thread's final retirement values before it retires. Disambiguation logic in the load queues detects memory mispredictions. When either source makes a mistake, The trace buffer receives the instruction fetch from the ICache. At the moment of misprediction, ordered blocks of instructions are fetched. The instructions are sorted, and those influenced by the fault are combined and passed to the rename unit. The rename unit receives a series of recovery instructions in program order in the dynamic trace, although they are not necessarily contiguous. Local input registers are renamed using a thread recovery map table, and logical destinations are assigned new physical registers. The recovery map table is avoided if a register input is generated outside of the sequence. Rather, if a mapping or data is accessible, the trace buffer provides it. The recovery instructions run and write their results to the new physical registers and trace buffers when their operands become usable.
Common Mistakes
A sequential program is divided into dynamically contiguous components that are not always produced. The CPU then executes these instructions as distinct threads. A thread produces a new thread when it meets a method call or a backward branch. Due to the type of threads employed, threads are not always produced in program order. The policy for thread allocation is proactive when all threads have been used. A new thread earlier in the program order pre-empts the lowest thread in the ordered list.
Concepts and Applications
This subject is important in college courses for both graduate and postgraduate studies, particularly for the following:
- Bachelor of Science in Computer Science
- Master of Science in Information Systems
- Master of Science in Information Technology
Related Topics
- Multithreaded algorithm
- Sequential program in multithreading
- Concurrency and parallelism
Practice Problems
Q1. Can dynamic multithreading happen in a single processor?
- Yes
- No
Correct answer- 1. Yes
Explanation:-yes, Dynamic multithreading can happen in a single processor.
Q2. What is the purpose of hardware that supports multiple threads?
- To transition between a blocked thread and a ready-to-run thread quickly.
- To prevent quick switching between a blocked thread and a thread that is ready to run.
- None
- To enable rapid switching in a blocked thread.
Correct answer- 1. Quickly transition between a blocked thread and a ready-to-run thread.
Q3. What is the first phase of the thread Lifecycle?
- New
- Terminated
- Waiting
- None
Correct answer- 1. New
Explanation:- In this state, the lifecycle of the newly created thread (new thread) begins. It will stay in this state until a program starts.
Q4. A thread is also called as ________________________
- Data segment process
- Heavyweight process
- Overhead process
- Lightweight process
Correct answer: 4. Lightweight process
Explanation:- A thread is also known as a lightweight process. It executes multiple tasks at the same time within a process.
Q5. Which of the following is one of the types of execution way of multithreading?
- Concurrent execution
- Single execution
- Multi execution
- Simultaneous execution
Correct answer: 1. Concurrent execution
Explanation:- When a processor successfully switches resources between threads in a single and multithreaded processor, it is known as concurrent execution.
Want more help with your computer science homework?
*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.
Search. Solve. Succeed!
Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.
Search. Solve. Succeed!
Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.