CS-GY 6233 Homework 4 Explanations

pdf

School

New York University *

*We aren’t endorsed by this school

Course

6233

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

pdf

Pages

9

Uploaded by CaptainUniverse9620

Report
Samantha Augustin & Samara Augustin CS-GY 6233 - Introduction to Operating Systems 19 November 2023 Homework 4 Explanations Part 2 - Mutex - What circumstances cause an entry to get lost? Analyze the initial code and write a short answer to describe what it means for an entry to be “lost,” and which parts of the program are causing this unintended behavior when run with multiple threads. - Based on the code, we assume that the circumstances that cause for an entry to get lost is that they do not have the resources to access or insert the key-pair into the table. Therefore, in the insert and receive sections which are called multiple times (depending on the number of entries) need to be revised so that every time a new entry needs to be inserted or received they have enough resources to allocate to the thread. This can also be due to the fact that since multiple threads or multiple key-value pairs are being written to the hashtable very quickly, it can cause for some key-value pairs to collide with each other since they are probably trying to access the same space at the same time, so some entries might be deleted or overwritten. - Data From Original Code
- Data From Mutex-Based Code -
- Graph of Prior Running Time vs. Updated Running Time
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
- Estimate of the Timing Overhead - - The timing overhead that I calculated using the data was 147.0130969%. This value shows that the average time taken to retrieve the 100000 keys was 147.0130969% greater than the average time taken to retrieve keys. Therefore, to guarantee correctness, the time taken has to be slowed down by 147.0130969% for all the keys to be retrieved. - Explanation of how you came up with that estimate - To calculate the time overhead, first the average time taken to retrieve the keys for the original code and the mutex-based code was calculated. The average time for the original code was 2.498616882 seconds, while for the mutex-based code was 3.673294059 seconds. Afterwards, the average of mutex-based code was divided by the original code average and multiplied by 100% to give: . 3.673294059 2.498616882 × 100% = 147. 0130969% Part 3 - Spinlock - Explanation -
- If you were to replace all mutexes with spinlocks, what do you think will happen to the running time? Write a short answer describing what you expect to happen, and why the differences in mutex vs. spinlock implementations lead you to that conclusion (it’s okay if your intuition turns out to be wrong, but start with this answer first). - We estimate that if all of the mutexes were replaced with spinlocks that the running time would increase because when spinlocks lock a thread it only does so for a long time compared to mutexes. Therefore, we believe that the execution time would be longer for the spinlocks since it does lock the threads and hold the resources for a long time. This also means that the time overhead would be significantly greater than the time overhead for the mutex-based code. - Graph of Prior Running Time vs. Mutex Running Time vs Spinlock Running Time - Graph - - Table
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
- - Estimate of the Timing Overhead - The timing overhead for the spinlock execution compared to the original execution without mutex or spinlock is 13.00935021 which is a 1300.935021% percent increase. Also, the timing overhead for the spinlock execution compared to the execution with mutexes is .8849109692 which is a 884.9109692% percent increase. Therefore, it shows that on average, the execution for the spinlock compared to the execution without mutexes or spinlocks is 1300.935021% longer. Also, the execution for the spinlock compared to the execution with mutexes is 884.9109692% longer. This shows that using spinlocks makes the program run slower than the others.
- Explanation of how you came up with that estimate - I calculated the timing overhead for the execution time for spinlock in comparison to the execution time of mutexes, by calculating the average of the time to receive for both mutexes and spinlocks. After I received the average time to receive for mutexes (3.673294059) and the spinlocks (32.50538206), I divided the average of the spinlocks by the average of the mutexes which gave me a timing overhead of 8.849109692, which I then multiplied by 100 to get the percent overhead. I followed the same process to receive the timing overhead of the spinlock compared to the execution time of the program without spinlocks or mutexes. Part 4 - Mutex, Retrieve Parallelization - Let’s revisit your mutex-based code. When we retrieve an item from the hash table, do we need a lock? Write a short answer and explain why or why not. - To ensure that the entries were not lost while being inserted or retrieved, locks were added to the insert and retrieve functions. At the beginning of the project, during an attempt to figure out part 2 of the project, we found that no keys were dropped when locks were added to only the insert function and the retrieve function remained the same. However, when we continuously ran the code with varying thread values there began to be some discrepancies with the command line. Therefore, we added the locks to the retrieve function as well which increased the retrieval time significantly but caused for no keys to be lost and for there to be no discrepancies in the command line. From our experience, we can assume that the retrieve function does need locks to make sure that nothing is lost or overwritten while retrieving multiple keys at the same time. However, if the goal was just to prevent the entries from being lost, then adding or not adding the locks is fine, as long as there is a lock in the insert function. This is to make sure everything runs smoothly. Part 5 - Mutex, Insert Parallelization - Describe a situation in which multiple insertions could happen safely (hint: what’s a bucket?). - A situation that would allow for multiple insertions to happen safely would be to have a sufficient amount of buckets and locks. A way to implement this is to allocate a lock per bucket which would allow for different keys to be safely hashed to different buckets when running concurrently, therefore, this would drastically prevent the risk of collision or duplicate hashing. Therefore, in order to implement this idea, rather than using a global lock to use throughout the program I defined a array of locks according to the number of buckets: pthread_mutex_t locks[NUM_BUCKETS]; - Afterwards, in the insert and retrieve functions, I defined the index of the lock in which I would be using for the bucket: int i = key % NUM_BUCKETS;
- Therefore, the lock and unlock functions would be defined as pthread_mutex_lock(&locks[i]); and pthread_mutex_unlock(&locks[i]);
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help