CS-GY 6233 Homework 4 Explanations
pdf
keyboard_arrow_up
School
New York University *
*We aren’t endorsed by this school
Course
6233
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
Pages
9
Uploaded by CaptainUniverse9620
Samantha Augustin & Samara Augustin
CS-GY 6233 - Introduction to Operating Systems
19 November 2023
Homework 4 Explanations
Part 2 - Mutex
-
What circumstances cause an entry to get lost? Analyze the initial code and write a short
answer to describe what it means for an entry to be “lost,” and which parts of the program
are causing this unintended behavior when run with multiple threads.
-
Based on the code, we assume that the circumstances that cause for an entry to get
lost is that they do not have the resources to access or insert the key-pair into the
table. Therefore, in the insert and receive sections which are called multiple times
(depending on the number of entries) need to be revised so that every time a new
entry needs to be inserted or received they have enough resources to allocate to
the thread. This can also be due to the fact that since multiple threads or multiple
key-value pairs are being written to the hashtable very quickly, it can cause for
some key-value pairs to collide with each other since they are probably trying to
access the same space at the same time, so some entries might be deleted or
overwritten.
-
Data From Original Code
-
Data From Mutex-Based Code -
-
Graph of Prior Running Time vs. Updated Running Time
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
-
Estimate of the Timing Overhead -
-
The timing overhead that I calculated using the data was 147.0130969%. This
value shows that the average time taken to retrieve the 100000 keys was
147.0130969% greater than the average time taken to retrieve keys. Therefore, to
guarantee correctness, the time taken has to be slowed down by 147.0130969%
for all the keys to be retrieved.
-
Explanation of how you came up with that estimate
-
To calculate the time overhead, first the average time taken to retrieve the keys for
the original code and the mutex-based code was calculated. The average time for
the original code was 2.498616882 seconds, while for the mutex-based code was
3.673294059 seconds. Afterwards, the average of mutex-based code was divided
by the original code average and multiplied by 100% to give:
.
3.673294059
2.498616882
× 100% = 147. 0130969%
Part 3 - Spinlock
-
Explanation -
-
If you were to replace all mutexes with spinlocks, what do you think will happen
to the running time? Write a short answer describing what you expect to happen,
and why the differences in mutex vs. spinlock implementations lead you to that
conclusion (it’s okay if your intuition turns out to be wrong, but start with this
answer first).
-
We estimate that if all of the mutexes were replaced with spinlocks that the
running time would increase because when spinlocks lock a thread it only
does so for a long time compared to mutexes. Therefore, we believe that
the execution time would be longer for the spinlocks since it does lock the
threads and hold the resources for a long time. This also means that the
time overhead would be significantly greater than the time overhead for
the mutex-based code.
-
Graph of Prior Running Time vs. Mutex Running Time vs Spinlock Running Time
-
Graph
-
-
Table
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
-
-
Estimate of the Timing Overhead
-
The timing overhead for the spinlock execution compared to the original
execution without mutex or spinlock is 13.00935021 which is a 1300.935021%
percent increase. Also, the timing overhead for the spinlock execution compared
to the execution with mutexes is .8849109692 which is a 884.9109692% percent
increase. Therefore, it shows that on average, the execution for the spinlock
compared to the execution without mutexes or spinlocks is 1300.935021% longer.
Also, the execution for the spinlock compared to the execution with mutexes is
884.9109692% longer. This shows that using spinlocks makes the program run
slower than the others.
-
Explanation of how you came up with that estimate
-
I calculated the timing overhead for the execution time for spinlock in comparison
to the execution time of mutexes, by calculating the average of the time to receive
for both mutexes and spinlocks. After I received the average time to receive for
mutexes (3.673294059) and the spinlocks (32.50538206), I divided the average of
the spinlocks by the average of the mutexes which gave me a timing overhead of
8.849109692, which I then multiplied by 100 to get the percent overhead. I
followed the same process to receive the timing overhead of the spinlock
compared to the execution time of the program without spinlocks or mutexes.
Part 4 - Mutex, Retrieve Parallelization
-
Let’s revisit your mutex-based code. When we retrieve an item from the hash table, do
we need a lock? Write a short answer and explain why or why not.
-
To ensure that the entries were not lost while being inserted or retrieved, locks
were added to the insert and retrieve functions. At the beginning of the project,
during an attempt to figure out part 2 of the project, we found that no keys were
dropped when locks were added to only the insert function and the retrieve
function remained the same. However, when we continuously ran the code with
varying thread values there began to be some discrepancies with the command
line. Therefore, we added the locks to the retrieve function as well which
increased the retrieval time significantly but caused for no keys to be lost and for
there to be no discrepancies in the command line. From our experience, we can
assume that the retrieve function does need locks to make sure that nothing is lost
or overwritten while retrieving multiple keys at the same time. However, if the
goal was just to prevent the entries from being lost, then adding or not adding the
locks is fine, as long as there is a lock in the insert function. This is to make sure
everything runs smoothly.
Part 5 - Mutex, Insert Parallelization
-
Describe a situation in which multiple insertions could happen safely (hint: what’s a
bucket?).
-
A situation that would allow for multiple insertions to happen safely would be to
have a sufficient amount of buckets and locks. A way to implement this is to
allocate a lock per bucket which would allow for different keys to be safely
hashed to different buckets when running concurrently, therefore, this would
drastically prevent the risk of collision or duplicate hashing. Therefore, in order to
implement this idea, rather than using a global lock to use throughout the program
I defined a array of locks according to the number of buckets:
pthread_mutex_t locks[NUM_BUCKETS];
-
Afterwards, in the insert and retrieve functions, I defined the index of the lock in
which I would be using for the bucket: int i = key % NUM_BUCKETS;
-
Therefore, the lock and unlock functions would be defined as
pthread_mutex_lock(&locks[i]); and pthread_mutex_unlock(&locks[i]);
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help