LHashCOCS3319Fall2023

docx

School

Sam Houston State University *

*We aren’t endorsed by this school

Course

3319

Subject

Computer Science

Date

Dec 6, 2023

Type

docx

Pages

6

Uploaded by PrivateCrocodile3673

Report
Hash Lab COSC 3319 Fall 2023 Burris Due: All classes prior to High Noon November 24. Late labs will not be accepted! Early submissions are appreciated. For your convenience a file of 200+ random words and names (Words200D16) has been provided on Blackboard. Each line of text in the file is exactly 16 characters in length with the word left justified. You may utilize any code developed in class directly in the lab. While not initially apparent, this lab will be more of a thought problem than coding problem. A careful examination of the sample code utilized in class will reveal a great deal of the code required for the lab has already been developed in class. Indeed, with so many examples , I assume you should be able to complete the “A” option code with little trouble. Hence grading will emphasize evaluation and presentation of results for correctly coded programs . Be professional grade! "C" Option: Assume a main memory hash table of size 128 . For convenience, assume the characters in the key are numbered 1 through 16. “My” hash address must be calculated as follows using Wallgol: HA = ( str( 1:2) / 256 + str( 3:4) / 277 + str( 5:5 ) ) / 2**10 + str(6:6) / 313 + str(7:7) / 3 + str(10:10) For str = “ABCDEFGHIJKLMNOP”: HA = ( AB / 256 + CD / 277 + E ) / 2**10 + F / 313 + G / 3 + ‘J’ All alphabetic/numeric characters in a key are 8 bit ASCII, left justified. Keys containing less than 16 alphabetic/numeric characters are padded with spaces on the right to a total of 16 characters. I have never used this function. It should be substantially less than optimal generating a plethora of collisions. All characters and character strings must be treated as integers in all steps of the calculation (cast or coercion) ! You may use the slice operator or simple subscripting depending on how you define the keys. You must use a highlighter to mark all code implementing the calculation for my hash function and your hash function . I will not grade the lab if you have not marked the code as required ! I recommend using a long_integer to hold the intermediate results of hash calculations. A “slice” in Ada allows the user to treat a substring as a single unit. If “str” has the value “ABCD” then str(2..3) refers to the characters in positions 2 through 3 as a group from the string str, i.e., ”BC.” “str(2..2) or str[2] would be the second character in the string, i.e., “B”. I am assuming you have instantiated appropriate functions (coercion/cast) which converts character strings (8 bits per ASCII character) to long integers (32 bits) so overflow will not occur in the sum.
Each entry placed in the hash table must be a record with 3 fields. The first field is the key, the second the initial hash address, and the third field is the number of probes to insert the key in the table counting the initial hash address as the first probe (sum of first probe plus the number of collisions for insertion). A) Create a hash table in main memory and hash the first 110 key in the data file in the table (approximately 86% full). Use the linear probe technique developed in class to handle collisions. Now look up the first 25 keys placed in the table. Print the minimum, maximum, and average number of probes required to locate the first 25 keys placed in the table. Now search for the last 25 keys placed in the table. Print the minimum, maximum, and average number of probes required to locate the last 25 keys placed in the table. Print the contents of the hash table clearly indicating open entries (this should allow you to see the primary/secondary clustering effect). Calculate and print the theoretical expected number of probes to locate a random item in the table. Explain your empirical results in light of the theoretical results. Your grade point will be highly dependent on your explanation! B) Now search for all 110 keys in the table. Calculate and print the theoretical expected number of probes to locate a random item in the table. Next calculate the actual average number of probes to find all keys in the table. YOU MUST PHYSICALLY SEARCH FOR EACH KEY TO CALCULATE THE STATISTICS! Discuss the empirical results in light of the theoretical results. Your grade points will be highly dependent on your explanation! C) Repeat A and B above but use the random probe for handling collisions as developed in class. Your grade points will be highly dependent on your explanation! D) Present all your results in a single neat tabular typed table . Compare your results to the theoretical results. If there are differences, please explain why . I expect you to physically calculate the theoretical values for comparison to your empirical results for both the linear and random probes ! You may not ask a friend, enemy, hire a consultant, or anyone else for help. You may not look them up in some table or the internet! Providing someone else with the results will not only cause them to fail the lab but you as well . You may however teach them to use a calculator, spreadsheet, log tables, or other techniques, explain ASCII or details of computer architecture. You may help others with coding and logic errors. You may not share your code ! E) My required hash function has multiple weaknesses. Criticize my hash functions specific weaknesses on a technical basis. To receive full credit, you must state clearly and explicitly why my hash function should fail ! I will not be impressed by a statement such as “the table size is a power of two” unless you can specifically tie the failure of my hash function. Based on your criticism, write a better hash function. You must explain explicitly why your hash function should be better from both a theoretical and empirical
standpoint based on your previous explanation of why my function is bad. You will not receive credit for simply writing a hash function that performs better . Implement your hash function and generate the same results as required for parts A thru E presented in tabular format for comparison. Formally evaluate the results as part of your lab (typed evaluation ). The results for all parts of the lab should appear in a single table . Shame on you if your hash function’s performance does not exceed at least my hash for both the linear and random probe. It is possible to exceed the theoretical performance values with sufficient thought. Script Kiddies may have trouble evaluating the failures of my hash function and creating a better function. Looking for “engineers!” "B" Option: First complete the "C" option using a main memory hash table. Now complete the "C" option using a random access (relative) file for the hash table . Main memory may not be used to store any portion of the hash table! Note you are only replacing where the data is stored or retrieved. For example, replacing store “hashTab[ J ] x” with a “write” statement and get “x hashTab[ J]” with a “read” statement using a relative file as opposed to main memory! All the rest of the code remains the same . Generate the same results as required for parts A thru E presented in tabular format for comparison. You must provide a professional discussion of the results. "A" Option: The “C/B” Options must implement my hash function and your hash code. programming actual language function including restrictions on how parameters are passed. Prototypes follow for Ada, Java, “C,” and C++. A) function BurrisHash( parms: in) return HA begin °°°end; B) procedure YourHash( parm1, parms: in Integer; HA: out ) begin °°°end; For Ada, “In” is the equivalent of call by value in Java, “C,” and C++. This is an important safe coding technique. “Call-by-Value” protects against modifying “VALUE” parameters from accidental or malicious modification (Trojan Horse) in the invoking subprogram . “Out” allows the procedure/function to return a value to the invoking program. The invoking program or task cannot inadvertently pass information to the invoked program. Opposite of call-by-value. “Inout” is equivalent to “call by reference” in sequential programming in Java, “C,” and C++. “Inout” has an especially important security role for Ada in real time systems (primary reason for which Ada was developed and utilized) . Nothing is changed/updated in calling
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
task until the “procedure/function ” returns!” For example, assume the main code for the spaceship invokes a separate task to modify its course. In Java, “C,” and C++ if a call parameter passed by reference arer modified in the invoked task it is actually physically modifying the parameter(s) in the main program. If the task incurs an error and terminates you have lost the original data required for the next attempt to adjust course. For “inout” parameters in Ada the task must complete successfully for the modifications to occur in the main program. “Inout” helps insure safe programming allowing the main calling task to recover and restart task that failed with all original data intact . “C”/C++ and Java support procedures but retain the keyword “function.” The typical prototype for the equivalent of an Ada “procedure” is “void function( parm1, parm2: integer; result1, result2: integer CBR) { --- } Parm1 and parm2 are input parameters required for result calculation. Frequently call- by-value (CBV). Result1 and result2 are the values to be returned to the invoking program hence must be of type “reference” (call-by-reference or CBR). Require format all grading options when printing the contents of the hash table for both the linear and random probe: What is stored in the table. For example the results of “your” hash function. You only have to print the contents of the hash table for your hash functions, not mine! address Contents at the address Original hash address Final hash address Number of probes to store/retriev e 1 Joe 1 1 1 2 Mary 74 2 26 3 °°° °°° °°° °°° °°° 128
REPORT ALL GRADING OPTIONS Results of the lab must be submitted in the following order. 1) A cover sheet clearly stating the grading option you have completed, your name and days (MWF or TTh) your class meets. 2) A single table containing all results for all grading options followed by discussion (explanation of results) for the grading option. 3) The code for the grading option. 4) Next include the required memory contents with locations from 1 through 128 including the key (in its final resting spot), and number of probes to store/locate the key in ascending numeric order from 1 through 128. Do this for both the linear and random probes . 5) Repeat the above steps for the “B” Option. Combine the discussion for both the “C” and ”B” options. 6) Repeat the above steps for the “A” Option. Combine the discussion for the “C,” ”B” and “A” options. 7) You must make it easy for me to locate the results. You can do this by numbering the pages and including a table of contents on the cover page. Alternately, some form of tabs braking up the sections in the specified order. Many have accomplished this in the past using scotch tape tabs.
YOU ARE THE EXPERT - I AM THE CLIENT. Convenience me you have done a professional job! Partial Sample Table “C” Option “C” Option: My Hash Linear Your Hash Linear My Hash Random Your Hash Random First 30 Low 40% High 85% Low 40% High 85% Low 40% High 85% Low 40% High 85% Min Probe Max Average Theoretical Last 30 Low 40% High 85% Low 40% High 85% Low 40% High 85% Low 40% High 85% Min Probe Max Average Theoretical Minimum, maximum and average number of probes to locate all keys in the table! A professional verifies their hash function to be sure it is calculating hash addresses correctly. Hint: Compare your hand calculation of hash addresses with those produced by your code. Use your knowledge of the character set to do the calculations by hand. If you are not verifying results, you are not a professional.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help