Lecture - 18 - 19

pptx

School

Manchester Community College *

*We aren’t endorsed by this school

Course

202

Subject

Computer Science

Date

Nov 24, 2024

Type

pptx

Pages

30

Uploaded by ChiefWater13497

Report
Lecture Notebooks 18, and 19
Using algorithms related to sorting values in NumPy arrays What are we going to learn in Note 18
Selection sort: It is  a sorting algorithm that repeatedly finds the minimum element in the unsorted portion of an array and swaps it with the element at the beginning of the unsorted portion . This process continues until the entire array is sorted. It is O() where n is the size of the array. O() represents how complex of the algorithm. Larger -> more complex Example : Consider [2, 1, 4, 3, 5] Step 1 : [ 2 , 1, 4, 3, 5 ] -> [ 1 , 2 , 4, 3, 5 ] Step 2 : [ 1 , 2 , 4, 3, 5 ] -> keep Step 3 : [ 1 , 2 , 4 , 3, 5 ] -> [ 1, 2 , 3 , 4 , 5 ] Step 4 : [ 1, 2 , 3 , 4 , 5 ] -> keep and finished! Output : array([1, 2, 3, 4, 5])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
def selection_sort(x): # We define a function called “selection_sort” with input x for i in range(len(x)): # i is the beginning of the unsorted portion swap = i + np.argmin(x[i:]) # find the location/index of the minimum elelment in the # unsorted portion (x[i], x[swap]) = (x[swap], x[i]) # swap the minimum element and the beginning # element return x # Output the sorted x x = np.array([2, 1, 4, 3, 5]) selection_sort(x) Output : array([1, 2, 3, 4, 5])
Bogo sort: It is  based on randomly shuffling the elements of the data structure and then checking if they are correctly sorted. If not, repeat the process. It is O(n×n!) Example : Consider [2, 1, 4, 3, 5] Step 1 : [2 1 4 3 5] -> not sorted -> [3 1 2 5 4] Step 2 : [3 1 2 5 4] -> not sorted -> [5 1 4 3 2] Step 3 : [5 1 4 3 2] -> not sorted -> [2 1 5 3 4] . . . Step 4 : [2 1 5 3 4] -> not sorted -> [3 5 1 2 4] Step 5 : [3 5 1 2 4] -> not sorted -> [5 2 3 1 4] Step 367 : [5 2 3 1 4] -> not sorted -> [1 2 3 4 5] Step 368 : [1 2 3 4 5] -> Sorted and finished! Output : array([1, 2, 3, 4, 5])
def bogosort(x): # We define a function called “bogosort” with input x while np.any(x[:-1] > x[1:]): # Check if the array is correctly sorted np.random.shuffle(x) # If not, randomly shuffling the elementa in the array return x x = np.array([2, 1, 4, 3, 5]) bogosort(x) Output : array([1, 2, 3, 4, 5])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quicksort in NumPy: np.sort or x.sort() . It is O(n logn) x = np.array([2, 1, 4, 3, 5]) y = np.sort(x) #np.sort(x) will not modify the input x print(x) print(y) Output : [2, 1, 4, 3, 5] [1, 2, 3, 4, 5] x.sort() # x.sort() will modify the input x. Be careful! print(x) Output : [1, 2, 3, 4, 5]
Quicksort in NumPy (Cont): argsort will returns the indices of the sorted elements. x = np.array([2, 1, 4, 3, 5]) i = np.argsort(x) print(i) Output : [1 0 3 2 4] i.e., The first element of this result gives the index of the smallest element, the second value gives the index of the second smallest, and so on.  print(x[i]) # Indices in I can then be used via fancy indexing to construct the sorted array Output : [1, 2, 3, 4, 5]
Sorting along rows or columns: Combine np . sort with axis . rand = np.random.RandomState(42) X = rand.randint(0, 10, (4, 6)) print(X) Output : [[6 3 7 4 6 9] [2 6 7 4 3 7] [7 2 5 4 1 7] [5 1 4 0 9 5]] # sort each column of X print(np.sort(X, axis=0)) Output : [[2 1 4 0 1 5] [5 2 5 4 3 7] [6 3 7 4 6 7] [7 6 7 4 9 9]]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Sorting along rows or columns (Cont): rand = np.random.RandomState(42) X = rand.randint(0, 10, (4, 6)) print(X) Output : [[6 3 7 4 6 9] [2 6 7 4 3 7] [7 2 5 4 1 7] [5 1 4 0 9 5]] # sort each row of X print(np.sort(X, axis=1)) Output : [[3 4 6 6 7 9] [2 3 4 6 7 7] [1 2 4 5 7 7] [0 1 4 5 5 9]]
Partial sorts/Partitioning: Use np.partition(X, k) to create an array with the k smallest values in X to the left of the partition, and the remaining values to the right, in arbitrary order. x = np.array([7, 2, 3, 1, 6, 5, 4]) np.partition(x, 3) # three smallest values at the left Output : array([2, 1, 3, 4, 6, 5, 7]) Note : The first three values in the resulting array (i.e., 2, 1, 3) are the three smallest in the array, and the remaining array positions contain the remaining values (i.e., 4, 6, 5, 7). Within the two partitions, the elements have arbitrary order.
Partial sorts/Partitioning (Cont): Multi-dimensional partial sorting rand = np.random.RandomState(42) X = rand.randint(0, 10, (4, 6)) print(X) Output : [[6 3 7 4 6 9] [2 6 7 4 3 7] [7 2 5 4 1 7] [5 1 4 0 9 5]] np.partition(X, 2, axis=1) # two smallest values at the left of each row Output : array([[3, 4, 6, 7, 6, 9], [2, 3, 4, 7, 6, 7], [1, 2, 4, 5, 7, 7], [0, 1, 4, 5, 9, 5]])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Partial sorts/Partitioning (Cont): Use np.argpartition to obtain the indices of the partition. x = np.random.randint(0,10,(4,6)) print(x) Output : [[9 1 9 7 7 4] [9 7 9 7 6 5] [3 7 0 3 5 6] [2 8 8 0 4 6]] np.argpartition(x, 2, axis=0) # indices of the two smallest values at the top of each column Output : array([[3, 0, 2, 3, 3, 0], [2, 1, 3, 2, 2, 1], [1, 2, 0, 1, 1, 2], [0, 3, 1, 0, 0, 3]])
Using NumPy's structured arrays and record arrays, which provide efficient storage for compound, heterogeneous data. What are we going to learn in Note 19
Scenario: Suppose we have three separate arrays of information about four persons, which are respectively their names, ages and weights. name4 = ['Alice', 'Bob', 'Cathy', 'Doug’] # It is name in our lecture note age4 = [25, 45, 37, 19] # It is age in our lecture note weight4 = [55.0, 85.5, 68.0, 61.5] # It is weight in our lecture note Motivation : We would like to organize the three separate arrays into one single structured data set/structure . For example, [('Alice', 25, 55. ) ('Bob', 45, 85.5) ('Cathy', 37, 68. ) ('Doug', 19, 61.5)] That is, the first, second and third coordinates respectively represent name, age and weight. Advantage : Three separate arrays are combined into one single data set/array and we can easily compute the average weight via np.mean( data [‘ weight ’]) , where data is the name of the data set and weight is the variable name for weight.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How to do it: Step 1 : Use compound data type for structured arrays. Important information to specify includes the name of the structured array, size of the structured array, name of the variables and the data type of the variables. # Use a compound data type for structured arrays data = np.zeros( 4 , dtype={'names':(' name ', ' age ', 'weight '), 'formats':(' U10 ', ' i4 ', ' f8 ’)}) # created an empty container array print(data,"\n") print(data.dtype) Output : [('', 0, 0.) ('', 0, 0.) ('', 0, 0.) ('', 0, 0.)] [('name', '<U10'), ('age', '<i4'), ('weight', '<f8’)] Note : 'U10' translates to "Unicode string of maximum length 10," 'i4' translates to "4-byte (i.e., 32 bit) integer," and 'f8' translates to "8-byte (i.e., 64 bit) float." More options later ….
How to do it (Cont): Step 2 : Fill the empty contained array with our lists of values data['name'] = name4 data['age'] = age4 data['weight'] = weight4 print(data) Output : [('Alice', 25, 55. ) ('Bob', 45, 85.5) ('Cathy', 37, 68. ) ('Doug', 19, 61.5)]
How to do it (Cont): Step 3 : Now, we can easily access this handy structured array Example 1 : Get the names of all entries data['name’] Output : array(['Alice', 'Bob', 'Cathy', 'Doug'], dtype='<U10') Example 2 : What information in the last row? data[-1] Output : ('Doug', 19, 61.5)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
How to do it (Cont): Example 3 : Who is in the last row? data[-1]['name’] Output : 'Doug’ Example 4 : Who is under the age 30? data[data['age'] < 30]['name’] Output : array(['Alice', 'Doug'], dtype='<U10')
How to do it (Cont): Example 5 : Sorting by age, then name, if ages are equal # Sort by age, then name, if ages are equal np.sort(data, order=['age', 'name’]) Output : array([('Doug', 19, 61.5), ('Alice', 25, 55. ), ('Cathy', 37, 68. ), ('Bob', 45, 85.5)], dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])
How to do it (Cont): Example 6 : Partitioning by weight: the smallest 2 weights on the left of the array # Partition by weight np.partition(data, 2, order=['weight’]) Output : array([('Alice', 25, 55. ), ('Doug', 19, 61.5), ('Cathy', 37, 68. ), ('Bob', 45, 85.5)], dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')]) Note : Pandas provides a  Dataframe  object, which is a structure built on NumPy arrays that offers a variety of useful data manipulation functionality similar to what we have shown here, as well as much, much more. We will learn Pandas in Part II.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Creating structured arrays Example 1 : Dictionary method np.dtype({'names':('name', 'age', 'weight'), 'formats':('U10', 'i4', 'f8')}) Output : dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f8')]) Example 2 : Using Python types or NumPy dtypes to produce the same result np.dtype({'names':('name', 'age', 'weight'), 'formats':((np.str_, 10), np.int32, np.float64)}) Output : dtype([('name', '<U10'), ('age', ‘<i4'), ('weight', ‘<f8')])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Creating structured arrays (Cont) Example 3 : Using compound type np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f8’)]) Output : dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f8')])
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quick notes about dtype Data type object (dtype) informs us about the layout of the array: 1. Type of the data (integer, float, string, etc.) 2. Size of the data (number of bytes. 1 byte = 8 bits (0 or 1)) 3. The byte order of the data (little-endian or big-endian) 4. If the data type is a sub-array, what is its shape and data type?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quick notes about dtype (Cont) Character Description Example 'b' Byte np.dtype('b') 'i' Signed integer np.dtype('i4') == np.int32 'u' Unsigned integer np.dtype('u1') == np.uint8 'f' Floating point np.dtype('f8') == np.int64 'c' Complex floating point np.dtype('c16') == np.complex128 'S', 'a' String np.dtype('S5') 'U' Unicode string np.dtype('U') == np.str_ 'V' Raw data (void) np.dtype('V') == np.void
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quick notes about dtype (Cont) Example 1 : >i4 - Byte order is >, which represents big-endian byte ordering - Type is i, which represents integer - Size is 4, means 4 bytes - Altogether, it means Python Program created a data type object containg a 32-bit big- endian integer - i.e., np.int32
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quick notes about dtype (Cont) Example 2 : <f8 - Byte order is <, which represents little-endian byte ordering - Type is f, which represents floating-point number - Size is 8, means 8 bytes - Altogether, it means Python Program created a data type object containg a 64-bit little- endian floating-point number - i.e., np.float64
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Quick notes about dtype (Cont) Example 3 : <U10 - Byte order is <, which represents little-endian byte ordering - Type is U, which represents unicode string - Size is 10, means 10 bytes - Altogether, it means Python Program created a data type object containg a 10-character little-endian string - i.e., (np.str_, 10) or (np.unicode_, 10)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Appending new array Example : Using np.append(old_array, new_data) Recall : We create the data set data as follows name = ['Alice', 'Bob', 'Cathy', 'Doug'] age = [25, 45, 37, 19] weight = [55.0, 85.5, 68.0, 61.5] dt = np.dtype([('name', 'U10'), ('age', 'i8'), ('weight', 'f8')]) # using compound type data = np.zeros(4, dtype=dt) data['name'] = name data['age'] = age data['weight'] = weight Want : Append the data array data with the record (John ,64, 89)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Appending new array (Cont) data=np.append(data, np.array([('John',64,89)], dtype=dt)) print(data) Output : [('Alice', 25, 55. ) ('Bob', 45, 85.5) ('Cathy', 37, 68. ) ('Doug', 19, 61.5) ('John', 64, 89. )]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help