BAN110_2

pdf

School

Seneca College *

*We aren’t endorsed by this school

Course

110

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

14

Uploaded by BailiffComputer14693

Report
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 1/14 Program Summary - HIMANI_119717239_Assignment2.sas Execution Environment Author: u63731080 File: /home/u63731080/BAN110/HIMANI_119717239_Assignment2.sas SAS Platform: Linux LIN X64 3.10.0-1062.12.1.el7.x86_64 SAS Host: ODAWS02-USW2-2.ODA.SAS.COM SAS Version: 9.04.01M7P08062020 SAS Locale: en_GB Submission Time: 21/02/2024, 14:31:52 Browser Host: CPEBC4DFB434483-CMBC4DFB434480.CPE.NET.CABLE.ROGERS.COM User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Application Server: ODAMID00-USW2-2.ODA.SAS.COM Code: HIMANI_119717239_Assignment2.sas libname record '/home/u63731080/BAN110' ; data customer_record1 ; set record.customer_all ; run ; /*Q1. Examine the target variable y: Use PROC FREQ to list a simple frequency table for the variable y. */ title 'Simple Frequency Table of target variable y' ; proc freq data = customer_record1 ; table y ; run ; title ; /* Q2. Examine the variable "contact" and study its dependency with the target variable y. Use PROC FREQ to list a simple frequency table for the variable "contact". Examine the output for invalid values. */ title 'Simple Frequency Table of variable Contact' ; proc freq data = customer_record1 ; table contact ; run ; title ; /* Q3. Contiengency table Contact by y and mosaic plot: create a 2x2 contingency table along with a mosaic plot. Show the statistics for Table of contact by y. */ proc freq data = customer_record1 ; tables contact * y / chisq plots = mosaicplot ; run ; /* Interpret: (a) Based on the mosaic plot, do you assume association between the two variables? (b) Based on the Contingency coefficient, is there an association between the two variables? Answer: (a) Based on the mosaic plot, I would assume that there is an association between the two variables. It apperas that customers contacted via cellular were more likely to buy the Certificate of Deposit(CD) from the institution. Customers were contacted via telephone were the next most likely and the least likely were customers contacted by unknown meth (b) According to contingency coefficient with a value of 0.2541 there is medium association between the two variables. The closer to 0 the contigency coefficient is, the association is weaker and closer to 1 the contigency coefficient is, the a */ /* Q4. Examine the variable "education" /* 4.1. define a new format, name it education_Check and use it to identify invalid values for the variable education. Valid values are 'primary', 'secondary', 'tertiary', 'unknown'. Refer to program 1.8. Chapter 1 - Working with Character Data Cody's Data Cleaning Techniques Using SAS, Third Edition*/ Proc format ; value $ education_check 'primary' , 'secondary' , 'tertiary' , 'unknown' = 'valid' 'SECONDARY' = 'invalid' ; run ; title 'Checking Invalid values of Education' ; proc freq data = customer_record1 ; table Education / nocum nopercent missing ; format Education $education_check. ; run ; title ; /* 4.2. Use the function lowcase on education column. use the same dataset name for output dataset. */ data customer_record1 ; set record.customer_all ; Education = lowcase ( Education ); run ;
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 2/14 /*4.3. show the simple frequency table after the change. */ title 'Simple frquency table of variable Education' ; proc freq data = customer_record1 ; table Education / nocum nopercent missing ; run ; title ; /* Q5. Examine the variable "marital". 5.1. Use PROC print with a where statement to check for data errors in the variable marital. Consider the valid values as "single", "married", "divorced". Refer to program 1.6. Chapter 1 - Working with Character Data Cody's Data Cleaning Techniques Using SAS, Third Edition */ title 'Table of Invalid values of variable marital ' ; proc print data = customer_record1 ; var marital ; id customer_id ; where marital not in ( 'single' , 'divorced' , 'married' ); run ; title ; /* 5.2. Use the function lowcase on the variable marital. */ data customer_record1 ; set record.customer_all ; marital = lowcase ( marital ); run ; /* 5.3. show the simple frequency table after the change. */ title 'Simple Frequency Table of variable marital' ; proc freq data = customer_record1 ; table marital / nocum nopercent missing ; run ; title ; /* Q6. Examine the variable "Job". 6.1. Use PROC FREQ to list a simple frequency table. */ title 'Simple Frequency Table of variable Job' ; proc freq data = customer_record1 ; table Job / nocum nopercent missing ; run ; title ; /* 6.2. write a code to combine the categories "admin." and "ADMINISTRATION" for the job variable as "admin". replace any occurrence of the value "ADMINISTRATION" with "admin". */ data customer_record1 ; set record.customer_all ; if Job in ( 'admin.' , 'ADMINISTRATION' ) then Job = 'admin' ; run ; /* 6.3. show the simple frequency table after the change. */ title 'Simple Frequency Table of variable Job after change' ; proc freq data = customer_record1 ; table Job / nocum nopercent missing ; run ; title ; /* Q7. checking missing values Adapt the code in program 7.2. of Chapter 1 so it works on customer_all dataset. Refer to program 7.2. Counting Missing Values for Character Variables in Chapter 1 - Working with Character Data Cody's Data C title "Checking Missing Character Values"; proc format; value $Count_Missing ' ' = 'Missing' other = 'Nonmissing'; run; proc freq data=Clean.Patients; tables _character_ / nocum missing; format _character_ $Count_Missing.; run; */ title "Checking Missing Character Values" ; proc format ; value $ Character_Count_Missing ' ' = 'Missing' other = 'Nonmissing' ; run ; proc freq data = customer_record1 ; tables _character_ / nocum missing ; format _character_ $Character_Count_Missing. ; run ; title ; title "Checking Missing Numeric Values" ; proc format ; value Numeric_Count_missing .= 'missing' other = 'nonmissing' ; run ;
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 3/14 proc freq data = customer_record1 ; tables _numeric_ / nocum missing ; format _numeric_ Numeric_Count_Missing. ; run ; title ; /* Q8. create a new variable named jobMF to indicate the most frequent job category Reuse the code provided in ch17, section 17.3.2. check the most frequent job category based on the output of proc freq. create the new variable jobMF print the first few observations. */ title 'Simple Frequency Table of variable Job' ; proc freq data = customer_record1 order = freq ; table Job / nocum nopercent missing ; run ; title ; /* Abbreviations: MF-MostFrequent and NM-NotMostFrequent */ data customer_record1 ; set record.customer_all ; if job = 'management' then jobMF = 'MF' ; else jobMF = 'NM' ; run ; proc print data = customer_record1 ( obs = 10 ); run ; /* Q9. Removing units from a value and standardizing For a reference example, refer to program 1.10 from chapter 1: Working with Character Data Cody's Data Cleaning Techniques Using SAS, Third Edition Section: Removing Units from a Program 1.10: Converting Weight with Units to Weight in Kilograms *Program to Remove Units from Numeric Data; data Units; input Weight $ 10.; Digits = compress(Weight,,'kd'); 1 if findc(Weight,'k','i') then 2 Wt_Kg = input(Digits,5.); else if not missing(Digits) then Wt_Kg = input(Digits,5.)/2.2; 3 datalines; 100lbs. 110 Lbs. 50Kgs. 70 kg 180 ; title "Reading Weight Values with Units"; proc print data=Units noobs; format Wt_Kg 5.1; run; */ data units ; input Length $ 10. ; datalines ; 100m. 110 ft. 50M. 70 Ft 180 ; run ; proc print data = units ; run ; /* Given the following units data, */ /*(a) use the approriate function to keep only digits. name the new variable "digits" */ /* (b) use the function findc on length to search for the character 'm' (stands for meter), if m is found, keep the value as it is, if not, make a foot to meter conversion. */ data units ; input Length $ 10. ; digits = input ( compress ( Length , , 'kd' ), best32. ); if findc ( Length , 'm' , 'i' ) then Length_m = input ( digits , best32. ); else Length_m = input ( digits , best32. )* 0.3048 ; datalines ; 100m 110 ft 50M. 70 Ft 180 ; run ; proc print data = units ; run ;
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 4/14 Log: HIMANI_119717239_Assignment2.sas Notes (61) 1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; NOTE: ODS statements in the SAS Studio environment may disable some output features. 69 70 libname record'/home/u63731080/BAN110'; NOTE: Libref RECORD was successfully assigned as follows: Engine: V9 Physical Name: /home/u63731080/BAN110 71 data customer_record1; 72 set record.customer_all; 73 run; NOTE: There were 10578 observations read from the data set RECORD.CUSTOMER_ALL. NOTE: The data set WORK.CUSTOMER_RECORD1 has 10578 observations and 17 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 3421.71k OS Memory 27048.00k Timestamp 21/02/2024 07:31:51 PM Step Count 55 Switch Count 2 Page Faults 0 Page Reclaims 544 Page Swaps 0 Voluntary Context Switches 17 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 2568 74 75 76 /*Q1. Examine the target variable y: 77 Use PROC FREQ to list a simple frequency table for the variable y. */ 78 79 title'Simple Frequency Table of target variable y'; 80 proc freq data=customer_record1; 81 table y; 82 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 2937.75k OS Memory 25512.00k Timestamp 21/02/2024 07:31:51 PM Step Count 56 Switch Count 2 Page Faults 0 Page Reclaims 374 Page Swaps 0 Voluntary Context Switches 13 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 272 83 title; 84 85 /* Q2. Examine the variable "contact" and study its dependency with the target variable y. 86 Use PROC FREQ to list a simple frequency table for the variable "contact". 87 Examine the output for invalid values. */ 88 89 title'Simple Frequency Table of variable Contact'; 90 proc freq data=customer_record1; 91 table contact; 92 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2044.18k OS Memory 25768.00k Timestamp 21/02/2024 07:31:51 PM Step Count 57 Switch Count 2 Page Faults 0 Page Reclaims 328 Page Swaps 0 Voluntary Context Switches 13 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 93 title; 94 95 96 /* Q3. Contiengency table Contact by y and mosaic plot: 97 create a 2x2 contingency table along with a mosaic plot. 98 Show the statistics for Table of contact by y. */ 99 100 101 proc freq data=customer_record1; 102 tables contact * y / chisq plots=mosaicplot; 103 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.15 seconds user cpu time 0.07 seconds system cpu time 0.01 seconds memory 10422.31k
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 5/14 OS Memory 33332.00k Timestamp 21/02/2024 07:31:51 PM Step Count 58 Switch Count 4 Page Faults 0 Page Reclaims 2291 Page Swaps 0 Voluntary Context Switches 225 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 1056 104 105 106 /* Interpret: 107 (a) Based on the mosaic plot, do you assume association between the two variables? 108 (b) Based on the Contingency coefficient, is there an association between the two variables? 109 Answer: 110 (a) Based on the mosaic plot, I would assume that there is an association between the two variables. 111 It apperas that customers contacted via cellular were more likely to buy the Certificate of Deposit(CD) from the 111 ! institution. 112 Customers were contacted via telephone were the next most likely and the least likely were customers contacted by unknown 112 ! methods. 113 (b) According to contingency coefficient with a value of 0.2541 there is medium association between the two variables. 114 The closer to 0 the contigency coefficient is, the association is weaker and closer to 1 the contigency coefficient is, 114 ! the association is stronger. 115 */ 116 117 118 /* Q4. Examine the variable "education" 119 120 /* 4.1. define a new format, name it education_Check and 121 use it to identify invalid values for the variable education. 122 Valid values are 'primary', 'secondary', 'tertiary', 'unknown'. 123 Refer to program 1.8. 124 Chapter 1 - Working with Character Data 125 Cody's Data Cleaning Techniques Using SAS, Third Edition*/ 126 127 Proc format; 128 value $education_check 129 'primary','secondary','tertiary','unknown' ='valid' 130 'SECONDARY' = 'invalid'; NOTE: Format $EDUCATION_CHECK is already on the library WORK.FORMATS. NOTE: Format $EDUCATION_CHECK has been output. 131 run; NOTE: PROCEDURE FORMAT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 249.71k OS Memory 30880.00k Timestamp 21/02/2024 07:31:51 PM Step Count 59 Switch Count 0 Page Faults 0 Page Reclaims 14 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 32 132 133 title'Checking Invalid values of Education'; 134 proc freq data=customer_record1; 135 table Education/ nocum nopercent missing; 136 format Education $education_check.; 137 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2027.78k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 60 Switch Count 2 Page Faults 0 Page Reclaims 329 Page Swaps 0 Voluntary Context Switches 12 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 272 138 title; 139 140 /* 4.2. Use the function lowcase on education column. 141 use the same dataset name for output dataset. */ 142 143 data customer_record1; 144 set record.customer_all; 145 Education=lowcase(Education); 146 run; NOTE: There were 10578 observations read from the data set RECORD.CUSTOMER_ALL. NOTE: The data set WORK.CUSTOMER_RECORD1 has 10578 observations and 17 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 3425.62k OS Memory 33704.00k Timestamp 21/02/2024 07:31:51 PM Step Count 61 Switch Count 2 Page Faults 0 Page Reclaims 521 Page Swaps 0
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 6/14 Voluntary Context Switches 16 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 2568 147 148 149 /*4.3. show the simple frequency table after the change. */ 150 151 title'Simple frquency table of variable Education'; 152 proc freq data=customer_record1; 153 table Education/ nocum nopercent missing; 154 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2025.50k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 62 Switch Count 2 Page Faults 0 Page Reclaims 317 Page Swaps 0 Voluntary Context Switches 16 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 155 title; 156 157 /* Q5. Examine the variable "marital". 158 5.1. Use PROC print with a where statement to check for data errors in the variable marital. 159 Consider the valid values as "single", "married", "divorced". 160 Refer to program 1.6. 161 Chapter 1 - Working with Character Data 162 Cody's Data Cleaning Techniques Using SAS, Third Edition */ 163 164 title'Table of Invalid values of variable marital '; 165 proc print data=customer_record1; 166 var marital; 167 id customer_id; 168 where marital not in ('single','divorced','married' ); 169 run; NOTE: There were 58 observations read from the data set WORK.CUSTOMER_RECORD1. WHERE marital not in ('divorced', 'married', 'single'); NOTE: PROCEDURE PRINT used (Total process time): real time 0.02 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 2072.78k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 63 Switch Count 1 Page Faults 0 Page Reclaims 291 Page Swaps 0 Voluntary Context Switches 6 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 24 170 title; 171 172 /* 5.2. Use the function lowcase on the variable marital. */ 173 174 data customer_record1; 175 set record.customer_all; 176 marital=lowcase(marital); 177 run; NOTE: There were 10578 observations read from the data set RECORD.CUSTOMER_ALL. NOTE: The data set WORK.CUSTOMER_RECORD1 has 10578 observations and 17 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.01 seconds memory 3425.21k OS Memory 33704.00k Timestamp 21/02/2024 07:31:51 PM Step Count 64 Switch Count 2 Page Faults 0 Page Reclaims 506 Page Swaps 0 Voluntary Context Switches 17 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 2568 178 179 180 /* 5.3. show the simple frequency table after the change. */ 181 182 title'Simple Frequency Table of variable marital'; 183 proc freq data=customer_record1; 184 table marital/ nocum nopercent missing; 185 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2024.71k
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 7/14 OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 65 Switch Count 2 Page Faults 0 Page Reclaims 321 Page Swaps 0 Voluntary Context Switches 12 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 186 title; 187 188 /* Q6. Examine the variable "Job". 189 6.1. Use PROC FREQ to list a simple frequency table. */ 190 191 title'Simple Frequency Table of variable Job'; 192 proc freq data=customer_record1; 193 table Job/ nocum nopercent missing; 194 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 2025.75k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 66 Switch Count 2 Page Faults 0 Page Reclaims 313 Page Swaps 0 Voluntary Context Switches 13 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 195 title; 196 197 /* 6.2. write a code to combine the categories "admin." and "ADMINISTRATION" 198 for the job variable as "admin". 199 replace any occurrence of the value "ADMINISTRATION" with "admin". */ 200 201 data customer_record1; 202 set record.customer_all; 203 if Job in ('admin.','ADMINISTRATION') then Job = 'admin'; 204 run; NOTE: There were 10578 observations read from the data set RECORD.CUSTOMER_ALL. NOTE: The data set WORK.CUSTOMER_RECORD1 has 10578 observations and 17 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 3424.18k OS Memory 33704.00k Timestamp 21/02/2024 07:31:51 PM Step Count 67 Switch Count 2 Page Faults 0 Page Reclaims 506 Page Swaps 0 Voluntary Context Switches 17 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 2568 205 206 207 /* 6.3. show the simple frequency table after the change. */ 208 209 title'Simple Frequency Table of variable Job after change'; 210 proc freq data=customer_record1; 211 table Job/ nocum nopercent missing; 212 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 2024.50k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 68 Switch Count 2 Page Faults 0 Page Reclaims 318 Page Swaps 0 Voluntary Context Switches 13 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 213 title; 214 215 /* Q7. checking missing values 216 Adapt the code in program 7.2. of Chapter 1 so it works on customer_all dataset. 217 218 Refer to program 7.2. Counting Missing Values for Character Variables in Chapter 1 - Working with Character Data Cody's 218 ! Data Cleaning Techniques Using SAS, Third Edition. 219 220 title "Checking Missing Character Values"; 221 proc format; 222 value $Count_Missing ' ' = 'Missing' 223 other = 'Nonmissing'; 224 run; 225
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 8/14 226 proc freq data=Clean.Patients; 227 tables _character_ / nocum missing; 228 format _character_ $Count_Missing.; 229 run; */ 230 231 232 title "Checking Missing Character Values"; 233 proc format; 234 value $Character_Count_Missing ' ' = 'Missing' 235 other = 'Nonmissing'; NOTE: Format $CHARACTER_COUNT_MISSING is already on the library WORK.FORMATS. NOTE: Format $CHARACTER_COUNT_MISSING has been output. 236 run; NOTE: PROCEDURE FORMAT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 247.15k OS Memory 30880.00k Timestamp 21/02/2024 07:31:51 PM Step Count 69 Switch Count 0 Page Faults 0 Page Reclaims 14 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 8 237 238 proc freq data=customer_record1; 239 tables _character_ / nocum missing; 240 format _character_ $Character_Count_Missing.; 241 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.04 seconds user cpu time 0.05 seconds system cpu time 0.00 seconds memory 2320.03k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 70 Switch Count 2 Page Faults 0 Page Reclaims 338 Page Swaps 0 Voluntary Context Switches 12 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 280 242 title; 243 244 title "Checking Missing Numeric Values"; 245 proc format; 246 value Numeric_Count_missing .='missing' 247 other= 'nonmissing'; NOTE: Format NUMERIC_COUNT_MISSING is already on the library WORK.FORMATS. NOTE: Format NUMERIC_COUNT_MISSING has been output. 248 run; NOTE: PROCEDURE FORMAT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 247.15k OS Memory 30880.00k Timestamp 21/02/2024 07:31:51 PM Step Count 71 Switch Count 0 Page Faults 0 Page Reclaims 16 Page Swaps 0 Voluntary Context Switches 1 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0 249 250 proc freq data=customer_record1; 251 tables _numeric_ / nocum missing; 252 format _numeric_ Numeric_Count_Missing.; 253 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.03 seconds user cpu time 0.04 seconds system cpu time 0.00 seconds memory 2229.46k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 72 Switch Count 2 Page Faults 0 Page Reclaims 328 Page Swaps 0 Voluntary Context Switches 15 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 280 254 title; 255 256 /* Q8. create a new variable named jobMF to indicate the most frequent job category 257 Reuse the code provided in ch17, section 17.3.2. 258
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 9/14 259 check the most frequent job category based on the output of proc freq. 260 create the new variable jobMF 261 print the first few observations. */ 262 263 title'Simple Frequency Table of variable Job'; 264 proc freq data=customer_record1 order=freq; 265 table Job/ nocum nopercent missing; 266 run; NOTE: There were 10578 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 2024.50k OS Memory 32168.00k Timestamp 21/02/2024 07:31:51 PM Step Count 73 Switch Count 2 Page Faults 0 Page Reclaims 318 Page Swaps 0 Voluntary Context Switches 14 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 267 title; 268 269 270 /* Abbreviations: MF-MostFrequent and NM-NotMostFrequent */ 271 data customer_record1; 272 set record.customer_all; 273 if job='management' then jobMF= 'MF'; 274 else jobMF= 'NM'; 275 run; NOTE: There were 10578 observations read from the data set RECORD.CUSTOMER_ALL. NOTE: The data set WORK.CUSTOMER_RECORD1 has 10578 observations and 18 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 3426.03k OS Memory 33704.00k Timestamp 21/02/2024 07:31:51 PM Step Count 74 Switch Count 2 Page Faults 0 Page Reclaims 509 Page Swaps 0 Voluntary Context Switches 15 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 2568 276 277 278 proc print data=customer_record1 (obs=10); 279 run; NOTE: There were 10 observations read from the data set WORK.CUSTOMER_RECORD1. NOTE: PROCEDURE PRINT used (Total process time): real time 0.02 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 1946.40k OS Memory 31908.00k Timestamp 21/02/2024 07:31:51 PM Step Count 75 Switch Count 0 Page Faults 0 Page Reclaims 262 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 16 280 281 282 /* Q9. Removing units from a value and standardizing 283 For a reference example, refer to program 1.10 from 284 chapter 1: Working with Character Data Cody's Data Cleaning Techniques Using SAS, Third Edition Section: Removing Units 284 ! from a Value 285 286 Program 1.10: Converting Weight with Units to Weight in Kilograms 287 *Program to Remove Units from Numeric Data; 288 289 data Units; 290 input Weight $ 10.; 291 Digits = compress(Weight,,'kd'); 1 292 if findc(Weight,'k','i') then 2 293 Wt_Kg = input(Digits,5.); 294 else if not missing(Digits) then 295 Wt_Kg = input(Digits,5.)/2.2; 3 296 datalines; 297 100lbs. 298 110 Lbs. 299 50Kgs. 300 70 kg 301 180 302 ; 303 title "Reading Weight Values with Units"; 304 proc print data=Units noobs; 305 format Wt_Kg 5.1; 306 run; 307 */ 308 309 data units; 310 input Length $ 10. ;
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 10/14 311 312 datalines; NOTE: The data set WORK.UNITS has 5 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 667.56k OS Memory 31140.00k Timestamp 21/02/2024 07:31:51 PM Step Count 76 Switch Count 2 Page Faults 0 Page Reclaims 92 Page Swaps 0 Voluntary Context Switches 17 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 318 ; 319 run; 320 321 proc print data=units; 322 run; NOTE: There were 5 observations read from the data set WORK.UNITS. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 604.40k OS Memory 31140.00k Timestamp 21/02/2024 07:31:51 PM Step Count 77 Switch Count 0 Page Faults 0 Page Reclaims 64 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0 323 324 325 /* Given the following units data, */ 326 /*(a) use the approriate function to keep only digits. name the new variable "digits" */ 327 /* (b) use the function findc on length to search for the character 'm' (stands for meter), 328 if m is found, keep the value as it is, 329 if not, make a foot to meter conversion. */ 330 331 data units; 332 input Length $ 10.; 333 334 digits = input(compress(Length, ,'kd'), best32.); 335 336 if findc(Length, 'm', 'i') then Length_m = input(digits, best32.); 337 else Length_m = input(digits, best32.)*0.3048; 338 339 datalines; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 336:52 337:25 NOTE: The data set WORK.UNITS has 5 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 681.46k OS Memory 31140.00k Timestamp 21/02/2024 07:31:51 PM Step Count 78 Switch Count 2 Page Faults 0 Page Reclaims 95 Page Swaps 0 Voluntary Context Switches 14 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 345 ; 346 run; 347 348 proc print data=units; run; NOTE: There were 5 observations read from the data set WORK.UNITS. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 606.50k OS Memory 31140.00k Timestamp 21/02/2024 07:31:51 PM Step Count 79 Switch Count 0 Page Faults 0 Page Reclaims 67 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 16 349 350 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 360
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 11/14 Results: HIMANI_119717239_Assignment2.sas Simple Frequency Table of target variable y The FREQ Procedure y y Frequency Percent Cumulative Frequency Cumulative Percent no 5289 50.00 5289 50.00 yes 5289 50.00 10578 100.00 Simple Frequency Table of variable Contact The FREQ Procedure contact contact Frequency Percent Cumulative Frequency Cumulative Percent cellular 7682 72.62 7682 72.62 telephone 712 6.73 8394 79.35 unknown 2184 20.65 10578 100.00 The FREQ Procedure Frequency Percent Row Pct Col Pct Table of contact by y contact(contact) y(y) no yes Total cellular 3313 31.32 43.13 62.64 4369 41.30 56.87 82.61 7682 72.62 telephone 322 3.04 45.22 6.09 390 3.69 54.78 7.37 712 6.73 unknown 1654 15.64 75.73 31.27 530 5.01 24.27 10.02 2184 20.65 Total 5289 50.00 5289 50.00 10578 100.00 Statistics for Table of contact by y Statistic DF Value Prob Chi-Square 2 730.1254 <.0001 Likelihood Ratio Chi-Square 2 759.2990 <.0001 Mantel-Haenszel Chi-Square 1 678.0393 <.0001 Phi Coefficient 0.2627 Contingency Coefficient 0.2541 Cramer's V 0.2627 Sample Size = 10578 Checking Invalid values of Education The FREQ Procedure Education Education Frequency invalid 267 valid 10311 Simple frquency table of variable Education The FREQ Procedure Education Education Frequency primary 1440 secondary 5204 tertiary 3470 unknown 464 Table of Invalid values of variable marital
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 12/14 customer_id marital 100712 DIVORCED 101546 DIVORCED 102342 DIVORCED 105806 DIVORCED 106060 DIVORCED 106274 DIVORCED 106281 DIVORCED 106425 DIVORCED 107292 DIVORCED 108050 DIVORCED 109071 DIVORCED 109682 DIVORCED 110184 DIVORCED 111996 DIVORCED 112298 DIVORCED 113003 DIVORCED 114990 DIVORCED 115288 DIVORCED 116020 DIVORCED 116168 DIVORCED 118106 DIVORCED 118244 DIVORCED 119011 DIVORCED 119350 DIVORCED 120481 DIVORCED 121491 DIVORCED 121824 DIVORCED 123764 DIVORCED 124067 DIVORCED 124734 DIVORCED 127101 DIVORCED 127897 DIVORCED 127993 DIVORCED 128420 DIVORCED 129317 DIVORCED 129350 DIVORCED 130560 DIVORCED 131231 DIVORCED 131891 DIVORCED 132398 DIVORCED 133929 DIVORCED 135060 DIVORCED 135280 DIVORCED 137607 DIVORCED 137735 DIVORCED 137912 DIVORCED 140911 DIVORCED 142844 DIVORCED 142905 DIVORCED 142972 DIVORCED 143259 DIVORCED 143559 DIVORCED 143772 DIVORCED 143915 DIVORCED 144016 DIVORCED 144114 DIVORCED 144668 DIVORCED 144830 DIVORCED Simple Frequency Table of variable marital The FREQ Procedure marital marital Frequency divorced 1243 married 5942 single 3393 Simple Frequency Table of variable Job The FREQ Procedure JOB JOB Frequency ADMINISTRATION 51 admin. 1134 blue-collar 1914 entrepreneur 291 housemaid 262 management 2391 retired 757 self-employed 367 services 850 student 375 technician 1768 unemployed 353 unknown 65 Simple Frequency Table of variable Job after change The FREQ Procedure JOB JOB Frequency admin 1185 blue-collar 1914 entrepreneur 291 housemaid 262 management 2391 retired 757
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 13/14 JOB JOB Frequency self-employed 367 services 850 student 375 technician 1768 unemployed 353 unknown 65 Checking Missing Character Values The FREQ Procedure contact contact Frequency Percent Nonmissing 10578 100.00 month month Frequency Percent Nonmissing 10578 100.00 poutcome poutcome Frequency Percent Nonmissing 10578 100.00 y y Frequency Percent Nonmissing 10578 100.00 default Frequency Percent Nonmissing 10578 100.00 housing Frequency Percent Nonmissing 10578 100.00 loan Frequency Percent Nonmissing 10578 100.00 Education Education Frequency Percent Nonmissing 10578 100.00 marital marital Frequency Percent Nonmissing 10578 100.00 JOB JOB Frequency Percent Nonmissing 10578 100.00 Checking Missing Numeric Values The FREQ Procedure day day Frequency Percent nonmissing 10578 100.00 campaign campaign Frequency Percent nonmissing 10578 100.00 pdays pdays Frequency Percent nonmissing 10578 100.00 previous previous Frequency Percent nonmissing 10578 100.00 customer_id Frequency Percent nonmissing 10578 100.00 balance Frequency Percent nonmissing 10578 100.00 AGE AGE Frequency Percent missing 20 0.19 nonmissing 10558 99.81 Simple Frequency Table of variable Job The FREQ Procedure JOB JOB Frequency
21/02/2024, 14:31 Program Summary - HIMANI_119717239_Assignment2.sas about:blank 14/14 JOB JOB Frequency management 2391 blue-collar 1914 technician 1768 admin 1185 services 850 retired 757 student 375 self-employed 367 unemployed 353 entrepreneur 291 housemaid 262 unknown 65 Obs contact day month campaign pdays previous poutcome y customer_id default balance housing loan Education AGE marital JOB jobMF 1 unknown 5 may 1 -1 0 unknown no 100103 no 2 yes yes secondary 33 married entrepreneur NM 2 unknown 5 may 1 -1 0 unknown no 100106 no 231 yes no tertiary 35 married management MF 3 unknown 5 may 1 -1 0 unknown no 100118 no 52 yes no primary 57 married blue-collar NM 4 unknown 5 may 1 -1 0 unknown no 100119 no 60 yes no primary 60 married retired NM 5 unknown 5 may 1 -1 0 unknown no 100121 no 723 yes yes secondary 28 married blue-collar NM 6 unknown 5 may 1 -1 0 unknown no 100126 no -372 yes no secondary 44 married admin. NM 7 unknown 5 may 1 -1 0 unknown no 100130 no 265 yes yes secondary 36 single technician NM 8 unknown 5 may 1 -1 0 unknown no 100141 no 2586 yes no secondary 44 divorced services NM 9 unknown 5 may 1 -1 0 unknown no 100161 no 0 yes no tertiary 32 married admin. NM 10 unknown 5 may 1 -1 0 unknown no 100168 no 59 yes no tertiary 59 divorced management MF Obs Length 1 100m. 2 110 ft. 3 50M. 4 70 Ft 5 180 Obs Length digits Length_m 1 100m 100 100.000 2 110 ft 110 33.528 3 50M. 50 50.000 4 70 Ft 70 21.336 5 180 180 54.864