Excersie_3 capstone
docx
keyboard_arrow_up
School
Seneca College *
*We aren’t endorsed by this school
Course
CAPSTONE
Subject
English
Date
Dec 6, 2023
Type
docx
Pages
26
Uploaded by SargentLemur2914
Toronto Police Traffic Case Study - Week 3
By: Arjun - 101391954
Dolly Nair – 101490446
Priyansh Bhardwaj – 101455632
Jay Sehgal – 101453476
Pulkit Patwari - 101401006
1)
Sample Data:
Due to 59 columns, it is unable to show all the columns. Below is the list of the same with the description.
2)
Meta data of dataset:
Columns
Description
'ACCNUM'
Accident Number
'YEAR'
Year in which accident occurred
'DATE'
Date on which accident occurred. It also contains time.
'TIME'
Time of accident occurred
'Hour’
particular hour (this is in 24-hour format)
'STREET1’
One of the street names
'STREET2'
Second nearest street
'Intersection'
Name of the Intersection if it occurred on that.
Not all accident are on intersection.
But all has two street name mentioned.
'OFFSET'
Precise location with distance and direction i.e. ‘20 m North’
'ROAD_CLASS'
Type of Road i.e. Local, Major Arterial etc.
'District'
District in which that location belongs.
'WardNum'
Ward number of location
'WardNum_X'
If the wardno: is 03,04 than WardNum_X = 04
'WardNum_Y'
If the wardno: is 03,04 than WardNum_Y= 03
'Division'
Division of that particular location
'Division_X'
Division = 22,11 than Division_X = 11
'Division_Y'
Division = 22,11 than Division_Y = 22
'LATITUDE'
measures of accident’s location position north or south on the Earth's surface, measured in degrees from the equator
'LONGITUDE'
measures of accident’s location distance east or west of the prime meridian
'LOCCOORD'
Location Coordinates: i.e. ’Intersection’, ‘Mid Block’, ‘Park/Private’, etc.
'ACCLOC'
Precise Accident Location i.e. ‘On Intersection’, ‘Parking Lot’, etc.
'TRAFFCTL'
Was there any traffic signs/signal available? i.e. ‘Stop sign’ etc.
'VISIBILITY'
What was the visibility at the time of accident?. i.e. ‘Clear’, ‘Fog’, etc.
'LIGHT'
Type of Light available at that time. i.e. ’Daylight’, ‘Dark’, etc.
'RDSFCOND'
It contains information about road surface conditions at the time of accident. i.e.: Dry, Wet, etc.
'ACCLASS'
It contains information about the accident classification or severity. I.e.: ’Non-Fatal’ or ‘Fatal’.
'IMPACTYPE'
It contains information about the impact types or collision types involved in the accidents. i.e.: ‘Angle’, ‘Turning Movement’, etc.
'INVTYPE'
This column contains information about the involvement types or roles of individuals involved in the accidents. i.e. ’Driver’, ‘Passenger’, etc.
'INVAGE'
Age (in Range) of injured person.
'INJURY'
Type Of injury. i.e. ‘Major’, ‘Fatal’, etc.
'FATAL_NO'
No of fatality in that accident.
'INITDIR'
This contains information about the initial directions or orientations of the vehicles or objects involved in the
accidents
'VEHTYPE'
Vehicle type involved. i.e.: ‘Station Wagon’, ‘Truck’, etc.
'MANOEUVER'
It contains information about the maneuvers or actions performed by the vehicles.
'DRIVACT'
contains information about the driving actions or behaviors
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
of the individuals involved in the accidents.
'DRIVCOND'
Condition of Driver i.e. ‘ Normal’, ‘Had been drinking’, etc.
'PEDTYPE'
information about the type of pedestrian involved in the accidents
'PEDACT'
It contains information about the actions or behaviors of the pedestrians involved in the accidents
'PEDCOND'
Condition of pedestrians. i.e.: ‘Normal’, ‘Inattentive’, etc.
'CYCLISTYPE'
Type of Cyclist i.e.: ‘C’, ‘I’, ‘M’
'PEDESTRIAN'
indicates whether a pedestrian was involved in the accidents. The value "Yes" suggests the presence of a pedestrian, while empty cells indicate no pedestrian
involvement.
'CYCLIST'
whether a cyclist was involved in the accidents
'AUTOMOBILE'
whether an automobile was involved in the accidents
'MOTORCYCLE'
whether a motorcycle was involved in the accidents
'TRUCK'
whether a truck was involved in the accidents
'TRSN_CITY_’
represents some transportation-related information *
'EMERG_VEH'
whether an emergency vehicle was involved in the accidents. Here ‘Yes’ suggest presence of emergency vehicle.
'PASSENGER'
a passenger was involved in the accidents
'SPEEDING'
whether speeding was a factor in the accidents
'AG_DRIV'
refers to aggressive driving behavior.
'REDLIGHT'
whether running a red light was a factor in the accidents
'ALCOHOL'
whether alcohol was involved in the accidents
'DISABILITY'
Was there any disability a factor in the accidents
'Hood_ID'
represents the ID or code associated with different neighborhood areas
'Neighbour'
name or label of the neighborhood
3)
Frequency Distribution:
In our dataset most of the field have unique values or multiple numeric values which may not derive any insights in terms of frequency distribution. So, we are only considering some of them which make sense.
Frequency Distribution for Column: YEAR
2012
453
2009
438
2013
429
2008
417
2011
400
2010
400
2017
393
2018
389
2016
388
2015
350
2014
348
Frequency Distribution for Column: Hour
18
301
17
299
15
264
16
254
14
251
19
248
20
234
21
215
13
210
12
200
10
194
11
193
9
186
8
174
22
171
7
155
6
150
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
23
148
0
133
2
111
1
105
3
94
5
75
4
40
Here in the case of Street 1 field we have more than 1400 data so representing all individually is not possible so summary of it is provided below.
Frequency Distribution for Column: STREET1 YONGE ST
80
BATHURST ST
78
DUNDAS ST W
77
EGLINTON AVE E
71
FINCH AVE W
63
..
MILVERTON BLVD
1
322 THE WESTWAY
1
RICHVIEW Road
1
SAMMON AVE
1
PICKERING TOWN LIN
1
Name: STREET1, Length: 1401, dtype: int64
Frequency Distribution for Column: STREET2
BATHURST ST
39
LAWRENCE AVE E
36
FINCH AVE E
28
YONGE ST
27
EGLINTON AVE E
27
..
SUMMITCREST DR
1
AILEEN AVE
1
SUMMITCREST Driv
1
ROSECLIFFE AVE
1
GORDON MURISON L 1
Name: STREET2, Length: 2091, dtype: int64
Frequency Distribution for Column: Intersection ROSEDALE VALLEY RD,BAYVIEW AVE
3
LAKE SHORE BLVD W,ELLIS AVE
2
KING ST W,BRANT ST
2
DUNDAS ST W,STERLING RD
2
DUFFERIN ST,ST CLAIR AVE W
2
..
DUFFERIN ST,LAPPIN AV
1
DUPONT ST,LANSDOWNE AVE
1
QUEEN ST E,EASTERN AVE
1
QUEEN ST E,KINGSTON RD
1
STEELES AVE E,PICKERING TOWN LINE
1
Name: Intersection, Length: 282, dtype: int64
Frequency Distribution for Column: OFFSET
5 m South of
13
100 m North o
11
1 m North of
11
10 m West of
11
1 m West of
10
..
66 m North of
1
75 m East
1
220 m South o
1
240 m North o
1
84 m West of
1
Name: OFFSET, Length: 336, dtype: int64
Frequency Distribution for Column: ROAD_CLASS
Major Arterial
2922
Minor Arterial
727
Collector
280
Expressway
236
Local
219
Other
8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Pending
3
Laneway
3
Name: ROAD_CLASS, dtype: int64
Frequency Distribution for Column: District Toronto and East York
1656
Scarborough
1014
Etobicoke York
942
North York
785
Toronto East York
1
Name: District, dtype: int64
Frequency Distribution for Column: WardNum
10
251
1
211
3
210
11
210
5
207
...
04,09
1
13,14
1
16,19
1
05,04
1
01,07
1
Frequency Distribution for Column: Division
42
414
32
314
22
293
14
275
23
271
41
261
31
242
43
242
53
213
51
206
12
204
11
187
52
185
13
180
33
176
55
162
54
129
54,55
55
14,52
41
51,52
37
11,14
34
12,11
27
33,41
26
41,43
23
33,32
23
23,22
23
13,53
22
54,41
18
33,42
16
32,13
16
32,53
13
51,53
12
33,54
11
14,53
9
33,54,41
6
51,55
5
14,53,52
5
53,52
5
32,13,53
4
55,41
3
13,14
3
00,51
2
23,31
1
42,41,43
1
33,42,41
1
11,13
1
33,32,53
1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
D14UE
1
00,52
1
22,11
1
42,43
1
Name: Division, dtype: int64
Frequency Distribution for Column: VISIBILITY
Clear
3752
Rain
492
Snow
86
Other
34
Freez
13
Fog,
12
Drift
6
Stron
1
Name: VISIBILITY, dtype: int64
Frequency Distribution for Column: LIGHT
Daylight
2555
Dark, artificial
842
Dark
790
Dusk
65
Dusk, artificial
52
Daylight, artifici
40
Dawn, artificial
33
Dawn
27
Other
1
Name: LIGHT, dtype: int64
Frequency Distribution for Column: RDSFCOND
Dry
3478
Wet
783
Oth
44
Loo
39
Slu
26
Ice
13
Pac
11
Spi
1
Frequency Distribution for Column: ACCLASS Non-Fatal Injury
3805
Fatal
600
Frequency Distribution for Column: IMPACTYPE
Pedestrian Collisions
1976
Turning Movement
570
Cyclist Collisions
538
SMV Other
432
Rear End
310
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Angle
212
Approaching
159
Sideswipe
104
Other
58
SMV Unattended Vehicle
46
Frequency Distribution for Column: INVTYPE Driver
2603
Vehicle Owner
633
Passenger
364
Pedestrian
287
Motorcycle Driver
255
Cyclist
109
Truck Driver
87
Other
32
Motorcycle Passenger
11
Moped Driver
10
Other Property Owner
4
Driver - Not Hit
4
Runaway - No Driver
3
In-Line Skater
1
Wheelchair
1
Frequency Distribution for Column: INJURY
None
2067
Major
1110
Fatal
231
Minor
190
Minimal
166
Frequency Distribution for Column: INITDIR South
857
East
842
West
828
North
764
Unknown
57
Frequency Distribution for Column: VEHTYPE Automobile, Station Wagon
2379
Other
984
Motorcycle
255
Bicycle
108
Pick Up Truck
64
Passenger Van
51
Municipal Transit Bus (TTC)
40
Truck - Open
35
Delivery Van
26
Truck - Closed (Blazer, etc
22
Truck - Dump
12
Street Car
11
Truck-Tractor
11
Moped
8
Truck (other)
5
Bus (Other) (Go Bus, Gray C
5
Taxi
4
Truck - Tank
4
Fire Vehicle
2
Intercity Bus
2
Tow Truck
2
Construction Equipment
2
Police Vehicle
1
School Bus
1
Off Road - 2 Wheels
1
Frequency Distribution for Column: MANOEUVER Going Ahead
1757
Turning Left
698
Turning Right
182
Changing Lanes
82
Stopped
71
Slowing or Sto
62
Reversing
56
Parked
53
Other
36
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Unknown
36
Overtaking
34
Making U Turn
31
Pulling Away f
13
Pulling Onto S
7
Merging
6
Disabled
1
Frequency Distribution for Column: DRIVACT Driving Properly
994
Failed to Yield Right of Way
700
Lost control
352
Improper Turn
216
Other
187
Disobeyed Traffic Control
144
Following too Close
88
Exceeding Speed Limit
78
Speed too Fast For Condition
75
Improper Lane Change
48
Improper Passing
39
Wrong Way on One Way Road
2
Speed too Slow
2
Frequency Distribution for Column: DRIVCOND
Normal
1607
Inattentive
655
Unknown
406
Ability Impaired,
86 Medical or Physic
72 Had Been Drinking
56
Fatigue
23
Other
20
Frequency Distribution for Column: PEDTYPE
Pedestrian hit at mid-block
81
Vehicle turns left while ped crosses with ROW at inter.
48
Vehicle is going straight thru inter.while ped cross without ROW
42
Pedestrian involved in a collision with transit vehicle anywhere along roadway 24
Vehicle turns right while ped crosses with ROW at inter.
19
Vehicle is going straight thru inter.while ped cross with ROW
15
Vehicle is reversing and hits pedestrian
13
Pedestrian hit on sidewalk or shoulder
12
Other / Undefined
8
Pedestrian hit a PXO/ped. Mid-block signal
6
Pedestrian hit at private driveway
4
Vehicle turns left while ped crosses without ROW at inter.
4
Unknown
2
Vehicle hits the pedestrian walking or running out from between parked vehicle 2
Pedestrian hit at parking lot
1
Frequency Distribution for Column: PEDACT Crossing with right of way
87
Crossing, no Traffic Contr
78
Crossing without right of
35
Other
31
Running onto Roadway
16
On Sidewalk or Shoulder
12
Crossing, Pedestrian Cross
8
Playing or Working on High
4
Person Getting on/off Vehi
3
Coming From Behind Parked
3
Walking on Roadway with Tr
2
Walking on Roadway Against
2
Person Getting on/off Scho
1
Pushing/Working on Vehicle
1
Frequency Distribution for Column: PEDCOND Normal
138
Inattentive
63
Unknown
41
Had Been Drinking
23
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Ability Impaired,
7 Medical or Physic
7 Other
6
Frequency Distribution for Column: CYCLISTYPE
C
83
M
18
I
6
Frequency Distribution for Column: CYCACT
D
42
I
21
F
20
L
14
O
9
S
2
Frequency Distribution for Column: CYCCOND
N
49
I
29
U
15
H
9
M
2
A
2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
O
2
---
Frequency Distribution for Column: PEDESTRIAN Yes
1974
Name: PEDESTRIAN, dtype: int64
---
Frequency Distribution for Column: CYCLIST Yes
569
Name: CYCLIST, dtype: int64
---
Frequency Distribution for Column: AUTOMOBILE Yes
3893
Name: AUTOMOBILE, dtype: int64
---
Frequency Distribution for Column: MOTORCYCLE Yes
446
Name: MOTORCYCLE, dtype: int64
---
Frequency Distribution for Column: TRUCK Y
226
Name: TRUCK, dtype: int64
---
Frequency Distribution for Column: TRSN_CITY_ Yes
245
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Name: TRSN_CITY_, dtype: int64
---
Frequency Distribution for Column: EMERG_VEH Y
7
Name: EMERG_VEH, dtype: int64
---
Frequency Distribution for Column: PASSENGER Yes
1061
Name: PASSENGER, dtype: int64
---
Frequency Distribution for Column: SPEEDING Yes
639
Name: SPEEDING, dtype: int64
---
Frequency Distribution for Column: AG_DRIV Yes
2145
Name: AG_DRIV, dtype: int64
---
Frequency Distribution for Column: REDLIGHT Y
263
Name: REDLIGHT, dtype: int64
---
Frequency Distribution for Column: ALCOHOL
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Y
160
Name: ALCOHOL, dtype: int64
---
Frequency Distribution for Column: DISABILITY Y
119
Name: DISABILITY, dtype: int64
Frequency Distribution for Column: Neighbourh
Waterfront Communities-The Island (77)
168
West Humber-Clairville (1)
132
Bay Street Corridor (76)
100
Woburn (137)
89
South Riverdale (70)
81
...
Lawrence Park North (105)
6
Danforth (66)
5
Elms-Old Rexdale (5)
4
Maple Leaf (29)
4
Lambton Baby Point (114)
3
Name: Neighbourh, Length: 140, dtype: int64
4)
Summary of key findings:
Upon reviewing the dataset, it became apparent that preprocessing the data is crucial before initiating any analysis. During this preliminary phase, we performed data cleaning tasks, including the identification and
elimination of duplicate records. Furthermore, we thoroughly
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
investigated the existence of any null or missing values in significant columns.Initially, there were 12,244 records in the dataset. However, after conducting data processing and cleaning, we now have a refined set
of 4,405 records ready for further analysis. We've noticed certain descriptive data fields like VEHTYPE and INITDIR that require clarification from our client. We plan to discuss these points in our upcoming meeting to ensure a clear direction for our analysis.
Here are some queries we aim to address in the upcoming meeting:
1)
Are there particular criteria or specific requirements essential for this analysis?
2)
Do you have any preferences or guidelines on how we should handle
missing data or outliers in the dataset?
3)
Are there particular statistical techniques, models, or methodologies you'd like us to employ during our analysis?
4)
Are there any specific insights or patterns that you are specifically interested in uncovering within the data?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help