E-Bike-Rental
docx
keyboard_arrow_up
School
University of North Carolina, Charlotte *
*We aren’t endorsed by this school
Course
6400
Subject
Information Systems
Date
Dec 6, 2023
Type
docx
Pages
2
Uploaded by DeaconRainPartridge10
Selected scenario:
E-bike rental
Potential data sources: For each source, include a source name and format (structured,
semi structured, or unstructured).
1.
IoT Device Data – Semi-structured, JSON
2.
GPS Data – Semi-structured, JSON or CSV
3.
Weather Data - Semi-structured, JSON or CSV
4.
User Data – Structured, Database
5.
Maintenance and Error Logs - Semi-structured (Log files)
6.
Rental Transaction Data – Structured, Database
7.
Traffic Data - Semi-structured, JSON or CSV
8.
Environmental Data- Semi-structured, JSON or CSV
9.
Social Media and Feedback Data - Semi-structured, JSON
Value, veracity, and variety considerations:
Ensure that the data collected is directly relevant to the goals of the e-bike rental program, which is to
reduce carbon emissions and traffic congestion. Data should help answer critical questions and provide
actionable insights.
Data should be capable of driving decision-making and actions. It should provide insights that can lead to
improvements in bike deployment, maintenance, marketing, and overall program effectiveness.
Ensure that the value derived from the data justifies the expenses associated with IoT devices, data
storage, and analysis.
Ensure that the data collected from IoT devices, sensors, and external sources is accurate and reliable.
Inaccurate data can lead to incorrect conclusions and misguided actions.
Data should be consistent across different sources and over time. Inconsistencies or data anomalies
should be identified and addressed to maintain data quality.
Implement measures to protect data integrity, including data encryption during transmission and
storage, as well as access controls to prevent unauthorized modifications.
Recognize that the data includes structured data (e.g., databases, JSON) and semi-structured or
unstructured data (e.g., logs, text feedback). Implement appropriate data processing techniques and
tools to handle this variety effectively.
Ensure that different data sources can be integrated and analyzed together. Establish data pipelines and
data integration processes to harmonize data from diverse sources.
Consider the spatial dimension of your data, especially GPS and location data. Utilize geospatial analysis
techniques to understand usage patterns in specific geographic areas.
Velocity and volume considerations:
The IoT devices on the e-bikes may transmit JSON data very frequently say for example every 2 minutes,
you need to ensure that your data processing pipeline can handle this high velocity of incoming data.
This includes having the necessary network bandwidth and processing power to receive and process
data in near real-time.
Consider whether real-time monitoring and analysis of data are necessary for your proof of concept. If
so, implement streaming data processing technologies to analyze and respond to data as it arrives,
allowing for immediate insights and actions.
Implement data buffering mechanisms to handle potential spikes in data transmission rates. This ensures
that data is not lost or delayed during peak usage times.
Given the frequent data transmissions from e-bikes, the volume of data generated can be substantial.
Assess your data storage requirements and choose scalable storage solutions that can handle the data
volume over time.
Determine how long you need to retain the data for analysis and compliance purposes. Define data
retention policies and consider data archiving or deletion strategies to manage storage costs.
Ensure that your data infrastructure is scalable to accommodate the potential expansion of the e-bike
rental program. As the fleet grows, the volume of data will increase accordingly.
Consider implementing data compression techniques to reduce the storage and bandwidth
requirements, especially for historical data that may not require real-time access.
Choose appropriate data storage solutions, such as databases or cloud-based storage that can scale to
accommodate increasing data volumes. Utilize data streaming and processing for real-time data analysis
and monitoring if needed. Implement data retention policies and automated data lifecycle management
to optimize storage costs and comply with data regulations. Monitor system performance regularly to
ensure that the data pipeline can handle both current and future data velocity and volume
requirements.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help