Requirements
.docx
keyboard_arrow_up
School
Saint Louis University, Baguio City Main Campus - Bonifacio St., Baguio City *
*We aren’t endorsed by this school
Course
BSBTMGT617
Subject
Computer Science
Date
Jun 23, 2024
Type
docx
Pages
3
Uploaded by hollumeday303
Using Machine Learning on Big Data for Healthcare Communities for Predicting Diseases
Objectives
ï‚·
To evaluate healthcare datasets to drawing meaningful results through Predictive modeling, such as basic regression models
ï‚·
To develop a system that detect or predict various sorts of illnesses in a single stage through Streamlit, an inbuilt python module leveraging on Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier.
ï‚·
To evaluate the effectiveness of the specific Machine Learning Algorithms adopted in the proposed model for exactness and accuracy in deriving the best results.
Research Questions
Below are five research questions that would be addressed in the project:
How can the predictive models be developed and integrated into the existing healthcare systems to provide timely and accurate predictions of diseases?
To achieve this research question, the project could develop a proposed disease prediction system using
one or more Machine Learning algorithms, such as Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier. The proposed system could be trained and tested using healthcare data gathered from communities, and its predictive accuracy could be evaluated using performance metrics (e.g., accuracy, precision, recall, F1 score). I can also compare the results of the Machine Learning models with the performance of traditional statistical models, such as linear or logistic
regression, to assess their respective advantages and limitations. To integrate the predictive models into
existing healthcare systems, the project could collaborate with healthcare organizations to identify appropriate platforms for implementing the models. I can also work with healthcare providers to gather feedback and optimize the models to meet their needs. Challenges to implementation may include data privacy concerns, integration with existing systems, and user adoption.
What is the predictive accuracy of different Machine Learning algorithms when applied to big data gathered from healthcare communities for predicting diseases?
To determine the predictive accuracy of different Machine Learning algorithms, the project could use cross-validation and performance metrics (such as accuracy, precision, recall, and F1 score) to evaluate the performance of each algorithm on the healthcare data. The project could also compare the performance of the Machine Learning algorithms to traditional statistical methods and assess their respective advantages and limitations.
How can feature selection techniques be used to identify the most important variables in the healthcare data that are predictive of diseases, and how does this affect the accuracy of the predictive
models?
To identify the most important variables in the healthcare data, the project could use feature selection techniques such as recursive feature elimination or principal component analysis. The project could also experiment with different subsets of the data to evaluate how the accuracy of the models is affected. Additionally, the project could investigate how the choice of feature selection technique impacts the accuracy and interpretability of the predictive models.
What are the ethical considerations surrounding the use of Machine Learning on big data from healthcare communities for diseases prediction?
To address ethical considerations surrounding the use of Machine Learning on healthcare data, the project could work with healthcare organizations and privacy experts to ensure that data is collected and used in a responsible and ethical manner. This could include obtaining informed consent from patients, implementing appropriate data security measures, and minimizing the risk of bias and discrimination in the models. The project could also investigate the potential societal impact of the predictive models and work to address any unintended consequences.
How can deep learning techniques (such as Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier) be used to improve the accuracy of diseases prediction on big data gathered from healthcare communities?
To use deep learning techniques to improve the accuracy of disease prediction, the project could experiment with different neural network architectures such as Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier. The project could also investigate how transfer learning, ensemble methods, and other techniques could be used to improve the accuracy of the models. Challenges to implementing these techniques may include the need for large amounts of training data, longer training times, and the need for specialized hardware.
How the Questions Aid demonstration of my Computing Skills
1.
To integrate the predictive models into existing healthcare systems and provide timely and accurate predictions of diseases: this would allow me demonstrate my proficiency in integrating Machine Learning models with existing software platforms using technologies like API and database management system.
2.
To use deep learning techniques, such as Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier, to improve the accuracy of disease prediction on big data gathered from healthcare communities: showcasing computing skills in this objective could involve my proficiency in deep learning architectures, such as Naïve Bayes algorithm, decision tree, random forest, and Support vector machines (SVMs) classifier. Additionally, knowledge of GPU computing and cloud services is necessary to handle the large amounts of data needed for training and testing deep learning models.
3.
To evaluate healthcare datasets and draw meaningful results through predictive modeling: In regards to this objective, skills that I will be able to showcase includes proficiency in data preprocessing, feature selection, and model selection techniques.
4.
To develop a system that detects or predicts various sorts of illnesses through Streamlit and Machine Learning algorithms: this objective allows me show computing skills in Python programming, including proficiency in using Scikit-learn or TensorFlow libraries to build and train
Machine Learning models. Additionally, proficiency in Streamlit, an in-built python module, is necessary to create a functional and responsive user interface.
5.
To evaluate the effectiveness of specific Machine Learning algorithms in the proposed model for
exactness and accuracy in deriving the best results: Demonstrating computing skills in this objective could involve proficiency in comparing different Machine Learning algorithms' performance in terms of accuracy, precision, recall, and F1 score. It might also involve
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
train an artificial neural network using CIFAR10 dataset. You can get the dataset from Keras similar to mnist dataset
- try to find the best performing model for your dataset (CIFAR-10), use the splitting for train/val/test as 80/10/10
Â
attach the screenshot of code and output at any python compiler
arrow_forward
A hyper-parameter is... (choose the single best answer)Â
Â
A setting or parameter associated with a prediction algorithm that governs how it will operate
Â
A parameter that defines other parameters. Â
Â
A redundant parameter in the configuration of a predictive model. Â
Â
None of the other answers is correct.
arrow_forward
Could you perhaps supply some examples to support your choice of machine learning model?
arrow_forward
In a branch of machine learning known as as is, a model is used to produce a forecast
utilising characteristics as inputs and delivering a prediction. Some recent
cutting-edge models that have found success include the following:
arrow_forward
In this section, you will find four distinct machine learning algorithms that may be utilized for supervised learning on a dataset that has been supplied to you. Provide an explanation of any four factors you would use to assist in determining which one you would use to do the job of determining if a tumor is malignant or not??
arrow_forward
One of the most often used machine learning techniques is regression. These models are often useful for predicting or mistake reduction. What are some linear regression applications?
arrow_forward
How does the concept of 'machine learning' differ from traditional programming approaches, and what are some real-world applications where machine learning algorithms have demonstrated significant advantages over rule-based systems?
arrow_forward
Use principal component analysis to show how supervised learning
algorithms may benefit from it.
arrow_forward
Supervised machine learning:
Â
includes a fair amount of trial and error in model design.
Â
requires picking the correct model from the start, or else your foundations are weak
Â
requires a PhD or the equivalent level of experience in Computer Science
Â
will often need millions of images to build any sort of useful model
Â
is largely dependent on the quality of your data. Clean, well-labeled data builds strong models.
arrow_forward
Show the differences between machine learning models and deep learning models in terms of the length of training time that is necessary, the data and compute demands, the accuracy requirements, the hyperparameter adjustment, and the hardware dependencies.
arrow_forward
Describe the overal iterative learning process for logistic regression in machine learning.
arrow_forward
Use python machine learning.
Â
Answer the following questions:
1. How does Random Forest work? Why is it better than a single decision tree?
2. Why is Random Forest better than a single decision tree? How does it decrease model error? How does it affect bias and virance?
3. What is Bagging?
arrow_forward
Identify an Industrial application where intelligent control is necessary and implement Fuzzy based approach or neural network based approach in solving the issue.
Develop the mathematical model and algorithm for the industrial application using MATLAB or any other open source software and run the simulation and get the output verified.
Report the output with different training tolerance, different activation function and report the output of the simulation.
Interpret the output of the designed algorithm with respect to change in training parameters.
arrow_forward
Show the differences between machine learning models and deep learning models in terms of the length of training time that is needed, the quantity of data and processing power that is necessary, the accuracy requirements, the hyperparameter adjustment, and the hardware dependencies.
arrow_forward
What kind of complexity testing does point-of-care fall under?
What kind of tests does point-of-care testing offer?
arrow_forward
Our toolbox includes a wide range of algorithms that could be used to support a variety of machine learning tasks. Identify three pros and three cons for the use of Support Vector Machines. Describe a scenario in which you would use an SVM over the other algorithms, and why..
arrow_forward
Make sure to show it as steps on a python with the final answer
Fit a decision tree model using the training dataset (`x_train` and `y_train`)Create a variable named `y_pred`.
Make predictions using the `x_test` variable and save these predictions to the y_pred variable
Create a variable called `dt_accuracy`.
Compute the accuracy rate of the logistic regression model using the `y_pred` and `y_test` and assign it to the `dt_accuracy` variable
arrow_forward
Q: Explain briefly why we need Supervised and Unsupervised Classification in machine learning projects? Support your answer with the help of examples
arrow_forward
Which of the following statements are usually true about the Learning rate in Neural Networks?
(1) It may be defined as a factor that decides the amount by which the weights are updated.
(2) If the learning rate is set too low, then the weight changes may be so big that the optimization may overshoot the minima and increase the error.
(3) If the learning rate is set too high, training will progress very slowly as very tiny updates are made to the weights in the network.
All the statements are true
(2) & (3)
(1) & (2)
Only (1)
arrow_forward
Justify your preference for one machine learning model over another by providing specific examples
arrow_forward
Please explain Optimizer in detail (In Machine Learning) and Discuss the advantages and disadvantages of at least three Optimizers.
arrow_forward
Show the differences between machine learning and deep learning models in terms of the training time needed, the data and compute requirements, the accuracy demands, the necessity to modify the hyperparameters, and the dependence on hardware.
arrow_forward
How can machine learning models be effectively trained on small datasets without overfitting or compromising performance?
arrow_forward
Machine Learning:
Do Regression models and artificial neural networs have similar common functions in terms of the learning process?
arrow_forward
What is an example of the differences in using precision and recall compared to the confusion matrix for gauging the performance of a machine learning model algorithm
arrow_forward
Subject: Machine Learning
Question Number 5 :
What are Ensemble Algorithms? Write an algorithm for BOOSTING method.
Assume that two individuals offer to sell you their predictive models M1Â and M2. The confusion matrices produced by each model are as follows. What is the accuracy of each model? Assuming that precision is of paramount importance in your application, which of the two models would you buy? Why?
arrow_forward
Can you explain your choice of model for machine learning using some examples?
arrow_forward
What are some of the most commonly used algorithms in machine learning and how are they applied in real-world scenarios to solve problems related to classification, regression, and clustering?
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Related Questions
- train an artificial neural network using CIFAR10 dataset. You can get the dataset from Keras similar to mnist dataset - try to find the best performing model for your dataset (CIFAR-10), use the splitting for train/val/test as 80/10/10  attach the screenshot of code and output at any python compilerarrow_forwardA hyper-parameter is... (choose the single best answer)  A setting or parameter associated with a prediction algorithm that governs how it will operate  A parameter that defines other parameters.   A redundant parameter in the configuration of a predictive model.   None of the other answers is correct.arrow_forwardCould you perhaps supply some examples to support your choice of machine learning model?arrow_forward
- In a branch of machine learning known as as is, a model is used to produce a forecast utilising characteristics as inputs and delivering a prediction. Some recent cutting-edge models that have found success include the following:arrow_forwardIn this section, you will find four distinct machine learning algorithms that may be utilized for supervised learning on a dataset that has been supplied to you. Provide an explanation of any four factors you would use to assist in determining which one you would use to do the job of determining if a tumor is malignant or not??arrow_forwardOne of the most often used machine learning techniques is regression. These models are often useful for predicting or mistake reduction. What are some linear regression applications?arrow_forward
- How does the concept of 'machine learning' differ from traditional programming approaches, and what are some real-world applications where machine learning algorithms have demonstrated significant advantages over rule-based systems?arrow_forwardUse principal component analysis to show how supervised learning algorithms may benefit from it.arrow_forwardSupervised machine learning:  includes a fair amount of trial and error in model design.  requires picking the correct model from the start, or else your foundations are weak  requires a PhD or the equivalent level of experience in Computer Science  will often need millions of images to build any sort of useful model  is largely dependent on the quality of your data. Clean, well-labeled data builds strong models.arrow_forward
- Show the differences between machine learning models and deep learning models in terms of the length of training time that is necessary, the data and compute demands, the accuracy requirements, the hyperparameter adjustment, and the hardware dependencies.arrow_forwardDescribe the overal iterative learning process for logistic regression in machine learning.arrow_forwardUse python machine learning. Â Answer the following questions: 1. How does Random Forest work? Why is it better than a single decision tree? 2. Why is Random Forest better than a single decision tree? How does it decrease model error? How does it affect bias and virance? 3. What is Bagging?arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education