If training_size is float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train examples.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question
Write a function called divide_dataset_into_training_and_testing as shown below that aims to split the dataset into training and testing. It must contain arugment called training_size that meet following:¶
  • If training_size is float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train examples.
Expert Solution
Step 1

Splits multi-view data into random train and test subsets. This utility wraps the train_test_split function from sklearn.model_selection for ease of use.

Parameters:
  • inputs (sequence of indexables) -- Allowed inputs are lists of numpy arrays, numpy arrays, lists, scipy-sparse matrices or pandas dataframes.
  • test_size (float or intdefault=None) -- If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples. If None, the value is set to the complement of the train size. If train_size is also None, it will be set to 0.25.
  • train_size (float or intdefault=None) -- If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size.
  • random_state (int or RandomState instancedefault=None) -- Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls.
  • shuffle (booldefault=True) -- Whether or not to shuffle the data before splitting. If shuffle=False then stratify must be None.
  • stratify (array-likedefault=None) -- If not None, data is split in a stratified fashion, using this as the class labels.
Returns:

splitting -- List containing the train-test splits of each of the inputs. If a list of arrays or 3D array is one of the inputs, train_test_split operates on each subarray and puts them together into a list of arrays or 3D array for training and one for testing.

Return type:

list, length=2*len(arrays)

steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education