Multi-Nomial Logistics Regression

docx

School

Maasai Mara University *

*We aren’t endorsed by this school

Course

101

Subject

Statistics

Date

Nov 24, 2024

Type

docx

Pages

11

Uploaded by ChefVulturePerson683

Report
1 Multi-Nomial Logistics Regression and Relative Importance Index Student’s Name Professor’s Name University Course Date
2 Multi-Nomial Logistics Regression Multi-Nomial Logistics Regression(MNLR) is an advanced statistical technique used to analyze the relationship between a dependent variable and one or more independent variables and predict the probability of an event occurring. It is a form of regression analysis in which the dependent variable has more than one level. The technique can be applied in various fields, including medical research, marketing, and economics. It is also often used in logistics to predict the probability of an event happening by taking into account different factors that may influence this event. It is possible to use binomial logistic regression with a dependent variable with more than two categories, frequently regarded as an extension of the original method (Fávero, & Belfiore, 2019). Multinomial logistic regression, like other forms of regression, may use conventional and continuous independent variables. It can also use interactions between those independent factors to predict the dependent variable. The formula for Multi-Nomial Logistics Regression is; Given the predictors X1…,Xp,X1,…,Xp, multinomial logistic regression models the probability of each level jj of YY by pj(x):=P[Y=j|X1=x1,…,Xp=xp]=eβ0j+β1jX1+ +βpjXp1+∑J−1ℓ=1eβ0ℓ+β1ℓX1+ +βpℓXp P(x) = 1/e^(-(x-μ)/σ) How MNLR Has been used MNLR is the same concept as logistic regression, except that there is not just one but numerous potential outcomes. Children's dietary preferences, for instance, are impacted by the decisions made by their parents and the activities they choose to participate in. One may research the connections between a child's dietary preferences, those of their families, and the activities children participate in. The dependent variable levels are the various meal options such as fast food and protein-packed. Investigation can be done on how the education levels of employees and the amount of time they have spent on the job influence promotion opportunities. The degrees of the dependent variable may be progression to group dynamics, sales jobs, or management positions. The independent variables would include the level of education and the
3 amount of time spent working at the job. Determining whether or not there is a connection between an independent variable and a dependent variable is the basis of multinomial logistic regression, just as it is for other forms of regression (Liang, Bi, & Zhan, 2020). After the output is compiled, sets of coefficients will be provided for each variable. The results produced by every software program will be distinctive. The model can be used to estimate the parameters of a given function while allowing for the presence of more than one explanatory variable. The technique has been used to study the relationships among different dependent variable levels, such as the time elapsed since exposure to a given risk factor and its association with an outcome. Consequently, a solid understanding of Logistic Regression is crucial for every budding Data Analyst or Machine Learning Engineer. The model has been used to predict the probability of an individual acquiring HIV. The model predicts that if an individual is male, belongs to a lower socioeconomic class, lives in a rural area, has less than primary education, and has had unprotected sex with more than one partner, they may be at risk of acquiring HIV. In a study by Pham et al. (2019). The regression modeling explores the association between HIV testing status and self-efficacy. It was determined using a nested multinomial logistic regression model to analyze the correlation between self-efficacy and HIV testing status, with never-tested YMSM serving as the result sample population (Pham et al., 2019). The study found no significant association between HIV testing status and self-efficacy. Benefits The benefits of this method are that it can handle complex relationships and interactions between many factors at once. It also uses all available data, thereby reducing bias in estimation and prediction. Also, the technique allows for the use of multiple predictors at once. This way, it can be used to predict outcomes that are not just binary, i.e., success or failure, but rather have more than two possible outcomes, e.g., yes or no. In addition, MNLR can be seen as a natural extension of logistic regression models and has been shown to perform better than other techniques in certain situations. To analyze binary data, logistic regression uses a logistic curve.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 The logistic curve represents the probabilities of the outcomes as a function of the value of the independent variables. The natural logarithm of these possibilities is assumed to have a linear relationship with the required predictor variable in the context of logistic regression. Given that it does not make any assumptions about the data being normally distributed, linear, or homoscedastic, multinomial logistic regression is typically seen as a desirable study. Discriminant function analysis, a robust alternative to multinomial logistic regression, requires that certain conditions be satisfied. Since multinomial logistic regression does not rely on these assumptions, it is utilized more often than discriminant function analysis. As with any statistical method, multinomial regression analysis requires certain conditions, such as the assumption of independence among some of the alternatives for the dependent variable. This implies no connection between a person's preferences and membership in any other categories. In addition, multinomial logistic regression is based on the idea of imperfect independence. Correlation coefficients and effect sizes will be unreasonable if the predictor properly segregates the groups of the outcome measure (Abdillah et al., 2020). In healthcare contexts, this technique can predict how likely people will get certain diseases or how quickly they will get better. It can also be applied in business contexts where it has been used to predict which product will sell best at a given time or how much revenue will be generated from specific products over time. Analyzing Risk and Ranking Risk MNLR has been used to analyze risk, rank risk, and make predictions. The technique has been used in various fields, including the military, healthcare, and business. It has been proven to be an effective tool in predicting various outcomes, such as when people will buy a product, how much they will spend, and what kind of products they will buy. In the military context, this technique is often used to analyze the probability of an event happening. For example, it can be used to predict the likelihood of an attack on a specific location or a specific time. Based on the hypothesized relationships between variables, multinomial logistic regression analysis utilizes various parameter estimating strategies. When there is apparent stratification or groups in the dataset, these methods may be thought of as multinomial logistic regression. Unconditional logistic regression describes how strata might be modeled. Here, all instances are analyzed using
5 a single model that incorporates the state as a set of dependent variables, each representing the cases' involvement in a certain state. The model is based on four different components: the probability of event occurrence, the event's severity, the likelihood that an event will not occur, and the severity if it does not happen. The model can be used in various applications, such as insurance, finance, and healthcare. It is more accurate than other models because it takes into account the nonlinear relationships between variables that other models do not capture. There are many use cases for this tool in the future of statistics. The technique is often used in supply chains, where it can be used to predict the likelihood of a particular product being defective or meeting customer specifications. Authors Reviews Authors present their findings on how Multinomial Logistics Regression can be used to model the relationship between different variables in logistics (Fávero, & Belfiore, 2019, Liang, Bi, & Zhan, 2020). They also provide an overview of how it can be used for forecasting purposes. Those researchers are clearly of the opinion that if a person is predisposed to comprehend binary logistics, then that person is likewise likely to understand multinomial logistics. The viewpoint has some validity due to the fact that one argument is an elaboration of the other, and both make use of maximum probability. Other authors assert that MNL assesses the probability of the dependent variables (Paul, et al, 2022). MNL is an effective choice for mode choice modeling because it does not presuppose normalcy, linearity, or homogeneity of variance. Relative Importance Index The Relative Importance Index (RII) is a statistical model used to measure a variable's relative importance in a given dataset. RII helps partition explained variance across many variables to better understand the contribution played by each predictor in a regression equation. The relative importance index or R-square has two primary uses: to determine which variables might be most predictive of future values of the dependent variable and to measure how well a model, e.g., linear regression, fits observed data points from an experiment or survey. To calculate the RII, one needs to calculate the mean and variance of each variable in a given
6 dataset. Then, they need to determine how much variance is explained by each variable. The model ranks the importance of risks to determine how much attention should be given to them. The Relative Importance Index is advantageous because it is not biased towards any type or size of risk, unlike other ranking models such as the Pareto Principle, which only considers significant risks with high consequences, and high-consequence events. The model uses a ratio to measure the severity of each type of risk and then ranks them accordingly. The technique involves using a statistical program such as SPSS to compute the variance explained by each variable in a linear regression equation. The variance explained is then divided by the total variance, and this value for each variable is multiplied by its corresponding coefficient in the linear regression equation. The sum of these values for all variables is then computed and this sum represents the relative importance of each variable in explaining the dependent variable. The process involves calculating the relative weights of each variable by dividing it's standard deviation by the sum of all of its standard deviations. The relative weights are then sorted from highest to lowest and plotted on a graph. It can then be seen which variables have the greatest importance to the research project, and which can be excluded. The Relative Importance Index is a ranking method used to analyze risk, rank risk, and improve decision-making. It is a data-driven approach that aims to determine the relative importance of different types of risks. This index can be used in any industry and applied to any problem or risk. Many organizations have used the Relative Importance Index to improve their decision-making process. It has helped them identify which risks are more critical than others and how they should prioritize their efforts to get the most out of them. The index is based on the following three components: Probability of event occurrence, the magnitude of event occurrence, and impact of event occurrence (Kassem, 2020). A variable used in the tree as either a main or surrogate splitter is considered an essential variable. The importance level of each of the other variables is determined by their position relative to the one with the most significant potential for improvement. The relative variable significance helps to simplify the process of interpreting the critical values by standardizing them. To get a variable's relative importance, multiply each importance score by 100 percent, then divide that total by the importance score of the parameter with the most outstanding absolute value.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7 The RII can be used for quantitative risk analysis by assigning weights to each risk and then using these weights to calculate an overall weighted average for the index. This weighted average is then compared with a predetermined threshold value to determine if the project should continue or not. The RII is calculated by dividing the total risk by the full impact. The higher this number, the greater the relative importance. Relative index analysis ranks criteria according to their relative importance. The result of this technique is a matrix in which the rows represent the criteria and the columns represent their relative importance. The first row of the matrix represents the most crucial criterion, and so on. The cells at the intersection of rows and columns represent the weighting for each criterion. The RII can be calculated for each criterion or an entire set of criteria (Tholibon et al., 2021). The RII values range from zero to one, with higher numbers indicating greater relative importance. Values closer to one indicate that the criterion is more important than other criteria in the set, while values closer to zero indicate less critical. The RII is based on the idea that certain risks are more important than others, and it considers both the probability of an event occurring and the consequences if it does happen. The index is a way of assessing the importance of one risk against another, which can help us prioritize the risks we need to address. The Relative Importance Index is calculated by multiplying the probability of an event happening with its severity. For example, if the likelihood of an event happening is 10% and its severity is 50%, then the event would have an index of 5 (10% x 50%). This method can be used for any risk assessment exercise. When the coefficient of determination correlation between two variables is high, the amplitude of the relative weight is more considerable by the serial correlation among all of the predictive variables, and it will vary as a consequence of various patterns of interdependencies among all of the predictive variables. When the predictor serial correlation is low, the amplitude of the relative weight is not driven by the collinearity among any of the predictive variables. When there is a significant degree of collinearity, it is reasonable to presume that relative weights will become more consistent. This is due to the fact that even though the independent correlations among pairs of predictive variables can change from sample to sample, the correlation coefficients are more steady as a sequence. This is because a relatively low correlation between one couple of variables can be counterbalanced by a relatively high correlation between the following combination of variables. A slight performance advantage may
8 be obtained if the random variable of tiny populations' relative weights is less changeable when collinearity is large. It generates a collection of new independent factors that are uncorrelated to one another yet have the strongest possible relationship to the variables that were originally considered independent. The dependent variable may be regressed onto this new number of independent variables in order to get a series of standardized regression coefficients. This is possible due to the fact that the constantly creating independent variables are statistically independent to one another. Authors have defined relative importance as the proportional ability to contribute that each predictor makes to R2, taking into account both the unique contribution that each predictor makes when it is considered on its own as well as its accumulative contribution when it is combined with the contributions of the other predictive variables (Azman et al., 2019, holibon et al., 2021). The Relative Importance Index helps researchers determine which keywords to use more often and which to use less often. The index also allows researchers to find new keywords they may not have thought about before (Kaiser et al., 2021). RII statistical model is very effective in identifying risks on projects, systems, or organizations. It has been shown that this model can be used for qualitative and quantitative risk analysis and to identify which risks are more critical than others to prioritize them accordingly. References Abdillah, A., Sutisna, A., Tarjiah, I., Fitria, D., & Widiyarto, T. (2020). Application of Multinomial Logistic Regression to analyze learning difficulties in statistics courses. Journal of Physics: Conference Series , 1490 , 012012. https://doi.org/10.1088/1742- 6596/1490/1/012012 Azman, N. S., Ramli, M. Z., Razman, R., Zawawi, M. H., Ismail, I. N., & Isa, M. R. (2019). Relative importance index (RII) in ranking of quality factors on industrialised building
9 system (IBS) projects in Malaysia. APPLIED PHYSICS of CONDENSED MATTER (APCOM 2019) . https://doi.org/10.1063/1.5118037 Fávero, L. P., & Belfiore, P. (2019). Binary and multinomial logistic regression models. Data Science for Business and Decision Making , 539-615. Kaiser, M., Chen, A. T.-Y., & Gluckman, P. (2021). Should policy makers trust composite indices? A commentary on the pitfalls of inappropriate indices for policy formation. Health Research Policy and Systems , 19 (1). https://doi.org/10.1186/s12961-021-00702-4 Kassem, M. A., Khoiry, M. A., & Hamzah, N. (2020). Using relative importance index method for developing risk map in oil and gas construction projects. Jurnal Kejuruteraan , 32 (3), 441-453. Liang, J., Bi, G., & Zhan, C. (2020). Multinomial and ordinal Logistic regression analyses with multi-categorical variables using R. Annals of Translational Medicine , 8 (16), 982–982. https://doi.org/10.21037/atm-2020-57 Paul, T., Chakraborty, R., Afia Ratri, S., & Debnath, M. (2022). Impact of COVID-19 on mode choice behavior: A case study for Dhaka, Bangladesh. Transportation Research Interdisciplinary Perspectives , 100665. https://doi.org/10.1016/j.trip.2022.100665 Pham, M. D., Aung, P. P., Agius, P. A., Pasricha, N., Oo, S. M., Tun, W., … Luchters, S. (2019). Relationship between self-efficacy and HIV testing uptake among young men who have sex with men in Myanmar: a cross-sectional analysis. International Journal of STD & AIDS , 30 (1), 20–28. https://doi.org/10.1177/0956462418791945 Tholibon, D. A., Md Nujid, M., Mokhtar, H., Rahim, J. A., Aziz, N. F. A., & Tarmizi, A. A. A. (2021). Relative Importance Index (RII) In Ranking the Factors of Employer Satisfaction
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 Towards Industrial Training Students. International Journal of Asian Education , 2 (4), 493–503. https://doi.org/10.46966/ijae.v2i4.187 Umaña-Hermosilla, B., de la Fuente-Mella, H., Elórtegui-Gómez, C., & Fonseca-Fuentes, M. (2020). Multinomial logistic regression to estimate and predict the perceptions of individuals and companies in the face of the COVID-19 pandemic in the Ñuble region, Chile. Sustainability , 12 (22), 9553.
11