Portal de Programas de Pós-Graduação (UFBA)

SIGAA - Sistema Integrado de Gestão de Atividades Acadêmicas

PGMAT PROGRAMA DE PÓS-GRADUAÇÃO EM MATEMÁTICA (PGMAT) INSTITUTO DE MATEMÁTICA E ESTATÍSTICA Phone: Not available E-mail: erharddirk@gmail.com

Banca de DEFESA: TATIANA FELIX DA MATTA

Uma banca de DEFESA de MESTRADO foi cadastrada pelo programa.
DISCENTE : TATIANA FELIX DA MATTA
DATA : 06/08/2021
HORA: 14:30
LOCAL: Salvador
TÍTULO:

New binary regression models using symmetric and asymmetric link functions

PALAVRAS-CHAVES:

Imbalanced data, double Lindley distribution, power and reversal power link functions, maximum likelihood method, predictive performance

PÁGINAS: 53
GRANDE ÁREA: Ciências Exatas e da Terra
ÁREA: Probabilidade e Estatística
RESUMO:

Regression models with binary response variables (1 - occurrence of the event of interest or "success'', 0 - non-occurrence of the event of interest or "failure'') have been intensively applied in several areas of knowledge, such as health, finance, industry, among others. Traditionally, the most used model in binary regression has been the logistic regression model. However, it uses the logit link function, which is a symmetric link function and may not be suitable in certain situations, for example, when one of the response variable classes is disproportionate to the other (imbalanced data set). The main aim of this work is to present new binary regression models using symmetric and asymmetric link functions. The parameter estimation of the proposed models (namely, the double Lindley, asymmetric double Lindley, power double Lindley, and reversal power double Lindley binary regression models) is performed with the classical maximum likelihood method. In order to compare and select the "best'' model among the different distributions, information criteria (AIC and BIC) and measures of predictive performance (AUC, balanced accuracy, sensitivity, specificity, positive and negative predictive values, F1-Score, Matthews correlation coefficient, among others) are used. Through the analysis of two real data sets, one on breast cancer, obtained from the University of California, Irvine's (UCI) Machine Learning Repository, and another on a competition promoted by Santander Bank for the Kaggle community, we show that models using the proposed link functions can provide a better fit and predictive ability than models using standard links, such as logit.

MEMBROS DA BANCA:
Presidente - 1961783 - PAULO HENRIQUE FERREIRA DA SILVA
Interno - 2019094 - PAULO JORGE CANAS RODRIGUES
Externo à Instituição - FRANCISCO LOUZADA NETO - USP

Notícia cadastrada em: 28/07/2021 10:20