A new trigonometric generalized family of distribution, the "Transmuted Cosine Topp-Leone G Family" is proposed in this study. Established on the foundations of, this family combines the adaptability of the Topp-Leone distribution with the periodicity of the cosine function and transmutation theory to produce a flexible framework that may be used to represent a wide range of real-life phenomena. We derive some of the statistical properties of the introduced family such as; survival and hazard functions, and moment and moment-generating function. Moreover, the model parameters are estimated using the Maximum Likelihood method, and a Monte Carlo simulation was performed to ascertain the behavior and the consistency of the estimates. Lastly, we demonstrate the applicability of the family on the two-lifetime datasets.
Keywords: Tope Leone G, Cosine-G, Transmuted-G, Monte Carlo simulation, maximum likelihood estimation.
In the field of mathematical modeling and statistical analysis, researchers are constantly exploring new probability distributions and their applications. There are many proposed probability models for modeling real-world phenomena in different disciplines, such as technology, medicine, economics, biological studies, environmental sciences, and more. Due to the dynamic nature of the datasets generated nowadays, developing generalized families of distributions has become a common practice among researchers. These families are typically obtained through parametric transformations, aiming to enhance the performance of the parent distributions. Some examples of such techniques include the exponentiated G family developed by (Gupta & Kundu, 1999), the Marshall Olkin family introduced by (Marshall & Olkin, 1997), beta generated family (Eugene et al., 2002), and Kumaraswamy G family proposed by(Gauss M. Cordeiro & de Castro, 2011).
(Kumar et al., 2015; Souzay et al., 2019) introduced a modern approach to generalizing probability distributions using trigonometric transformations. This approach has these days achieved popularity among statisticians as a result of its remarkable success in effectively modeling real-world data, surpassing traditional methods. The trigonometric G families offer numerous advantages, such as the simplicity of their functions, and allow for more complex parameters that align with the original distributions. Among the distributions within this framework, the cosine G family has garnered significant interest due to its desirable properties. Within this context, (Nanga et al., 2023) introduced the cosine Topp-Leone family by merging the cos-G family with the Topp-Leone G family. (Muhammad et al., 2021) proposed a modified cos-G family, comprising the extended cosine Weibull, extended cosine power, and extended cosine generalized half-logistic models. (Bakr et al., 2022) originated the odd Lomax trigonometric generalized family, with a specific model called the Lomax cosecant Weibull. (Mahmood et al., 2022) provided an expansion of the cosine generalized family, including the extended cosine Weibull, which exhibits various density shapes and hazard rate functions.
Topp-Leone distribution (TL) introduced by (Topp & Leone, 1955) is a probability distribution for modeling lifetime data that has been applied in many fields of study. The TL distribution stands out for its versatility and applicability in modeling a wide range of phenomena. Topp-Leone G (TL-G)family as an alternative to exponentiated G was proposed by (Al-shomrani et al., 2016). The TL- G family of distributions and its several extensions have been studied in several research. The Topp Leone Exponential-G family was presented by (Sanusi et al., 2020) and was proven to be useful for modeling positive real data sets. This family was subsequently extended to the Topp-Leone Gompertz-G distribution by (Oluyede et al., 2022), proving that it could handle non-monotonic hazard rate functions and heavy-tailed data. Building on this work, (Reyad et al., 2018) presented the Topp-Leone odd Lindey G family, which provides flexibility to the model and has been demonstrated to work effectively in real-world data applications. Lastly, (Hassan et al., 2020) presented the inverted TL distribution, which is successful in simulating real datasets and comprises a range of hazard function shapes.
The transmuted G class of distributions has been used in various statistical applications, demonstrating its flexibility and usefulness in modeling real datasets. A series of studies have introduced and explored the properties of various transmuted G families of distributions. (Yousof & Afify, 2015) introduced the transmuted exponentiated G family which was shown to be flexible and applicable to real datasets. (Afify et al., 2020) further expanded this with the Marshall Olkin transmuted Generalized family, and (Yousof et al., 2017) introduced the transmuted Topp-Leone G family which was also found to have useful properties and applications. These families of distributions offer a range of options for modeling and analyzing data. (Mohammed & Ugwuowo, 2021) extended this with transmuted exponential Topp Leone distribution, which outperformed other distributions using some baseline distributions when employed for real-life data sets.
In this study, we propose the Transmuted Cosine Topp-Leone Family (TrCTL -G), which combines the flexibility of the TL distribution with the periodicity of the cosine function through transmutation. This family holds the potential to explore novel approaches for modeling complex phenomena that display both stochastic and oscillatory characteristics.
2.0 TrCTL-G Family
Here, we presented the Transmuted Cosine Topp-Leone Family (TrCTL-G). But we first presented the Transmuted-G, Top-Leone-G, and Cos-G families.
2.1 Transmuted-G Family
Transmuting means adding an extra parameter to an existing distribution. Let
be the cumulative distribution function and probability density function of a baseline distribution respectively, with parameter. The cumulative distribution function(cdf) and probability density function (pdf) of transmuted G family distribution as derived by (Shaw & Buckley, 2007) is expressed as;
Which can also be simplified as
The pdf can be obtained by taking a first derivative of (2) which becomes;
2.2 Cosine G family
The use of trigonometric transformation to generate new distribution has attracted the interest of many researchers. (Souza et al., 2019) introduced cos-G family with and defined its cdf and pdf as follows;
2.3 Transmuted Cosine G family
Here we propose the Transmuted Cosine G family by combining the Cos-G family and Transmuted-G family, the cumulative distribution function (cdf) and probability density function (pdf) of the transmuted-G family are expressed below:
The pdf can be written as
2.4 Topp-Leone G family
The cumulative distribution function(cdf) and probability density function (pdf) of the Topp-Leone Generalized family as introduced by (Alshomran et al., 2016) given by;
By combining transmuted cosine G in (6) with Topp-Leone G (8) as a parent distribution, we propose a newly generalized transmuted cosine Topp-Leone G family with the cumulative distribution function(cdf) and probability density function (pdf) given as;
The corresponding survival function failure rate or hazard function of the TrCTL-G is derived as;
Quantile function of the TrCTL-G family
We can derive the quantile function of the TrCTL-G family of distribution as follows:
Solve the nonlinear equation (12) using the quadratic equation formula
Useful expansion of the distribution
We derive a useful representation of the pdf of TrCTL-G, in the section. The expansions of the sine and cosine functions from the Taylor series expansion are given as
Applying (16) we have;
Using binomial expansion
Applying series and binomial expansion
The pdf of TrCTL-G can now be reduced to
The rth moment of the TrCTL-G can be obtained as
We derive the moment-generating function of the TrCTL-G below
When applying the exponential expansion (Taylor series expansion) we have,
Parameter estimation
The most frequently used parameter estimation technique in the literature is the maximum likelihood method due to its accurate performance and consistency in estimation. We thus take into account estimating the parameters of this family using just maximum likelihood.
be a random sample of size n from the TrCTL-G and consider a p × 1 vector of parameters
The following is an expression for the log-likelihood function:
The score function components
can be obtained by taking the partial derivative of the log-likelihood function
as follows:
MLE estimates can be obtained by Setting,
and solving the resulting equations to obtain the maximum likelihood estimators. Moreover, these equations may be solved analytically hence, they require to be solved numerically.
Some special cases of TrCLT-G
Some sub-models of the TrCLT-G transmuted cosine Topp-Leone Weibull (TrCLTW-G) distribution and transmuted cosine Topp-Leone exponential (TrCLTE-G) are presented in this section.
Transmuted cosine Topp-Leone Weibull
The new distribution from the family is obtained by replacing the
with the cumulative distribution function and probability density function of the Weibull distribution, respectively. This provides the Transmuted cosine Topp-Leone Weibull (TrCTLW) distribution with the probability density function presented below.
Figure 1 presents the probability density function and hazard graph of the TrCTLW distribution. The probability density function exhibits a reverse J, right skewed, and near symmetrical shapes. While the hazard function shows increasing and decreasing shapes. These are strong indications that the TrCTLW distribution has great flexibility in the modeling of many lifetime datasets. Figure 2 displays the cdf and survival plots of the TrCTLW.
Figure 1: TrCTLW PDF and hazard plots
Figure 2: TrCTLW Survival and CDF plots
Transmuted cosine Topp-Leone exponential
and are cdf and pdf of the exponential distribution, respectively. We can derive a new distribution from the family of the Transmuted cosine Topp-Leone exponential (TrCTLE) with the following pdf.
Figure 3 provides the pdf and hazard graphs for the TrCTLE distribution using various distribution parameter values. The pdf exhibits a right-skewed and near-symmetrical shape. While the hazard of the TrCTLE distribution is increasing and decreasing in shape. These are clear indications that the TrCTLE distribution has tremendous flexibility in the modeling of diverse lifetime datasets. The cdf and survival function plots of the TrCTLE are presented in Figure 4.
Figure 3: TrCTLE PDF and hazard plots
Figure 4: TrCTLE Survival and CDF plots
Here, we conduct a simulation study to evaluate the accuracy and performance of MLE in the estimation of the TrCTLW distribution’s parameters. Using the quantile function of the TrCTLW distribution, we produce samples of sizes n = 50, 100, 250, 500, and 1000 for 1000 iterations. Then, two sets of the selected parameter values are assigned; set one
and set two
Simulation results compare the actual parameter values and the estimates, where the corresponding measures such as; root mean square errors (RMSE) and average biases (AB) are obtained for the two instances and displayed in Table 1. It’s seen that as the sample size rises, both the biases and the root mean square errors approach zero as they decrease. We may thus draw the conclusion the MLE method is sufficient enough in estimating the parameters of TrCTLW distribution.
Table 1: The outcome of simulation for the model’s parameter estimation based on MLE
SET one | |||||||||
Actual values | Sample size | AB | RMSE | AB | RMSE | AB | RMSE | AB | RMSE |
50 | 0.0909 | 0.1637 | -0.0090 | 0.1987 | -0.0403 | 0.0478 | -0.0143 | 0.1189 | |
100 | 0.0878 | 0.1408 | 0.0129 | 0.1358 | 0.0416 | 0.0463 | -0.0205 | 0.0962 | |
250 | 0.0813 | 0.1180 | 0.0257 | 0.0965 | -0.0428 | 0.0454 | -0.0241 | 0.0736 | |
500 | 0.0858 | 0.1091 | 0.0283 | 0.0749 | -0.0424 | 0.0439 | -0.0301 | 0.0600 | |
1000 | 0.0850 | 0.0986 | 0.0285 | 0.0574 | -0.0425 | 0.0433 | -0.0316 | 0.0505 | |
set two | |||||||||
Actual values | Sample size | AB | RMSE | AB | RMSE | AB | RMSE | AB | RMSE |
50 | 0.1940 | 0.2969 | -0.0277 | 0.3040 | -0.1449 | 0.1658 | -0.0997 | 0.2373 | |
100 | 0.1906 | 0.2549 | -0.0289 | 0.2056 | -0.1470 | 0.1597 | -0.1163 | 0.1930 | |
250 | 0.1664 | 0.2087 | -0.0098 | 0.1418 | -0.1556 | 0.1612 | -0.1178 | 0.1560 | |
500 | 0.147 | 0.1783 | 0.0094 | 0.1103 | -0.1619 | 0.1650 | -0.1132 | 0.1396 | |
1000 | 0.1329 | 0.1619 | 0.0093 | 0.0812 | -0.1653 | 0.1677 | -0.1033 | 0.1295 |
In this section, we demonstrated the prospect of the TrCTLG family in a practical context through applications to lifetime data sets and investigated the efficiency and excellence of the family in fitting real-world data. All computations were conducted using the R program. We examine two data sets, and the TrCTLW and TrCTLE fitting is compared with cosine Topp-Leone Weibull (CTLW), and Weibull (WD) distributions. For illustrative purposes, we take into consideration various goodness-of-fit metrics, including the Anderson-Darling statistic (AD), the Kolmogorov-Smirnov statistic (KS), and Cramer-von Mises (CVM). We also compute the values of negative log-likelihood along with the information criteria: the Akaike Information Criteria (AIC), the Consistent Akaike Information Criteria (CAIC), and the Bayesian Information Criteria (BIC). It follows that the model with the fewest criterion values has the best fit, and also the model with the smallest KS statistics and greatest p-value.
Data one: This dataset, as presented in Table 2, comprises the total milk production from the first birth of 107 cows of the SINDI race. The data is obtained from (G M Cordeiro & dos Santos Brito, 2012).
Data two: The Pediatric Oncology Group (POG) presented this dataset of standard-risk acute lymphocytic leukemia in children in May 1981 and published it in (Gieser et al., 1998).
Table 2: Lifetime data presentation
Data | Observation |
Dataset one | 0.4365, 0.4260, 0.5140, 0.6907, 0.7471, 0.2605, 0.6196, 0.8781, 0.4990, 0.6058, 0.6891, 0.5770, 0.5394, 0.1479, 0.2356, 0.6012, 0.1525, 0.5483, 0.6927, 0.7261, 0.3323, 0.0671, 0.2361, 0.4800, 0.5707, 0.7131, 0.5853, 0.6768, 0.5350, 0.4151, 0.6789, 0.4576, 0.3259, 0.2303, 0.7687, 0.4371, 0.3383, 0.6114, 0.3480, 0.4564, 0.7804, 0.3406, 0.4823, 0.5912, 0.5744, 0.5481, 0.1131, 0.7290, 0.0168, 0.5529, 0.4530, 0.3891, 0.4752, 0.3134, 0.3175, 0.1167, 0.6750, 0.5113, 0.5447, 0.4143, 0.5627, 0.5150, 0.0776, 0.3945, 0.4553, 0.4470, 0.5285, 0.5232, 0.6465, 0.0650, 0.8492, 0.8147, 0.3627, 0.3906, 0.4438, 0.4612, 0.3188, 0.2160, 0.6707, 0.6220, 0.5629, 0.4675, 0.6844, 0.3413, 0.4332, 0.0854, 0.3821, 0.4694, .3635, 0.4111, 0.5349, 0.3751, 0.1546, 0.4517, 0.2681, 0.4049, 0.5553, 0.5878, 0.4741, 0.3598, 0.7629, 0.5941, 0.6174, 0.6860, 0.0609, 0.6488, 0.2747 |
Dataset two | 1.3, 1.0, 1.2, 0.9, 1.1, 0.8, 0.5, 1.0, 0.7, 0.5, 1.7, 1.1, 0.8, 0.5, 1.2, 0.8, 1.1, 0.9, 1.2, 0.9, 0.8, 0.6, 0.3, 0.8, 0.6, 0.4, 1.1, 1.1, 0.2, 0.8, 0.5, 1.1, 0.1, 0.8, 1.7, 1.0, 0.8, 1.0, 0.8, 1.0, 0.2, 0.8, 0.4, 1.0, 0.2, 0.8, 1.4, 0.8, 0.5, 1.1, 0.9, 1.3, 0.9, 0.4, 1.4, 0.9, 0.5, 1.7, 0.9, 0.8, 0.8, 1.2, 0.9, 0.8, 0.5, 1.0, 0.6, 0.1, 0.2, 0.5, 0.1, 0.1, 0.9, 0.6, 0.9, 0.6, 1.2, 1.5, 1.1, 1.4, 1.2, 1.7, 1.4, 1.0, 0.7, 0.4,0.9, 0.7, 0.8, 0.7, 0.4, 0.9, 0.6, 0.4, 1.2, 2.0, 0.7, 0.5, 0.9, 0.5, 0.9, 0.7, 0.9, 0.7, 0.4, 1.0, 0.7, 0.9, 0.7, 0.5, 1.3, 0.9, 0.8, 1.0, 0.7, 0.7, 0.6, 0.8, 1.1, 0.9, 0.9, 0.8, 0.8, 0.7, 0.7, 0.4, 0.5, 0.4, 0.9, 0.9 , 0.7, 1.0, 1.0, 0.7, 1.3, 1.0, 1.1, 1.1, 0.9, 1.1, 0.8, 1.0, 0.7, 1.6, 0.8, 0.6, 0.8, 0.6, 1.2,0.9, 0.6, 0.8, 1.0, 0.5, 0.8, 1.0, 1.1, 0.8, 0.8, 0.5, 1.1, 0.8, 0.9, 1.1, 0.8, 1.2, 1.1, 1.2, 1.1, 1.2, 0.2, 0.5, 0.7, 0.2,0.5, 0.6, 0.1, 0.4, 0.6, 0.2, 0.5, 1.1, 0.8, 0.6, 1.1, 0.9, 0.6, 0.3, 0.9, 0.8, 0.8, 0.6, 0.4, 1.2, 1.3, 1.0,0.6, 1.2, 0.9, 1.2, 0.9, 0.5, 0.8, 1.0, 0.7, 0.9, 1.0, 0.1, 0.2, 0.1, 0.1, 1.1, 1.0, 1.1, 0.7, 1.1, 0.7, 1.8, 1.2, 0.9, 1.7, 1.2, 1.3, 1.2, 0.9, 0.7, 0.7, 1.2, 1.0, 0.9, 1.6, 0.8, 0.8, 1.1, 1.1, 0.8, 0.6, 1.0, 0.8, 1.1,0.8, 0.5, 1.5, 1.1, 0.8, 0.6, 1.1, 0.8, 1.1, 0.8, 1.5, 1.1, 0.8, 0.4, 1.0, 0.8, 1.4, 0.9, 0.9, 1.0, 0.9, 1.3, 0.8, 1.0, 0.5, 1.0, 0.7, 0.5, 1.4, 1.2, 0.9, 1.1, 0.9, 1.1, 1.0, 0.9, 1.2, 0.9, 1.2, 0.9, 0.5, 0.9, 0.7, 0.3,1.0, 0.6, 1.0, 0.9, 1.0, 1.1, 0.8, 0.5, 1.1, 0.8, 1.2, 0.8, 0.5, 1.5, 1.5, 1.0, 0.8,1.0, 0.5, 1.7, 0.3, 0.6, 0.6, 0.4, 0.5, 0.5, 0.7, 0.4, 0.5, 0.8, 0.5, 1.3, 0.9, 1.3, 0.9, 0.5, 1.2, 0.9, 1.1, 0.9, 0.5, 0.7, 0.5, 1.1 , 1.1, 0.5, 0.8, 0.6, 1.2, 0.8, 0.4, 1.3, 0.8, 0.5, 1.2, 0.7, 0.5, 0.9, 1.3, 0.8, 1.2, 0.9 |
Table 3: The model parameters MLE estimates and information criteria for the dataset one
Model | AIC | CAIC | BIC | ||||
TrCTLW | 0.3502 | -0.6783 | 2.8620 | 0.5331 | 229.16 | 229.27 | 244.54 |
TrCTLE | 2.2955 | -0.7307 | 1.4308 | - | 285.05 | 285.12 | 296.59 |
CTLW | 0.3837 | - | 3.3219 | 0.3837 | 233.56 | 233.63 | 245.10 |
WD | - | - | 2.7289 | 1.1329 | 231.55 | 231.59 | 239.24 |
Table 4: The test results for the Goodness-of-fit for dataset one
Model | KS (p-value) | AD (p-value) | CVM (p-value) | |
TrCTLW | -110.58 | 0.08 (0.015) | 2.44 (<0.0001) | 0.43 (<0.0001) |
TrCTLE | -139.52 | 0.14 (<0.0001) | 7.31 (<0.0001) | 1.26 (<0.0001) |
CTLW | -113.78 | 0.09 (0.005) | 8.92 (<0.0001) | 1.50 (<0.0001) |
WD | -113.78 | 0.10 (0.04) | 8.92 (<0.0001) | 1.50 (<0.0001) |
Figure 5: Empirical probability density function plots for the fitted distributions on dataset one
Table 3 presents the MLE estimates of the TrCTLW and the results of the information criteria (the AIC, BAIC, and BIC). Since the TrCTLW has the lowest values and the lowest of these performance evaluation measures, it’s clear that the TrCTLW distribution performs better in fitting these lifetime data. Furthermore, the log-likelihood and the values of KS, AD, and CVM with their p-values (in parentheses) are shown in Table 4. The proposed distribution looks to be a very competitive model for these data since the values of the considered metrics are lower and the probability value of Kolmogorov-Smirnov statistics is greater when compared with those values of the other models.
Table 5: The model parameters MLE estimates and information criteria for the dataset two
Model | AIC | CAIC | BIC | ||||
TrCTLW | 0.1246 | -0.6058 | 5.8758 | 4.7018 | -48.97 | -48.58 | -41.28 |
TrCTLE | 1.6035 | -0.6821 | 2.2111 | - | -10.66 | -10.42 | -2.64 |
CTLW | 0.1485 | - | 6.8868 | 5.0200 | -48.38 | -48.15 | -40.37 |
WD | - | - | 2.6012 | 5.3818 | -38.69 | -38.58 | -33.35 |
Table 6: The test results for the Goodness-of-fit for dataset two
Model | KS (p-value) | AD (p-value) | CVM (p-value) | |
TrCTLW | 28.49 | 0.06 (0.84) | 0.39 (0.37) | 0.06 (0.38) |
TrCTLE | 8.33 | 0.12 (0.07) | 3.80 (<0.0001) | 0.62 (<0.0001) |
CTLW | 27.19 | 0.08 (0.56) | 0.64 (0.09) | 0.10 (0.10) |
WD | 21.34 | 0.08 (0.45) | 0.64 (0.09) | 0.10 (0.11) |
Figure 6: Empirical probability density function plots for the fitted distributions on dataset two
Table 5 displays the statistics AIC, CAIC, and BIC for all models studied. When contrasted with the values of the other models, the TrCTLW values are smaller than those of the other distributions, indicating that this novel distribution is a highly competitive model for these data. Table 6 shows that TrCTLW has the greatest KS p-value and the lowest KS, AD, and CVM values for Data Set two. This illustrates that the TrCTLW distribution performs better in fitting these data sets. However, we demonstrate how well the proposed distribution fits the real data by plotting the empirical pdf of the TrCTLW distribution against other competing models for both datasets one and two, respectively. We display these plots in Figures 5 and 6 which reveal that, in terms of model fitting the TrCTLW superior over other models for both datasets. Based on this, we can conclude that TrCTLW better fits the two data.
In this study, we extend the transmuted family introduced by (Shaw & Buckley, 2007) by proposing a new family of continuous distributions, the transmuted Cosine Topp-Leone G family (TrCTLG). A number of its statistical characteristics are examined, such as the quantile, hazard, and survival functions, moments, and generating functions. We use a MLE technique to estimate the model parameters and the method's performance is evaluated using the MC simulation. The results show that ML estimates are flexible and consistent, as the AB and RMSE decrease when the sample size increases. Finally, two data sets were fitted to the proposed model to illustrate the flexibility of the TRCTLG family of distributions over lifetime data, and the results clearly show that the proposed distribution continuously outperforms the other three competitor distributions for standard goodness-of-fit measures and all criteria.
