The Beta-Hyperbolic Secant (BHS) Distribution

The shape of a probability distribution is often summarized by the distribution's skewness and kurtosis. Starting from a symmetric parent density f on the real line, we can modify its shape (i.e. introduce skewness and in-/decrease kurtosis) if f is appropriately weighted. In particular, every density w on the interval (0; 1) is a specific weighting function. Within this work, we follow up a proposal of Jones (2004) and choose the Beta distribution as underlying weighting function w. Parent distributions like the Student-t, the logistic and the normal distribution have already been investigated in the literature. Based on the assumption that f is the density of a hyperbolic secant distribution, we introduce the Beta-hyperbolic secant (BHS) distribution. In contrast to the Beta-normal distribution and the to Beta-Student-t distribution, BHS densities are always unimodal and all moments exist. In contrast to the Beta-logistic distribution, the BHS distribution is more eexible regarding the range of skewness and leptokurtosis combinations. Moreover, we propose a generalization which nests both the Beta-logistic and the BHS distribution. Finally, the goodness-of-fit between all above-mentioned distributions is compared for glass fibre data and aluminium returns.


Introduction
Several techniques can be applied to symmetric distributions in order to generate asymmetric ones with possibly lighter or heavier tails. In terms of density functions -provided their existence -most of these methods can be represented by where g denotes the transformed density, f and F the (symmetric) pdf and cdf, respectively, of the original ("parent") distribution and w is an appropriate weighting function on the interval (0, 1) with parameter vector θ (see, for instance, Ferreira and Steel, 2004). Choosing w(u; λ) = 2F (λF −1 (u)), the skewing mechanism of Azzalini (1985Azzalini ( , 1986) is recovered. Similarly, using w(u; λ) = 2 λ + 1 λ f (λ sign(0.5−u) F −1 (u)) f (F −1 (u)) (2) corresponds to applying different parameters of scale to the positive and the negative part of a symmetric density (see, for example, Fernández, Osiewalski andSteel, 1995 andTheodossiou, 1998).
In particular, every probability density on (0, 1) which is not uniform can be used either to introduce skewness and/or to modify the kurtosis of 1 0 t a−1 (1 − t) b−1 dt denotes the Beta function (cf. Jones, 2004). Examples where (3) has been used in the literature are the following: - Aroian (1941), Prentice (1975): Beta-logistic distribution (which is also termed as exponential generalized beta of the second kind or EGB2 distribution, or log F distribution), - Eugene et al. (2002): Beta-normal (BN) distribution, - Jones and Faddy (2003): Beta-Student-t distribution.
Within this work we introduce the BHS (Beta-hyperbolic secant) distribution as a weighted hyperbolic secant distribution with weights from (3). The hyperbolic secant distribution itself dates back to Perks (1932). It is symmetric, more leptokurtic than the normal, even more than the logistic distribution but still with existing moments. Both the cumulative distribution function and the inverse cumulative distribution function are given in closed form. Despite its interesting properties, the hyperbolic secant distribution has not received sufficient attention in the literature so far.
Whereas both Beta-normal and Beta-Student-t distribution do not guarantee unimodality -except for a special parameterization given in Ferreira and Steel (2004) -the BHS distribution does. In contrast to the Beta-Student-t distribution, all moments of the BHS distribution exist. Although the Beta-logistic and the BHS distribution are very similar, the BHS distribution will be seen to be more flexible regarding skew and leptokurtic data. In order to discriminate between both distribution models, a generalized Beta-GSH model -based on Vaughan's (2002) generalized secant hyperbolic (GSH) distribution -is proposed that includes both candidate distributions as special case.
The paper is structured as follows: The BHS distribution and some fundamental properties are introduced in section 2. Section 3 is devoted to the parameter estimation of the BHS distribution. A generalization of both the Beta-logistic distribution and the BHS distribution is proposed in section 4. In section 5, the BHS distribution is compared with its competitors derived from alternative parent distributions.

Definition of the Beta-Hyperbolic Secant Distribution
The probability density function of a standardized (i.e. zero mean and unit variance) hyperbolic secant distribution is given by It is symmetric and the corresponding cumulative distribution function is The inverse cumulative distribution function is F −1 (u) = log(tan( πu 2 )). Combining (1), (3), (4) and (5), the density of the Beta-hyperbolic secant (BHS) distribution is defined by where β 1 > 0 and β 2 > 0 determine the shape of the density. The corresponding cumulative distribution function is Introducing a location parameter µ ∈ R and a scale parameter σ > 0, the BHS density from (6) generalizes to Different densities and their corresponding log-densities with µ = 0, σ = 1 and varying β 1 , β 2 are plotted in figure 1.  Define θ ≡ β1−β2 2 and β ≡ β1+β2 2 > 0. Then β + θ = β 1 and β − θ = β 2 , and equation (3) can be rewritten as where C(β, θ) = 1 only if β = 1. Thus, the weighting density can be partitioned into two parts, where the first part essentially governs the amount of kurtosis and the second part the amount of skewness (see figure 2, where both parts are plotted separately). Consequently, a second parameterization of BHS density is given by where symmetry corresponds to θ = 0.  In order to ensure the existence of the Beta function in the last equation, both β + θ and β − θ have to be positive. Hence, it is required that |θ| < β, i.e. highly leptokurtic data (that means small β) induce higher restrictions on θ. It also becomes obvious from the above parameterization that β 1 and β 2 commonly determine skewness and kurtosis (measured by the third and fourth standardized moment within this work).
Lemma 2 (Tail behavior). The BHS distribution has exponentially decaying tails. In particular, the log-density is asymptotically linear with slope determined by β 1 and β 2 , respectively.
Proof. Assume µ = 0, σ = 1 and focus on the right tail of the BHS distribution. From The Beta-Hyperbolic Secant (BHS) Distribution 7 we conclude that for large x In particular, β 2 < 1 corresponds to distributions with heavier than plain exponential tails, β 2 > 1 distributions with lighter than plain exponential tails. The same argument is true for the left tail.
Additionally, the score function for the BHS distribution is derived which plays an important role in the theory of rank test (see, e.g. Kravchuk, 2005, for β 1 = β 2 = 1) Lemma 3 (Score function). With ζ(x) ≡ arctan (e x ) the score function of a BHS variable is given by .
Finally, it can be shown (see Appendix A for a detailed proof) that BHS densities are unimodal for all β 1 , β 2 > 0. This is not valid for the Betanormal and the Beta-Student-t distribution, in general.

Special and limiting cases
First of all, for β 1 = β 2 = 1 the hyperbolic secant distribution is recovered. Setting β 2 = 1 or β 1 = 1, skew hyperbolic secant distributions can be obtained. A generalized symmetric family of hyperbolic secant distributions is achieved for β 1 = β 2 = β, where β governs the amount of kurtosis. Like the Beta-logistic distribution and the Beta-normal distribution, the BHS distribution converges to the normal distribution for β 1 , β 2 → ∞.

Moments of the BHS distribution
Obviously, the exponential tail behaviour of the BHS distribution guarantees the existence of all moments. In particular, the m th non-central moment of a BHS density is given by From Gradshteyn and Ryhzik (1994), formula 1.518.3 and 9.616 we can write with the usual Riemann zeta function Using the notation the following lemma can be derived.
Lemma 5 (Moments of the BHS distribution). Assume that m > 0.
In particular, the mean of the BHS distribution is given by The Beta-Hyperbolic Secant (BHS) Distribution 9 with a k from (8). Note that ψ denotes the digamma function in the last equation. In contrast to (9), the corresponding formula for the Beta-logistic distribution is given by From the first four moments we can deduce the skewness and kurtosis coefficients M 3 and M 4 (i.e the third and fourth standardized moments) for different parameter combinations of the BHS distribution.

Moment ratio diagrams
Moment ratio diagrams have been introduced for Pearson-type distributions by Elderton and Johnson (1969) in order to provide a useful visual assessment of skewness and kurtosis. The classical moment ratio plot consists of all possible pairs (M 3 , M 4 ) that can be obtained through different combinations of the shape parameters of the underlying distributions. In general, the relation M 2 3 < M 4 − 1 for M 4 > 0 holds, i.e. for a given level of kurtosis only a finite range of skewness may be spanned.
Due to the bi-modality of the Beta-normal distribution and the nonexistence of some moments for the Beta-Student-t distribution we only compare the BHS distribution with the Beta-logistic (EGB2) distribution in figure 3, below. The possible combinations of skewness and kurtosis (for a given distribution) are indicated by the black area which was generated using a large number of random numbers from the domain of the shape parameters (β 1 , β 2 ). The dashed line (encompassing the black area) corresponds to the boundary mentioned above. Note that we plotted the exponentiated kurtosis against the skewness in order to highlight the differences between EGB2 distribution and BHS distribution. It then becomes visible that the achievable area of the BHS distribution includes that of the EGB2 distribution.

Generalizations: EGB2 versus BHS distribution
In order to discriminate between Beta-logistic (EGB2) and BHS distribution we can plug a parent distribution into (3) which includes both logistic distribution and hyperbolic secant distribution. A promising choice is the GSH distribution of Vaughan (2002) with kurtosis parameter t and density The GSH distribution includes the logistic distribution (t = 0) and the hyperbolic secant distribution (t = −π/2) as special cases and has cumulative distribution function given by for t ∈ (−π, 0), for t > 0.

Strength of glass fibre
Our first example corresponds to that of Jones and Faddy (2003) who analyzed the strengths of glass fibre. This data set is 'sample 1' of Smith and Naylor (1987) and deals with the breaking strength of n = 63 glass fibres of length 1.5 cm, originally obtained by workers at the UK National Physical Laboratory. Due the apparent skewness in the data set (see figure 4(a) for a histogram of the data), Jones and Faddy (2003) fitted a Beta-Student-t distribution -using a reparameterized version -to the data, estimating the unknown parameters by means of maximum likelihood.  Additionally, we fitted a Beta-normal, a Beta-logistic (EGB2), a Betahyperbolic secant (BHS) and a Beta-GSH distribution to the data. The estimation results are summarized in table 1, below. Graphs of the fitted densities are provided by figure 4(b).
Regarding the log-likelihood value L, the Beta-normal distribution seems to fit worse. Both Beta-logistic and Beta-hyperbolic secant distribution outperform the Beta-Student-t distribution, in particular, if we account for the number of parameters k and focus on the criterion of Akaike, i.e. AIC = −2L + 2k. Moreover, the log-likelihood value of the BHS distribution is higher than that of the EGB2 distribution. Finally, the Beta-GSH distribution provides evidence in favor of the BHS distribution against the EGB2 distribution. Concerning the estimation results of the Student-t, the parameters β 1 , β 2 , ν seem to be poorly identified. We therefore fix the number of degrees at 2 as in Jones and Faddy (2003). Note that the 6th column of table 1 contains the estimated shape parameter beyond β 1 and β 2 , i.e. the estimated degrees of freedom ν for a Beta-Student-t distribution and the estimated t of the Beta-GSH distribution, respectively.

Returns aluminium
Secondly, we focus on the series of the daily aluminium prices (in US-Dollar/Tonne) from January 1999 to September 2002 (N = 1195 observations) which can be obtained from the LME (London Metal Exchange). 1 The series of prices and corresponding log-returns (i.e. difference of consecutive log-prices) are displayed in figure 5.
The (sample) mean of the log-returns is −0.0139 with a (sample) standard deviation of 1.0560. Moreover, there seems to be a certain amount of skewness in the data set (the skewness coefficient -measured by the third standardized moments -is given by by 0.2398), whereas the kurtosis coefficient -in terms of the fourth standardized moments -is 4.4250, reflecting the leptokurtosis of the data. The results of a maximum likelihood estimation are summarized in table 2, below.  Though this data set is totally different to the glass fibre data, the results are nearly identical (concerning the order of the log-likelihood values). Again, the Beta-GSH distribution favors the BHS distribution against the Beta-Logistic distribution with t = −1.63, both of which outperform Beta-normal and Beta-Student-t. Again, the shape parameters of the Beta-Student-t seem to be unidentified.

Summary
A new class of probability densities (the so-called BHS-distribution family) is introduced which arises as special case from the general family explored by Jones (2004) if the hyperbolic secant distribution is chosen as "parent distribution". It exhibits similar behavior and properties like the log-F or EGB2 distribution. In particular, the range of possible skewness and kurtosis combinations of the BHS distribution includes that of the EGB2 distribution. Moreover, a generalized distribution model is introduced which includes both EGB2 and BHS distribution. Application to glass fibre data and aluminium returns provides empirical evidence in favor of the BHS distribution.
Altogether, this means that g has exactly one root in (−∞, ∞). It then also follows this yields a relative maximum (and hence absolute maximum) since g is positive to the left of the root, and negative to the right. 2