ARTICLE IN PRESSStatistics & Probability Letters 73 (2005) 189–1970167-7152/$ -
doi:10.1016/j.
Fax.: +35
1The author
Tecnologia (Fwww.elsevier.com/locate/staproAn exponential inequality for associated variables
Paulo Eduardo Oliveira,1
Dep. Matema´tica, Univ. Coimbra, Apartado 3008, 3001-454 Coimbra, Portugal
Received 2 April 2004; accepted 22 November 2004
Available online 16 February 2005Abstract
We prove an exponential inequality for positively associated and strictly stationary random variables
replacing an uniform boundedness assumption by the existence of Laplace transforms. The proof uses a
truncation technique together with a block decomposition of the sums to allow an approximation to
independence. We show that for geometrically decreasing covariances our conditions are fulfilled,
identifying a convergence rate for the strong law of large numbers.
r 2005 Elsevier B.V. All rights reserved.
Keywords: Association; Exponential inequality1. Introduction
One of the main tools used for characterizing convergence rates in nonparametric estimation
has been convenient versions of Bernstein type exponential inequalities. There exist several
versions available in the literature for independent sequences of variables with assumptions of
uniform boundedness or some, quite relaxed, control on their (centered or noncentered) moments.
If the independent case is classical in the literature the treatment of dependent variables is more
recent. The extension to dependent variables was first studied considering m-dependence or
different mixing conditions. An exponential inequality for strong mixing variables eventually wassee front matter r 2005 Elsevier B.V. All rights reserved.
spl.2004.11.023
1 239832568.
was partially supported by Centro de Matema´tica da Universida de Coimbra, Fundac- a˜o para a Cieˆncia e
CT) and POCTI.
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197190proved in Carbon (1993) using the same type, as for the treatment of the independent case,
of assumptions on the variables, besides the strong mixing: uniformly bounded or some control
on the moments. Naturally, these latter extensions included some extra terms on the upper
bounds depending on the mixing coefficients. An account of the main results briefly described
before may be found in Bosq (1996). In another direction in controlling dependent variables,
Azuma (1967) proved a version of exponential inequalities are also available for martingale
differences supposing the variables to be uniformly bounded and, more recently, Lesigne
and Volny´ (2001) obtained an extension assuming only the existence of Laplace transforms.
Another dependence structure that has attracted the interest of probabilists and statisticians
is association, as introduced by Esary et al. (1967). For this dependence structure the idea
of asymptotic independence is not so explicitly stated as in mixing structures. For associated
random variables Birkel (1988) seems to have been the first author to prove some moment
inequalities. An exponential type inequality appeared much latter in Ioannides and Roussas
(1998) under the assumption of uniform boundedness and some convenient behaviour on the
covariance structure of the variables. The technique used in this latter reference was adapted in
Henriques and Oliveira (2002a) to prove almost sure consistency results in nonparametric
distribution function estimation, based on associated samples, with description of convergence
rates. In another direction, with a somewhat different method, exponential decay rates for
nonparametric density estimation, also based on associated samples, were proved in Henriques
and Oliveira (2002b). The present article presents an extension of the Ioannides and Roussas’s
(1998) inequality dropping the boundedness assumption, which is replaced by the existence of
Laplace transforms.
The article is organized as follows: Section 2 describes some auxiliary results and introduces
the truncated variables used to approximate the original variables, the corresponding tails and the
block decomposition of the sums; Section 3 studies the truncated part giving conditions on the
truncating sequence to enable the proof of an exponential inequality for these terms; Section 4
treats the tails left aside from the truncation and, finally, Section 5 summarizes the partial results
into a final theorem. As indicated, the proof technique consists on a truncation which is then
treated using a blocking decomposition of the sums, together with a control on the tails of the
distribution, achieved assuming the existence of Laplace transforms.2. Definitions, preliminary results and notation
We say that the variables X 1; X 2; . . . are associated if, for every n 2 N and f ; g : Rn!R
coordinatewise increasing
Covðf ðX 1; . . . ; X nÞ; gðX 1; . . . ; X nÞÞX0
whenever this covariance exists.
For associated variables there exist some general inequalities justifying the use of assump-
tions on the covariance structure. One of such inequalities, useful in the sequel, appears in
Dewan and Prakasarao (1999) and is a generalization of an earlier result by Newman (1980).
It states a version for generating functions of Newman’s (1984) inequality for characteristic
functions.
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197 191Lemma 2.1. Let W 1; . . . ; W n be associated random variables bounded by a constant M. Then, for
every y40
E ey
Pn
i¼1 W i
Yn
i¼1
EðeyW i Þ
py2enyM
X
1piojpn
CovðW i; W jÞ.
This inequality was used in Dewan and Prakasarao (1999) to prove an exponential convergence
rate for the nonparametric estimator of the density, but the method used to control all the terms
involved forced the authors to assume a condition that is unattainable for associated variables. In
fact, these authors assumed that 1=n
Pn
i¼1 CovðX 1; X iÞ converges to zero exponentially fast. Now,
for associated variables, all the covariances are nonnegative so that, at best, the seriesP1
i¼1 CovðX 1; X iÞ is convergent with positive limit, and, in such case, 1=n
Pn
i¼1 CovðX 1; X iÞ
converges to zero at the rate n1: That is, only a polynomial decrease rate may be satisfied. The
inequality stated in Lemma 2.1 was later used in Henriques and Oliveira (2002b) to prove a
version of an exponential rate for the kernel estimator for the density, avoiding the difficulties just
described.
We quote next a general lemma used to control some of the terms appearing in the course of
proof.
Lemma 2.2 (Devroye, 1991). Let W be a centred random variable. If there exist a; b 2 R such that
PðapWpbÞ ¼ 1; then, for every l40;
EðelW Þp exp l
2ðb aÞ2
8
.
Next we introduce the notation that will be used throughout the text. Let cn; nX1; be a
sequence of nonnegative real numbers such that cn!þ1 and, given the random variables
X n; nX1; define, for each i; nX1;
X 1;i;n ¼ cnIð1;cnÞðX iÞ þ X iI½cn;cn
ðX iÞ þ cnIðcn;þ1ÞðX iÞ,
X 2;i;n ¼ ðX i cnÞIðcn;þ1ÞðX iÞ; X 3;i;n ¼ ðX i þ cnÞIð1;cnÞðX iÞ, ð1Þ
where IA represents the characteristic function of the set A. For each nX1 fixed, the variables
X 1;1;n; . . . ; X 1;n;n are uniformly bounded, thus they may be treated using Lemma 2.1. Note that, for
each nX1 fixed, all these variables are monotone transformations of the initial variables X n: This
implies that an association assumption is preserved by this construction.
The proof of an exponential inequality will use, besides the truncation introduced before, a
convenient decomposition of the sums into blocks. This block decomposition is the mean to an
approximation to independence technique on the truncated variables. The tails will be treated
directly using Laplace transforms.
Consider a sequence of natural numbers pn such that, for each nX1; pnon=2 and define rn as
the greatest integer less or equal to n=2pn : Define then, for q ¼ 1; 2; 3; and j ¼ 1; . . . ; 2rn
Y q;j;n ¼
Xjpn
l¼ðj1Þpnþ1
ðX q;l;n EðX q;l;nÞÞ. (2)
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197192Finally, for each q ¼ 1; 2; 3; and nX1; define
Zq;n;od ¼
Xrn
j¼1
Y q;2j1;n; Zq;n;ev ¼
Xrn
j¼1
Y q;2j;n,
Rq;n ¼
Xn
l¼2rnpnþ1
ðX q;l;n EðX q;l;nÞÞ. (3)
The proof of the main result is now divided into the control of the bounded terms,
corresponding to the index q ¼ 1; and the control of the nonbounded terms that correspond to the
indices q ¼ 2 and 3.3. Controlling the bounded terms
From definitions (1) and (2) it is obvious that jY 1;j;njp2pncn; j ¼ 1; . . . ; rn: This enables us to
use Lemma 2.2 to control the Laplace transform of these variables. A straightforward application
of this lemma produces the following upper bounds.
Lemma 3.1. Let X 1; X 2; . . . be random variables. If Y 1;j;n; j ¼ 1; . . . ; 2rn are defined by (2) then, for
every l40
Yrn
j¼1
EðelY 1;2j1;nÞp expðl2npnc2nÞ,
Yrn
j¼1
EðelY 1;2j;nÞp expðl2npnc2nÞ.
As it was done in Ioannides and Roussas (1998) and Henriques and Oliveira (2002a,b) we will
be interested in controlling the differences between the Laplace transform of a sum of variables
and what we would have if the variables were independent. These are the terms appearing in the
left side of the inequalities stated in the previous lemma. This control is achieved by summing the
odd indexed terms on one side and the even indexed terms on the other side, as was done in
Henriques and Oliveira (2002b).
Lemma 3.2. Let X 1; X 2; . . . be strictly stationary and associated random variables. On account of
definitions (1), (2) and (3), and for every l40
EðelZ1;n;odÞ
Yrn
j¼1
EðelY 1;2j1;nÞ
p l
2n
2
elncn
Xð2rn1Þpn
j¼pnþ2
CovðX 1; X jÞ (4)
and analogously for the term corresponding to Z1;n;ev:
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197 193Proof. According to (3) and the fact that the variables defined in (1) are associated we have, from
a direct application of Lemma 2.1
EðelZ1;n;odÞ
Yrn
j¼1
EðelY 1;2j1;nÞ
pl2rnpne2lrnpncn
X
1pjoj0prn
CovðY 1;2j1;n; Y 1;2j01;nÞ. (5)
As 2rnpnpn the factors in front of the summation and the exponent are bounded above by the
quantities figuring in (4), so we are left with the sum of the covariances to deal with. Using the
stationarity of the variables it follows that:
X
1pjoj0prn
CovðY 1;2j1;n; Y 1;2j01;nÞ ¼
Xrn1
j¼1
ðrn jÞCovðY 1;1;n; Y 1;2j1;nÞ.
A further invocation of the stationarity implies that
CovðY 1;1;n; Y 1;2j1;nÞ ¼
Xpn1
l¼0
ðpn lÞCovðX 1;1;n; X 1;2jpnþlþ1;nÞ
þ
Xpn1
l¼1
ðpn lÞCovðX 1;lþ1;n; X 1;2jpnþ1;nÞ
ppn
Xð2jþ1Þpn
l¼ð2j1Þpnþ2
CovðX 1;1;n; X 1;l;nÞ. ð6Þ
We now analyze the covariances using the Hoeffding formula (see, for example, Lemma 2 in
Lehmann (1966))
CovðX 1;i;n; X 1;j;nÞ ¼
Z
R2
PðX 1;i;n4u; X 1;j;n4vÞ PðX 1;i;n4uÞPðX 1;j;n4vÞdudv. (7)
According to the truncation made in (1), it easily follows that the integrand function vanishes
outside the square ½cn; cn
2: Moreover, for u; v 2 ½cn; cn
we may replace, in the integrand
function, the variables X 1;i;n and X 1;j;n by X i and X j; respectively, so that
CovðX 1;i;n; X 1;j;nÞ ¼
Z
½cn;cn
2
PðX i4u; X j4vÞ PðX i4uÞPðX j4vÞdudv
p
Z
R2
PðX i4u; X j4vÞ PðX i4uÞPðX j4vÞdudv ¼ CovðX i; X jÞ
due to the nonnegativity of the latter integrand function, as follows from the association of the
original variables. Inserting this into the inequalities stated earlier, (5) and (6), the lemma
follows. &
We may now prove an exponential inequality for the sum of odd indexed or even indexed terms.
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197194Lemma 3.3. Let X 1; X 2; . . . be strictly stationary and associated random variables. Suppose that
n
p2nc
4
n
exp
n
pncn
X1
j¼pnþ2
CovðX 1; X jÞpC0o1. (8)
Then, for every e 2 ð0; 1Þ;
P
1
n
jZ1;n;odj4
e
9
pð1þ 36C0Þ exp
ne2
162pnc
2
n
(9)
and analogously for Z1;n;ev:
Proof. Applying Markov’s inequality and using the previous lemma we find that, for every l40;
P
1
n
jZ1;n;odj4
e
9
p l
2n
2
exp lncn ln
e
9
Xð2rn1Þpn
j¼pnþ2
CovðX 1; X lÞ þ exp l2npnc2n ln
e
9
. (10)
Optimizing the exponent in the last term of this upper bound we find l ¼ e=18pnc2n; so that this
exponent becomes equal to ne2=162pnc2n: Replacing this choice of l into the first term of the
upper bound and taking into account (13) it follows that
P
1
n
jZ1;n;odj4
e
9
p36C0 exp
ne2
18pnc
2
n
þ exp ne
2
162pnc
2
n
pð1þ 36C0Þ exp
ne2
162pnc
2
n
: &
To complete the treatment of the bounded terms it remains to control the sum corresponding to
the indices following 2rnpn; that is, R1;n:
Lemma 3.4. Let X 1; X 2; . . . be strictly stationary associated variables and suppose that
n
cn
!þ1. (11)
Then, on account of definition (3), for n large enough and every e40; we have PðjR1;nj4neÞ ¼ 0:
Proof. As R1;n ¼
Pn
l¼2rnpnþ1 ðX 1;l;n EðX 1;l;nÞÞ it follows that jR1;njp2ðn 2rnpnÞcnp2cn; accord-
ing to the construction of the sequences rn and pn: Now PðjR1;nj4neÞpPð24ne=cnÞ and, using
(11), this is zero for n large enough. &
In order to prove the almost sure convergence of 1=n
Pn
i¼1 ðX 1;i;n EðX 1;i;nÞÞ and identify a
convergence rate we will allow e in the previous lemmas to depend on n in such a way as to define
a convergent series in the upper bound. Assume that, for some a40 (we will need to be more
precise on the choice of a; but that will become apparent later),
en ¼ 9
ffiffiffi
2
p apn log n
n
1=2
cn. (12)
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197 195Tracing back the proof of Lemma 3.3, this choice of en means that the optimizing value of l would
now be l ¼ 1=
ffiffiffi
2
p
cnða log n=npnÞ1=2: Inserting these expressions in (10) and repeating the
arguments would lead to the following result.
Lemma 3.5. Let X 1; X 2; . . . be strictly stationary and associated random variables. Suppose that for
some a40;
log n
pnc
2
n
exp
an log n
2pn
1=2 ! X1
j¼pnþ2
CovðX 1; X jÞpC0o1. (13)
Then, for en as in (12), we have
P
1
n
jZ1;n;odj4
en
9
p 1þ 4
a
C0
expða log nÞ, (14)
and analogously for Z1;n;ev:
As for the term R1;n; it is treated exactly as in Lemma 3.4. Repeating the arguments used in that
lemma we would be left with the term Pð24nen=cnÞ: But nen=cnðnpn log nÞ1=2!þ1; so the
argument of Lemma 3.4 still applies.
We may now state a theorem summarizing the partial results described in the lemmas of this
section.
Theorem 3.6. Let X 1; X 2; . . . be strictly stationary and associated variables satisfying (13) for some
a40: On account of definitions (1), (2) and (3) it follows that, for every en as in (12) and n large
enough,
P
1
n
Xn
i¼1
ðX 1;i;n EðX 1;i;nÞÞ
4 en3
!
p2 1þ 4
a
C0
expða log nÞ. (15)
Proof. It suffices to write
P
1
n
Xn
i¼1
ðX 1;i;n EðX 1;i;nÞÞ
4e
!
pP 1
n
jZ1;n;odj4
e
3
þ P 1
n
jZ1;n;evj4
e
3
þ P jR1;nj4
ne
3
and apply the previous lemmas. &
Note that the result just proved implies the convergence to zero of the upper bound in (15) but
implies the almost sure convergence to zero of 1=n
Pn
i¼1 ðX 1;i;n EðX 1;i;nÞÞ only if we may choose
a41:4. Controlling the unbounded terms
The variables X 2;i;n and X 3;i;n are associated but not bounded, even for fixed n. This means that
Lemma 2.1 may not be applied to the sum of such terms. But we may note that these variables
depend only on the tails of the distribution of the original variables. So, by controlling the
decrease rate of these tails we may prove an exponential inequality for sums of X 2;i;n or X 3;i;n: For
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197196this control we will not make use of the block decomposition of the sums
Pn
i¼1 ðX q;i;n EðX q;i;nÞÞ
as the condition derived would be exactly the same as the one obtained with a direct treatment
(the upper bound derived would be the same, up to the multiplication by a constant).
We have, for q ¼ 2; 3; recalling that the variables are identically distributed
P
Xn
i¼1
ðX q;i;n EðX q;i;nÞÞ
4ne
!
pnPðjX q;1;n EðX q;1;nÞj4eÞp
n
e2
VarðX q;1;nÞp
n
e2
EðX 2q;1;nÞ.
Lemma 4.1. Let X 1; X 2; . . . be strictly stationary random variables such that there exists d40
satisfying supjtjpd EðetX 1ÞpMdoþ1: Then, on account of definition (1), for t 2 ð0; d
;
P
Xn
i¼1
ðX q;i;n EðX q;i;nÞÞ
4ne
!
p 2Mdne
tcn
t2e2
; q ¼ 2; 3. (16)
Proof. According to the inequality stated before this lemma it remains to control EðX 2q;1;nÞ: Let us
fix q ¼ 2; the other possible choice being treated analogously. We will set F ðxÞ ¼ PðX 14xÞ: Now,
using Markov’s inequality it follows that, for t 2 ð0; dÞ; F ðxÞpetxEðetX 1ÞpMdetx: Writing the
mathematical expectation as a Stieltjes integral and integrating by parts we find
EðX 22;1;nÞ ¼
Z
ðcn;þ1Þ
ðx cnÞ2 F ðdxÞ ¼
Z þ1
cn
2ðx cnÞFðxÞdxp2Md
etcn
t2
from which the lemma follows. &
Note that for this step the association of the variables is irrelevant.5. Strong convergence and rates
This section summarizes the results obtained earlier. In addition, we show that for
geometrically decreasing covariances the assumptions made are fulfilled. In this case we also
find explicitly the convergence rate that follows.
Theorem 5.1. Let X 1; X 2; . . . be strictly stationary and associated random variables satisfying (13)
for some a40: Suppose that en is as in (12) and there exists d4a satisfying supjtjpd EðetX 1ÞpMdoþ
1: Then, on account of definitions (1), (2) and (3), for n large enough,
P
1
n
Xn
i¼1
ðX i EðX iÞÞ
4e
!
p 2 1þ 4
a
C0
þ 2Mdn
2
9a3pn log
3 n
!
expða log nÞ. (17)
Proof. Separate the sum in the left of (17) into three terms, apply (15) and (16) with en=3 in place
of e for the latter. Then choose t ¼ a and cn ¼ log n in (16), so that the exponents are equal, and
recalculate en for this choice of cn: &
Notice that this result requires some extra assumptions on the choice of a in order to derive the
almost sure convergence with rate e1n n1=2=p1=2n log3=2 n:
ARTICLE IN PRESS
P.E. Oliveira / Statistics & Probability Letters 73 (2005) 189–197 197Suppose now that CovðX 1; X nÞ ¼ r0rn; for some r040 and r 2 ð0; 1Þ: Then (13) may be
rewritten as
r0r
pnpC0 exp
an log n
2pn
1=2 !
pn log n
or equivalently, as
pn logrp
a n log n
2pn
1=2
þ log C0pn log n
r0
.
This leads to the choice pnn1=3 log1=3 n and a42 log2 r: Then, the largest order term in the upper
bound of (17) behaves like n5=3a=log10=3 n; so we should choose a48
3
: The convergence rate that
follows is of order n1=3=log5=3 n: This convergence rate is somewhat slower than the rate obtained
in Ioannides and Roussas (1998) where a rate n1=3=log2=3 n was proved. But, in this latter reference
the authors considered uniformly bounded variables so they needed no truncation. Here we used a
truncation of the variables using the sequence log n which is responsible for our slower
convergence rate.
If we suppose that the covariances decrease polynomially the inequality just derived is not
strong enough to identify a convergence rate. In fact, in this case we would be led to a choice of pn
behaving like n= log n and this would mean that the corresponding en would not converge to zero.References
Azuma, K., 1967. Weighted sums of certain dependent random variables. Toˆhoku Math. J. 19, 357–367.
Birkel, T., 1988. The invariance principle for associated processes. Stochastic Process. Appl. 27, 57–71.
Bosq, D., 1996. Nonparametric Statistics for Stochastic Processes. Lecture Notes in Statistics, vol. 110, Springer, Berlin.
Carbon, M., 1993. Une nouvelle ine´galite´ de grandes deviations. Applications. Publ. IRMA Lille 32, XI.
Devroye, L., 1991. Exponential inequalities in nonparametric estimation. In: Roussas, G. (Ed.), Nonparametric
Functional Estimation and Related Topics. Kluwer Academic Publishers, Dordrecht, pp. 31–44.
Dewan, I., Prakasa Rao, B.L.S., 1999. A general method of density estimation for associated random variables.
J. Nonparametric Statist. 10, 405–420.
Esary, J.D., Proschan, F., Walkup, D.W., 1967. Association of random variables, with applications. Ann. Math.
Statist. 38, 1466–1474.
Henriques, C., Oliveira, P.E., 2002a. Convergence rates for the estimation of two-dimensional distribution functions
under association and estimation of the covariance of the limit empirical process. Pre´-Publicac- o˜es do Departamento
de Matema´tica da Universidade de Coimbra 02–05. Preprint.
Henriques, C., Oliveira, P.E., 2002b. Exponential rates for kernel density estimation under association. Pre´-Publicac- o˜es
do Departamento de Matema´tica da Universidade de Coimbra 02–23. Preprint.
Ioannides, D.A., Roussas, G.G., 1998. Exponential inequality for associated random variables. Statist. Probab. Lett.
42, 423–431.
Lehmann, E.L., 1966. Some concepts of dependence. Ann. Math. Statist. 37, 1137–1153.
Lesigne, E., Volny´, D., 2001. Large deviations for martingales. Stochastic Process. Appl. 96, 143–159.
Newman, C., 1980. Normal fluctuation and the FKG inequalities. Comm. Math. Phys. 74, 119–128.
Newman, C., 1984. Asymptotic independence and limit theorems for positively and negatively dependent random
variables. In: Tong, Y.L. (Ed.), Inequalities in Statistics and Probability, IMS Lecture Notes—Monograph Series,
vol. 5, pp. 127–140.