An exhaustive ADDIS principle for online FWER control

In this paper, we consider online multiple testing with familywise error rate (FWER) control, where the probability of committing at least one type I error will remain under control while testing a possibly infinite sequence of hypotheses over time. Currently, adaptive‐discard (ADDIS) procedures seem to be the most promising online procedures with FWER control in terms of power. Now, our main contribution is a uniform improvement of the ADDIS principle and thus of all ADDIS procedures. This means, the methods we propose reject as least as much hypotheses as ADDIS procedures and in some cases even more, while maintaining FWER control. In addition, we show that there is no other FWER controlling procedure that enlarges the event of rejecting any hypothesis. Finally, we apply the new principle to derive uniform improvements of the ADDIS‐Spending and ADDIS‐Graph.


Introduction
The decision in hypothesis testing is usually made by comparing a p-value with a prespecified significance level, e.g.α = 0.05.However, if not only one, but several hypotheses are tested, this will lead to an inflation of the type I error and some overall error control is needed.In life sciences, for example, it is often essential to avoid any type I error and therefore the probability of making at least one false discovery, called the familywise error rate (FWER), should be controlled under a previously defined level α ∈ (0, 1).Classical multiple testing theory assumes that a finite number of hypotheses H 1 , . . ., H m , is predefined at the beginning of the evaluation (Bretz et al. 2016).A simple multiple testing procedure for controlling the FWER in this classical setting is, e.g., the Bonferroni correction, in which each individual hypothesis is tested at the level α/m.In many contemporary trials, however, the hypothesis set grows over time and statistical inference is to be made with the already known hypotheses without knowing the future ones.In this paper, we focus on online multiple testing (Foster and Stine 2008;Javanmard and Montanari 2018).Thereby, the hypotheses arrive one at a time and it must be decided on the current hypothesis while having access only to the previous hypotheses and decisions.Since the number of future hypotheses is also unknown in advance, it is usually assumed to be infinite.
Online multiple testing problems arise, for example, when public databases are used.Here, several researchers access the database at different times testing various hypotheses.In addition, many public databases grow over time as new data is collected, leading to the testing of further hypotheses (Robertson et al. 2023b).But comparatively smaller studies, such as specific platform trials, can also be formulated as an online multiple testing problem (Robertson et al. 2023a).Another interesting application is the sequential modification of a machine learning algorithm (Feng et al. 2022(Feng et al. , 2021)).Here, it is started with an initial model and at each step a hypothesis test is performed to decide whether a modification of the current model improves the performance.If the test is rejected, meaning the performance of the modified model is significantly larger, the modified model can be used as the new benchmark for future updates.In order to ensure with high probability that the benchmark model constantly improves, online FWER control is required.

Problem formulation
Let (Ω, A) be a measurable space and P be a set of probability distribution on (Ω, A).Unless otherwise stated, P ∈ P denotes the true data generating distribution in the remainder of this paper.We have a sequence of null hypotheses (H i ) i∈N about P with corresponding p-values (P i ) i∈N .The null p-values are assumed to be valid, meaning P(P i ≤ x) ≤ x for all x ∈ [0, 1] and i ∈ I 0 , where I 0 ⊆ N is the index set of true null hypotheses.Note that this includes: uniformly distributed null p-values, where P(P i ≤ x) = x for all x ∈ [0, 1] and i ∈ I 0 ; and conservative null p-values, which means that P(P i ≤ x) < x for some x ∈ [0, 1].Let V (i) denote the number of false rejections up to step i ∈ N. The goal is to determine procedures that generate a sequence of individual significance levels (α i ) i∈N , whereby each α i is only allowed to depend on the previous p-values P 1 , . . ., P i−1 , such that is controlled at level α ∈ (0, 1) for each i ∈ N. P denotes the probability under the true configuration of true and false hypotheses.Since P(V (i) > 0) is increasing in i, it is sufficient to control FWER := P(V > 0), where V = lim i→∞ V (i).It is differentiated between strong and weak FWER control.Strong control provides that FWER ≤ α under any configuration of true and false hypotheses, whereas weak control assumes that all null hypotheses are true.We focus on strong control.While controlling the FWER strongly, the power should be maximised, where power is defined as the expected proportion of rejections among the false hypotheses.In this paper, we aim for uniform improvements of existing procedures.By uniform improvement we mean that a procedure rejects as least as much hypotheses as the initial procedure and, in some cases, even more.This is the case, for example, when the newly constructed procedure tests the hypotheses at larger individual significance levels than the existing one for all data constellations.

Existing literature and contribution
As a first FWER controlling online procedure, Foster and Stine (2008) introduced the Alpha-Spending as online version of the weighted Bonferroni, which sets the individual significance levels (α i ) i∈N such that i∈N α i ≤ α.This method is conservative, meaning that there exist online FWER controlling procedures that uniformly improve Alpha-Spending.One approach to derive such improvements is the closure principle (Marcus et al. 1976).One could either extend existing Bonferroni-based closed procedures to the online setting (Tian and Ramdas 2021) or derive direct improvements of the Alpha-Spending via the online closure principle (Fischer et al. 2022).Most of these Alpha-Spending-based closed procedures can by summarised by choosing Here, if a hypothesis H i is rejected (P i ≤ α i ), the significance level can be reused for future testing, which improves the classical Alpha-Spending.However, simulations have shown that these Alpha-Spending based closed procedures lead to low online power as well (Tian and Ramdas 2021).The problem is that the individual significance level α i and thus the probability of reusing a significance level tend to zero for i to ∞.A more promising approach is the ADaptive-DIScard (ADDIS) principle by Tian and Ramdas (2021).The term ADDIS stems from the simultaneous discard of conservative null p-values based on a parameter 0 < τ i ≤ 1 and adaption to the proportion of non-nulls (p-values corresponding to false null hypotheses) using a parameter 0 ≤ λ i < τ i .Precisely, an ADDIS procedure determines individual significance levels (1) In contrast to Alpha-Spending-based closed procedures, ADDIS procedures allow to reuse the significance level α i if either P i > τ i or P i ≤ λ i .To correct for this improvement, the additional factor 1/(τ i −λ i ) needs to be included.Since null p-values are often conservative and thus tend to be large and non-null p-values tend to be small, we expect this trade-off to be useful.The interpretation of ADDIS procedures is as follows: Large p-values (P i > τ i ) are discarded, meaning not being tested, and small p-values (P i ≤ λ i ) are likely to be non-null and thus cannot lead to a type I error.However, to control the FWER, ADDIS procedures need additional assumptions.Firstly, the null p-values (P i ) i∈I0 need to be independent from each other and from the non-nulls.Secondly, in case of τ i < 1, the null p-values are required to be uniformly valid, meaning P(P i ≤ xy|P i ≤ y) ≤ x for all x, y ∈ [0, 1] and i ∈ I 0 (Zhao et al. 2019).In return, ADDIS procedures lead to a high online power (Tian and Ramdas 2021).The ADDIS-Spending (Tian and Ramdas 2021) and ADDIS-Graph (Fischer et al. 2023) were proposed as concrete ADDIS procedures satisfying the required conditions.
Although ADDIS procedures are quite powerful, they are based on the Bonferroni inequality which leads to conservative procedures.It is known that under independence of p-values, the Bonferroni procedure can be uniformly improved by the Sidak correction (Šidák 1967), which uses (2) Tian and Ramdas (2021) have already attempted to apply the same idea to ADDIS procedures.However, since the individual significance level α i of an ADDIS procedure uses information about the previous p-values, the events {P i ≤ α i } are no longer independent of each other and thus the third equality in equation ( 2) becomes an inequality (Tian and Ramdas 2021).This implies that the ADDIS-Sidak is conservative as well.Therefore, Tian and Ramdas (2021) left open the question of whether their ADDIS principle can be uniformly improved.In this paper, we will answer this question by introducing the exhaustive ADDIS principle, which provides a uniform improvement over the ADDIS principle by utilizing the independence of the p-values.In addition, we show that the there is no FWER controlling procedure that enlarges the event of rejecting any hypothesis further.

Overview of the paper
In Section 2, we introduce a general ADDIS algorithm that contains all other ADDIS procedures.Afterwards, we derive the exhaustive ADDIS algorithm as a uniform improvement of the ADDIS algorithm and show that the event of rejecting any hypothesis cannot be further enlarged (Section 3).In Section 4, this exhaustive ADDIS algorithm is used to obtain uniform improvements of the ADDIS-Spending and ADDIS-Graph.In Section 5 and 6, we quantify the performance of the constructed procedures by applying them on simulated and real data, respectively.For the complete proofs of the theoretical results we refer to the Appendix and the R code for the simulations can be found at the GitHub repository https://github.com/fischer23/Exhaustive-ADDIS-procedures.

The ADDIS algorithm
In this section, we introduce a general ADDIS algorithm which encompasses all online procedures satisfying the ADDIS principle.This facilitates the interpretation of the ADDIS principle and conveniently introduces a notation needed to construct the uniform improvement in the next section.
In the most general form of the ADDIS principle, the parameters τ i and λ i are allowed to depend on the previous pvalues as well.Mathematically, α i , τ i and λ i are random variables with values in [0, 1), (0, 1] and [0, τ i ) respectively that are measurable with respect to G i−1 := σ(P 1 , . . ., P i−1 ).Furthermore, note that condition (1) is equivalent to i j=1 αj τj −λj (S j − C j ) ≤ α for all i ∈ N, where S i = 1 Pi≤τi and C i = 1 Pi≤λi .Since α i , τ i and λ i are measurable with regard to G i−1 , they must be fixed before knowing the true values of S i and C i .Therefore, we need to make pessimistic assumptions at step i ∈ N, meaning S i = 1 and C i = 0. Hence, condition ( 1) is equivalent to With this, we can formulate a general ADDIS procedure, called ADDIS algorithm, that contains all other ADDIS procedures.
0. Before the study starts, choose For ease of understanding, we illustrate the ADDIS algorithm in Figure 1.The chart includes the end of step i − 1, i.e. the setting of τ i , λ i and α i , and the entire step i.The parameter α (i) , i ∈ N, can be interpreted as the level that can (but does not have to) be spent for the hypotheses {H j : j ≥ i}.Using Alpha-Spending one would set α (i+1) = α (i) − α i for all i ∈ N.However, as seen in Figure 1, in the ADDIS algorithm the entire significance level is shifted to the future hypotheses (α (i+1) = α (i) ) if P i ≤ λ i or P i > τ i and in turn α (i+1) = α (i) − αi τi−λi in the opposite case.

The exhaustive ADDIS algorithm
In this section, we introduce a uniform improvement of the ADDIS algorithm and show that it cannot be further improved.First, we illustrate the idea for the first two hypotheses and τ 1 = 1.As in the ADDIS algorithm, we set distributed null p-values we can calculate where c denotes the value of α (2) in case of P 1 > λ 1 .The last line follows from the independence of the null p-values and as we always choose α 2 ≤ α (2) .Determining c such that the last line equals α under the global null hypothesis and the assumption of uniformly distributed null p-values gives us c = α − α 1 1−α 1−λ .This already indicates the uniform improvement over the ADDIS algorithm, as it sets α (2) to the smaller value α 1).In addition, in case of α 2 = α (2) the last line in the above calculation becomes an equation and thus FWER = α, which suggests that the procedure fully exhausts the significance level.In general, the exhaustive ADDIS algorithm can be formulated as follows.
Definition 3.1 (Exhaustive ADDIS algorithm).0. Before the study starts, choose For proving that the exhaustive ADDIS algorithm controls the FWER, we use a backward induction similar to the calculation at the beginning of this section.
Theorem 3.2.The exhaustive ADDIS algorithm controls the FWER in the strong sense when the null p-values are uniformly valid, independent from each other and independent from the non-null p-values.
or In Figure 2, the exhaustive ADDIS algorithm is illustrated.Note that the main difference between the exhaustive ADDIS algorithm and the ADDIS algorithm (Figure 1) is the factor As the algorithm ensures that α (i) ≥ 0 for all i ∈ N, the level that can be spend for the future hypotheses is larger and therefore the exhaustive ADDIS algorithm is uniformly more powerful.Moreover, we have an additional constraint as we need to choose λ i ≥ τ i α (i) for all i ∈ N.However, one usually sets λ i ≥ τ i α (i) anyway in order to exploit the potential of ADDIS procedures.To see this, note that closed procedures allow to reuse the significance level if P i ≤ α i without requiring the factor 1/(τ i − λ i ) (see Section 1.2).Therefore, one should always use λ i ≫ α i in ADDIS procedures, which often leads to λ i ≥ τ i α (i) automatically.For example, Tian and Ramdas (2021) recommended to choose λ i = τ i α, which is always greater or equal than τ i α (i) .Hence, λ i ≥ τ i α (i) is only a minor constraint.
Remark 3.3.An interesting special case of the exhaustive ADDIS algorithm is obtained for τ i = 1 and λ i = α (i) , where it basically reduces to the condition α i ≤ α (i) with α (i) = α − j<i α j (1 − 1 Pj ≤α (j) ).It is easy to see that this uniformly improves the Alpha-Spending-based closed procedures (see Section 1.2).However, note that this improvement only works when the null p-values are independent of each other and the non-nulls.
The exhaustive ADDIS algorithm does not only uniformly improve the ADDIS algorithm, but it is also optimal in the sense that the event of rejecting at least one hypothesis cannot be enlarged.Before showing this, we prove that under the global null hypothesis the probability of committing any type I error is exactly α.Proposition 3.4.Assume the null p-values are uniformly distributed and independent.In addition, let the individual significance levels of the exhaustive ADDIS algorithm be chosen such that i∈N α i 1−α (i)   τi−λi (S i − C i ) = α.Then, under the global null hypothesis, we obtain: Note that under the global null hypothesis and uniformly distributed null p-values, there will almost surely be infinitely many p-values P j with λ j < P j ≤ τ j , implying that S j − C j = 1.Hence, the condition i∈N α i assumption to circumvent this issue: For any event E ∈ A with P(E|I 0 = N) = 0 it also holds P(E) = 0. (3) This ensures that if the probability of an event under the global null hypothesis is zero, it is also zero under the true data constellation.This holds in most models considered in applied statistics (Goeman et al. 2021).
Theorem 3.5.Assume that the assumptions of Proposition 3.4 are satisfied and (3) holds.Then there is no procedure (α i ) i∈N with FWER control such that {∃i ∈ N : P i ≤ αi } ⊇ {∃i ∈ N : P i ≤ α i } and P({∃i ∈ N : In particular, note that Theorem 3.5 does not hold for the ADDIS algorithm, as the exhaustive ADDIS algorithm is always as least as good as the ADDIS algorithm and leads to a larger probability of rejecting any hypothesis in some cases.

Uniform improvements of ADDIS-Spending and ADDIS-Graph
When applying the exhaustive ADDIS algorithm, one could just calculate the α (i) at each step and then choose τ i , λ i and α i according to the conditions given in Figure 2.However, in some situations the analyst may not want to do these steps manually and would like the algorithm to give out a specific significance level based on the previous test results.For example, this might be the case when the time between hypothesis tests is very short or when performing a simulation study.Tian and Ramdas (2021) proposed the ADDIS-Spending and Fischer et al. ( 2023) the ADDIS-Graph as concrete procedures satisfying the ADDIS principle.In this Section, we show how these procedures could be improved using the exhaustive ADDIS algorithm.
For the construction of the ADDIS-Spending and ADDIS-Graph it is started with a non-negative sequence (γ i ) i∈N that sums to at most one and which can be interpreted as the initial allocation of the significance level α.In order to obtain α (i+1) = α (i) in case of P i ≤ λ i or P i > τ i , the ADDIS-Spending ignores hypothesis H i in the future testing process, while the ADDIS-Graph distributes its level according to non-negative weights (g j,i ) ∞ i=j+1 that sum to at most one for each j ∈ N. To compensate for the fact that we lose the level αi τi−λi in case of λ i < P i ≤ τ i when using the ADDIS algorithm (Figure 1), the significance levels of both procedures are multiplied by the factor (τ i − λ i ).This leads to the individual significance level for the ADDIS-Spending and for the ADDIS-Graph.We derive the graphical representation of the ADDIS-Graph in the Appendix.
The easiest way to uniformly improve these procedures based on the exhaustive ADDIS algorithm is to change the factor i) ).This leads to the Exhaustive-ADDIS-Spending (E-ADDIS-Spending) and Exhaustive-ADDIS-Graph (E-ADDIS-Graph) where Note that one needs to set λ i ≥ τ i α (i) in these procedures as required by the exhaustive ADDIS algorithm.
Since α (i) ≥ 0 for all i ∈ N and at least α (1) = α > 0, these procedures provide uniform improvements of the usual ADDIS procedures.However, except for unrealistic extreme cases, α (i) will tend to zero for i to infinity.This implies that these improvements can be very marginal for hypotheses that are tested at a late stage.For this reason, we propose a further approach to exploit the exhaustive ADDIS algorithm.
Let i ∈ N be arbitrary.Note that where for all i ∈ N. Thus, applying the ADDIS algorithm (Definition 2.1) at each step i ∈ N at the level α + i j=1 αj α (j)   τj −λj (S j − C j ) is equivalent to applying the exhaustive ADDIS algorithm at level α.
For example, this could be incorporated into the ADDIS-Graph ( 4) by distributing the level (α j α (j) )/(τ j − λ j ) in case of λ j < P j ≤ τ j to the future hypotheses according to non-negative weights (h j,i ) ∞ i=j+1 that sum to at most one for each j ∈ N.This approach leads to the following procedure, which we term Evenly-Improved-ADDIS-Graph (EI-ADDIS-Graph).
Note that the EI-ADDIS-Graph can be interpreted just as the ADDIS-Graph.However, the EI-ADDIS-Graph distributes the significance level to the future hypotheses also if λ j < P j ≤ τ j , but reduced by the factor α (j) .Obviously, this defines a uniform improvement of the ADDIS-Graph.Furthermore, the weights (h j,i ) j∈N,i>j determine which hypotheses benefit from the improvement, such that the gained significance level can be spent more evenly than using the E-ADDIS-Graph.This may lead to a larger power improvement when compared with the ADDIS-Graph.Also note that the same improvement could be applied to the ADDIS-Spending, as the ADDIS-Spending can be written as a specific ADDIS-Graph (Fischer et al. 2023).

Simulations
In this section, we aim to quantify the gain in power using the proposed EI-ADDIS-Graph instead of the ADDIS-Graph.To this regard, we consider the gaussian testing setup as described in Tian and Ramdas (2021).More precise, n null hypotheses (H i ) i∈{1,...,n} of the form H i : µ i := E(Z i ) ≤ 0 are tested sequentially using z-tests.During the generation process of a test statistic Z i , i ∈ {1, . . ., n}, it is assumed that ∼ N (0, 1).Note that π A is the probability of a null hypothesis being false and µ A the strength of the alternative.Moreover, the null p-values are conservative if µ N = −2 and uniformly distributed if µ N = 0.
Figure 3 and 4 show the estimated power and FWER of the Alpha-Spending, ADDIS-Graph and EI-ADDIS-Graph for n = 1000 hypotheses by averaging over 2000 independent trials (Alpha-Spending is included as a reference).The power is represented by the solid lines and the FWER by the dashed lines.In the left plots, the p-values are uniformly distributed (µ N = 0) and in the right plots they are conservative (µ N = −2).As done by Tian and Ramdas (2021), we applied the ADDIS procedures at level α = 0.2 with parameters τ i = 0.8 and λ i = 0.16 for all i ∈ N. Furthermore, we set γ i = 6 π 2 i 2 in Figure 3 and γ 4 for all i ∈ N. In addition, we chose g j,i = γ i−j and h j,i = g j,i for all j ∈ N and i > j in all cases.
, τ i = 0.8 and λ i = 0.16.The results show that the EI-ADDIS-Graph achieves a power improvement of 0.01 to 0.02 in all cases.Considering a number of 1000 hypotheses this also means that we expect to make at least one more true discovery when using the EI-ADDIS-Graph instead of the ADDIS-Graph and even more when the proportion of false hypotheses is large.Furthermore, Figure 3 indicates that the EI-ADDIS-Graph is roughly exhausting the significance level.Note that the level is not exhausted in Figure 4, because the (γ i ) i∈N decreases very slowly and since the testing process stops at step 1000, not the entire level is used.
Since there are also online multiple testing problems, e.g.platform trials, where much less than one thousand hypotheses are tested, we also considered a setting with n = 10 hypotheses.In this case, we also set the strength of the alternative to µ A = 2.The results when applying the procedures with the same parameters as in Figure 3 can be found in Figure 5.It is easy to see that in this case the power gain of the EI-ADDIS-Graph is larger.We do not present them here, but the same simulations were performed for E-ADDIS-Graph and E-ADDIS-Spending compared to ADDIS-Graph and ADDIS-Spending, respectively.As expected, the performance improvements are slightly smaller than those obtained using EI-ADDIS-Graph.

Application to IMPC data
The International Mouse Phenotyping Consortium (IMPC) coordinates a large study to pin down the function of each protein-coding mouse gene (Muñoz-Fuentes et al. 2018).Since the resulting public data base is growing over time, IMPC data is used as a benchmark application for online multiple testing (Robertson et al. 2023a).We apply our procedures to 5000 of the p-values available at the Zenodo repository https://zenodo.org/record/2396572(Robertson et al. 2019), which resulted from the evaluation by Karp et al. (2017).Note that the p-values in the IMPC data set are possibly correlated.However, we apply our procedures assuming independence for illustrative purposes.
We applied Alpha-Spending, ADDIS-Graph, ADDIS-Spending, E-ADDIS-Graph, E-ADDIS-Spending and EI-ADDIS-Graph with the parameters γ i ∝ 1 (i+1)log(i+1) 1.5 , g j,i = γ i−j , h j,i = g j,i , τ i = 0.8 and λ i = 0.16.The number of rejections obtained by the procedures for different FWER levels α can be found in Figure 6.As expected, the Alpha-Spending performed worst.In addition, the ADDIS-Graphs led to more rejections than the ADDIS-Spending procedures.The EI-ADDIS-Graph led to most rejections, which is consistent with our theoretical reasoning that the EI-ADDIS-Graph performs best in practice.In all cases, it rejected more hypotheses than the ADDIS-Graph.The rejection gap between these procedure is growing for an increasing FWER level.At level α = 0.05 the EI-ADDIS-Graph rejected 3 hypotheses more than the ADDIS-Graph, while it led to 25 additional rejections at α = 0.4.

Discussion
Tian and Ramdas (2021) asked whether their ADDIS principle is uniform improvable.We answered this question by firstly formulating the ADDIS principle as a general online procedure (Section 2), called ADDIS algorithm, and secondly introducing an exhaustive ADDIS algorithm (Section 3), which is a uniform improvement of the ADDIS algorithm.Since no additional assumptions are needed for obtaining the improvement, the exhaustive ADDIS proce-  n+1)log(n+1) 1.5 , g j,i = γ i−j , h j,i = g j,i , τ i = 0.8 and λ i = 0.16.
dures should be preferred over the usual ADDIS procedures in practice.Furthermore, the proposed methods not only lead to power improvements, but we also provide a general algorithm for their construction which is easy to use.
In practical use, offering software has become increasingly necessary.Statistical software, such as R packages and shiny apps, have been introduced to implement online control procedures (Robertson et al. 2019).As part of future efforts, the creation of a new package or the inclusion of new functions in existing software that implement our proposed procedures will be considered.
Our uniform improvement is based on a different proof idea than the usual ADDIS principle, which makes it possible to exploit the independence between p-values.We wonder whether the same approach can be used to derive adaptive procedures that work under more complex (but still known) dependence structures.This could be addressed in future work.
Note that the independence of the null p-values was used in both inequalities and the uniformly validity was applied in the second inequality.Since α (1) = α and G 0 = ∅, in particular P(R 1:i > 0) ≤ α.As i ∈ N was arbitrary, weak FWER control follows.Now we use this to show strong control.To this end, let I 0 ⊆ N be arbitrary and (α i ) i∈N be obtained by applying the exhaustive ADDIS algorithm to (H i ) i∈N .Since the null p-values are independent from the non-nulls and j<i,j∈I0 where α τ k −λ k , the levels (α i ) i∈I0 could also be obtained by applying the exhaustive ADDIS algorithm to H I0 .Consequently, strong FWER control is implied by weak control.
Proof of Proposition 3.4.The assertion follows from the proof of Theorem 3.2.For this, first consider the case where the entire significance level is spent for a finite number of hypotheses, thus α (i) = 0 for some i ∈ N. Then α i = α (i)  and we obtain an equality in the initial case of the induction proof.Together with the assumption of uniformly distributed null p-values, all inequalities of the induction proof become equations and we obtain that the probability of rejecting any hypothesis is exactly α.Now consider the case where α (i) > 0 for all i ∈ N and let δ > 0 be arbitrary.
Figure 7: Graphical representation of the levels (α i ) i∈N , where αi = α i /(τ i − λ i ) and α i is the level of the ADDIS-Graph.The initial levels are illustrated below each node.In case of P i > τ i or P i ≤ λ i , the future significance levels are updated according to the weights (g j,i ) j∈N,i>j .
Figure 1: Illustration of the ADDIS algorithm from the end of step i − 1 to step i.

Figure 2 :
Figure 2: Illustration of the exhaustive ADDIS algorithm from the end of step i − 1 to step i.

Figure 3 :
Figure 3: Power and FWER for n = 1000 hypotheses against proportion of false hypotheses (π A ) for Alpha-Spending, ADDIS-Graph and EI-ADDIS-Graph at level α = 0.2.Solid lines correspond to power and dashed lines to FWER; Strength of the alternative is µ A = 4 in both plots; p-values are uniformly distributed (µ N = 0) in the left plot and conservative (µ N = −2) in the right; Procedures were applied with parameters γ i = 6π 2 i 2 , g j,i = γ i−j , h j,i = g j,i , τ i = 0.8 and λ i = 0.16.