Effects of Lipophilicity and Structural Features on the Antiherpes Activity of Digitalis Cardenolides and Derivatives

There is growing interest in exploring Digitalis cardenolides as potential antiviral agents. Hence, we herein investigated the influence of structural features and lipophilicity on the antiherpes activity of 65 natural and semisynthetic cardenolides assayed in vitro against HSV‐1. The presence of an α,β‐unsaturated lactone ring at C‐17, a β‐hydroxy group at C‐14 and C‐3β‐OR substituents were considered essential requirements for this biological activity. Glycosides were more active than their genins, especially monoglycosides containing a rhamnose residue. The activity enhanced in derivatives bearing an aldehyde group at C‐19 instead of a methyl group, whereas inserting a C‐5β‐OH improved the antiherpes effect significantly. The cardenolides lipophilicity was accessed by measuring experimentally their log P values (n‐octanol‐water partition coefficient) and disclosed a range of lipophilicity (log P 0.75±0.25) associated with the optimal antiherpes activity. In silico studies were carried out and resulted in the establishment of two predictive models potentially useful to identify and/or optimize novel antiherpes cardenolides. The effectiveness of the models was confirmed by retrospective analysis of the studied compounds. This is the first SAR study addressing the antiherpes activity of cardenolides. The developed computational models were able to predict the active cardenolides and their log P values.


Introduction
Infections with Herpes Simplex Virus types 1 and 2 (HSV-1 and HSV-2) occur worldwide in high-and lowincome countries. [1,2] Acyclovir is the drug of choice for treating herpes infections; however, due to the emergence of resistant strains, [3,4] there is an increas-ing demand for the search and development of new antiviral drugs. [5,6] Natural products and their semisynthetic derivatives may constitute a potential source of compounds and templates to develop new antiviral drugs.
Digitalis cardenolides are plant secondary metabolites currently classified as a new class of steroidal hormones in view of their isolation from mammals' blood plasma, adrenals and hypothalamus, along with their physiological role and molecular functions. [7] Cardenolides like digoxin are clinically used to treat congestive heart failure and atrial arrhythmias. [8] In recent years, new pharmacological activities have been described for cardenolides pointing out the potential of these compounds as anticancer and antiviral agents, with mechanisms of action distinct from those of the available drugs. [9,10] The effects of cardenolides against DNA viruses have been demonstrated, including some reports of their antiherpes action. [11][12][13][14] The goal of this work was to undertake a structureactivity relationship (SAR) study employing a series of natural cardenolides and semisynthetic derivatives (CDs) that had exhibited potent in vitro anti-HSV activity as previously reported by our research group. [13] We have also evaluated the effect of lipophilicity on the antiherpes activity of the selected cardenolides. Additionally, in silico models were developed and applied to discriminate active from inactive CDs based on their molecular and tridimensional structural properties.

Results and Discussion
In a previous study published by our research group, the antiherpes activity of different CDs was screened against three different viral strains (HSV-1 KOS and 29-R strains, which are acyclovir sensitive and resistant, respectively, and HSV-2 333 strain). The mechanism of antiherpes action of glucoevatromonoside against HSV-1 (KOS strain), here numbered as compound 41, was investigated since it was the most promising cardenolide showing high selectivity indices. [13] Considering that lipophilicity plays a crucial role in the transport of compounds through biological membranes as well in the formation of the ligand-receptor complex, [15] the SAR study of 65 CDs was performed, and the influence of lipophilicity on their antiherpes activity was also investigated. Initially, all compounds were herein tested for antiviral activity against HSV-1 (KOS strain) at 1 μM concentration. To investigate the effect of lipophilicity, the compounds that inhibited viral replication by at least 90 % were considered active ( Figure 1). We adopted this restrictive cut-off-value because the analysis was based on a single physicochemical parameter. According to this criterion, 17 CDs were active, and all of them had their IC 50 values determined herein. However, to support SAR investigation, concentration-response curves were also built for some selected compounds, which were outside the cut-off range. On the other hand, for in silico prediction study, we adopted a less restrictive cut-off value (30 %) (Figure 1) since the analysis was based on many descriptors. In this case, a total of 27 CDs were active, and the IC 50 values were also determined for most of them ( Table 1). Chem. Biodiversity 2022, 19, e202200411 Table 1. Anti-HSV-1 (KOS strain) activity and lipophilicity data of cardenolide derivatives (CDs). Group CD Antiherpes activity Lipophilicity (log P) The partition coefficients of 56 CDs (Table 1) were determined by the shake flask method in n-octanol/ water and the results expressed as log P values. Either highly hydrophilic or lipophilic compounds could not have their log P values determined experimentally due to their low concentrations in the n-octanol or water phases, thus impairing an accurate quantitation. The determinations were carried out in triplicate and the results showed RSD < 10 %.
The distribution of active (n = 17) and inactive (n = 39) compounds according to their log P values is depicted in Figure 2. The ratio values of active to inactive compounds (R a/i ) were higher for log P values in the ranges of À 0.49 to 0.00 (R a/I = 3.1) and 0.51 to 1.00 (R a/I = 5.5); for all other ranges, R a/I values below zero or close to 1.00 were observed. Based on these findings, log P values between 0.51 and 1.00 were  considered in the optimal lipophilicity range for antiherpes activity. The influence of distinct structural features on the antiherpes activity elicited by cardenolides was investigated qualitatively by structure-activity relationship (SAR) analyses. For the analyses, CDs were gathered in 13 groups according to the presence or absence of the investigated structure feature, and the results are presented and discussed in the following subsections. The calculated and experimental log P values of the CDs are shown in Table 1. Whenever possible, digitoxigenin (1), the non-substituted aglycone of cardenolides, was employed as a model compound for comparative analyses.

Group I: Configuration of the Hydroxy Group at C-3
Compounds 1 and 2 are isomers differing in the configuration of the hydroxy group at C-3, which has a beta configuration in natural Digitalis cardenolides. This feature does not seem to contribute to the antiherpes activity since both compounds inhibited virus replication marginally. Log P values obtained for 1 (1.18 � 0.04) and 2 (1.49 � 0.09) are outside the optimal range (0.51-1.00) identified for the active cardenolides ( Figure 3).

Group II: 14β-Hydroxylation of the Steroid Nucleus
The comparison between the antiherpes activity of compounds 1 and 3 indicated that a 14β-OH group is critical for this activity of both cardenolide aglycones. The influence of this feature can also be observed for the triglycoside derivatives 4, 5 and 6. Among the compounds bearing a 14β-OH group, only 4 inhibited HSV-1 replication above 90 % and its IC 50 value was determined. Log P values obtained for 5 (1.80 � 0.08) and 6 (1.11 � 0.01) are not within the optimal range (0.51 -1.00) required for antiherpes activity, thus accounting for their low activity or inactivity ( Figure 4).

Group III: Hydroxylation at Different Positions of the Steroid Nucleus
Compounds 7, 8, 9 and 10, which bear hydroxy groups at 16β, 7β, 8β and 12β, respectively, were inactive (7,8,9) or showed a high IC 50 value (10) in comparison to the non-hydroxylated digitoxigenin (1) ( Table 1). This finding suggests that the hydroxylation at those positions is disadvantageous for the selective antiherpes activity of the aglycones probably caused by steric hindrances. [16] Interestingly, log P values of 7 (0.94 � 0.03) and 10 (0.66 � 0.05) are within the range (0.51 -1.00) determined for antiviral activity, and they may represent an exception to the proposed model ( Figure 5).

Group IV: α,β-Unsaturated Lactone Ring at C-17
The replacement of the lactone ring by a 20,21 ketol group at C-17 increased the IC 50 value of 11 in comparison to 1, which corroborates the relevance of this structural feature for the antiherpes activity. In addition, the substitution of the C-17 lactone ring for an α,β-unsaturated lactam ring slightly increased the antiherpes activity of 12 (Table 1). However, when the α,β-unsaturated lactam ring is bound at C-17 through a methylene group (13), the activity is abolished similarly as it is observed for 14, which bears a lactone ring instead. For both compounds, the presence of the Δ 5,6 unsaturation may also contribute to inactivity. In addition, the log P values of 12 (1.05 � 0.03), 13 (2.09 � 0.09), and 14 (2.56 � 0.23) fell outside the lipophilicity range (0.51-1.00) required for antiherpes activity ( Figure 6).   The comparison of the antiherpes effect of digitoxigenin (1) with those of compounds 15 to 21 indicates that replacing the C-3 β-OH by groups like β-amino, oxime, 4-amino-1-thio-amino-cyclopentane and its Nacetylated derivative, along with β-bromo, α-bromo and β-chloro does not affect the antiherpes activity substantially. For some derivatives, a moderate increase of the activity was observed (compounds 16, 18 and 21). Log P was corroborated as a valid parameter to predict the antiherpes activity of cardenolides within this series, since most of the assayed compounds exhibited values outside the optimal range (0. 51 The insertion of a C-3β amino group in 22, which contains a C-12β-OAc group, is also deleterious to the activity in comparison to compound 10. In its turn, the absence of the acetyl group at C-12β-OH in 23 improves the activity moderately. The log P values of 22 (À 0.22 � 0.06) and 23 (À 1.12 � 0.04) are outside the optimal range (0.51-1.00) confirming the low antiherpes activity detected ( Table 1).
Among the C-3 esters and amides derived from adamantine (compounds 24, 25, 26 and 27), only 27 induces a significant inhibition of HSV-1 resulting in an IC 50 value lower than that of digitoxigenin (1). This finding suggests that a secondary nitrogen at C-4' plays an important role in the antiherpes activity, since the tertiary amide of 26 was inactive, in agreement with the results of other authors. [17] On the other hand, the morpholine 28 was more active than 1, although it has a tertiary nitrogen at C-4', However, this derivative is an amine not an amide, which indicates that a nitrogen atom at C-4' possessing free electrons is relevant for the activity. The lipophilicity of compounds 24 (3.12 � 0.16), 25 (3.17 � 0.31), and 26 (3.32 � 0.30) fell outside the optimal range, whereas log P values of 27 (> 3) and 28 (> 3) could not be determined due to their low solubility in the aqueous phase ( Table 1). These two compounds seem to be an exception to the optimal range of log P.
The introduction of 3β-O-dimethyl phosphite and phosphate-O-dimethyl substituents in 29 and 30, respectively, improved the antiviral effects significantly in comparison to digitoxigenin (1). This finding suggests that the presence of groups bearing electronegative atoms are beneficial for the antiherpes activity. Moreover, 29 (0.93 � 0.05) and 30 (0.86 � 0.08) showed log P values within the optimal range of lipophilicity ( Figure 7). The results show that cardenolides bearing other groups than sugar residues at C-3 can exhibit higher antiherpes activity.
Group VI: 16β-Hydroxylation of the Steroid Nucleus Gitoxigenin (7) possess a hydroxy group at C-16β position and exhibited a lower antiherpes activity than digitoxigenin (1). Acetylation of this group and replacement of the A ring for a six-membered lactam proved to be slightly advantageous for the antiviral activity of 31. Besides, the insertion of an azide function at C-3β position in 32 did not improve the activity in comparison to 7, and the log P value is outside the ideal lipophilicity range. Similarly, the inclusion of a tertiary amino group at C-3 in 33 improved the antiviral activity slightly ( Table 1). Log P values measured for 7 (0.94 � 0.03) and 31 (0.76 � 0.05) are within the optimal lipophilicity range (0.51-1.00), whereas the log P value determined for 32 (1.16 � 0.05) fell outside the range and 33 showed a bordering value (1.05 � 0.04) corroborating their low antiherpes activity ( Figure 8).

Group VII: Sugar Residues Linked to C-3
The insertion of a digitoxose residue at C-3-O-β position, as observed in digitoxigenin monodigitoxoside (34), led to a drastic reduction of the IC 50 value when compared to the non-glycosylated digitoxigenin (1). The introduction of two digitoxose residues in 35 also resulted in a substantial enhancement of the antiherpes activity. On the other hand, compounds with three or four sugar units in the side chain (36 and 37, respectively) presented higher IC 50 values than those of mono or diglycosides ( Table 1). Regarding the lipophilicity of these compounds, only 35 showed log P value (0.75 � 0.04) within the established range, and it was the most active derivative in this series (Figure 9). Based on these findings, we can conclude that the length of the oligosaccharide chain is a structural feature relevant to the antiherpes activity of cardenolides.

Group VIII: Introduction of Substituents at C-19 and C-5β in Monoglycoside Derivatives
The comparison of the IC 50 values of 34 and 38 showed that the introduction of an aldehyde function at C-19 together with hydroxylation at C-5β position, Chem. Biodiversity 2022, 19, e202200411 as found in 38, enhances antiherpes activity. Additionally, it can be inferred that the methylation of the hydroxy group at C-3' position in 39 is deleterious for the antiviral activity in relation to 38 ( Table 1). In fact, compound 40 showed the lowest IC 50 value among this series, probably due to the presence of a rhamnose residue, which bears an axial C-2'-OH and an equatorial C-3'-OH that seems to be important for the antiviral activity. This finding is in agreement with previous data reported for ouabain, for which the role of the hydroxy groups at C-2' and C-3' cross-linking with the N-terminal 41-K fragment of the Na + ,K + -ATPase enzyme has been demonstrated. [18 -20] Conversely, 38 and 39 presented log P values (0.52 � 0.01 and 0.71 � 0.07, respectively) within the optimal range of lipophilicity. Derivative 40 (log P of À 0.48 � 0.02) is an exception to the rule, a finding that could be ascribed to the presence of the rhamnose residue ( Figure 10). The above-mentioned results indicate that the nature of the oligosaccharide, the specific glyco-sidic linkages, and the configuration of the sugar improve the antiherpes activity of cardenolides.

Group IX: Nature of the Sugar Residues Linked to Diglycoside Derivatives
The IC 50 values obtained for diglycosides indicate that replacing a digitoxose unit in 35 by a glucose residue in 41 does not affect the antiviral activity. In its turn, the substitution of a digitoxose residue by a fucose unit decreased the antiviral activity of 42 substantially in comparison to 41 ( Table 1). Fucose possess a hydroxy group at C-2' and, therefore, it was expected to present antiherpes activity like those CDs containing a rhamnose residue. The differences in the biological response induced by rhamnose and fucose derivatives may be related to the configuration of the hydroxy group linked to C-2' and C-4': equatorial in rhamnose and axial in fucose. As similarly observed for the aglycones, hydroxylation at C-12β position in-  Chem. Biodiversity 2022, 19, e202200411 creased the IC 50 value of 43 in comparison to 35. The IC 50 value of 44, a diglycoside derivative hydroxylated at C-16β position, was higher than that of 41 confirming our previous findings that hydroxylation at this site is deleterious for the antiviral activity. It should also be noticed that 42 and 44, the less active compounds of this series, presented log P values (À 0.08 � 0.07 and À 0.05 � 0.02, respectively) outside the optimal range of lipophilicity, whereas 35 (0.75 � 0.04), 41 (0.54 � 0.05) and 43 (0.84 � 0.03) fell within the range ( Figure 11).

Group X: Hydroxylation of the Steroid Nucleus and Changes in the Sugar Chain in Triglycoside Derivatives
The methylation of the hydroxy groups at C-3''' and C-4''' from the sugar chain (compounds 45 and 46) increased the antiherpes activity significantly in comparison to 36 ( Table 1). This enhancement may be credited to a decrease in lipophilicity since the methylated derivatives 45 (0.58 � 0.06) and 46 (0.67 � 0.08) possess log P values within the optimal range, while 36 (1.62 � 0.03) fell outside.
The hydroxylation of the steroid nucleus at C-12β and C-16β positions (compounds 4, 47 and 48) increased the IC 50 values and the lipophilicity of these derivatives, which were found to present log P values outside the optimum range (respectively 1.07 � 0.03,  Chem. Biodiversity 2022, 19, e202200411 1.62 � 0.03 and 1.19 � 0.09). The presence of a C-12βhydroxy group in the triglycosides 4 and 48 was disadvantageous for the antiherpes activity as similarly observed for the aglycones previously described in Group III.
The IC 50 values of derivatives 36 and 49 revealed that acetylation of the hydroxy at C-4''' did not affect the antiviral activity significantly. However, peracetylation of the sugar side chains (compounds 50 and 51) reduced the antiherpes activity dramatically (Figure 12). This finding is in agreement with a previous study which demonstrated that only the first sugar residue establishes hydrogen bond interaction with the subunit of the Na + ,K + -ATPase. [21] Unfortunately, it was not possible to determine the log P values for the peracetylated derivatives due to their low solubility in the aqueous phase, but they probably exhibit lipophilicity outside the optimal range. Therefore, the nature and degree of substitution on the sugar chain may affect substantially the antiherpes activity of cardenolides.

Group XI: Sugar Composition of Tetraglycoside Derivatives
The tetraglycosides derivatives did not present potent antiherpes activity and compound 37 was the most active within this group. The hydroxylation at C-12β position decreased the activity of 52 in comparison to 37, whereas the introduction of a terminal glucose residue reduced the antiherpes activity of 53 and 54 ( Table 1). The highest IC 50 values obtained for these derivatives are probably related to their inadequate lipophilicity, since compounds 37, 52, 53 and 54 showed log P values outside the optimal range (Figure 13).

Group XII: Acetylation of the Hydroxy Group at C-3β
The comparison of the antiherpes activity of digitoxigenin (1) with the antiviral effects elicited by compounds 55 to 60 demonstrates that the introduction of an acetyl group at C-3β-OH position is not relevant to the antiviral activity. The presence of an unsaturated α,β lactam ring at C-17 in 59, or a C-19 amino group and a 5β-OH in 60 was also irrelevant to the antiherpes activity. Acetylation at C-3β-OH increased the lipophilicity considerably and all derivatives of this series presented log P values outside the ideal range, which may explain their low antiherpes activity (Figure 14).

Group XIII: Miscellaneous Steroids
Compounds 61, 62 and 63 lack important features for the antiviral activity, namely an unsaturated lactone ring at C-17, a hydroxy group at C-14β (except 62), and C-3β-OR groups; therefore, they exhibited marginal antiherpes activity. Derivative 64 exhibits these favorable structural features and the insertion of an amine group at C-19 increased its antiviral activity in comparison to digitoxigenin (1). Similarly, the presence of a Δ 16,17 unsaturation and an azide group at C-3 enhanced the activity of 65 ( Figure 15).
To the best of our knowledge, this is the first SAR study addressing the antiherpes activity of cardiac glycosides undertaken with 65 CDs obtained from natural sources or by semisynthesis. Previous SAR Figure 13. Chemical structures of CDs (group XI) employed to evaluate the effect of the sugar composition on the antiherpes activity of tetraglycoside derivatives.
Chem. Biodiversity 2022, 19, e202200411 studies investigated the effects of CDs on cancer cell lines and Na + ,K + -ATPase enzyme. [22 -25] The results described herein disclose common structural features required for antiherpes activity and Na + ,K + -ATPase enzyme inhibition, namely an α,β-unsaturated lactone ring at C-17, a β-OH group at C-14, and C-3β-OH substituents. However, a correlation between the inhibition of this enzyme and the anti-HSV-1 activity of the evaluated compounds was not demonstrated (data not shown). In addition, another cardenolide, 21benzylidene digoxin, has been reported to inhibit the multidrug exporter Pdr5p with potent cytotoxic activity and low inhibitory effect on Na + ,K + -ATPase enzyme. [26] A recent publication questions some widely accepted dogmas in the Na + ,K + -ATPase research, including but not limited to the specificity of  the contradictory findings on digitalis-induced signaling function of Na + ,K + -ATPase; and the doubts about the Na + ,K + -ATPase structure in native cell membranes. [27] Therefore, the role played by the Na + ,K + -ATPase in the antiherpetic activity of digitalis cardenolide should be further and deeper investigated.
Taken together, these results revealed the structural features required for antiherpes activity of cardenolides, i. e., the presence of an α,β-unsaturated lactone ring at C-17, a β-OH group at C-14 and C-3β-OR substituents, and glycosides were found to be more active than their corresponding genins, especially monoglycosides containing a rhamnose residue. In addition, the presence of a C-19 aldehyde group resulted in more active compounds than those bearing a C-19 methyl group, whereas insertion of a C-5βhydroxy group enhanced the activity significantly.
In addition to the qualitative SAR study, the log P values (n-octanol/water partition coefficient) measured experimentally for the cardenolides allowed us to determine a range of lipophilicity (log P between 0.51 and 1.00) associated with the optimal antiherpes activity. The log P values were measured for 56 out of the 65 evaluated compounds. Among the 17 most active CDs, eight fell within the optimum range demonstrating the relevance of log P for predicting anti-HSV-1 activity. Conversely, only three out of the 39 remaining compounds with low antiherpes activity presented log P values within the optimum range, thus accounting for approximately 8 % of false positive results.
Previous reports of SAR studies correlating the lipophilicity of cardiac glycosides and their Na + ,K + -ATPase enzyme inhibition led to controversial results. [28,29] reported a positive correlation between the Na + ,K + -ATPase enzyme inhibition and lipophilicity of cardiac glycosides, whereas another group showed a negative correlation between the lipophilicity of cardioactive steroids and their inotropic and Na + ,K + -ATPase enzyme inhibitory potencies. [30] The association between lipophilicity and bioactivity has been investigated for other classes of compounds as well as for other biological effects. [31,32] It is worth mentioning that lipophilicity contributes for the interaction of drugs and target proteins; however, the nature of different functional groups in the drug, their spatial distribution, and other physical-chemical features are crucial to determine biological activity. [33] Therefore, although our experimental data disclosed a range of lipophilicity associated with the optimal antiherpes activity of cardenolides, the compounds identified as exceptions to the proposed model are clear evidence that other features are relevant to the biological effect. It should be noted that the experimental log P values obtained in the present work differ significantly from data generated by Marvin calculation program reported in Table 1. This finding clearly demonstrates the limitations of a theoretical approach to predict lipophilicity for cardenolides.
Aiming to overcome this limitation, quantitative structure-property models (QSPR) were generated to predict the lipophilicity of CDs using the experimental results obtained herein as input. Initially, 32 HQSAR models were created by varying the fragment distinction parameter by combining the atom flag with all other ones. The top three models of this parameter screening step ( Table 2) presented an acceptable robustness (q 2 > 0.6).
The model number 4 was selected since it was generated by using atoms and H-bond acceptors and donors as the fragment distinction parameter. This model afforded the highest q 2 value and was simpler than the other two models, which have more than three parameters in the fragment distinction. For the second parameter variation step, the increase of the fragment size from 4 to 7 atoms to fragments with up to 10 atoms improved the robustness of the models. Taken together, these results indicate that the atom type, the H-bond acceptors and donors, and larger fragments (up to 10 atoms) explain better the lipophilicity of the studied CDs.
Subsequently, the models 38, 47, and 54 were validated by using the test set compounds. From all calculated metrics, the model 47 showed the best predictability. These three selected models also presented external validation metrics as r 2 ext , q 2 F1 , q 2 F2 , and CCC in accordance with internal validation metrics ( Table 3).
The results shown in Tables 2 and 3 indicate that the three models have similar robustness but the model 47 showed a better external predictive power and could be employed for the prediction of log P values of new cardenolide derivatives. For comparison, the log P values of the dataset compounds were also calculated using another software (Supporting Information, Figure S2), and the three generated models (38, 47, and 54) outperformed the other tested methods by at least less than 1 log unity of calculated mean absolute error (MAE) (Figure 16).
In parallel with SAR investigation, predictive models based on the antiviral data available were developed. After an initial retrospective validation, these comple-Chem. Biodiversity 2022, 19, e202200411 mentary models could be applied in a screening program to identify and/or optimize novel antiviral CDs. Two strategies were employed to achieve this goal: the development of a 3D-pharmacophore model based on common structural features of the highest active molecules, and the construction of a PLS-DA model based on molecular descriptors.
The 3D-pharmacophore model was based on three highly active structures (compounds 35, 40, and 45). The final refined model comprised a total of six features (one optional) (Figure 17), which were closely related to the classical pharmacophore features identified in the qualitative SAR: hydrogen-bond acceptor group located at the lactone ring, hydrophobic feature at the C-18 methyl group, and hydrogen-bond donor or acceptor groups at C-14. The remaining features are hydrogen-bond acceptor groups located in the substituent at C-3. To improve predictivity, one of the features was selected as optional during the virtual screening accounting for differences in sugar moieties of aglycones and glycosides.
After model refinement, the complete CDs-database was screened virtually. Compounds were classified as inactive or active according to the cut-off established during the initial antiviral evaluation (� 30 % HSV-1 inhibition). This model properly identified 89 % of the truly active CDs (23 out 27) and 63 % of the truly inactive CDs (14 false positive hits). The    (compounds 18, 23, 31, and  64). Hence, the model performance was considered appropriate and capable of discriminating compounds from chemically-related series, including aglycones and glycosides. The model can therefore be employed as a prospective tool to find out novel unidentified antiviral CDs to be investigated in the future.
The second in silico approach was a multivariate classification model (PLS-DA) built in SIMCA. This model relies on molecular descriptors for distinguishing between active and inactive compounds and allows a better evaluation of the molecular features important for the antiviral activity. After the stepwise exclusion of descriptors that did not influence the discriminative power of the model significantly, the final model comprised six descriptors to produce two principal components resulting in R 2 of 0.8 and Q 2 of  Chem . Biodiversity 2022, 19, e202200411 0.4. The final descriptors included MLOGP (Moriguchi octanol-water partition coefficient) and Mp (mean atomic polarizability scaled on carbon atom), both correlated negatively with the active compounds. These descriptors corroborate the negative influence of high lipophilicity on the antiviral activity as discussed previously in this study (hydrophobic molecules have higher log P values and are more polarizable). The remaining descriptors correlated positively with the classification of active compounds and disclosed the relevance of molecular volume and hydrogen-bond groups for the antiviral activity: TPSA Tot (topological polar surface area using N, O, S, P polar contributions), GGI2 (topological charge index of order 2), Hy (hydrophilic factor), and DISPv (displacement value weighted by van der Waals volume). The model was able to classify correctly 67 % of active (12 out of 18) and 85 % of inactive (22 out of 26) CDs of the training set and reached 78 % of active (7 out of 9) and 83 % of inactive (10 out of 12) CDs of the test set. The predicted score plots are shown in Figure S3, available in the Supporting Information.
The SAR study herein reported allowed us to identify some structural features required for the antiherpes activity of cardenolides, which were also reported as relevant for their cytotoxicity and antihuman cytomegalovirus activity, namely the steroidal aglycon, the nature and length of the oligosaccharide chain, some specific glycosidic linkages, the configuration of the sugar, and the degree of substitution on the sugar. [34] Furthermore, the obtained data demonstrate that cardenolides bearing other groups than sugar residues at C-3 can have the antiherpes activity enhanced. Nevertheless, it cannot be ruled out that cardenolides interact with different cellular targets, as evidenced by the recent finding that they are able to bind to angiotensin-converting enzyme 2 of human lung cells, exhibiting anti-SARS-CoV-2 activity. [35] Despite the narrow therapeutic index of cardenolides, it is important to point out their high antiviral activity and the fact that the therapy for viral infections is usually not prolonged, thus minimizing the occurrence of toxic effects. Therefore, cardenolides should be deeply evaluated as antiviral agents, aiming their introduction in therapy, especially for topical applications like treating labial herpes, where toxic effects shall be minimized. It should be also remembered that cardenolides like digitoxin, digoxin, and lanatosid C have been used in the clinical practice for over a century, administered by oral route, indicating that drug repositioning should be considered for their short-term use to treat viral infections, herpes in-cluded. Additionally, cardenolides act mainly on the host cells, which are less prone to mutations differently from the observed for viral targets. [36] The limiting factor for the pre-clinical evaluation of cardenolides is the use of rodent models, whose Na + ,K + -ATPase is less sensitive to these compounds. [37,38] As a consequence, rodent models are not adequate to evaluate the activity and/or toxicity of cardenolides, thus limiting to carry out in vivo studies with these compounds.

Conclusion
In conclusion, this study disclosed the major structural features related to the antiherpes activity of CDs. Additionally, an optimal range of lipophilicity required for the antiviral activity was established. The log P was shown to successfully discriminate the active compounds and may represent an important tool to be used for screening and optimization studies. In this sense, QSPR models were generated to predict the lipophilicity of CDs. Furthermore, the in silico studies allowed the development of two predictive models potentially useful to identify and/or optimize novel antiherpes CDs. The developed computational models performed properly in the retrospective analysis, presented complementary performance to predict active and inactive compounds, and provided informative structure-property trends for antiherpes activity. The applicability of the developed models may be limited to structurally similar compounds and, therefore, should be further experimentally validated through a prospective virtual screening campaign focused on the identification of new optimized antiviral CDs. Since cardenolides comprise a large and structurally diverse group of secondary metabolites, the modeling and predicting of their biological activities is not an easy task. To date, this is the first report of a retrospective virtual screening of antiherpes CDs, although several groups have employed in silico tools to predict and understand the multiple activities of this class of compounds (i. e., Na + ,K + -ATPase enzyme inhibition, cytotoxic effects, and other antiviral approaches). [24 -26,39]

Experimental Section
Compounds A series of 65 CDs was employed in this study; they were obtained from Digitalis lanata leaves [40,41] or from commercial suppliers (Extrasynthèse, Genay, France; Merck, Darmstadt, Germany; Boehringer, Mannheim, Germany; Carl Roth, Karlsruhe, Germany) or by fungi biotransformation [42] or by semisynthesis. [43 -45] The purity of the tested compounds was checked by HPLC and NMR analyses (at least 95 % purity; data not shown). A complete list of SMILES codes is available in Supporting Information ( Table S1).

Determination of Antiherpes Activity
Vero cells (ATCC CCL 81) were infected with approximately 100 plaque forming units (PFU) of HSV-1 (KOS strain) for 1 h at 37°C. After, cells were overlaid with minimum essential medium (MEM) containing 1.5 % carboxymethylcellulose either in the presence or absence of different concentrations of the compounds. After 72 h of incubation at 37°C, cells were fixed and stained with naphthol blue-black and viral plaques were counted. The results were expressed as IC 50 values (n = 3) defined as the concentration that inhibited 50 % of viral plaques number when compared to untreated controls as well as the percentages of HSV-1 inhibition at 1 μM (n = 1 -3). Acyclovir was used as a positive control. [46] Determination of Log P Each compound (0.3 μM) was accurately weighted and transferred to a safe-lock flask (1.5 mL of capacity) following the addition of 700 μL of water pre-saturated with n-octanol. The solution was sonicated for 20 min in an ultrasonic bath. In the sequence, 700 μL of noctanol saturated with water was added to the flask and submitted to agitation in a mixing block (MB-101, Bioer Technology, China) at 800 rpm for 15 h, at 25°C. The phases were separated by centrifugation at 8,400 × g for 10 min and approximately 600 μL of each phase was carefully collected to avoid cross contamination with the opposite phase and stored at À 20°C in safe-lock flasks. Before HPLC analyses, the phases were centrifuged at 8,400 × g for 15 min. Toluene (log P of 2.8) was employed as a reference compound to assure reliability of results, whereas a mixture of water and n-octanol prepared exactly as described for the experiments was used as the blank. For highly lipophilic compounds (log P > 3.0), 1.0 mL of the aqueous phase (from two replicates) was concentrated to dryness in a centrifugal vacuum concentrator at 55°C (Labconco Centrivap, USA). The residues were solubilized in 150 μL of methanol, and an aliquot (90 μL) was injected into the HPLC system. Similarly, the n-octanol phases from two replicates were combined and adequate volumes were taken for HPLC analyses. For basic and acid compounds, log P values were determined in McIlvaine's buffer pH 4.0 and 8.0 (in triplicate) and expressed as the logarithmic values of the ratio between compound concentrations (measured as peak areas) in the n-octanol and water phases.

Computational Modeling Hologram based-QSAR Models for Log P Prediction
We selected 50 cardenolides for generation of hologram-based QSAR (HQSAR) models. Compounds experimentally accessed with different experimental conditions (pH and buffer) were excluded from dataset as well as compounds with no exact log P values (e. g., log P > 3). The lowest energy conformer of dataset compounds was generated by using OMEGA 2.3.4 software [47] followed by ionization states calculation at pH 7.4 with QUACPAC 1.7.0.2 software (OpenEye Scientific Software, Santa Fe, NM. http://www.eyesopen.com). Training and test set compounds (80 and 20 %) were selected by using a hierarchical cluster analysis (HCA) considering molecular structure (Pub-Chem fingerprints), physicochemical properties (molecular weight, number of H-bond donors and acceptors, polar surface area, number of rotatable bonds and sp 3 carbon fraction), and experimental values of log P, according to our previous work. [48] KNIME platform [49] was employed at this step. Training set compounds were employed to generate the models and test set (compounds 1, 13, 20, 26, 35, 38, 40, 53, 58 and 65) to perform external validations. HQSAR models were generated by using Sybyl X 2.1 package (Tripos Inc, Sybyl-X suite, 2013. www.certara.com/software/molecularmodeling). Initially, HQSAR models were generated by combining all fragment distinction parameters and fixing fragment size equals to 4 -7 atoms. The most robust model (highest internal validation regression coefficient [q 2 ]) was selected to vary the fragment size parameter. At this step, models were generated by using intervals of 2, 3, and 4 atoms ranging from 1 to 11 atoms and fixing the best fragment distinction from the previous parameter screening step. Both steps of parameters screening were done by using all default hologram lengths to generate HQSAR models. As the training set comprises 40 compounds, PLS models were generated by using up to 8 principal components due Occam's Razor. [50] The three most robust models were selected to external validations accessed by the calculation of q 2 F1 , q 2 F2 , q 2 F3 , CCC and r 2 m metrics. [51] For comparison, we also calculated log P values by using Sybyl X 2.1 package, PaDEL descriptor software, [52] RDkit node of KNIME platform, [53] QikProp software (Schrödinger Release 2018-3: QikProp, Schrödinger, LLC, New York, NY, 2018) and AlogPS [54] and Molinspiration (Molinspiration. Calculation of Molecular Properties and Bioactivity Score. Accessed on October 20, 2018. Available at: http://www.molinspiration.com/cgi-bin/properties#) webservers.

Pharmacophoric Modelling and PLS-DA Study
In silico studies were carried out on workstations running Windows 7 and/or Linux CentOS 5, with the aim of generating computational models able to discriminate between active and inactive CDs based on their molecular and tridimensional structural properties. The total dataset comprising 65 compounds was initially classified into two categories, active and inactive, based on the in vitro anti-herpes screening ( Table 1). For the computational modeling, active compounds were considered as those that inhibited HSV-1 replication by at least 30 % at 1 μM. For the generation of a common-feature 3D pharmacophore model, the LigandScout / espresso module was used (LigandScout 3.03b, Inte:Ligand, Vienna, Austria). The model was based on three highly active CDs from our experimental dataset, and then used for the virtual screening the complete in-house DC-database. As a second in silico approach, the experimental results were used to develop a partial least-squares projection to latent structures discriminant analysis (PLS-DA) model. For that, molecular descriptors generation was carried out in E-Dragon after 3D structural optimization in incorporated CORINA. [55] A total of 297 molecular descriptors was obtained and, after the removal of descriptors with zero variance, 186 descriptors were employed as the starting point for model development. The dataset was divided into training set (2/3, model development) and test set (1/3, prediction evaluation) based on the selection of structural diverse compounds by using ChemGPS, a tool for navigation in the biologically relevant chemical space. [56] Detailed information is available in Figure S1 in the Supporting Information. Finally, a PLS-DA was developed in SIMCA, version 13.0.3.0 (Umetrics, Umeå, Sweden). The model was optimized by stepwise molecular descriptors removal (guided by variable of importance-to-projection -VIP -and predictive power in training set), and the statistical validity of the model was assessed by leave-one-out cross-validation of R 2 (Q 2 ) and random permutation test.

Supporting Information
SMILES codes, total dataset projected into drug chemical space, training and test sets are available as Supporting Information.