Prediction of Passive Membrane Permeability by Semi‐Empirical Method Considering Viscous and Inertial Resistances and Different Rates of Conformational Change and Diffusion

Abstract Membrane permeability is an important property of drugs in adsorption. Many prediction methods work well for small molecules, but the prediction of middle‐molecule permeability is still difficult. In the present study, we modified a classical permeability model based on Fick's law to study passive membrane permeability. The model consisted of the distribution of solute from water to membrane and the diffusion of solute in each solvent. The diffusion coefficient is the inverse of the resistance, and we examined the inertial resistance in addition to the viscous resistance, the latter of which has been widely used in permeability prediction. Also, we examined three models changing the balance between the diffusion of solute in membrane and the conformational change of solute. The inertial resistance improved the prediction results in addition to the viscous resistance. The models worked well not only for small molecules but also for middle molecules, whose structures have more conformational freedom.


Introduction
Permeability is one of the most important factors in a drug's adsorption and target-binding properties in cells. The understanding and predicting membrane permeability of molecules have been studied for last few decades. It is still one of the hot topics, especially under circumstances where the molecular weights of drug molecules have been increasing and larger molecules often face the lower permeability than smaller drug molecules do. There have been a number of reports on permeability.  The main permeability problems are adsorption in human intestine, extraction from kidney, penetration of the blood-brain barrier, skin permeability, and the permeability of the cell membrane to approach target proteins in cells. Caco-2 cells and MDCK cell systems are two of the model systems that mimic human intestine adsorption and extraction from kidney, respectively. Parallel artificial membrane permeability assay (PAMPA) systems have been popular in vitro assay systems for the past 20 years. [26][27][28][29][30][31][32] PAMPA systems have been improved to mimic in vivo permeability by trying various membrane materials, pH levels of donor and acceptor liquids and the other conditions. Certain mechanisms underlie permeability. [2,3,26] Namely, solute molecules penetrate the cell membrane by diffusion (transcellular), the solute molecules go through the tight junction (paracellular), and transporters and channel proteins work in the influx and efflux processes. Among these mechanisms, PAMPA permeability represents transcellular passive permeability.
Recent advances in molecular dynamics (MD) simulations enable the understanding and evaluation of trans-cellular passive permeability. [4][5][6][7][8][9][10][11][12][13][14][15] In these calculations, MD simulations have been applied to explicit atomic models of membrane molecules with solvent water molecules. Since permeation is a very slow process, biased MD simulations have been popular in this analysis. MD simulations have shown that the distribution of the existence probability of solute and the diffusion of solute in the membrane gave the permeability constant.
On the other hand, many approaches have adopted quantitative structure-property relationship (QSPR) models for passive permeability based on Fick's law. [2,3,26] Previous works have shown the efficiency of this basic theory, and some extensions from this theory have improved prediction accuracy. [16,17] Fick's law explains that permeability is a combination of the transfer of solutes from the donor water into the membrane and the diffusion of the solutes from the donor to the acceptor sides in the membrane. Only the neutral molecule moves into the lipid layer so that the pK a and the partition coefficient (LogP) or the distribution coefficient (LogD) determine the distribution of solute between the donor water and the membrane. The permeability P app is given as follows.
where P app , D, M, and h represent the apparent membrane permeability, distribution coefficient, diffusion constant of the solute, and thickness of the membrane, respectively. The above MD simulations support this assumption. Namely, LogD corresponds to the probability distribution of existence and M corresponds to the diffusion of solute obtained by the MD simulation, respectively. The diffusion constant M is estimated by the Einstein-Stokes relation where k B , T, μ, and R are the Boltzmann constant, temperature, viscosity and radius of the solute, respectively. Since LogD corresponds to the free energy of the transfer from water to membrane, this value could be approximated by the accessible surface area (ASA) method, the generalized Born (GB) method, polar surface area (PSA), and so on. [33] Thus, the following linear regression model can estimate LogP app .
where x, c, and N descriptor are the molecular descriptors, the fitting parameters, and the total number of descriptors, respectively, and the fitting parameters represent the characteristic properties of the membrane.
Recent advances in pharmaceutical research have increased the molecular size of drugs, and middle-molecule drugs, with molecular masses > 500 Da, have become popular. In the last few decades, the major drug targets have been receptors and enzymes. Pharmaceutical companies have released several thousands of drug molecules; however, they are now facing a severe depletion in the druggable targets with conventional strategies. Research activities focusing on protein-protein interaction (PPI) inhibitors have been started instead of the projects on finding ligands to receptors and enzymes. [34,35] To inhibit PPIs, larger molecules are often preferable than the small molecules. However, the hydrophobic and often insoluble physical properties of middle molecules that are distinct from small ligands cause serious problems in their development stages.
While one of the major problems of middle molecules is the low membrane permeability, the synthetic accessibility in chemical modifications is also limited. The synthetic processes of middle molecules are more complicated and time-consuming than that of small molecules. Thus, we need the mechanistic understandings to unveil the dominant factors of membrane permeability, rather than blackbox permeability predictions, to guide a rational modification of the middle molecules.
A number of reports suggested that conformational change is essential in permeability and cyclization as well as in the methylation of main-chain amide groups, and some chemical modifications have improved permeability. [18][19][20][21][22] Many studies have challenged this problem by using QSPR models and have successfully predicted membrane permeability, including that of middle molecules. [23] However, some studies have suggested that it is still difficult to predict the membrane permeabilities of middle molecules.
Diffusion constant M in eq. 1 consists of viscous resistance and inertial resistance. Equation 2, which is the inverse of viscous resistance, assumes the slow migration of solute in equilibrium. Permeability is non-equilibrium process in general, and the density of the solute keeps changing in the experiment. When solute molecules push solvent molecules, the inertial resistance is proportional to the cross section of the solute. In general, we have ignored inertial resistance without examining the actual experimental data. In a non-equilibrium state, the density of solute in the membrane cannot reach equilibrium. The solute can pass through the membrane before the conformational ensemble of the solute reaches equilibrium distribution (see Figure 1A). [4] In this case, the distribution of solute is different from the distribution coefficient D observed in equilibrium.
In the present study, we examined the diffusion process by using a classical QSPR model. We considered inertial resistance the same as viscous resistance. Also, we developed formulas considering the above two cases in which the conformational change occurs faster or slower than the diffusion in the membrane. [5] Full Paper www.molinf.com

Methods
We classify the membrane permeability process into two models according to the speed of permeability. [5] If the membrane permeation is faster than the conformational change of the solute in the membrane, each conformer penetrates the membrane while keeping the 3D structure of each conformer, and the dominant conformer in water could be the main contributor to the permeability (permeability model A in Figure 1A). If the membrane permeation is slower than the conformational change of the solute in the membrane, the most stable conformer in the membrane could contribute to the permeability (permeability model B in Figure 1B). In this article, we apply Fick's law to membrane permeability, since the physical meaning of Fick's law is clear and simple.

Permeability Model A
Since each solute penetrates the membrane without conformational change, the distribution of conformers in the membrane is equal to that in water solvent. The total permeability ratio (P app all ) is the summation of the permeability of all conformers. Here, P app (a), d wat (a), and N conformer are the apparent permeability of the a-th conformer, the distribution of the a-th conformer in water, and the total number of conformers, respectively. The following relation is obvious.
where the brackets < > wat stand for the average over the distribution in water. The fraction of the a-th conformer in water (= d wat (a)) is given by nðbÞ expðÀ E wat ðbÞ=ðk B TÞ (6) where E wat (a) and n(a) are the energy and the degeneracy of the a-th conformer, respectively. When we apply Kubo's cumulant expansion to eq. 5, the first two terms of the expansion are as follows. [36] LogP all app ¼< LogP app ðaÞ > wat À 1 2 LogP app ðaÞ 2 À LogP app ðaÞ The first term is the average LogP app and the second is the deviation of LogP app . A linear regression model that is a weighted summation of the descriptor values approximates the LogP app of the ath conformer as follows, where c and x(a) are the fitting parameters and the descriptor of the a-th conformer, respectively. If the descriptors x A and x B are independent from each other, , where σ stands for the deviation. Thus, if all the descriptors are independent of each other, eq. 7 becomes as follows.

www.molinf.com
In permeability model A, the deviations of all the descriptors, including the ASA and the radius of the solute, contribute to LogP app all in addition to the average values of these descriptors.

Permeability Model B
If the membrane penetration is much slower than the conformation change, the distribution of conformers reaches equilibrium in the membrane and in water. The distribution constant D is given by the partition function of the molecule as follows.
where Z°c t , Z wat , and n(b) are the partition functions in octanol and in water and the degradation number of the bth conformer. E°c t and E wat are the energy of the conformer in octanol and water, respectively. The D value gives the density of molecules on the donor À -membrane interface of the membrane. The summation of diffusions of all the conformers gives the total diffusion.
The fraction of the a-th conformer in the membrane is given by nðbÞ expðÀ E mem ðbÞ=ðk B TÞ (12) As in eq. 7, here we apply Kubo's cumulant expansion to eq. 11, giving In permeability model B, the deviations of the descriptors relating to the diffusion contribute to LogP app all besides the average values of the descriptors. Namely, the deviation of the radius of the solute should contribute to the prediction of LogP app .

Regression and Prediction
Our physical-property prediction method was a principal component regression (PCR) with an L2 regularization term based on the molecular descriptors. [37,38] The regression model was the same as that used in our previous study. Namely, the principal component (PC) analysis projected each compound into each point in a chemical space of the PC, and a multiple linear regression was applied to the molecular coordinates in the chemical space. The principal component axes were selected to minimize the root-meansquare error (RMSE) in regression. The addition of the L2 term reduced the RMSE in the prediction.
The physical property of the i-th molecule V(i) is estimated based on the molecular descriptors {s i b } where b represents the b-th descriptor as follows.
where c j , p i j , and d b j are the intercept parameter of the linear function, the PC vector, and the loading vector, respectively. The PC analysis of the descriptor s i b gives the loading vector d b j and the PC vector (axis) p i j . The values of {c j } are determined by multiple linear regression (MLR). N axis (N axis < N descriptor ) is determined so as to maximize the Q 2 -value obtained by cross validation tests. The parameters are determined based on the learning set and then are used for prediction. The objective function O({c}) consists of the summation of the square error between the experimental (V exptl ) and calculated (V calc ) property values with the following regularization term:

www.molinf.com
Here, the parameter set {c} is determined to minimize the O value. The weighting parameter λ is fixed to a constant (λ = 0.000001) in the present study.
We apply this regression model to P app , LogP, LogD, and pK a . Since the permeability depends on the LogP, LogD, and pK a , the P app prediction model should include the descriptors for LogD, LogP, and pK a prediction. All the prediction models shared the same molecular descriptors. We applied 4-fold CV tests to evaluate the accuracy, and all the values in the tables were predicted in the 4-fold CVs. The CV tests showed RMSE and Q 2 values. The definitions of Q 2 and RMSE are determined as follows.
RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P Here, Y pred and Y exp represent the predicted value in validation and the experimental Y value, respectively. In the present study, we do not compare the Q 2 values of different properties, since the variances of the experimental data (denominator in eq. 17) differed from each other among the different data sets. [39]

Regression and Prediction
Models A and B use the degeneracy and energy of the conformers in eqs. 6 and 9 so that we must sample the conformers. Although an exhaustive search of conformers is very difficult and time-consuming, many heuristic conformer generation methods have been proposed. The major approaches are random search and systematic search. [40][41][42][43][44][45][46][47] Especially, the conformer generation of ring systems is more difficult than that of nonring systems. For ring systems, some specialized methods, such as corner and edge flips, have been used. [40,41] We applied a Monte Carlo random-sampling method. [42] In conformer generation, a difficult problem is the estimation of degeneracy (n(a)) in eqs. 6 and 9. Most conformer generation programs focus on the reproduction of the most stable conformers precisely, but the degeneracy is unclear. Long-time MD simulations could measure the degeneracy but are not realistic in this work, since there are too many molecules.
One of the simplest approaches should be a uniform sampling of conformers without solvent bias. If the conformer search was a random sampling of dihedral angles of rotatable bonds of a molecule, the generated structural ensemble should mimic the uniform sampling (see some discussions in APPENDIX A in the supporting information). Unfortunately, this approach could not work in the sampling of large or cyclic compounds, since the ring systems could be opened by the random rotations of the dihedral angles and an atomic collision could occur. Thus, we applied a force field to close the ring systems and to avoid atomic collisions. Our conformer generation program transforms the ring systems of a molecule to nonring systems by removing one of the bonds in the ring. The following energy optimization closes the rings. The force field used was an AMBER-like one including 1-2, 1-3, 1-4, and 1-5 interactions. The 1-4 and 1-5 interaction terms consisted of only van der Waals interactions without electrostatic interactions, since the dielectric constant of water is very different from that of membrane.
A clustering analysis of the conformers generated above estimates the degeneracy number (n(a) in eqs. 6 and 9). The clustering threshold is the heavy atom RMSD < 1.5 Å.

Descriptors
The molecular descriptors consisted of physical and chemical ones. The physical descriptors represent mainly the size of a molecule that is related to the diffusion. The accessible surface area (ASA) is a useful idea with which to represent the solute À solvent interaction. The chemical properties represent mainly the hydrophobicity/hydrophilicity of a molecule that is related to the distribution between water solvent and membrane. In general, charged molecules cannot penetrate a membrane whose dielectric constant is small, but neutral molecules can. The total charge of a molecule is determined by the pK a and the MACCS key, [48] which is a set of substructures that recognizes the functional groups closely related to pK a . [48] The atomic charges were AM1-BCC charges obtained by MOPAC7. [49][50][51] The hydrogen bonds are determined for the hydrogen atoms of OH and NH groups and for the acceptor atoms with lone pairs (O, N, S). The GBSA method was used to calculate the LogD by eq. 10. [33] Since the descriptors include the ASA, the ASA and the LogD term double-counted the solutesolvent interaction. The PCR could combine these dependent terms and thus reduce the number of independent fitting parameters. The atomic solvation parameters in the GBSA method were set to 10 cal/mol/Å2 and À 5 cal/mol/Å2 for water and membrane, respectively.
Models A (eq. 9) and B (eq. 13) include both the average and deviation terms. The average < A > and deviation σ(A) of property A are determined as follows. The conformer generation program in section 2.4 generates the conformers.  (20) where sol represents the solvent (water or membrane). E s°l (i) is the relative energy of the i-th conformer in the solvent compared to that in vacuum.
Since membrane permeability is related to LogD, or to LogP and pK a values as described in eqs. 1, 3, 8, and 13, the descriptors that could approximate LogD, LogP, and pK a values should contribute to P app values. Dissociations of hydrogen atoms depend on the chemical bonds of the functional groups and the electrostatic field generated by the charge distribution of the solute. A previous work proposed a linear correlation between the proton chemical shift and the pK a value, and the chemical shift depends on the electron density on the nucleus. [52] Thus, we adopted the descriptors used in the LogD and pK a predictions. Namely, we used the numbers of hydrogen donors and acceptors, the atomic charges of the hydrogen atoms of the NH and OH groups, and the MACCS key to represent the chemical structures of the solutes.

Data Preparation
The PAMPA permeability data and molecular structures were extracted from ChEMBL database version 24. [53] The ChEMBL assay and compound IDs used are summarized in Table S1. Most of the experimental conditions included pH 7.4 and the observation times were several hours, but the details of the conditions varied. In addition, a variety of membrane materials have been used. [15,[26][27][28][29][30][31][32] A desirable assay data set consists of the P app values obtained under the unique experimental conditions. [18] The careful data selection limited the number and diversity of examined compounds. While there have been differences in the P app values observed through the PAMPA assay, the systems were designed to reproduce P app values of the Caco-2 assay system. To evaluate the wide variety of compounds, the present data set consisted of 737 compound data obtained under the heterogeneous experimental conditions.; There were 70 pairs of the same solute with different P app values; for these 70 pairs, the average difference was 0.401 and the RMSD was 0.500. Thus, we discussed the prediction accuracy of LogP app > 0.5. As with the P app , we also prepared data on the octanol-water distribution coefficient (LogD, 4215 compounds), [54] and the dissociation coefficient (pK a , 240 compounds). [55,56] Astellas Pharmaceutical kindly provided additional experimental data, since there were fewer data on macrocyclic compounds than on acyclic compounds. The observed experimental property data were P app , LogD, and LogS data. The test compounds were 58 selected commercially available macrocyclic compounds. In addition, we asked Enamine Ltd. to evaluate the P app values of two cyclic peptides using the PAMPA assay. The molecular structures and experimental values are summarized in Tables S1 and S2 in the supporting information. [57] We prepared three-dimensional molecular structures of the examined compounds in the present study. The computational procedure was summarized as APPENDIX B in the supporting information. The initial molecular structures were 2D planar structures without hydrogen atoms in the SD file format. A file transfer program, Hgene/myPresto, was used to prepare the three-dimensional molecular structures with hydrogen atoms in electrically neutral forms including acids and amines. The atomic charges were the AM1-BCC charges provided by the MOPAC program. A molecular dynamics simulation program, cosgene/myPresto, gave the energy-optimized structures with the general AMBER force field (GAFF) assigned by the tplgeneL/ myPresto topology generator. [58,59] The energy optimization gave one stable or meta-stable molecular structure for each molecule in vacuum. All the molecular structures were prepared in the Sybyl mol2 file format.
The total data set consisted of 795 data points from the ChEMBL database and the unpublished assay data. Table 1 shows the distributions of molecular weight and the total number of atoms as well as the numbers of heavy atoms, ring structures, rotatable bonds, and atoms included in the biggest ring system of each molecule.

Prediction Models for Models A and B
Our LogP app prediction model represents the dissociation of solute in water, the distribution of solute from donor water (donor) to membrane, and the diffusion of solute in the membrane towards the water (acceptor) phase. Diffusion depends on the solute molecular radius R, and many previous works adopted R as the descriptor in LogP app prediction. [2,16,23,24] The ES term (eq. 2) represents the viscous resistance, while the real solute molecule feels both the viscous resistance and the inertial resistance. The inertial resistance is proportional to the product of the cross section of the solute and the square of the velocity of the molecule. The Taylor series of the total diffusion becomes as follows.
LogM ¼ Log where R, f v , and f i are the average radius of the solute and the respective coefficients of the diffusions related to the viscous and inertial resistance. In the actual calculation, we used the average R (denoted as < R >) instead of R. Thus, we described the diffusion term of LogP app as a weighted summation of < R > , < R > 2 , < R > 3 , and Log < R > . Also, the < R > 3 term represented the permeant's volume, since the transfer of the molecule from the donor phase to the membrane needed a volume change of the membrane, and this change generated additional energy corresponding to pressure in the membrane. In addition to these terms, we adopted the σ(R) term of the cumulant expansion (eqs. 7 and 13). We discussed only a passive permeability. The Fick' law (eq. 1) suggested that the permeability process consisted of several processes in a sequential order (the diffusions in water, the partitioning between water and membrane, the diffusion in membrane) and the LogP app could be given by a summation of the effects of these processes. In addition, our modifications (eqs. 7, 13 and 21) could be represented by linear regression models. Thus, we started from simple linear regression models as the QSPR equations based on Models A (eq. 22) and B (eq. 23). and

Prediction Results by Models A and B: Conformer Dependence
To examine the adequacy of Models A and B, we calculated the conformer dependence of each. Before the validation of the prediction models, we examined the conformer generation. Table 2 shows the average values of < R > wat and < R > mem , σ(A) wat , and σ(A) mem over all of the 795 solute molecules in the dataset, when the number of conformers was set to 100. The conformer generation represented the solute-size change in water and in membrane. The results supported previous findings. [18][19][20][21][22]25] Namely, the solutes in membrane were folded into smaller compact structures than those in water (< R > mem < < R > wat ), and the solutes in water fluctuated more than those in membrane (σ (R) mem < σ(R) water ). These results showed a consensus with the previously reported phenomena. [18][19][20][21][22]25] Thus, we ap- We applied Models A and B to the ensemble of conformers by restricting the maximum number of generated conformers up to 300. Then the 4-fold CVs of LogP app predictions were used to estimate the Q 2 and RMSE values. Figure 2 and Table S3 show the conformer-number dependence of the LogP app prediction. Both Models A and B worked well, and the increase in conformers improved the prediction accuracy. Some previous works showed that the conformer change increased the membrane permeability. [3,6,7,24,25] Namely, if the molecule has some conformers that show hydrophilic or hydrophobic surfaces, it can select stable conformers depending on the solvent (hydrophobic conformer in lipid and hydrophilic conformer in water).
There are two problems that make the validation of the models difficult. One of the problems is the lack of sufficient experimental observations of the conformational dynamics of solute in permeability process. Solution NMR experiments could determine the conformations in different environments including those mimic membrane interiors. However, available NMR experimental data that directly compare aqueous and lipid bilayer environments are rather limited. Often, the limited solubilities of middle molecules prevent the observation of NMR signals in aqueous solution.
In addition, most of the permeability data are for small molecules, which are less flexible than middle molecules and have smaller number of possible conformers. There are only small number of experimentally determined permeability data on middle molecules to establish the dynamicsactivity relationships. The limited number of flexible molecules with membrane permeability data also causes the problem in validating of the models based on the prediction accuracy. Recent progress of the solution NMR and molecular dynamics simulations of membrane systems should elucidate and validate the permeability mechanism in near future. Table S4 also shows the average and maximum CPU times elapsed for one molecule. The longest CPU time was less than 62 minutes. The calculation speed should be faster than the precise MD simulation.
The results obtained by Model A were close to those by Model B. Thus, we combined the models into one by merging their descriptors. Model AB represents the united model, and the fitting parameters were determined in the same way as for Models A and B. The results obtained by Model AB were slightly better than those by Models A and B (see Figure 2 and Table S3). The RMSE values were about 0.5 and close to the deviation of the experimental data (about 0.4). Since these results showed that both Models A and B were possible permeability processes, we could not clarify the ratio of the speeds of conformer change and diffusion. The ratio should be different for each molecule in the data set.
The conformer generation represented two effects. One was the molecular-size change depending on the solvent. The radius of the molecule in water was different from that in membrane. One conformer rich of intra-molecule hydrophobic contacts could be dominant in water and the other conformer rich of intra-molecule hydrogen bonds could be dominant in water. The other effect was the structural fluctuation of the molecule. The dynamics of the molecule in water were different from those in membrane. In this case, multiple conformers existed in water and in membrane, and the population of conformers in water could be different from that in membrane. The deviation terms (σ(R)) in eqs. 22 and 23 correspond to this structural fluctuation of the solute molecule. To evaluate the effect of this deviation term, we removed the deviation terms from eqs. 22 and 23. Figure 2D shows the RMSE values obtained by Models A, B, and AB without the σ(R) terms. The σ(R) terms did not improve the accuracy drastically as the number of con-

Full Paper
www.molinf.com formers increased. These results suggested that the structural change of the solute molecule was important in the LogP app prediction, but the dynamic structural fluctuation of the solute molecule in each condition was not so important in the LogP app prediction.
We examined which terms in eqs. 22 and 23 reflect the conformer number dependence. The permeability process includes the dissociation of solutes in water and the distribution of electrically neutral solute molecules from water to membrane. These processes relate to the LogD, LogP, and pK a values. Thus, the permeability-prediction models (eqs. 22 and 23) should predict the LogD, LogP, and pK a values by adjusting the coefficients {c} of eqs. 22 and 23. Figures 2B and 2 C show the conformer dependences of the predicted LogD and pK a values, respectively. The prediction models worked and the RMSE values of these predictions were similar to those previously reported. These predicted values did not show clear dependence on the number of conformers. Using the same model, only the predicted LogP app values showed the conformer-number dependence among these properties. Considering eq. 1, these results suggested that the conformer-number dependence of LogP app originated from the diffusion process (M in eq. 1) in the present models and that the < R > terms in eqs. 22 and 23 should contribute to the conformernumber dependence of LogP app .
We also examined the RMSE and Q 2 values when all the degeneracies of conformers were identical (n(i) = 1 in eqs. 19 and 20). Figure 2E shows the conformer dependence of the predicted LogP app values with n(i) = 1. The results were worse than those with estimated degeneracy, and increasing the number of generated conformers did not improve the RMSE. These results suggested that the conformer generation worked properly and that the estimation of conformer populations was important in the LogP app prediction.

Contribution of Diffusion Terms to LogP app Prediction
We examined the diffusion process in terms of viscous (ES term, eq. 2) and inertial resistances. Table 3 shows the Q 2 and RMSE values for various combinations of diffusion terms. The diffusion was described by the Log < R > , < R > , < R > 2 , < R > 3 , and σ(R) terms in eqs. 22 and 23. Since the Taylor series in eq. 21 includes higher-order terms than < R > 3 , we examined the effect of < R > 4 too. < R > 3 was proportional to the volume of solute molecule, but it did not originally refer to volume. < R > 2 and < R > 3 correspond to the cross section of the solute molecule and the cross section of it that causes inertial resistance. The < R > and Log < R > terms represented viscous resistance. The < R > 2 and < R > 3 terms improved the accuracy, as did the ES term that corresponded to < R > and Log < R > . The higher-order term (< R > 4 ) and the deviation σ(R) did not improve the results so much.

Molecular Weight and Ring-Size Dependences
We examined how Models A and B worked in the prediction of the LogP app of middle molecules. If the prediction model was adequately constructed, the predicted results should not depend on the molecular size. Since molecules with MW > 500 Da and those with N ring (the number of ringmember atoms of the biggest ring) > 12 are so-called middle molecules and macrocyclic molecules, respectively, we examined the MW and N ring dependences of the predicted data distributions. Figures 3 and 4 show the prediction results obtained by the 4-fold CV of Model B at N structure = 100. Figure 3A and 4 A show the correlations between the predicted and experimental LogP app values, and Figure 3B and 4B show the principal component analysis results as the chemical space of the data points. Since Models A and AB showed the same trends as Model B, these results were omitted. The colors in Figure 3 represent molecular weight (MW). The overall trend was that the set of small molecules showed high permeability and that of large molecules showed low permeability, but the overlaps among different color points were spread across a wide area. The overlap among the smaller and bigger molecules showed wide distribution in the chemical space. The RMSE did not depend on the MW so much. These results suggested that the present prediction model should be robust against the MW difference.
The colors in Figure 4 represent the number of ringmember atoms of the biggest ring (N ring ). The distributions of the same color data points were not localized in the LogP app prediction and chemical space, or the data points of similar size compounds spread across a wide area. The distributions of middle molecules and macrocycles overlapped that of small molecules. The RMSE did not depend on the N ring so much. In addition, Table 4 shows the dependence of the RMSE on the number of rotational bonds (N rot ). N rot represents the flexibility of a molecule. As with the MW and N ring , the N rot dependence of LogP app was weak. These results suggested that the prediction model worked well in the wide molecular-size range and should be robust against various molecular sizes and shapes.
We depicted the PCA plots based on Mordred descriptors to examine the chemical space of the collected molecules in the present study. [61] Mordred consists of 1826 molecular descriptors and it has been widely used in the chemoinformatics. Figures 3C and 4 C show the results. As same as Figures 3B and 3 C, the distributions of molecules were widely spread, and the groups of molecules colored according to their size and shape distributed contiguously except very small molecules < 150 Da (colored in red). Thus,

Verifications of Models A and B
To verify Models A, B and AB, we also examined the L1 regularization (eq. 24) instead of the L2 regularization in eq. 16. The L1 regularization could show the descriptors that show the major contributions to the results. The accuracy was almost equivalent to that obtained by the L2 regularization. Namely, Models A, B, and AB showed the RMSE and Q 2 values of 0.73 and 0.60, respectively, in the 4-fold cross validation tests ( Table 5). The important descriptors suggested by the L1 regularization were the < R > , Log < R > , < R > 2 , some ASA, mainly sub-structures including O and N atoms, and q(OH)/q(NH) among 324 descriptors including 166 MACCS keys. The results were summarized in Table S5 and Figure S1 in the supporting information. The results supported the results obtained by the present regression models.
We examined the robustness of the present method by using hold-out tests and compared the results obtained by the Mordred-descriptor set as an alternative method. We examined five hold-out tests and compared the coefficient sets {c} of generated regression models, in addition to the comparisons of predicted LogP app values. In each hold-out test, the molecules were sorted by the molecular features and the top 25 % bigger molecules form the hold-out set and the other smaller 75 % of molecules were used for the prediction model construction. The considered molecular features were the molecular weight (MW), number of atoms (N atom ), number of ring structures (N cycle ), number of rotational bonds (N rot ), and the number of member atoms of the maximum ring system of molecule (N ring ). In addition to these biased sets, we prepared a non-biased set (None) by a random selection of molecules to make a reference prediction model.
In each hold-out test, we applied the 3-fold cross validations to the teaching sets (the smaller molecules) and generated the prediction models. The prediction models estimated the LogP app values of the molecules in the holdout set (the bigger molecules). Table 6 summarizes the correlation coefficients (R) and the RMSE between the predicted and experimental LogP app values of the hold-out set. These results show that both the present and Mordred descriptors worked well (see Figure S2 in the supporting information).
Then, we examined the robustness of the model construction. Table 7 shows the correlation coefficients between the coefficients {c} of the prediction model obtained by the hold-out test and that of the reference model. The {c} based on the present descriptors did not depend on the difference of the teaching sets so much. On the other hand, the {c} based on the Mordred descriptors depended on the difference of the teaching sets strongly. The PAMPA systems represent a passive permeability only. It means that the {c} does not depend on the choice of the teaching sets. Thus, the present method should be realistic and useful rather than the collection of many descriptors.

Conclusion
We proposed a QSPR method for for evaluating the apparent membrane permeability (P app ) based on an analysis of the diffusion process and the partition function calculation with conformer sampling. This method gener-