Programmable Polymeric‐Interface for Voiceprint Biometrics

In the age of data science, voice data can be treated as one of the crucial assets used for strengthening biometrics technologies and biomedical applications. However, voice biometrics is still at emergence state facing voice variability, cross‐linguistic variations, and voice spoofing challenges. These long‐standing issues are proposed to be solved by deploying an organic, flexible, and printed piezoelectric polymeric interface exhibiting several unique features such as strong directionality, stability over a large operating temperature range (30–90 ˚C), broad frequency (6 kHz), ultra sensitivity (5.77 V Pa−1) and high signal‐to‐noise ratio (38–55 dB). By analyzing speech processing parameters acquired by the voice sensor, an individual voiceprint is established, which is further utilized to train a neural network model for deploying artificial intelligence (AI)‐driven voice biometrics, resulting in a remarkable accuracy of >96% for population identification and speaker recognition and >93% for healthcare assessment. Featuring a versatile printing fabrication process, innovative voiceprint approach, and a robust neural network, the programmable polymeric acoustic interface proposes a promising complementary tool to the existing biometrics technologies and plays a vital role in healthcare monitoring.


Introduction
Biometric information, owing to their unique biological characteristics plays a key role in various modern life applications ranging from security to healthcare. [1,2]Fingerprint, (deoxyribonucleic acid) DNA, and iris are the actual well-established physical features used for biometrics technologies due to their Parkinson's disease often expressed common symptoms based on the changes in voice quality, including reduced loudness and monotone pitch. [7,14]Similarly, a person subject to chronic obstructive pulmonary disease (COPD) may have a deeper or more breathy voice. [15]espite numerous research groups working on acoustic sensing and voice recognition, the field of voice biometrics is still in its early stage of development.To date, few voice recognition systems based on capacitive, [16,17] piezoresistive, [18][19][20] and triboelectric effect [21][22][23] have been proposed to decode individual information.However, these are still facing the limitations of insufficient sensitivities, limited working distances, environmental robustness, durability, or ease of integration.Moreover, for a broader deployment of voice biometrics, one of the critical challenges lies in the fabrication of an acoustic sensor (also defined as a voice sensor) able to capture the full-fledged human voice features.In this context, piezoelectric-based acoustic sensors have been demonstrated as promising devices, [5,[24][25][26][27][28] due to their high sensitivity, wide bandwidth response, high signal-to-noise ratio, and strong directional properties.To use these sensors for practical applications, they should also be environmentally robust to perform well under outdoor conditions.In this work, we propose a high-fidelity programmable wearable, polymeric interface as an acoustic sensor with a superior signal-to-noise ratio (SNR) that can address these challenges and above all distinguish individuals among a population and differentiate speakers by gender, health conditions and voice spoofing or impersonation attacks (Figure S1, Supporting Information).The programmable polymeric interface deployed as the programmable acoustic sensor (PAS), consists of a printed polymeric sandwiched structure of poly (3,4ethylenedioxythiophene)-poly(styrenesulfonate) (PEDOT: PSS)polyvinylidene fluoride trifluoroethylene (P(VDF-co-TrFE)) -PE-DOT: PSS on polyethylene naphthalate (PEN) substrate.The piezoelectric PAS converts sound wave-induced vibration of the polymeric membrane into electrical signal and exhibits a wide range of linearity (57-71 dB) in a large bandwidth (6 kHz) with a high degree of sensitivity (5.77V Pa −1 ), broad range of temperature stability (30-90 ˚C) and high signal-to-noise ratio (SNR) (38-55 dB) on increment of the sound intensity.
To further demonstrate the efficacy of our voice sensor specifically in the domain of person identification, an artificial intelligence (AI) algorithm was implemented, that provides distinct voice signature differentiation in recognizing the comprehensive English alphabet and is further validated through population identification, health conditions determination and speaker recognition with a classification accuracy of >96 %, >93 %, and >96 %, respectively.In an effort to solve the long-standing limitations in voice biometrics, our high-resolution AI-driven voice sensor expands the toolbox of the existing biometrics technologies by its capability to record voice signatures and identify individuals' voice traits based on the speech processing parameters.

Polymeric-Interface Fabrication and Characterizations
With its ability to ensure high precision, repeatability, uniform deposition, and large-scale production offering simplicity, screen printing emerges as an ideal technique for the fabrication of high-performance, flexible, and wearable piezoelectric sensors.Piezoelectric polymer P(VDF-co-TrFE) was chosen as an active material owing to its lightweight, low acoustic impedance, high compliance, and broadband acoustic properties that compensate for its weaker dielectric and electromechanical coupling properties in contrast to conventional piezoelectric ceramics.P(VDFco-TrFE) can be solution-process deposited or printed allowing to freely design any flexible desired shape on planar, curved, or more complex geometries. [31]Figure 1a illustrates the schematic of the structure of the screen-printed programmable piezoelectric acoustic sensor (PAS), consisting of an electroactive piezoelectric copolymer, P(VDF-co-TrFE), of 5 μm thick sandwiched between two PEDOT: PSS layers (≈200 nm) as electrodes on a 50 μm polyethylene naphthalate (PEN) substrate.The presence of both top and bottom electrodes in a printed sensor ensures a uniform electric field distribution enabling efficient polarization of the piezoelectric material.The device includes also 2 μm silver (Ag) lines as electrical connections for signal acquisition.Figure 1b shows the flexible semi-transparent screenprinted PAS.A high-resolution cross-sectional scanning electron microscope (SEM) image of the PAS (Figure 1c) validates the effi-cient screen-printing process of the several homogeneous layers with control thicknesses and their interfaces.The morphology of annealed P(VDF-co-TrFE) thin films acquired by atomic force microscopy (AFM) image of 1 μm 2 area is shown in Figure 1d and can be described as homogeneously distributed rice-like domains, consisting of stacks of crystalline lamellae of well-ordered polymer chains separated by an amorphous interlamellar region made of disordered polymer chain formation. [32]ocal switching piezoelectric measurements were carried out by piezoresponse force microscopy (PFM) mode which is used to track the polarization switching corresponding to the voltage applied in a direction perpendicular to the substrate.In Figure 1e, the presence of alternation of high (light blue) and low (red) piezoresponse within the elongated domains of ≈7 nm wide with a period of 16 nm corresponding to the crystal and amorphous phases respectively are clearly visible.Probing the polarization switching of the crystal (light blue domain) of the semi-crystalline P(VDF-co-TrFE) thin films showed a piezo and ferroelectric behavior as shown by the butterfly loop behavior of PFM amplitude and the presence of a large hysteresis in the phase versus DC voltage is shown in Figure 1f,g, respectively.
A sinusoidal tension of 1 V amplitude between 10 2 and 10 5 Hz was applied to the device to determine the complex relative permittivity ɛ (ɛ′, ɛ″ and tan  = ɛ″/ɛ′).The dielectric constant evolution with the temperature is characterized by a broad maximum.The maximum of ɛ′ is located at the Curie transition, and a thermal hysteresis is observed during the heating and cooling cycle with Curie temperatures at heating and cooling being frequencyindependent.These observations correspond to the well-known Curie transition from the ferroelectric phase at low temperatures to the paraelectric phase at high temperatures (Figure 1h; Figures S2 and S3, Supporting Information).XRD spectrum of P(VDFco-TrFE) copolymer shows a diffraction peak at 2 = 20.0°,which corresponds to an interchain lattice spacing of 4.439 Å (calculated from Bragg's equation 2d sin  = n) from the (110, 200) reflection in ferroelectric crystalline phases (Figure S4, Supporting Information). [33,34]In the Fourier transform infrared spectroscopy (FTIR) peaks at 1288 and 848 cm −1 indicated the presence of  phase (Figure S5, Supporting Information).Electric displacement versus electric field (D-E) hysteresis loop is commonly used to define the type of behavior (ferroelectric, relaxor ferroelectric, or paraelectric) of fluorinated polymers from spontaneous dipoles switching in response to an applied external electric field also employed for polarization of the electroactive film.D-E loops of ≈5 μm thin film were measured by applying a triangular voltage waveform up to about 150 V μm −1 at a frequency of 1 Hz between the bottom and the grounded top electrode.Figure 1i shows the ferroelectric D-E loop that exhibits a coercive field (E C ) ≈50 V μm −1 and large remnant polarization (P r ), ≈7 μC cm −2 .Finally, the inverse piezoelectric effect was characterized by carrying out 3D particular velocity measurements on the piezo active surface.The membrane was excited by a fivecycle sinusoidal burst centered at 100 kHz by an electrical voltage of 225 V peak-to-peak .The full vibrational motion consisting of all three orthogonal components of velocity was quantified via 3-D laser Doppler vibrometer (3D LDV) methods.Figure 1j illustrates the out-of-plane particular velocity of the piezoelectric transducers where insets depict the measured scan on all piezo active surfaces.

Performance Evaluation of Polymeric Acoustic Interface
The acoustic response of the PAS has been characterized as a function of the working distance from the speaker to define the suitable distance at which it can effectively capture human voices and be employed for acoustic source localization.Figure S6 (Supporting Information) illustrates the data acquisition of the PAS set-up in our anechoic chamber.Figure 2a shows that the output voltage decreases from 480 to 20 mV when the distance increases from 1 to 20 cm under the sound pressure level (SPL) of ≈70 dB from the acoustic source due to sound wave attenuation during propagation.Sensitivity is one of the critical metrics of a sensor's performance reflecting its ability to convert sound waves into an electrical signal.For a given variation in air pressure, highly sensitive sensors produce a correspondingly larger electrical signal, which is typically measured in terms of the sensor's output voltage per unit sound pressure level, expressed as  2c).This high efficiency and sensitivity of the PAS is a virtue of the high electroactivity of screen-printed P(VDF-co-TrFE) and the precise control of the thickness with excellent adhesion of the different layers. [35]Signal-to-noise ratio (SNR) is another key parameter of the figure when testing the fidelity of the acoustic sensors/microphones.Here, the as-developed PAS has shown a linearity of SNR up to 71 dB (Figure 2d).During a clamorous soundscape, a linear relationship between SNR and SPL can be the key to unlocking the full potential of recorded speech, enabling the faithful capture of voice signals.The impeccable SNR of the PAS can be attributed to its fabrication process that enables homogeneous interfacing of different layers. [36]wing to its electromechanical properties and relative thin thickness (≈55 μm), PAS exhibits a broad bandwidth of 6 kHz (Figure 2e), which is a key asset to capture a large range of frequencies required for accurate and natural representation of the human voice and detection of basic speech elements (phonemes, tones, and words).On a different note, in music production or broadcasting, a broader bandwidth is also desirable to capture the full range of vocal harmonies and overtones. [37]Furthermore, to assess the ability of the PAS to acquire error-free acoustic signals, 360°spatial signal directionality plot has been measured.The recorded data showed pronounced directional features indicating the PAS is directionally selective as its sensitivity varies with the angle; with the highest response (≈480 mV) at 0˚and 180å nd lowest response (≈25 mV) at 90˚and 270˚angle (Figure 2f).This directional selectivity of the PAS demonstrated the highly selective reception of sound waves from a specific direction as well as a certain tolerance to environmental noise interference from unwanted sounds originating from other directions representing two key features of speech recognition.Precise control of the shape and size of the sensor elements enables PAS with directional sensitivity patterns that are more complex and specific than those produced by other techniques. [38]Finally, we have compared the output signal of our printed PAS with a commercial microphone in order to evaluate its reliability and performance.Immediate observations can be made.Both devices exhibit extremely similar frequency spectrum signals.The only difference lies in the change in the signal intensity (spectrogram), which indicates that the PAS is able to generate a similar electrical signal to the original sound however at a lower frequency, thus validating the present technology (Figure 2g,h). [39]To go one step further and implement our PAS as a wearable voice sensor to measure acoustic waves from everyday life, it needs to be lightweight, bendable, and flexible such as human skin, and withstand temperature fluctuation.In this regard, the performance stability of the PAS has been evaluated as a function of temperature and applied strains.Figure 2i (upper) shows that the PAS exhibits excellent temperature stability from room temperature up to 90 ˚C with no noticeable change (<1%) in the sensitivity.Sensitivity measurement of the PAS with respect to applied strain was carried out by bending the printed flexible PAS onto cylinders with different radii has been probed as indicated in Figure 2i (lower), demonstrating its perfect stability even when rolled up with a 5 mm radius of curvature.A table consisting of the Comparison among the various acoustic features for different devices indicating the superiority of our fabricated acoustic interface has been provided in the Supporting Information (Table S1, Supporting Information).

Voiceprint Identification
Human voice is generated through a complex physiological process involving the vocal cords of the larynx and expressed distinct features that vary among individuals (Figure 3a).In particular, the length and thickness of vocal cords affect the pitch of the voice, while the size and shape of the throat and mouth will impact the timbre or the tone. [40]These physical features are determined by genetics and can vary significantly from person to person, leading to a wide range of individual voice characteristics.Despite the significant variation in vocal traits developing a reliable approach to creating a unique individual voice signature remains a challenge. [41,42]n this regard, our approach to overcome this difficulty involves plotting an individual's voice as a single point on a 3D plot that considers the duration of speech (t), the pitch (p), and the amplitude (amp) (Figure S7, Supporting Information).This graphical representation also referred to as a voiceprint plot is a powerful promising tool for voice biometrics.Specifically, we first recorded the acoustic patterns of the 26 letters (A-Z) of the English alphabet pronounced by one person in the exact same conditions for each letter in an anechoic chamber (Figure S8 and Video S1, Supporting Information).Behind each recorded spectrum (spectrogram) corresponding to a letter as shown in Figure 3b lies a unique mechanical stimulation that corresponds to a specific electrical signal defined by its time duration (t), pitch (p), and amplitude (amp).These features were implemented onto a 3D voiceprint plot (Figure 3c) that highlights the unique acoustic characteristics of each letter occupying a specific position on the 3D-plot.
In order to get further insight into the relationship among the signals of those letters generated from the PAS, a correlation matrix was plotted (Figure 3d).The correlation matrix provides valuable insights into the nature and strength of the relationships between variables. [43]The voiceprint of each letter exhibits a complex and highly variable relationship with the other letters.Correlation coefficients indicating the strength and direction of these relationships can be positive or negative (Figure S9, Supporting Information).Positive coefficients suggest a direct relationship between specific voiceprint parameters.For instance, an increase in one parameter tends to be associated with an increase in the other parameter, and vice versa.These relationships are influenced by distinct physiological processes involved in producing different letters, with various articulatory structures contributing to their specific voice features. [44]The observed complexity and variability in the correlation matrix underscore the diverse nature of letter generation.Each letter possesses unique voiceprint characteristics that reflect the required articulatory movements and vocal tract configurations.Consequently, the relationship between pitch, duration, and amplitude parameters significantly varies among letters, emphasizing the intricate interplay of articulatory mechanisms and vocal tract dynamics in speech production. [45]

Voiceprint Deployment in Population Identification and Healthcare Anomaly Detection
To further demonstrate the practical applications of the PAS we conducted a study with 20 volunteers comprising an equal number of males and females (average age of ≈ 28years) who were asked to say the sentence "I Love Science".By analyzing the voiceprint features of their speech, the PAS was able to classify the population based on their vocal features such as pitch, time duration, and amplitude of voice (see Videos S2 and S3, Supporting Information).Individual voice signature was visualized on a spectrogram that depicts the variation in pitch frequency and amplitude based on their tone, timbre, and speaking style (Figure 4a).Their corresponding time domain plot is provided in Figure S10 (Supporting Information).
A deep learning neural network model defined as a programmable acoustic neural network (PANN) has been developed for the classification of voice features by using the keras library that consists of one input layer, two hidden layers, and one output layer with dropout layers in between for avoiding overfitting/underfitting issues (more details are given in Experimental Section;Figure S11 and Associate discussion Figure S1, Supporting Information). [46]The learning rate of a neural network is one of the critical hyperparameters for an accurately trained network.
As a matter of fact, it has shown numerous effects on neural network performance such as tuning the rate of convergence, risk of overshooting, and unstable training. [47]Consequently, we have optimized the learning rate by running the PANN at different rates.It was found that as the learning rate increases the model converges faster with less epochs in contrast to the slower rate of training (Figure 4b).Similarly, a faster learning rate was observed to lead to a faster decrement of the loss function (Figure 4c; Figure S11, Supporting Information).A larger learning rate leads the model to take bigger steps for updating itself during training, which results in faster convergence with fewer training iterations.Noteworthy to mention here, even though a larger learning rate reduces time and epochs for network training, several issues such as skipping over local optima, inability to universalize the model, and unstable training were observed in return, highlighting the impact of the optimal learning rate for training the neural network.
To further optimize the performance of the PANN, an investigation was conducted to explore the impact of varying training/testing proportions, as they have been found to have a significant influence on the network's performance. [48,49]The accuracy of the PANN is primarily increasing and then decreases with the increment of the training/testing proportion (Figure 4d), a similar trend was also observed for the losses.This trend is due to the fact that a high training proportion leads to a large amount of training data, which subsequently increases the likelihood of overfitting and hinders the model's ability to correctly identify the unknown samples.Inversely, a too-small training proportion would lead to underfitting.To assess the PANN's ability to distinguish between positive and negative classes and aid in determining an appropriate prediction threshold, an ROC plot has been plotted, indicating the perfect classification between the true positive rate (TPR) and false positive rate (FPR) with the area under the curve (AUC) equal to 1 (Figure S12, Supporting Information).The performance of our developed PANN with an accuracy >96% (for classification of male and female population) has shown improved performance in comparison to commercially available neural network toolbox (C-Tool), (≈84%) (Figure 4e; Figures S13  and S14, Supporting Information).
The unique features of each individual's voice have been plotted on the voiceprint boxplot (Figure 4f) allowing clear discrimination between male and female populations as indicated by their distinct voice characteristics (Figure S15, Supporting Information).Females generally have higher pitch while speaking with a shorter duration, whereas men generally possess larger and thicker vocal cords and may produce louder and longer speech more easily. [50]rincipal component analysis (PCA) has been also carried out based on the voiceprint for classifying the male/female population (Figure S16, Supporting Information) and a pair plot demonstrates the relation among all the features (Figure S17, Supporting Information). [51]With the aim to recognize the classification capabilities of the PAS for classifying the male and female population, a confusion matrix has been deployed further, that indicates the population has been detected with an accuracy of >96% (Figure 4g).Over a second phase, for one male, voice signals from both good health and hoarseness-affected throats were recorded in order to demonstrate the disease detection capabilities of the PAS based on voice features.Figure 4h shows a boxplot that differentiates these two conditions on the voiceprint plot.Clearly, the hoarseness infected throat has a low pitch and smaller amplitude with a wider time of speaking in comparison to the normal throat.This can be attributed to the fact that inflamed vocal cords require more effort to produce sound, leading to vocal fatigue and difficulty in sustaining speech for longer periods of time.The changes in pitch and amplitude are due to disrupted vibratory patterns caused by inflammation. [52]A confusion matrix has been plotted for further evaluating the recognition of healthy/infected populations by PAS and classifies the proportion of corrected/incorrected classes by PANN. Figure 4i demonstrates that the PAS integrated with PANN can correctly identify healthy over-infected populations based on voice features with an accuracy of ≈>93%.The architecture of the of the PANN along with the AI-driven voiceprint generation has been illustrated in Figures S18 and S19 (Supporting Information), respectively.
To exemplify the further capabilities of PAS deployable voice biometrics as individuals' identification for the security sector, we recorded the voice signals of five volunteers (40 times from each volunteer) and trained the PAS using PANN.The voiceprint features of all the volunteers were analyzed using PCA, which discriminates distinct clusters formed by the five individuals (Figure 4j).Furthermore, the centroids of these clusters (hollow black stars) were evidently separated from each other, indicating a clear distinction between the voice characteristics of each volunteer. [52]In order to assess the classification accuracy of individual speech, a confusion matrix has been implemented achieving a classification accuracy of >96% (Figure 4k).The high accuracy and well-separated voiceprint features confirmed the robustness and effectiveness of PAS-integrated PANN for speaker recognition as well.

Conclusion
Owing its high sensitivity and broad frequency domain our programmable polymeric interface emerged as a breakthrough for acquiring high-fidelity voice data.By combining it with artificial intelligence (AI), an organic wearable acoustic sensor has shown its ability to quantitively sense, analyze, and recognize voice signatures as demonstrated by population classification, healthcare assessment to speaker recognition.It clearly appears as a dawning pillar in voice biometrics, offering an extra shield for authentication and access control as well as non-invasive early-stage disease diagnosis.In a broader context, we believe that voice biometrics paves the way for the implementation of nextgeneration interaction systems technologies, including voice-controlled cars, smart homes, and human-machine interfaces.

Experimental Section
Device Fabrication: The multi-layered PAS was entirely screen printed using an Ekra X5 professional screen and stencil printer at the ELORPrint-Tec facility.First, electrical connections were screen-printed from a silver ink (DM-SIP-3060S) purchased from Dycotec Materials.A polyester screen with 100 threads per cm was used with a pressure of 100 N and a speed of 100 mm s −1 .After printing, the silver ink was dried at 110 °C for 15 min.Then, electrodes were screen-printed from a PEDOT:PSS ink (EL-P5015) purchased from Agfa with a pressure of 80 N and a speed of 100 mm s −1 and then annealed at 120 °C for 15 min.The piezoelectric semi-crystalline polymer P(VDF-co-TrFE) was screen-printed from a piezoelectric ink (FC20 INK P) purchased from Piezotech Arkema.A dual annealing step at 80 °C for 5 min and 135 °C for 30 min was carried out straight after printing.High-quality homogeneity and adhesion between the different layers and the substrate were observed.
Characterization Techniques: Dielectric measurements were achieved on a Solatron 1260A impedance analyzer.Electric displacement-electric field (D-E) loops were recorded with a TF Analyzer 2000 (aixACCT System).The high-resolution SEM images were recorded with a TESCAN VEGA3 instrument under an accelerating voltage of 15 kV.AFM and PFM measurements were acquired from Dimension ICON and multimode 8 Bruker Corporation.
Data Acquisition: All the measurements were acquired with the noninvasive PAS remotely from the volunteers within the institute and consent had been taken from every volunteer prior to the data recording.The data collection for building the AI model had been carried out 40 times for each volunteer at different extremities of time to ensure the diversity of the data for building a robust AI model.
PANN Architecture: The PANN model had been built with a sequential neural network model using the Keras library that consists of one input layer, two densely connected layers, and one output layer containing 128, 64, 32, and 1 neurons, respectively.The activation function used in each layer was the rectified linear unit (ReLU) and sigmoid activation function for the output layer.The sigmoid function was commonly utilized in binary classification scenarios because it possesses the capability to map the input to a continuous value within the range of 0-1.This transformed value could be interpreted as an estimate of the probability that the input was associated with the positive class.During training, the adam optimizer updates the weights of the neurons in the model based on the gradient of the loss function with respect to the weights.In order to place a greater penalty on incorrect predictions, binary cross-entropy was employed as the loss function.Additionally, to prevent overfitting by reducing the coadaptation of neurons in PANN, two dropout layers had been added with dropout rates of 0.3 and 0.2, respectively.(A more detailed description is provided in Figure S16 and associated discussion Figure S1, Supporting Information).

Figure 1 .
Figure 1.Design, fabrication, and materials characterization of the acoustic sensor.a) Schematic of the programmable printed acoustic sensor (PAS).b) Digital image of the PAS (scale bar 1 cm).c) Cross-sectional Scanning Electron Microscopic (SEM) images of the interface between the PEDOT:PSS/P(VDF-co-TrFE)/PEDOT:PSS on PEN substrate.d) AFM morphology image of P(VDF-co-TrFE) (scale bar 200 nm).e) Displacement visualization image of the P(VDF-co-TrFE).f) PFM amplitude and g) phase loop of P(VDF-co-TrFE).h) Dielectric constant and i) D-E loop characterization of P(VDF-co-TrFE).j) Laser Doppler Vibrometer characterization of the out-of-plane particular velocity of printed P(VDF-co-TrFE) film.The down inset shows showing control (undeformed/baseline) region, whereas the up inset indicates the deformation region.
Volts per Pascal (V Pa −1 ) or decibels relative to 1 Volt per Pascal (dBV Pa −1 ).The sensitivity (S a ) of the PAS as a function of different sound intensity values has been measured over 5.77 V Pa −1 over a range of sound intensity (57-71 dB). Figure 2b represents the linear relation between the voltage response and the sound pressure level (SPL) indicating that the sensor is capable of generating consistent and reliable signals across different sound intensities.Also, it demonstrates the reproducibility of the signal at different sound intensities.Such high values are attributed to the efficient conversion ability of P(VDF-co-TrFE) to convert mechanical stimulation into electrical signals.In fact, the maximum sensitivity per sensing area values were found to reach ≈18.3 mV Pa −1 mm 2 in the range of 50-70 dB, being the most promising values reported for self-power piezoelectric and triboelectric transduction-based acoustic sensors (Figure

Figure 2 .
Figure 2. Performance characterizations of the PAS.a) Output voltage versus distance plot for the PAS.b) Output response of PAS for different sound intensities.c) State of the art of self-power acoustic sensors sensitivity per sensing area.d) Variation of the signal-to-noise ratio (SNR) with the sound intensity.e) Variation of sensitivity with frequency.f) Directionality plot of the PAS.The acoustic signal generated by the g) commercial microphone LYM02, GYVAZLA, and h) PAS with their corresponding spectrograms.i) Normalized sensitivity as a function of (upper) temperature and (lower) bending radius, representing the stability of the PAS toward temperature and bending.

Figure 3 .
Figure 3. Voice biometric acquisition.a) Schematic representation of the fibroscopic view of a larynx.b) Generation of the dictionary of the English Alphabet (A-Z) with the PAS.c) Signature of each letter on the voiceprint plot, indicating that each letter can be clearly recognized as each has a specific pitch, amplitude, and time duration of the speaking.d) Correlation matrix plotted with the voiceprints of the English alphabet.

Figure 4 .
Figure 4. Voiceprint identification and deployment.a) Spectrogram of the volunteers of speaking sentence "I Love Science".b) Accuracy and c) loss plots at different learning rates for the PANN.d) Effect of different proportions of training/testing on the accuracy and loss of PANN.e) Accuracy comparison between the commercially available deep learning tool (C-Tool) and our built PANN.f) Voiceprint plot for the population identification (green and purple boxes represent the pitch and amplitude respectively) and g) its corresponding confusion matrix.h) Voiceprint plot of healthcare detection (red and blue boxes represent the pitch and amplitude respectively) and i) its corresponding confusion matrix.j) Principal component analysis (PCA) plot and k) confusion matrix for speaker recognition involving five volunteers.