Optimising air quality prediction in smart cities with hybrid particle swarm optimization ‐ long ‐ short term memory ‐ recurrent neural network model

In smart cities, air pollution is a critical issue that affects individual health and harms the environment. The air pollution prediction can supply important information to all relevant parties to take appropriate initiatives. Air quality prediction is a hot area of research. The existing research encounters several challenges that is, poor accuracy and incorrect real ‐ time updates. This research presents a hybrid model based on long ‐ short term memory (LSTM), recurrent neural network (RNN), and Curiosity ‐ based Motivation method. The proposed model extracts a feature set from the training dataset using an RNN layer and achieves sequencing learning by applying an LSTM layer. Also, to deal with the overfitting issues in LSTM, the proposed model utilises a dropout strategy. In the proposed model, input and recurrent connections can be dropped from activation and weight updates using the dropout regularisation approach, and it utilises a Curiosity ‐ based Motivation model to construct a novel motivational model, which helps in the reconstruction of long short ‐ term memory recurrent neural network. To minimise the prediction error, particle swarm optimisation is implemented to optimise the LSTM neural network's weights. The authors utilise an online Air Pollution Monitoring dataset from Salt Lake City, USA with five air quality indicators for comparison, that is, SO2, CO, O3, and NO2, to predict air quality. The proposed model is compared with existing Gradient Boosted Tree Regression, Existing LSTM, and Support Vector Machine based Regression Model. Experimental analysis shows that the proposed method has 0.0184 (Root Mean Square Error (RMSE)), 0.0082 (Mean Absolute Error), 2002*109 (Mean Absolute Percentage Error), and 0.122 (R2 ‐ Score). The experimental findings demonstrate that the proposed LSTM model had RMSE performance in the prescribed dataset and statistically significant superior outcomes compared to existing methods.

and noise quality, keeping tabs on traffic and the surrounding area, directing drivers to available parking spots, and more [1].Internet of Things (IoT) plays a vital role in various monitoring systems.Internet of Things is a system where physical objects are linked together and can exchange data.
Our goal with IoT is to make city life easier by linking everything to the web, gaining environmental awareness, delivering data consistently via various wired and wireless networks, and processing that data to provide new, valuable services.Implementing solutions for environmental monitoring has become a crucial concern for cities and local governments as they work to enhance the quality of life for their residents.Expansion, industrial pollution, climate change, and ecological disasters are a few of the causes that degrade air, water, and soil quality.Smart city technologies, such as environmental monitoring, utilise a dispersed sensor network connected via the IoT to collect and transmit data in real time that can be used to improve the administration of city services such as transportation, parking, utilities, garbage collection, and public safety.The air pollution sensors, temperature humidity gauges, and sensors to measure the water and surface water are quality IoT sensors in an environmental monitoring system that generate real-time data from wired, wireless, and cellular protocols and relay to the city through a fast network [2].
The analytics platform collects the data, processes it, and presents the results to municipal officials through interactive dashboards, maps, and reports.Cities may better manage a wide range of public health and environmental issues with the help of environmental monitoring, which provides complete, real-time information on a massive number of data points.With the use of environmental monitoring systems, metropolitan areas utilise various steps.It is essential to lessen the chances of ecological disasters happening.It can help avert oil spills, chemical leaks, and natural gas explosions, as well as the resulting costs and negative consequences if they are better able to detect disasters in the making and reduce the destructive force of natural calamities.Warnings about storms and floods, for example, can help communities reduce the number of lives lost and the amount of property damage.Public health must be enhanced [3].
Monitoring may aid cities in providing a safer and healthier environment for their residents by keeping tabs on water and air quality and reducing their exposure to harmful chemicals.How to keep our natural resources safe?This is now a prominent research topic.Gas emissions, oceanic fluctuations, and temperature variations are just a few environmental factors that may be tracked with environmental monitoring technologies.Using this information, cities can better safeguard their precious natural resources.Environmental monitoring can detect and prevent air, water, and soil threats that breach regulatory codes such as the "Clean Water Act" [4].Most of the population now wants to live in urban areas; thus, city governments are moving their attention from focusing only on economic expansion to recognising the urgent need to address climate change.Gas emissions, traffic management, and trash management are all growing issues due to global warming and the rapid expansion of the current infrastructure.Implementing innovative programmes to address problems such as traffic jams, smog, a lack of renewable resources, and diseases related to a lack of cleanliness is essential.Traffic lights, smart parking, smart buildings, and automated waste management are smart city infrastructures that can be built using ICTs, which should help reduce pollution and boost public safety [5].
Data collection is a crucial part of environmental monitoring since it helps to demonstrate how the environment functions, how it impacts people, and how it may be managed.Sources, such as precipitation or sewage, or emissions from cars, are both captured in environmental monitoring data.Environmental monitoring makes civilian lives safer, more convenient, less taxing on the environment, and more fruitful.Internet of Things may be implemented in several ways, each of which caters to a particular sector of smart work we are working towards.Environmental studies tracked that air, soil, and water are the primary natural resources.Water samples for chemical, radiological, and biological data about population density are the essence of water monitoring.Grabs are tested for salt, pollution, and acidity as part of soil monitoring, which helps farmers assess soil quality and foresee problems such as erosion, flooding, and threats to biodiversity [6].Other forms of information, such as traffic counts, population breakdowns, security, resource scarcity, building, and home health, city infrastructure, and food security, are also monitored and analysed as part of the environmental monitoring of people's homes and urban areas.
The scope of environmental surveillance now includes the entire planet called "Global warming gas monitoring".Gases are the root cause of climate change and its associated severe weather, food supply disruptions, and health problems due to smog and pollution.The vast volumes of data that must be sorted, monitored, analysed, and proactively used to produce answers for everyday difficulties present the most significant difficulty in smart environments.In some instances, when using CCTV to monitor obese people's actions, environmental monitoring consists solely of checking in on the sensors' performance.For environmental monitoring apps to be beneficial, they need to be able to interpret measured data and come up with strategies for resolving practical problems.Harvests in farming, climate change, city pollution, and low productivity in labour-intensive manufacturing are only some of the daily struggles people face [7].
� Airborne dust particles: Air quality checking is one way of environmental monitoring.The genesis of original fine dust is typically natural.Vehicles, home heating systems, farms, and factories contribute to the complicated chemical reactions that produce fine secondary dust.The environment allows researchers to determine the composition of dust in different geographic locations, such as "cities Versus factories" and worldwide.Experts can better measure the impact of dust pollution on people's health if the environment is monitored for dust [3].Monitoring the environment aids in creating laws for allowable amounts of dust released by cars and industrial activities by city planners and the manufacturing sector [8].
� Conditions of crops: Wet leaf sensors assess the amount of wetness on plant leaves and then analyse that data to tell farmers how their crops are doing and to reveal seasonal and annual patterns in how the weather affects crop yields.� Water quality: The mining sector routinely collects and analyses water samples to develop effective water management strategies and foresee the effects of mining on the environment.In smart environments, there are primarily two kinds of environment monitoring software.� Control and monitoring: Environmental disasters and hazardous waste from industrial operations are only two examples of environmental occurrences that may be monitored and controlled using these tools.Activities, such as population expansion, affect the environment, which may be quantified using these programs.For example, marine scientists researching the effects of fishing restrictions on seafood supplies and the influence of plastic waste on marine life are only two examples of the inputs that may be analysed by monitoring and control applications.Those who study environmental psychology collect and analyse environmental data to determine how it affects human health and behaviour [4].� Facilitators of ecological improvement: Machine learning (ML) algorithms and their applications are widely used to boost the overall efficiency of ecological monitoring sectors.It provides all the benefits of the information gathered through smart environmental monitoring.These applications help to build and maintained houses, buildings, and cities in an eco-friendly way.� A Framework for Smarter Environment: Smart material and immaterial parts are examples of non-physical aspects that support intelligent environments.Any appliance connected to the web wirelessly or via a wired connection is considered an IoT device.But not all IoT gadgets in futuristic wire-free societies operate over wireless networks.
Components of an IoT device include sensors, actuators, network connectivity support technologies, and functional software such as application programming interfaces.The IoT allows for the hands-free and immediate sharing of data amongst physical items, humans, and computer programmes [5].� Actuators and sensors: Data in smart environments comes from various software programmes, sensors, and actuators.Environmental monitoring software gets input data from sensors, including temperature, proximity, gas, smoke, water, and air quality monitors.A sensor, such as a motion detector or a light switch, is a physical device that measures and transmits information about the state of the physical world.Electrical impulses are sent to an actuator, such as a switch or a valve, and the device acts, producing a measurable result, such as the activation of a fan or the opening of a window.Sensor data is often uploaded to cloud databases and data lakes to create intelligent, self-learning apps and then watched and analysed [6].
The idea of "Smart Cities" has arisen as an exciting new direction in our age of fast urbanisation, with the potential to solve many of the problems that city life brings.A major problem in cities is the air pollution that people breathe.This is a major risk to people's health and the environment.The use of state-of-theart technology, namely air quality prediction models, is now necessary to lessen the impact of these threats.Due to increased energy consumption, transportation, and industrial activity, air pollution has worsened as a result of urbanisation.Effective solutions are urgently needed since poor air quality has negative consequences on human health, including respiratory ailments and cardiovascular difficulties.One potential solution to these problems is the implementation of smart city strategies, which are data and technology-driven.One of these strategies should be the forecast of air quality.To keep tabs on a wide range of environmental factors, "smart cities" use a vast network of sensors and data sources.These sensors, strategically placed across cities, gather data on air pollution, weather, and other pertinent variables in real time.Modern air quality prediction models use this information and use ML algorithms to spot trends and make predictions about the air's future state.In order to provide city planners and lawmakers with useful insights from raw sensor data, ML techniques are crucial.Accurate forecasts, adaptation to new circumstances, and the ability to detect complicated correlations between variables are all capabilities of these algorithms.With the help of past data, these models can get better with time, leading to better decisions made in the here and now.Predictions of air quality that are both accurate and useful allow people to plan their outdoor activities in a way that reduces their exposure to dangerous pollutants and the likelihood of health problems.Smart Cities are able to take preventative actions against pollution by monitoring and forecasting air quality trends.These tactics include modifying traffic patterns, improving energy efficiency, and expanding green areas.Predicting air quality enables more effective use of resources.During times of heavy pollution, for example, emergency services can be notified in advance, and public transit timetables can be changed to reduce emissions.By keeping the public in the loop about expected air quality, we can raise awareness and rally support for efforts to curb pollution.

| Problem definition
Predicting air pollution concentrations at a given place and time is the goal.Predicting concentration or classifying air quality may be done as a regression or classification operation.

| Inputs
� Past air quality data contains PM2.5 and NO2 time series data.� Meteorological Data such as temperature, humidity, wind speed and direction, air pressure, and more.� Temporal elements include hours, days, months, and maybe seasonal tendencies.� Geographic characteristics include latitude and longitude to account for geographical variations.

| Output
In a regression task, the estimated target air pollutant concentration is a continuous number; in a classification task, it is categorised into predefined values ("good," "moderate," "unhealthy").Data accuracy, model interpretability, and the necessity for ongoing model modification are some of the obstacles that must be overcome in order to fully realise the enormous potential of air quality prediction in Smart Cities.Ensuring the continuous effectiveness of air quality prediction projects requires future research to concentrate on algorithm refinement, sensor capability enhancement, and multidisciplinary partnerships.One revolutionary way to tackle the complicated problems of urban air pollution is to incorporate air quality prediction activities into Smart Cities.Urban areas may improve the health of their residents and make their communities more sustainable by using data and ML to control and reduce air pollution.The importance of air quality prediction in Smart Cities is going to grow in the future of city life as technology keeps becoming better.
Air pollution research encounters several challenges, that is, accuracy, and real-time data analysis.According to public studies, many companies that have adopted the IoT are struggling to discover ways to use the data they collect in better business decisions.Due to critical IoT systems and unskilled workers, who lack specialised IoT skills have a more difficult time managing and monitoring these systems than those with more mature systems and clear best practice standards.In the industry due to the rapid development of IoT, not every nation agrees with the international regulations that try to restrict pollution and keep an eye on the longevity of limited supplies [9].There is no de facto standard for IoT gadgets; ZigBee, for instance, competed with Bluetooth's mesh network services.Compatibility and safety might occur without a common standard.

| Research questions
The research questions that define the applicability of this research model are as follows.
RQ1: How the combination of particle swarm optimisation (PSO) and long short-term memory recurrent neural network (LSTM-RNN) enhance the accuracy and efficiency of predicting air quality changes compared to traditional methods or standalone models?RQ2: What are the synergistic effects of PSO in optimising LSTM-RNN hyperparameters for improved performance?RQ3: How does PSO aid in the optimal tuning of LSTM-RNN hyperparameters for air quality prediction?RQ4: Whether the optimised model generalises well across different smart city environments, considering variations in geographical, meteorological, and pollution factors?Some of the IoT-based systems employ advanced communication protocols and technologies, leading to unwieldy and hard-to-maintain settings.Internet of Things node's authentication, authorisation, and interconnection occur centrally, using a client-server architecture.It is a potential for a bottleneck on the server side as the node count rises.Internet of Things computing models, in which IoT devices act as hubs for time-critical processes and cloud servers, handle data processing, and analytical duties may be a potential solution to bottlenecks.The data used and created by the IoT can be both organised and unstructured.Internet of Things systems may need more interfaces to handle unstructured data; legacy systems may need to be updated.In mission-critical systems, errors in ML algorithms can lead to unnecessary manual labour by producing false positives or false negatives such as trying to use a computer without a monitor when it comes to understanding the surroundings and the data it produces.A platform that can analyse all aspects of an intelligent global network and send out alerts and notifications automatically is the answer.Monitoring involves the recording and interpreting of measured data.In contrast, conditional monitoring checks the operation of sensors (such as CCTV sensors).The complete article is organised as follows.Chapter 1 covers the introduction, chapter 2 covers the related work, chapter 3 covers the materials and methods, and chapter 4 covers the implementation results and discussion.Chapter 5 covers the conclusion and future works.

| RELATED WORK
This section covers the analysis of various existing air quality prediction research studies.Using a deep learning technique, the Taichung Area's risk assessment for four subtypes of cardiovascular disease was carried out [10].In the top 10 significant causes of mortality in Taiwan in 2011, cardiovascular illnesses accounted for two and three of those spots, respectively.They are extracted and classified using an autoencoder and Softmax.Results from SoftMax are used to calculate the likelihood that each sample will be affected by one of four types of cardiovascular disease.Data is broken down further to show how trends vary by age group, geography, and time of year.With the use of deep learning characteristics of objects, the authors in Ref. [11] developed a method for classifying ecological and environmental factors.In the first step, a deep convolutional neural network (DCNN) is trained to recognise and categorise various ecological features.They are cropped from each region and utilised to represent the appropriate region to extract deep information from regions with irregular shapes.Trained DCNN is then used to extract deep features from these sub-images.
The SoftMax classifier is used to make predictions about the classes to which all of the individual photos belong.In the proposed winner-takes approach, the class of a given region is established by summing the classes of its constituent parts.At last, we have completed our themed maps depicting ecologically significant features of our area.In a series of classification experiments conducted on a dataset of high-resolution remote sensing photographs of several ecologically significant features, we can determine that the proposed method achieves an impressive classification accuracy of 98.44%.In addition, the accuracy of categorising regions of an irregular form reaches 96.77%.With the efficacy of the suggested method, these outcomes have been obtained.To provide a consolidated environmental monitoring service, the authors in Ref. [12] introduced a data mashup service enabled by the IoT, where multimedia data is gathered from several IoMT platforms and fed into an environmental deep learning service to identify anomalies in potentially dangerous environments.Using a multi-resolution wavelet transform, we extracted useful features inside each region.We then put it into a discriminative classifier to get at the hidden patterns.Scenario and experimental findings for a data mashup service enabled by IoMT are also provided.
A deep learning approach to emotion categorisation is discussed in Ref. [13].Data from smartphones and wearable's are gathered to create our dataset in a real-world investigation.It combines the signal dynamics and the temporal interactions of the three types of sensors.According to research [14], the area of artificial intelligence has undergone a radical change over the past decade, with deep learning serving as a driving force in this evolution.The purpose of environmental protection learning's capacity "self-l" discriminative patterns from data makes it an attractive computational strategy for automating the categorisation of visual and spatial.In audio data, the authors describe how supervised deep learning is already being used and where it has the potential to go in the field of environmental protection.Then, the authors go on to detail various technical and implementation-related hurdles that might prevent the widespread use of this technology in actual conservation initiatives.We highlight research goals to help avoid these future issues and make this technology more widely available to environmental scientists and conservationists.
Smart Cities and Open Data Reuse project financed by the European Commission, [15] investigated the potential of deep learning in flood monitoring (SCORE).We employ deep learning in our work to create an accurate picture classification model for varying degrees of drain obstruction.The methodology allows for analysing and classifying images into many categories based on their context.Developing this model, we experimented with filtering in terms of segmentation as one method to focus solely on the region of interest inside a picture, hence improving classification accuracy.In our application, segmentation is used during the pre-processing of data, just before the training phase begins.To train and validate our model, we relied on publicly accessible photos gathered from the population.Our model's classification accuracy was boosted by segmentation.
One of the most challenging aspects of environmental epidemiology is used to quantify the combined effects of many exposures that occur near one another [16].Efficiency may be increased.Understanding the impact of our surroundings on public health is crucial.By adopting comprehensive measures that consider multiple aspects of the urban environment, we can gain valuable knowledge.This innovative approach is at the forefront of exposure science.It might pave the way for more effective scalability of studies for widespread coverage and lead to novel methods of comprehending environmental effects on health.Innovative techniques should be supplemented with conventional exposure measurements.However, enough data sets are needed for training and evaluating them first.
In addition to enhancing the clarity and accuracy of estimates in data-poor scenarios, deep learning shows considerable potential for improving environmental health by complementing current measures in data-rich circumstances.Efforts across disciplines are required to exploit this potential fully [17].The research proposed a deep active learning technique that automatically assesses the unlabelled data's uncertainty.Seventeen thousand photos were utilised in the studies, all taken on real-world building sites.Seven hundred twenty training photos were needed for 80% mean Average Precision (mAP) with the random learning approach; just 180 were needed when utilising active learning for construction item recognition.In addition, the mAP of a deep learning model built using active learning was 93.0%.However, it was only 89.1% with the random learning strategy.Findings emphasise the significant benefits of uncertainty-based data sampling on the model's performance and show the promise of the suggested approach.The outcomes of this work can give valuable insights and new research paths for the construction research community and increase the feasibility of vision-based monitoring on building sites [18].
A novel deep learning model, Entity Dense Net model, is discussed in Ref. [19]."It also allows us to peek inside the black" and extract the Spatiotemporal features of PM2.5.Damage to buildings from natural and artificial causes is compiled.Much research has gone into vibration-based approaches using the monitor structure's vibration response to evaluate its status and spot damage.However, no research has documented the shift from conventional techniques to ML and deep learning.Their work aspires to address this void by providing a detailed overview of the most current applications of ML and DL algorithms used for vibration-based structural damage detection in civil structures, as well as presenting the highlights of the older approaches.To identify individual plants, the authors in Ref. [20] used geospatial analysis with a unique object identification algorithm to create an integrated process.A Trimble UX5 (HP) flew over 12 neighborhoods in the Dubai Emirate over the course of six months to collect aerial data.
Drone photographs were processed using various detection methods; these images included both visible and infrared spectrums.Evaluations and improvements were also made using cutting-edge geo-processing techniques.They used example images from the datasets to illustrate our points.The primary objective is to provide experts with a method of evaluating the extent of green plant cover using data gleaned from processed photos.Findings suggest that unmanned aerial systems combined with deep learning algorithms might be a viable option for long-term agricultural mapping.
Bridget, created in Ref. [21], is a deep neural network trained to recognise 984 species of birds native to North America and Europe based only on their calls.Over four years and 121 species, BirdNET attained an average accuracy of 0.791 for single-species recordings.A model to evaluate the two methods' relative merits is discussed in Ref. [22].We need to optimise the vehicle's trajectories according to nonhomogeneous interest coverage criteria, which is a challenging optimisation issue.The water resource being monitored is scaled up in resolution, and the problem becomes more severe.The evolutionary method is 50% more effective at the lowest resolution but scales poorly.The double Deep Q-Learning method looks less resilient than the evolutionary technique.However, it has higher convergence regarding learning stability and sparsity of the trajectory optimality.
A generalisation study shows that the Deep Learning method is 35% better at adapting to new circumstances.Progress in using deep learning for geological hazard assessments is summarised in Ref. [23].This research discusses various unscrewed aerial vehicles, satellite platforms, and insitu monitoring systems, which may be used to collect Earth observation data.These are all covered first.Standard deep learning models such as convolutional and recurrent neural networks (RNNs) are presented together with their historical context.For the first time, the authors in Ref. [24] describe a low-cost camera-based automated system for monitoring the dynamics of Fog Computing layers in wastewater pumping stations at high-frequency (minutes) and for long periods (months).Collected from a wastewater pumping station in Rotterdam, The Netherlands, over a period of 6 months with a frequency of 2 min to showcase the methodology used in this research.
Two deep learning-based approaches for gauging urban disparities using satellite and ground-level images are discussed in Ref. [25].The goal is to use London data as a case study to examine three specific outcomes: income, population density, and environmental deprivation.These factors are measured and categorized into decile classes.Mean Absolute Error (MAE) evaluates the efficacy of our proposed multimodal models compared to their respective unimodal counterparts (MAE).In areas where street photos are available, the accuracy of income, population density, and living conditions estimates is improved by 20, 10, and 9% points, respectively, when satellite tiles are added to street-level imagery.Gains of 6, 10, and 11% points are possible using dense information from satellite photos and the sparser details from street-level photographs.By integrating knowledge of porous media flow behaviour with advanced deep learning techniques, a highly efficient data assimilation-reservoir response forecasting process can be achieved [26].
The workflow uses a framework based on Ensemble Smoother Multiple Data Assimilation to update geologic properties and make predictions about reservoir performance from pressure history and CO 2 plumes inferred from seismic inversion, along with the quantified uncertainty associated with those predictions.The process can finish history matching and reservoir forecasting with uncertainty quantification in less than an hour on a standard desktop computer.Recent research studies, including over 400 totals, were analysed in a biometric analysis by [27].Studies show that deep learning is used in agriculture to digitise fields with more precision than conventional image processing approaches.Most literature studies plant identification issues, such as weed and pest categorisation.Convolutional neural network architecture forms the backbone of their techniques.The author also suggested that this research may be used as a roadmap for future investigations into the practical uses of deep learning in agriculture by academics and industry professionals.
The strain data from an aeroplane under different aerodynamic loads were utilised by [28] to train a convolutional neural network (CNN) model.A350-aircraft's aerodynamic loads were simulated numerically using the vortex lattice approach, which included inserting damaged parts at random positions.Matrices, uncertainty, and sensitivity analyses were performed using the trained CNN model in the damage detection procedure.A research study demonstrates that 99% accuracy can be achieved in damage detection without noise.In comparison, 97% accuracy can be achieved with 2% Gaussian noise.In the CNN model, we found that a threshold value of 1.5 yielded the highest overall accuracy (83%) for damage localisation compared to the other threshold values (2.5% and 2.5%).Results showed the proposed technique to be timely, precise, and reliable.
A straightforward method for keeping tabs on tiny sea cucumbers grown in cages by combining a time-lapse camera with deep learning-based image processing on the bottom is discussed in Ref. [29].Over 2 months, we used the camera to capture many time-lapse photographs of sea cucumbers.The training model's accuracy, recall, and F-measure ended at 0.72.With the limited picture resolution, the validation model could count sea cucumbers in a cage at an acceptable monitoring level.The second step is applying the custom model to each camera picture for automatic detection.Discovery has implications for the construction of sea cucumber cages for bottom-of-the-sea aquaculture.After studying these papers, the authors see that there is a scope for improvement in the performance of the RNN with the LSTM model in air quality level prediction in smart cities.Table 1 shows the existing research in air quality forecasting research.

| METHODS AND MATERIAL
In this work, RNNs using LSTMs have learnt sequential data patterns and structures.In time series prediction, this is useful since input element order matters.In this work, the authors combine the output representations from the RNN and LSTM modules to create a fused representation of both short-term and long-term dependencies and apply the Curiosity-Based Motivation module to modulate the learning process, influencing the model's exploration behaviour [41].The combination of PSO with LSTM-RNN is novel in the context of air quality prediction.The PSO algorithm optimises the hyperparameters of the LSTM-RNN, allowing for efficient learning and adaptation to the dynamic nature of air quality data.This research presents a hybrid model based on LSTM, RNN, and Curiosity-based Motivation method.Optimising Long Short-Term Memory (LSTM) networks with curiosity-based motivation entails integrating the LSTM design with methods that stimulate model exploration and learning.With the help of RNN, curiosity-driven strategies encourage inquiry and discovery [42].

| Recurrent neural network
Using a regular feed-forward network for learning and prediction is impossible with sequential or time series data.Future values call for a system to store and recall relevant historical data.Neural networks, or RNNs for short, are an offshoot of traditional feed-forward artificial neural networks better suited to processing sequential input and may be taught to retain learnt information.One subset of artificial neural networks, an RNN, is optimised for processing time series and other sequence-based data.Generally, feed-forward neural networks only apply to data items that may be treated as isolated entities.The neural network must be adjusted to account for dependencies between data points if the data is presented in a sequence in which one data point depends on the initial data point.The most common use of RNNs is in language models, where predicting the following letter in a word or the next word in a sentence depends on the information already available [43].
An intriguing experiment includes an RNN that, after being fed data from Shakespeare's works, effectively produces text that sounds like Shakespeare himself wrote it.Recurrent neural networks to write is an example of computational creativity.Artificial intelligence can mimic human ingenuity because it has mastered the rules of syntax and semantics from its training set.
Artificial neural networks are built on the learning process of RNNs of computational elements that process data like brains.This is built up from interconnected layers of artificial neurons also known as network nodes that can take in data and send it to other nodes.
The intensity of a signal and the network's final output is affected by the edges or weights that link the nodes.Occasionally, artificial neural networks only perform a linear inputto-output operation.For example, image recognition systems rely on Convolutional neural networks, a feed-forward neural network; RNNs may be stacked to do bidirectional processing.Recurrent neural networks, such as feed-forward neural networks, process data linearly, moving from input to output.In contrast to feed-forward neural networks, RNNs incorporate feedback loops such as backpropagation via time into the computation.This allows RNNs to analyse sequential and temporal data by linking inputs.When the input sequence is trimmed to reduce the number of time steps, the resulting RNN is called a truncated backpropagation via a time neural network.It is helpful for sequence-to-sequence models built using RNNs when the number of input steps (or input time steps) is larger than the number of output steps.A typical artificial neural network uses forward projections to make predictions and backward projections to assess the past.They are not employed in tandem as in a bidirectional recurrent neural networks [44].

| Long short-term memory units
The vanishing gradient problem is an issue that plagues traditional RNNs and prevents them from being trained to their full potential, which in turn leads to subpar results.It occurs in multilayer neural networks, which analyse large amounts of complex data.Gradient-based learning, the standard approach to RNN improvement, degrades as RNNs increase in size and complexity.Parameter tuning at the early stages becomes too time-consuming and computationally costly.In 1997, computer scientists Sepp Hochreiter and Jurgen Schmidhuber devised a solution to their problem and named as "LSTM networks".They use LSTM units to divide information into short-term and long-term storage sections.By doing so, RNNs can determine whether the information is crucial enough to be stored and looped back into the network for future use.It also helps RNNs determine what information may safely be ignored.
Long short-term memory networks are a subclass of RNNs designed to learn from time-series data and make predictions based on long-term dependencies.It takes four parts working together to make up a single LSTM repeating module.To train the LSTM, the parameters are modified so that the input window of historical data results in a minor possible discrepancy between the predicted and actual values for the following measurement.The value beyond this point in time is all that can be predicted using a sequential approach based on the window of historical information.

T A B L E 1
Summary of existing research in air quality forecasting.Improved quantification and prediction of air quality.

Outcomes
Mobile monitoring and automated machine learning.
Not specified in the provided information.

| Processing of data
This phase utilises consolidation, cleaning, splitting of the input window and the output, scaling, and partitioning of the data for training and validation.These are all necessary steps in preparing the data for LSTM networks.Data from many can be consolidated by integrating them.Outliers, missing entries, failed sensors, and other missing or damaged data should be eliminated throughout the data cleansing process.There are inputs (the previous time series window) and outputs (the results of the analysis) (predicted next value).A sequence of functions is applied to the inputs to arrive at a forecast of the result.The expected loss (objective) function for fitting is the squared difference between the expected and measured output.
The training process can be optimised by using a scaling transformation to convert all data to a 0-1 interval.To test the model's fit independently of the training, data is split into training (e.g., 80%) and validation (20%) sets.Cross-validation training data may be split into different groups and individually optimised.Models are compared concerning the consistency of their parameters.The layers of an LSTM network establish a connection between the input data window and the network's outputs.Often, it consists of several layers rather than simply one [45].

| Verifying long-short term memory predictions
When validating a model's also vital to find out how well it does when it leverages previous forecasts to anticipate future results without any actual data.It is crucial information for evaluating the model's performance in a predictive application.The forecast is a set of predictions made without the benefit of prior measurement results.

| Particle swarm optimization
One of the bio-inspired algorithms, PSO is a straightforward approach to finding the best possible answer to a problem.As compared to other optimisation techniques, this one stands out since it requires neither the gradient nor a differential form of the objective function.There are not a ton of hyperparameters to tweak either.
In 1995, Kennedy and Eberhart came up with the idea for PSO.Sociobiologists hold that animals and plants that travel in groups "may profit from the experience of all other members," as stated in the cited research.While one bird flies around aimlessly looking for food, the others in the flock can benefit from what it finds and learn from it.
Finding the extrema of a function specified on a highdimensional vector space is where PSO really shines.
In other words, PSO is a method that uses the collective intelligence of a large group of people.The swarm is made up of a number of individual particles.In this context, each "particle" represents a potential answer.Potential answers exist and work together in tandem.Each individual swarm particle flies throughout the search region, trying several landing spots.The swarm of moving particles stands in for the ever-evolving set of solutions, and the region to search is the set of solutions themselves.
Figure 1 demonstrated that each particle remembers its own best solution (optimum) and the swarm's best solution (optimum) as the generations (iterations) go.Then, it adjusts two variables-the speed at which it flies and its current location.Each individual particle takes into account its own and its neighbours' flight histories to dynamically modify its cruising altitude.It does the same thing by trying to adjust its position based on data including its current location, velocity, and the distance between its present location and its own personal optimal and the swarm's optimum.Particle swarm optimisation is generally good at exploring the search space efficiently.It can help identify promising regions of hyperparameter space for LSTM networks.In the complex search space having multiple optima, PSO has performed well in finding good solutions.In this work, the authors integrate PSO as an optimisation algorithm to dynamically adjust hyperparameters and weights within the hybrid model and define a PSO optimisation mechanism to search for optimal configurations of the RNN, LSTM, and Curiosity-Based Motivation components.
F I G U R E 1 Flow chart of particle swarm optimization (PSO).DALAL ET AL.

| Proposed hybrid PSO-LSTM-RNN for air quality monitor
Time series analysis often employs LSTM networks.Long short-term memory networks can learn the sequential relationships between observations in a series; they are a good choice for time series forecasting.It is important to note that LSTM models have received some criticism.Long short-term memory is not very good at making predictions.It predicts the following observation by using delayed values from the previous one.It might come in handy when pinpoint precision at a specific moment is needed.The training of RNNs and LSTMs is challenging because of the memory-bandwidthbound computation they demand.
It can be quite challenging for a designer and greatly limits the application of neural network solutions.In short, LSTM necessitates four linear layers (MLP layers) for every cell and for each time-step in the sequence.Of the high memory bandwidth requirements of linear layers, a system with a small number of processing cores will struggle to process them, and increasing memory bandwidth is far more complex than adding additional processing cores (note enough lines on a chip, long wires from processors to memory, etc.).While building an LSTM prediction model, the weights are the most crucial of several parameters.Using a PSO method to these settings can help you get more accurate predictions.By using the LSTM hidden layer weights as input to PSO, we may prevent the network from converging on a suboptimal solution.LSTM's initial output error is used as the fitness of the particle swarm, which is then evaluated based on the given conditions.Depending on the local and global extrema, the initial random particle swarm will adjust its own parameters.Figure 2 shows the working of the proposed PSO-LSTM-RNN model.
To do this, the PSO method is employed to optimise the LSTM algorithm with the value of the ideal particle position vector in the particle swarm being used as the starting value of each weight in the LSTM network.Particle swarm fitness is defined as the mean square error of each output neuron on the given training set, and particle size may be determined from the neural network model's architecture.Then, we use the fitness function to determine each particle's fitness score; a lower score indicates a more accurate network output.In addition, this means that the appropriate particles will function more efficiently.By constantly updating the particle position, the network's output layer's inaccuracy may be steadily minimised.The particle with the least amount of inaccuracy is chosen as the best one to use for the next iteration.

| Curiosity-based motivation
The motivation learning algorithm used to construct the curiosity-based incentive is a source of Motivation.The proposed framework constructs internal models of sensory data.It establishes associations between those models and the behaviours it has learnt to do.Motivation learning happens if the outcome of machines is irrelevant to its presence.This screening process prevents irrelevant observations from being stored in a machine's memory, even though they may be relevant enough for Novelty (NI)-based learning even though they are unpredictable to the system.In the absence of additional reasons, such a system is nevertheless capable of engaging in NI-based learning.
Figure 3 shows the execution of Motivation in the proposed model.To learn to be Motivation's necessary to have a way to generate motives and objectives that are abstract in nature.This mechanism controls impulses, chooses objectives, and monitors their realisation.To a large extent, the functioning of motivations is impacted at any given time by conflicting events and attention-switching signals, which originate from contact with the environment.
Observational learning for motives yields sensory states in which events alter reciprocally.It will play a significant role in the locations where fresh experiences may be found to inspire agents engaging behaviour.In response to a pique-down F I G U R E 2 Proposed particle swarm optimization (PSO)-long-short term memory (LSTM)-recurrent neural network (RNN) model.curiosity, the agent may zero in on a particular feature of its surroundings.Before describing the motivation learning algorithm, it will be essential to describe observations, events, NI, interestingness, and attention.In the first phase, focus on observation functions.The observation function specifies the combinations of sense data that will stimulate inference.An agent is given fewer sensations to observe, and it can narrow its focus to a more specific region of the state space.Each observation in operating system (t) considers every part of the perceived state at time t, as defined by the following parameters such as year_month_day_hour, DeviceID, PM1, PM2_5, PM10, MicsRED, MicsNOX, MicsHeater, Temperature, Humidity and Ozone.The difference function provides data on how much of a difference there is between two experiences.Each event is defined by a set of conditions on the different variables.The event function specifies the agent's recognition of those conditions as events.The following formula can be used to define an event function: -y_predicted (8760, 1) -x_test (8760, 4, 9) -Scaled Values shape (285285,9) Depending on the number of sensations to be altered, events may be of varying durations or perhaps non-existent.

| Function for identifying new information
A NI state is generated when the framework's conceptual state is compared to the long-term memory-constructed memories of earlier experiences.The introspective search can discover NI by comparing an agent's present conceptual state with its recollections of earlier experiences.The interestingness function assigns a score to a given circumstance based on the degree to which NI is present.With selective attention, you may zero in on a particular object while simultaneously filtering out irrelevant details.The authors use a "maximum interesting" method to decide what to focus on to generate drive-in cyborgs Algorithm 1.

Algorithm 1 Curiosity-inspired Motivational Algorithm in the Proposed LSTM-RNN Prediction Model
Step 1: Apply Observation Function Ob StateðtÞ ¼ ðOb1ðtÞ; Ob2ðtÞ; ……ObnðtÞ Where Obn(t) = Sn(t) (8N), Ob_State defines the state in time interval t and Ob1(t)defines the observation in the time interval (t), n is for several iterations Merged the sensation received from a sensed phase (by Equation 1).Sensation results motivate and help in nextlevel reasoning.
Step 2: Apply Difference Function Calculate the difference value for two sensation data {Sn(t), andSn(t 0

)} Step 3: Apply Even Function
Calculates the sensation change for each variable.Where the event function can be calculated as follows: Step 4: Apply the Novelty identification method.
A Novelty identification method NM mainly compares the memories outcome of past experiences (M_newεM_old) for each new state (New_state ε Old_state): Introspective search can identify novelty by comparing an agent's existing conceptual state to memories of past knowledge.
Step 5: Applying the introspective searching technique evaluates a monetary value for a situation's involvement.
A Novelty (NI) is calculated based on a new state (New_state ε Old_state) by equation (6).
Step 6:Repeat for each New state (i)εOld_state Step 7: Repeat for each NI(j)εNI(t) Step 8: Set the new attention value by Equation 7.

New attenstion ¼ maxNIðtÞ ð7Þ
Step 9: Create the new motivation value using the new attention value.

| Mathematical model for proposed long short-term memory recurrent neural network model
The proposed hybrid model is based on the LSTM and RNN method, and it also utilises the features of Curiosity-based Motivation (Motivational algorithm).In the proposed model, the LSTM-RNN is implemented using the backpropagation algorithm.The proposed model first calculates the gradient values for available weights in a specific timestamp.Let the set of input X input be in the time stamp Time_S.The cell state is from C state from time interval (Time_S-1) and (Time_S) be (C state -1) at (Time_S = 0).
-Initialise all the available weights for each LSTM Gate.
-Once the gradients pass by the LSTM gate, the value will be determined by Gradian delta ¼ D Gradian D ht -Once we apply the Mean square error, then Gradian_delta will be calculated by a new equation (18).
� Where Pv is the predicted value and Ah the Actual Value � The final Gradient can be calculated now for each LSTM Gates.� LSTM Gradient for output GATE By using the LSTM gradient, we can calculate the outcome for all the Gates that is, input-output and forget Gate.

| Dataset
Twenty-five pollution sensors from the Air Pollution Monitoring Network in Salt Lake City, Utah, USA, have been used to compile this dataset and were requested from the University of Utah's linked group [40].For 60 seconds, each air quality sensor transmits a packet of data (assuming that the monitor is functioning correctly).Environmental sensors, including an optical particle counter (Plan tower PMS3003), a temperature and humidity sensor (Texas Instruments HDC1080), and a sensor for detecting oxidising and reducing gases (SGX Sen-sorTech MiCS4514), are included in each pollution monitor.
In this data set, readings are aggregated and averaged over an hour based on the device they were taken with (one row per device per hour).The desired data could come from the FEM Tropospheric Ozone equipment at the Hawthorne Monitoring Site, operated by the Utah Department of Air Quality (DAQ).It is provided every 60 min.Hourly ozone readings from a DAQ system are attached to the corresponding row in the dataset.There are never more than 25 sets of ozone (one for each of the 25 Air pollution monitors) [46].

| Data pre-processing
Raw data is nearly never useable for training ML models.A scientist's work is spent on data preparation and cleaning.Feature engineering can prepare variables for usage in ML models.In missing values, encoding, categorising variables, mathematically altering data, and generating new variables from existing ones are all examples of transformations in feature engineering.The first stage in implementing effective ML models is data preparation.We convert raw data from multiple sources into a useable format [47].Figure 4 shows the functional architecture of the proposed LSTM-RNN Prediction Model.

| Handling missing values
The data acquired by a research work often has issues with missing data.The issue arises when the observations feature does not have a corresponding value in the dataset.Many factors can contribute to a dataset having gaps in it.The source of missing information in research work data comes from the dataset's respondents.It might be absent because of a clerical mistake when entering the data.To train properly, most ML algorithms necessitate data where each observation has a value for all characteristics.Data can introduce bias into the parameter estimation process, reducing the effectiveness of ML models.This might lead us to interpret data incorrectly and perhaps cause problems.For this reason, missing data is a problem for ML models and must be handled properly.Several methods are employed to deal with the missing information [48].
Things are: The missing observation was deleted (s); Average Imputed Value; Speculative Hot-Deck Imputation; Imputation using Regression.In this strategy, we replace the missing value of the observation with a randomly picked value from all the observations in the sample that has similar values on other variables.As a result, this method guarantees that the imputing value is solely chosen randomly from the interval where the real value may fall rather than being predetermined.In this method, we substitute a value from among the available variables with observation values most closely matching the missing one.With Hot Deck imputation, the imputing value is not chosen randomly in this method.It was fitting a regression model to a feature lacking data and then using the model's predictions to fill in the missing value, what is known as regression imputation.The method maintains the connections between characteristics, which gives it several advantages over simpler imputation techniques such as mean and mode imputation [49].

| Handling outliers
Outliers may be a massive difficulty in data analysis or ML.A small number of extreme cases can have a catastrophic effect on the ML algorithm or visualisation results.It is vital to discover outliers and deal with them cautiously.
➢ Finding Abnormal Patterns: The process of identifying anomalies presents no significant difficulty.Few methods exist for spotting anomalies: Boxplot; Histogram; Statistical Analysis: Mean and Standard Deviation; Inter Quartile Range; Z-score Percentile.➢ Processes for Dealing with Extreme Values: One of the trickiest choices in data analysis is what to do with an outlier after one has been identified.Do they get rid of them or adjust?I will first examine a few techniques for filtering out extreme cases.➢ Z-Score: Having shown how to use the Z-score to identify an outlier, we now wish to eliminate these anomalies and get pure data.The authors computed the Z-score.This only requires one line of code [50].

| Hyperparameter optimisation
The hyperparameter selection process for the hybrid model designed to anticipate air quality variations to promote environmentally friendly development of smart cities was thorough and systematic.The PSO component was designed with a swarm size of 30 to achieve a balance across both exploration and extraction.After an extensive investigation, the learning Architecture of proposed long short-term memory recurrent neural network (LSTM-RNN) prediction model.

14
- rate was established at 0.2 and the inertia weight at 0.7 to enhance convergence and exploration skills.The design was optimised by incorporating two LSTM layers, each containing 100 neurons, as well as a dropout rate of 0.3 in order to reduce overfitting across the Long Short-Term Memory (LSTM) and RNN modules.The LSTM's learning rate was established at 0.001 to regulate the size of the optimisation step.A batch size of 32 was used to optimise memory usage throughout the training phase.The model underwent training for 50 epochs to achieve convergence while avoiding overfitting.The hyperparameters were chosen considering factors such as model convergence, computing efficiency, preventing overfitting, and establishing the appropriate exploration-exploitation compromise in optimisation techniques.Thorough testing and verification using suitable metrics of performance were carried out to ensure that the selected hyperparameters enhanced the model's accuracy in forecasting air quality variations, in line with the goals of sustainable smart city advancement.
The hyperparameters for the RNN component of the hybrid model were meticulously selected to enhance the overall architecture.Two layers, each consisting of 100 neurons, were used to successfully capture temporal dependencies, similar to LSTM.To prevent overfitting, a dropout rate of 0.3 was used, and a learning rate of 0.001 was selected for optimal convergence in the training process.The hyperparameters were chosen to achieve a balance among model complexity and generalisation, while preventing underfitting and overfitting.This is in line with the overarching objective of improving the model's predictive accuracy in forecasting air quality variations to support responsible growth in smart cities.A thorough experimental method along with performance validation was carried out to confirm that the selected hyperparameters enhanced the efficiency of the RNN in the hybrid model.Table 2 presents hyperparameter details for the proposed hybrid model.

| RESULTS AND DISCUSSION
This section covers the experimental results of proposed and existing models on the air quality dataset.

| Performance metrics for evaluation
Root Mean Square Error (RMSE), MAE, Mean Absolute Percentage Error (MAPE), and the Coefficient of Determination (R2-Score) were used to measure the performance for TS prediction [51].
➢ RMSE measures how distant the data points are from the regression line and is calculated as the standard deviation of the prediction errors.When the value is higher, the prediction becomes more misaligned [52].

RMSE ¼
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Where Pi is predicted, and Ai is the actual value.
➢ MAE is the average of the absolute differences between the predicted and actual values for all instances in the test set if all individual differences have the same weight and thus measure the average magnitude of the prediction errors while ignoring their directions.
➢ MAPE is a relative statistic in which the average value of a relative error is expressed as a fraction of the true value.
➢ R2-Score is used to measure how well a linear regression model predicts changes in a dependent variable from changes in the independent variables, and researchers employ a statistic called the R2-Score.

| Experimental results and comparisons
In the experimental analysis, we measured how well the proposed hybrid LSTM-RNN performed using the RMSE, MAE, MAPE, and R2-Score.We compared it with existing research models GBTR, Existing LSTM, and SVMR.The dataset was collected from the University of Utah Air Pollution Monitoring Network dataset Salt Lake City from 2019-07-26 to 2021-05-14.The proposed algorithm has been implemented using Python language.The step of the implementation phase is initialising the biases and weight matrices.Sets with the minimum,

| Experiment 1
In the first experiment, the proposed LSTM-RNN model is tested on PM 2.5 on the proposed dataset and calculated the pollution level.Table 3 compares the performance of the proposed hybrid LSTM-RNN using the MAE, MAPE, and R squared.The table below shows that our suggested prediction approach has higher performance regarding RMSE and MAE errors and model expressiveness with different optimisation algorithms.
Table 3 displays a thorough performance assessment of the suggested model compared to various optimisation algorithms, such as Transformer, gated recurrent units (GRU), LSTM þ RNN þ ant colony optimization (ACO), and LSTM þ RNN þ GA.It includes key metrics such as RMSE, MAE, MAPE, and Coefficient of Determination (R 2 -Score).The findings demonstrate that the suggested model is significantly better, as seen by its lower RMSE and MAE values, which suggest improved accuracy and fewer discrepancies from real data.The significantly reduced MAPE highlights the model's accurate predictions with a minimum percentage error, essential for dependable forecasting.The higher R 2 -Score confirms that the suggested model fits the data well, demonstrating its usefulness in capturing variations in air quality.The superior performance is due to the hybrid PSO-LSTM-RNN method, which combines PSO for parameter tuning and the recursive structure of the neural network for capturing temporal relationships.This demonstrates the model's adaptability and reliability in forecasting air quality changes.
Table 4 displays a thorough performance assessment of the suggested model in contrast to Transformer and GRU for predicting air quality.The suggested model demonstrates better prediction accuracy than Transformer and GRU, as shown by reduced RMSE and MAE values.The RMSE for the proposed model is 0.0184, which is much better than Transformer (0.0074) and GRU (0.082).The suggested model has the lowest MAE of 0.0082, demonstrating its superior accuracy compared to Transformer (0.0237) and GRU (0.0197).The suggested model has greater performance with a Coefficient of Determination R 2 -Score of 0.1227, surpassing both Transformer (0.0591) and GRU (0.0784).There seems to be an inconsistency in the reported MAPE figures that needs to be explained.The results demonstrate that the proposed model shows strong performance in forecasting air quality, highlighting its significance for environmentally conscious growth in smart cities.
Figure 7 illustrates the efficacy of the hybrid LSTM-RNN model in forecasting air quality dynamics.The data in the table is based on information gathered on the stated date, and the model underwent training for 20 epochs.The image probably T A B L E 3 Performance evaluation of the proposed model with different optimization algorithms.7 and Tables 3 and 4 together offer a thorough assessment of the predicted accuracy and dependability of the proposed hybrid model.

| Experiment 2
In experiment 2, the RMSE and MAE parameters are calculated for PM 2.5 for the proposed LSTMM-RNN model and existing models, that is, GBTR, Existing LSTM, and SVMR. Figure 8 shows the MAE results for the proposed LSTM-RNN model and existing models.The proposed LSTM-RNN model achieves MAE results of 2.12 for 2 h, 2.25 for 4 h, 2.89 for 6 h, 3.65 for 8 h, and 4.12 for 10 h, which is fewer than another existing mode.The lesser value of MAE shows better performance for the model.
Figure 9 shows the RMSE results for the proposed LSTM-RNN and existing models.The proposed LSTM-RNN model achieves RMSE results of 1.13 for 2 h, 1.45 for 4 h, 1.97 for 6 h, 2.25 for 8 h, and 2.87 for 10 h, which is fewer compared to other existing modes.The lesser value of RMSE shows better performance for the model.
Figure 10 shows the MAPE results for the proposed LSTM-RNN model for the first 2 h in the range of 15-30, 30-40, 40-70, and 70þ.As per 'WHO' standard guidelines, PM 2.5 in the limited range of 0 to 20 shows less impact on the human body due to its lower value, and so we are not discussing this range of MAPE results.Figure 11  In the third experiment, we initially computed results with and without hyperparameter tuning for both the proposed and existing models.Subsequently, a comprehensive analysis was conducted using 10-fold cross-validation on the Air quality dataset to further evaluate and compare the models.

Analysis for Hyperparameters
Optimising hyperparameters is an essential stage in the creation of ML models.It entails methodically choosing the best hyperparameters, which are external configuration choices, to improve the model's performance and generalisation.Typical hyperparameters consist of learning rates, regularisation strengths, and factors that are particular to the architecture.The approach frequently utilises methods such as as grid search, random search, or more sophisticated techniques such as Bayesian optimisation to methodically investigate the hyperparameter space and determine the combination that produces the optimal model performance.various hyperparameter values.Graphs with hyperparameter tweaking often demonstrate enhanced performance compared to those without tuning, underscoring the need of selecting suitable hyperparameters to maximise the model's efficacy.
Table 5 displays a comparative study of experimental findings for the proposed hybrid model and current models including RMSE, MAE, MAPE, and R 2 -Score.The trials were done with and without hyperparameter tweaking.The proposed hybrid model regularly surpasses existing models in both cases.The proposed hybrid model demonstrates higher prediction accuracy by hyperparameter tweaking, achieving the lowest RMSE of 0.012, MAE of 0.009, and MAPE of 3.5%, along with the greatest R 2 -Score of 0.95.The proposed hybrid model remains competitive even without adjustment, demonstrating its robustness.The success of the proposed hybrid model is due to innovative features such as curiosity-driven motivation, PSO for weight optimisation, and the effective integration of neighbouring stations and weather records, which collectively improve its adaptability and forecasting performance.

Analysis for k-fold cross validation
A k-fold cross-validation is crucial for analysing an air quality dataset, as discussed in section "4.2.3.2Analysis for k-fold cross-validation."Ensuring the reliability and robustness of predictive models is critical in air quality prediction for effective environmental monitoring and public health management.A k-fold cross-validation is particularly useful for managing the inherent unpredictability and complexity in air quality data.Air quality datasets frequently show changes over time and space.Including a variety of data points from various sources can assist and account for this variability when training and evaluating models.K-fold cross-validation involves systematically rotating distinct subsets of the dataset for training and testing to assess the generalisation ability of a model across diverse situations and time frames.Air quality prediction models use k-fold cross-validation analysis to verify that the model's performance is not biased by any particular sample of data while forecasting pollutant levels under various conditions.It is crucial for obtaining a dependable and impartial evaluation of the model's precision, thereby improving the accuracy of forecasts for real-world air quality situations.K-fold cross-validation findings enhance the thorough assessment of the model's performance in dealing with the intricacies of air quality datasets.
The presented Table 6 illustrates the results of a 10-fold cross-validation for the proposed hybrid model, LSTM þ RNN þ ACO, and LSTM þ RNN þ GA in predicting air quality.Each row corresponds to a fold, showcasing metrics such as RMSE, MAE, MAPE, and R2-Score.The "Avg" row represents the average performance across all folds.
Table 6 presents the findings of the 10-fold crossvalidation, showing that the Proposed Hybrid Model outperforms the existing models, LSTM þ RNN þ ACO and LSTM þ RNN þ GA, in forecasting air quality.The proposed hybrid model consistently shows improved accuracy in predicting air quality levels with an average RMSE of 0.013 and MAE of 0.010, indicating decreased prediction errors.The average MAPE of 3.8% indicates a decreased level of error, highlighting the model's accuracy.The R2-Score of 0.94 indicates a good amount of explained variation and dependability in the predictions.Existing models show greater average of RMSE and MAE, increased MAPE, and lower R2-Scores, suggesting worse performance.The proposed hybrid model demonstrates strong and performance in air quality prediction, highlighting its dependability and efficacy.

| Ablation analysis
The ablation study has focused on the roles of PSO, RNN, and LSTM in predicting air quality changes in smart cities in Table 7. Experiment 3 integrates PSO, resulting in a notable improvement in performance metrics.The MAE lowers to 0.0189, MAPE decreases to 0.0258, while the R-squared Score remains constant at 0.081.Experiment 4 combines curiositydriven motivation with LSTM, RNN, and PSO, leading to highly positive results.The model attains an MAE of 0.0082, a MAPE of 0.0184, and an R2-Score of 0.1227.The results demonstrate a synergistic impact when LSTM, RNN, PSO, and curiosity-based motivation are combined, resulting in enhanced prediction accuracy and overall model performance.

| Discussion
The LSTM-RNN hybrid model shows a significant progress in forecasting air quality, outperforming current techniques.The hybrid architecture combines the advantages of Long Short-Term Memory (LSTM) and RNN, using LSTM's ability to capture long-term dependencies and RNN's efficacy with sequential input.The architectural design is improved by integrating Curiosity-based Motivation, which introduces a new incentive system that adjusts and enhances the model's predictive skills as time progresses.
An in-depth assessment utilising important measurements further confirms the model's excellence.Table 3 shows that the RMSE of the proposed model regularly has lower values than other methods, indicating its higher accuracy in forecasting changes in air quality.Table 4 supports this claim by demonstrating decreased MAE values for the suggested model, suggesting a stronger agreement between projected and observed air quality values.Table 5 demonstrates that the suggested approach excels in reducing percentage errors, specifically in the MAPE.
The suggested model stands out for using several air quality indicators such as SO2, CO, O3, and NO2.This comprehensive method improves the model's capacity to understand a wide variety of environmental elements that impact air quality.The temporal analysis shown in Figures 9 and 10 strengthens the model's reliability and efficiency over various time periods.The experimental findings of MAPE demonstrate the model's effectiveness in forecasting changes in air quality within the initial 2-4 h.
Overall, the suggested LSTM-RNN hybrid model proves to be a strong and dependable method for forecasting air quality.The architectural synergy, Curiosity-based Motivation, and consistent outperformance across various evaluation metrics, as shown in Tables 3-5, along with the temporal assessments in Figures 9 and 10, establish it as a leading contender in the field, demonstrating its potential for sustainable development in smart cities.

| CONCLUSION & FUTURE SCOPE
The combination of the LSTM-RNN with PSO, in the proposed model, motivated by curiosity, offers a viable method to improve the accuracy of upcoming air quality predictions.The paper describes the process of implementing the integrated model, starting with creating a univariate forecasting model for each subset of the air quality.A curiosity-driven incentive is implemented to predict the current station's data by combining information from nearby stations and weather histories.This study introduces a prediction model for environmental quality using LSTM in accordance with the increasing focus on environmental governance due to the worsening air quality.The model uses data from the University of Utah AIRU Pollution Monitoring Network to compute the Air Quality Index (AQI) based on factors such as temperature, PM2.5, PM10, SO2, wind direction, NO2, CO, and O3.
The article begins by providing an overview of the development, technological features, and present condition of air quality monitoring, leading into a full explanation of the environmental forecasting model.The model utilises dropout regularisation, Curiosity-based Motivation, and PSO optimisation to improve LSTM-RNN, reducing prediction errors.The proposed method outperforms existing methods such as GBTR, Existing LSTM, and SVMR in terms of performance metrics: 0.0184 (RMSE), 0.0082 (MAE), 2002*10^9 (MAPE), and 0.122 (R2-Score).The results show that the suggested LSTM model outperforms other techniques in terms of RMSE in the specified dataset, highlighting its precision in forecasting AQI.
While the suggested LSTM-RNN model excels in accurate AQI forecasting, nevertheless is a realistic awareness of limits in extreme performance forecasts.Future efforts will concentrate on enhancing the model to address severe situations and verifying its resilience across a wider spectrum of environmental circumstances.Improving the accuracy of air quality forecasting through these developments has significant potential to improve public health and guide decisions on environmental and health policies.

�
Distinction Function: The difference between two sensations, SL (t) and SL, (t 0 ) in the experienced states S(t) and is quantified by a different function as follows: shows the MAPE results for the proposed LSTM-RNN model for the first 4 h in the range of 15-30, 30-40, 40-70, and 70þ.F I G U R E 7 Performance of the proposed hybrid long short-term memory recurrent neural network (LSTM-RNN).F I G U R E 8 Mean Absolute Error (MAE)results for the proposed long short-term memory recurrent neural network (LSTM-RNN) model and existing models.
Figures 12 and 13  display the performance metrics of a ML model under different hyperparameter configurations, with and without hyperparameter tuning.The graphics clearly illustrate how the model's performance changes with F I G U R E 1 1 Mean Absolute Percentage Error (MAPE) experimental results for the proposed model for the first 4 h.F I G U R E 9 Root Mean Square Error (RMSE) results for the proposed long short-term memory recurrent neural network (LSTM-RNN) model and existing models.F I G U R E 1 0 Mean Absolute Percentage Error (MAPE) experimental results for the proposed model for the first 2 h.

Dataset used Limitation or future direction
Hyperparameter details for proposed hybrid model.
T A B L E 2 evaluation of the model's accuracy and its capacity to detect patterns in the air quality data.Using 20 epochs means the dataset was processed 20 times during training, improving the model's predicting skills.Figure Experimental results with Hyperparameter and without hyper parameter.
T A B L E 5

Table 7
Ablation study.displays an ablation research investigating the effects of altering model components on performance indicators such as MAE, MAPE, and R2-Score.Experiment 1 examined the simple LSTM model, yielding an MAE of 0.0584, MAPE of 0.0478, and R2-Score of 0.067.Experiment 2, including RNN into the LSTM model, shows enhancement with a decreased MAE of 0.0489, decreased MAPE of 0.0397, and an enhanced R2-Score of 0.078.