Internet of Things for Noise Mapping in Smart Cities: State of the Art and Future Directions

Noise pollution has been an issue since ancient times. Recently, this problem has been exacerbated due to rapid population growth and urbanization. Noise mapping is a strategic action plan that visualizes the long-term and real-time noise pollution of our cities, industrial sites, and other regions of interest. This article first discusses the working principle of general model-based noise mapping and the lessons learned. Then, in-depth descriptions of the technical challenges and design issues of noise mapping using mobile crowdsensing and acoustic sensor networks are presented. Finally, we provide our insights for future research directions regarding artificial intelligence assisted noise prediction, constructive interference for multimedia transmission, and simultaneous noise sensing and sound energy harvesting as well as inaudible sound attacks and defense.


IntroductIon
In recent years, the rate of urbanization has rapidly accelerated.Half of the global population has been living in urban areas, according to the World Urbanization Prospects, and the proportion is estimated to increase to 68 percent by 2050.Urbanization advances all aspects of our society, from infrastructure to public services.However, it also has many detrimental effects, one of which is noise pollution.Sufficient scientific evidence has shown that noise pollution has negative impacts on people's psychological and physiological health.
International organizations and governments have been making considerable efforts to mitigate this issue.For example, in 2002, the European Commission announced the Environmental Noise Directive, which requires its members to produce noise maps and update them every five years.The fourth round of noise mapping will be completed by 2022.Noise mapping is one critical application in smart cities because it has the following advantages: • It allows us to gain a comprehensive understanding of invisible environmental noise in terms of sources, level, and distribution.• It provides noise information that is accessible to the public, thus allowing them to assess potential risks.
• It guides governments in developing noise reduction plans and preserving quiet areas.In October 2018, the World Health Organization released new environmental noise guidelines with the aim of prompting governments to evaluate noise exposure and implement sustainable policy actions.On the other hand, a significant number of researchers have studied noise pollution from the perspectives of human health effects [1], noise mapping technology [2], noise cancellation [3], and so on.
Noise mapping has traditionally relied on manual sound collection.Appointed workers periodically traveled to sites being investigated and collected noise samples using professional devices.This data was then brought back to relevant departments for analysis and report.Large-scale noise mapping by manual sampling, without a doubt, is time-consuming, expensive, and sometimes dangerous.To address these issues, computational model-based noise mapping was proposed and has become the current main approach, because not only does it generate large-scale noise maps efficiently, it also predicts future environmental noise levels.However, the lessons learned from many years of practice include the fact that the computational model-based noise mapping approach poses several challenges, such as coarse-grained estimation, few simulation scenarios, static visualization, and massive computation burden.
As the next technological revolution, the Internet of Things (IoT) [4] is expected to fundamentally overcome the above challenges.First, ubiquitous real-world sound sensing can be achieved by the huge numbers of intelligent equipment such as sensor nodes, smartphones, wearable devices, self-driving cars [5], and unmanned aerial vehicles.Second, the collaboration between in-network nodes, edge devices, and cloud platforms could make ubiquitous computing a reality.Third, cutting-edge technologies, for example, artificial intelligence (AI), big data, knowledge discovery and data mining, can significantly enhance the extraction of intelligence.It is evident that the IoT is transforming the implementation and capability of noise mapping techniques, the ubiquitous sensing ability of which enables large-scale noise measurement.Fine-grained noise classification and dynamic noise mapping are enabled through the capability of ubiquitous computing.Ubiquitous intelligence facilitates people-centric noise visualization and data-driven noise mitigation.In summation, next-generation noise mapping is taking the form of smart city applications of merging IoT technologies.
In the rest of this article, we fi rst illustrate noise mapping based on computational models, along with the lessons learned.Subsequently, the stateof-the-art noise mapping approaches are systematically overviewed.Possibilities for future research are presented next, followed by the conclusion.

bAckground computAtIonAl model-bAsed noIse mAppIng
For the past decade, the computation method has been used to generate noise maps.The basic idea of this method is to exploit the acoustic emission behavior of noise sources and sound propagation characteristics to calculate sound pressure level.Figure 1 illustrates the working principle.
In the first step, two types of datasets are required as input data: sound source characteristics and environment characteristics.Sound source characteristics are used to assess noise emission from road traffi c, aircraft, and industrial production, while environment characteristics are required to assess noise propagation attenuation at receptors.
In the second step, sound pressure levels in predefined scenarios are calculated based on computational models.For example, the ISO 9613-2 model describes the behavior of sound propagation attenuation in outdoor environments.The traffic noise model (TNM) version 3.0 was released in 2017 by the U.S. Federal Highway Administration (FHWA) for highway traffi c noise modeling.Generally speaking, the noise emission of road traffic is modeled by traffic type, flow intensity, heaviness, and other parameters.Aircraft noise emission is presented through the type of aircraft and fl ight track.Information regarding equipment type, sound power spectrum, location, and directivity is used to model the sound power emission of industrial activities.In addition, atmospheric environment, ground type, barriers, and building refl ection are necessary to calculate noise attenuation at exposed people point.
In the third step, the impacts of noise pollution are collected, such as the number of exposed people beyond limit values, the number of exposed dwellings, and the seriously affected areas.Information regarding people and dwellings in certain areas can be obtained from geographic information systems (GIS) or professional socio-demographic datasets.Finally, noise maps are generated to guide plans of action and help the public understand the situation.

lessons leArned
One of the lessons learned from the computational model-based method is that coarse-grained estimation due to several parameters are required to calculate sound pressure level, but in many cases, such parameters cannot be fully accurately obtained.In addition, the investigated scenarios are usually predefi ned, and thus only simulate general ambient noise.In fact, abnormal noise events (e.g., road work, traffic jams and festive activities) are also of interest.Unfortunately, the simulation results do not include this part of sound noise.The second lesson learned is the small number of simulation scenarios.Noise models for road traffi c are well studied [6], but the public is also frequently exposed to social noise, construction, and industrial noise, which are diffi cult to accurately model.Furthermore, noise models pertaining to indoor environments are expected to be comprehensively investigated.Other lessons learned include static visualization and the massive computation burden since, in central platform, massive data computation and processing are required to generate large-scale noise maps.Regarding dynamic visualization, frequent input data updates are also diffi cult.

stAte-oF-the-Art sound mobIle crowdsensIng
Mobile devices, smartphones especially, have become necessities in our daily lives.People use smartphones to take photographs, record videos, and upload these to online social networks.Many interesting applications are available on smartphones that take advantage of various types of sensors included.In addition, the number of smartphone users will reach to 6.1 billion in 2020, which constitutes 70 percent of the world's population.These facts make a crowdsensing paradigm possible.
For noise mapping applications, environmental noise and the eff ects on people's health could be monitored through smartphones, smart watches, and smart bracelets.An embedded microphone is able to record sounds in the users' surroundings.Location information can be obtained via global navigation satellite system (GNSS), compasses, and wireless signals.Users' experience of a certain place can also be captured through photographs and videos.Furthermore, heart sensors are able to monitor users' health conditions in real time.After collecting the multi-scale data, powerful neural engines and processors on smart devices can calculate the sound pressure level and recognize sound sources.The results obtained from all participants are then aggregated in the cloud server through WiFi or 5G communication.Finally, the information is visualized on mobile applications.In this way, the public can understand the situation of one location or region.

technIcAl chAllenges
Crowdsensing technology enables all members of the public to be involved in noise monitoring and thus obtain a comprehensive understanding of the environment in which they live.Such an understanding can also increase public awareness and prompt people to actively reduce noise.However, much progress must be made before crowdsensing technology can be adopted for noise mapping.Six technical challenges are discussed as follows.
Noise Measurement Accuracy: The practicability of using smartphones instead of professional sound level meters for noise assessment was studied.Enda et al. [7] conducted a detailed evaluation of the measurement accuracy of mobile devices and noise monitoring applications.Seven iOS and Android applications were tested on 100 smartphones from six manufacturers.The results show that there are limitations to measure ambient noise using smartphones, especially background noise and high sound pressure.Moreover, the different applications and mobile device models affect measurement accuracy.
Positioning Accuracy: Accuracy is also reflected in position.Normally, positioning precision of mobile phones is approximately several meters.But in crowded urban areas and indoor environments, the accuracy could deteriorate due to the attenuation of satellite signals.As a result, incorrect noise situations are mapped with inaccurate noise values and positioning errors.The cross-calibration of noise level and position through multisource data fusion could be a method by which to increase the accuracy of the measurement.
Context Awareness: Microphones are frequently in use when people use a smartphone to make voice calls or send voice messages.A smartphone-based crowdsensing system should be able to detect the status of the smartphone to decide whether a sound level recording task should be executed.Moreover, the phone is sometimes held in a person's hand, or placed in a pocket or bag, which significantly affects noise measurement accuracy.Therefore, a crowdsensing system's context awareness ability is crucial for recognizing the status and placement of smartphones.At present, most crowdsensing systems rely on smartphones to monitor noise.We look forward that smart watches and smart bracelets will play important roles in noise mapping since they are more "closer" to ambient noise.In addition to providing health risk alarms, the burden of context awareness could also be relieved.
Energy Consumption: In the era of IoT, the public depends heavily on smartphones.These devices are charged every night so that they can function throughout the day.If noise monitoring tasks consume much energy or even deplete the smartphone's charge, people would experience increased anxiety.Furthermore, it would cause a sharp decline in the interest in participatory noise sensing.Thus, energy-efficient task scheduling and low-power sensing are necessary.
Privacy Preservation: Personal health data and time-location information are sensitive.The participants' health conditions and behaviors can be portrayed by this excessive disclosed data.It will be a safety hazard to both individuals and society as a whole.Therefore, research regarding privacy-preserving data sharing is needed.
Fine-Grained Mapping: Noise estimation of uncovered areas through limited crowdsourcing data also becomes an urgent task when there are not large numbers of participating noise measurements in the dimension of time and space.

relAted systems And mobIle ApplIcAtIons
Many sound crowdsensing systems have been proposed in recent years including NoiseSensee, GRCSensing, and SONYC [2].Moreover, a variety of mobile applications have been developed for noise mapping, such as NoiseTube and the National Institute for Occupational Safety and Health (NIOSH) Sound Level Meter App.A summary of related work is presented in Fig. 2. The determination of design criteria not only helps us compare the abilities of different systems but also serves as a guideline for engineering development and research.The criteria [8] to assess their performance are as follows.
Personal Exposure and Community Exposure: The first two criteria mean visualization should reflect both current and historical values of sound pressure level around individuals and communities to enable users to understand the noise exposure situation.

Availability
Risk Assessment: Since environmental noise has harmful effects on people's physiological and psychological health, a health risk assessment is expected to notify users about illness prevention and timely treatment.One example of commercial off-the-shelf product is Apple Watch Series 5, which can measure ambient sound levels and keep track of users' health.
Experience Capture: The sound levels themselves cannot directly represent events that occur, such as machine noise or the roar of a crowd.To capture the experience, artificial intelligent-based noise source recognition is necessary.Other approaches to capture events that occur include the use of raw audio data, photographs, and videos.
Continuity and Energy-Awareness: These two criteria mean the system needs to balance monitoring performance and energy consumption due to the limited battery capacity and high energy consumption of multimedia sensors used in noise mapping applications.
Correctness and Calibration: Measurements obtained by smartphones and wearables are not as accurate as those obtained by professional sound level meters, thus a calibration mechanism is essential for correctness.
Context-Awareness and Unobtrusiveness: As mentioned in the technical challenges part, context awareness helps for the adaptive sound measurement.In addition, intelligent operation without user interaction is also necessary for unobtrusiveness.
Open Source and Interoperability: Finally, openness and interoperability facilitate rapid application promotion.

AcoustIc sensor networks
Sound mobile crowdsensing has many advantages.However, a significant limitation is the sensing sparsity in both space and time.Acoustic sensor networks (ASNs) are good complements to sound crowdsensing in that they provide continuous noise monitoring.However, different system designs critically influence the quality of service.Existing ASNs are typically classified into three categories: ASNs based on dedicated equipment, ASNs based on improved customized sensor nodes, and ASNs based on low cost sensor nodes.A detailed description of the three types of networks was presented in [9].Here, we compare them in Table 1 and discuss the main design concerns, which are as follows.
Hardware Cost: The cost of acoustic sensor nodes affects network deployment decision.With low cost hardware, it is easy to ensure large-scale urban coverage.If the hardware is expensive, however, a compromise must be made between the deployment cost and area coverage.
Accuracy: The choice of acoustic sensors influences data quality.Dedicated sound level meters can provide accurate data collection, but they are expensive and large in size.MEMS microphones could mitigate this problem but calibration algorithms are required.
Scalability and Reliability: Network architecture and protocol stack must be carefully designed to achieve scalable and reliable ASNs.Flexible device discovery and leaving mechanism facilitates the adjustment of the monitoring area.Robust medium access control and routing ensure that the communication performance does not significantly fluctuate due to adverse external environments.
Capability: Real-time sound sensing and multimedia stream transmission consume a considerable amount of energy.In addition to battery power supply, harvesting ambient energy from solar power, wind, radio waves, and vibrations is ideal.The ability of local processing is important to identify sound events and reduce data transport.Equipping devices with GNSS, cameras and other kinds of sensors can provide more insightful information for noise mapping.

Future reseArch dIrectIons
Despite the numerous academic and industrial efforts that have been made regarding noise mapping applications, the research of noise mapping technology is still in its infancy.In this section, we highlight four promising directions for future research.

AI-AssIsted noIse predIctIon
Compared with the current model-based noise mapping solutions, mobile crowdsensing and ASNs are able to record real noise data.However, they themselves cannot predict future noise levels, which is vital for the advanced evaluation of the performance of any noise control measure.In the era of IoT, all devices are interconnected.Using information regarding sound sources and the environment, artificial intelligence could be leveraged for noise prediction.
Artificial neural network (ANN) is a good candidate performing this task since it can be trained to predict sound noise through analyzing examples under various scenarios.A basic architecture for traffic noise prediction is depicted in Fig. 3, which consists of an input layer, one or more hidden layers, and one output layer.The number of different types of vehicles, traffic speeds, road widths, building heights, and other factors can be considered the input signals.After learning via sample observations, such as the real noise measurement at major roads of a city, the network is capable of obtaining predicted traffic noise at the output layer.A few pilot studies [10] also have proven the feasibility of traffic noise prediction using artificial neural networks.
There are several challenges when ANN is employed to address the noise prediction problem.First, there is a trade-off between the performance and complexity of an ANN system.Prediction accuracy can be improved by increasing the number of input signals.However, a more powerful platform is required and the bur- den of data annotation is increased.Therefore, it is critical to decide the most suitable number of variables for the input layer.Second, a neural network is typically trained by a particular set of data, for example, measurements obtained from a single city.Due to variations in local conditions (e.g., weather conditions and vehicle specifications), a neural network that performs excellent in one region may not work well in other regions, and thus result in mistaken predictions.Therefore, the generality and applicability of an ANN-based noise prediction model needs to be guaranteed.Third, the ANN-based approach proposed in these pilot studies all focused on traffi c noises.However, traffic noise is not the only source that contributes to environmental noise pollution.Other major causes include industrial noise generated by factory equipment, construction noise during construction activities like road maintenance, and social noise that occurs in homes, commercial zones, education zones, and other public places.However, sophisticated ANN-based approaches for these types of noises have yet to be investigated.Fourth, dynamic fi ne-grained noise prediction at a large scale is challenging because sensor data processing and calculation require massive computing resources.Therefore, novel learning and computing paradigms are needed to improve the system's computing capability and effi ciency.
Here, we present some potential solutions to the challenges listed above.The suitable input variables for ANN could be determined through sensitivity analysis and an importance ranking to illustrate the most important factors that affect noise levels in various application scenarios.Generative adversarial networks (GAN) are expected to be applied to noise prediction to enhance the generality and robustness of AI expert systems.In addition to artifi cial neural networks, the accuracy, computational cost, and applicability of other machine learning algorithms as they pertain to environmental noise prediction in diff erent scenarios must be investigated.Finally, distributed edge computing with federated learning is a promising method by which the massive number of resource-constrained IoT devices can be used for effi cient large-scale fi ne-grained noise prediction.
constructIVe InterFerence For multImedIA trAnsmIssIon Raw sound signals, images, and related videos are valuable for context-aware noise mapping.Therefore, multimedia information along with noise level information should be delivered to edge nodes and clouds.However, a negative fact is that the to-be-monitored areas might be crowded with high-speed wireless communications.For example, frequent information exchanges among vehicles and roadside units will occur as the number of self-driving cars and the frequency with which the Internet of Vehicles is used increases.In megacities, the wireless signals in central business districts, railway stations, and airports are already overcrowded.As a result, the adverse wireless communication environment largely aff ects both the reliability and latency of ASNs.
Constructive interference-based wireless transmission is a robust technique to protect data against radio interference and achieve ultra-low delivery latency in multi-hop low-power wireless networks.Constructive interference is an interesting phenomenon in low-power wireless networks that was first explored in 2008 by Dutta et al.Based on this concept, the game-changing protocol Glossy was proposed by Ferrari et al. in 2011.Essentially, Glossy transforms the development of protocol design for wireless sensor networks and has attracted significant attention from researchers studying embedded wireless systems and networks.In recent years, most communication protocols and network services [11] presented in fl agship academic conferences and international scientific competitions are built on top of constructive interference.
The basic principle of constructive interference is that one receiver can successfully decode identical packets from multiple transmissions if the radio signals from the senders are tightly overlapped in both time and frequency domains.However, it is not easy to implement such a technique to existing low-power sensing platforms because tight synchronization is challenging and the timing requirements of constructive interference in different platforms largely depend on modulation techniques and data rates.In addition, the packet size, number of concurrent senders and transmission power also affect its dependability.Furthermore, most existing implementations are designed on TelosB mote, which was introduced to the research community as a test platform over 10 years ago.
At present, increasing numbers of hardware platforms with enhanced processing and communication capabilities are introduced to emerging smart applications.The public testbeds D-Cube and FIT IoT-Lab already support powerful hardware platforms such as the Nordic nRF52 series and Zolertia Firefl y.Now is the time to study the feasibility of realizing constructive interference on these platforms under different physical layer technologies.When constructive interference meets noise mapping, a signifi cant issue is the efficient transmission of multimedia traffic because a larger packet size would lead to a higher packet corruption rate.This topic has been rarely investigated so far.It now provides new opportunities for existing research on multimedia transmission to be integrated into constructive interference toward further improving the reliability, energy effi ciency, and bit fl ux for noise mapping.Possible methods include combining inter-frame and intra-frame coding or compressive sensing with constructive interference to ensure better synchronization by reducing frame length.

sImultAneous noIse sensIng And energy hArVestIng
Energy-efficient noise monitoring is challenging due to the following reasons.First, the general ambient noise is of concern that requires frequent sampling.Second, abnormal events such as road work and traffic jams greatly contribute to noise pollution, and they are of more interest to researchers and offi cial departments.However, the exact times at which these abnormal events occur are uncertain.Traditional periodic sensing approaches can only capture part of these events and might lead to the loss of critical information.Third, the limited energy seals of sensor nodes and smart devices cannot continuously perform noise sensing tasks.Fourth, the audio sensors and cameras used in noise mapping applications are power-hungry, and result in the rapid acceleration of energy consumption.
Currently, many applications need power-hungry sensors.To extend the lives of these applications, sensorless sensing technologies are proposed.The basic idea is to leverage radio as sensor.Examples of this include human activity recognition and in-home sleep monitoring using radio signals.Here, we envision a new form of sensorless sensing, namely power supply as sensor.Sound energy harvesting could become not only a promising ambient energy source but also passive sensors for noise mapping [12,13].
We illustrate the novel simultaneous sensing and energy harvesting system in Fig. 4. The interested noise signals from road traffi c, social activities, industrial production, and construction are fi rst captured through piezoelectric material, electromagnetic induction, and the triboelectric eff ect.Next, converter circuits are used to convert ambient sound energy into electrical energy, which is then stored in a supercapacitor or rechargeable battery.The voltage monitor observes the harvested energy level at a predefi ned rate.Furthermore, these measured values are used as the inputs of the sound-voltage model, which is capable of calculating the corresponding sound pressure level.Thus, the system is able to passively record environmental noise levels without any audio sensors.The harvested sound energy could be periodically forwarded to the power supply unit.The amount of electrical energy is dependent on sound energy density and conversion effi ciency, which means a self-sustained sensorless noise monitoring system without external power supply would be possible.On the other side, the system could be equipped with an additional loudness sensor and camera.The task scheduler could thus decide if the fi negrained and context-aware sensing should be performed according to the noise thresholds.

InAudIble sound securIty And monItorIng
Millions of smart objects are equipped with a voice controllable system as human-machine interface in the IoT.Through voice commands, these smart objects (e.g., smartphones, robots or cars) are able to execute corresponding actions to initiate phones, clean rooms, and drive the car automatically.By enabling hands-free human-machine interaction, voice assistants help people do many things with ease, especially for children, the elderly, and disabled individuals.
However, it is worth noticing that hidden voice commands can be generated by malicious operators in the following two ways.First, sound signals between 20 Hz and 20 kHz are audible to both humans and microphones while ultrasound signals are inaudible to humans, yet can be recorded by microphones.That means that malicious operators can easily and undetectably use software defined radios or even modified commercial smartphones to send inaudible voice commands via ultrasound channels without people being aware of the attacks.Second, malicious audible voice commands can be disguised as white noise and embedded into public broadcast programs and videos uploaded to online social media platforms.When people listen to or watch these multimedia resources, the malicious voice commands are recognized by voice controllable systems during the speech recognition stage while people do not perceive the command delivery.Such hidden voice command attacks could lead to serious consequences, such as traffi c accidents, indiscriminate transactions, and wrong operation.They could significantly affect the progress of smart home, autonomous driving, agricultural robotics, and other applications.Therefore, inaudible sound attacks and defense are critical issues that must be addressed.
Most of the existing security solutions that serve to mitigate this issue focus on the study of detection mechanisms on the receiver side to differentiate whether voice commands come from legitimate users or malicious operators.Popular detection methods include the analysis of indelible non-linear voice trace [14], pop noise location and classification [15].
Here, we present an alternative method: overthe-air monitoring of inaudible sound attacks.The ASNs and mobile devices used for audible environmental noise mapping could simultaneously monitor inaudible sound.For example, the sensor node could monitor the sound signal at one's home and send the detected voice commands to a screen for visualization to notify the homeowners of this situation and thus enable them to take action.In driving scenarios, a smartphone can monitor hidden voice commands in real time to ensure the safety of drivers and passengers.
Unfortunately, the current noise mapping approaches focus primarily on the calculation of sound pressure level.In the future, the application of enhanced noise mapping to over-the-air monitoring of inaudible sound attacks is expected to be investigated, in which broadband sound signal capture at different frequencies and speech recognition are essential.conclusIon Many academic research and industrial efforts have contributed to IoT-based noise mapping.However, comprehensive literature review and systematic analysis on this area is still lacking.This article closes the gap by presenting in-depth descriptions of the latest advances in noise mapping technologies.Specifically, we discussed the technical challenges of sound mobile crowdsensing and summarized the criteria for the design of crowdsensing systems and mobile applications.In addition, we compared the cost, accuracy, scalability, reliability, and capability of the three types of acoustic sensor networks.Moreover, presented elaborate discussions of the four identified research directions from the perspectives of motivations, technical issues, and potential solutions.

FIGURE 1 .
FIGURE 1.The working principle of computational model-based noise mapping.

FIGURE 2 .
FIGURE 2. Summary of crowdsensing systems and mobile applications for noise mapping.

FIGURE 3 .
FIGURE 3. Artifi cial neural network for traffi c noise prediction.

FIGURE 4 .
FIGURE 4. Architecture of simultaneous environmental noise monitoring and energy harvesting in noise mapping application.
Internet of Things for Noise Mapping in Smart Cities: State of the Art and Future DirectionsYe Liu, Xiaoyuan Ma, Lei Shu, Qing Yang, Yu Zhang, Zhiqiang Huo, and Zhangbing Zhou Digital Object Identifier: 10.1109/MNET.011.1900634Ye Liu is with Nanjing Agricultural University; Xiaoyuan Ma is with Shanghai Advanced Research Institute, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences; Lei Shu (corresponding author) is with Nanjing Agricultural University and the University of Lincoln; Qing Yang is with the University of North Texas; Yu Zhang is with Loughborough University; Zhiqiang Huo is with the University of Lincoln; Zhangbing Zhou is with China University of Geosciences and TELECOM SudParis.

TABLE 1 .
Comparison of Acoustic Sensor Networks.