Authentic Dialogue Generation to Improve Youth’s Awareness of Cybergrooming for Online Safety

This paper deals with a cybergrooming and sexual misconduct topic in artificial intelligence-based educational programs. Although cybergrooming has been recognized as a cybercrime, there is a lack of programs to protect youth from cybergrooming. We present a generative chatbot framework, SERI (Stop cybERgroomIng), that can generate fluent authentic conversations in the context of cybergrooming between a perpetrator chatbot and a potential victim chatbot. Furthermore, we propose deep-reinforcement-learning-based dialogue generation with a stage-related reward to lead the conversation to the expected stage. We also minimize potential ethical issues introduced by the perverted languages when deploying the chatbots for cybersecurity education programs. We evaluated the conversations of SERI with open-source referenced, unreferenced metrics and human evaluation. We developed SERI as a platform for deploying perpetrator chatbot to interact with youth users to observe their responses and collect reactions when they are asked for private or sensitive information by the perpetrator.


I. INTRODUCTION
Cybergrooming is a practice on the Internet to establish emotional, intimate, and trusting relationships with potential victims, usually children and teenagers, and use them for online sexual abuse or exploitation, which often leads to offline sexual crimes [1].Detecting perpetrators for cybergrooming is mainly focused on conversation analysis, which is reactive rather than proactive.The example of reactive defense is detecting cybergrooming by leveraging the lexical features [2].In contrast, the proactive way requires prevention or vulnerability analysis [3] in advance of crimes.
Since the research has shown the unique vulnerability of the youths as teenagers under puberty [4], it is critical to take a proactive intervention to protect the youth from being potential victims to cybergrooming.Further, the previous research Lateef [5] has shown high effectiveness of simulationbased learning by providing immersive, realistic situations, such as medical emergencies, disaster response, or military battlefields.Hence, our work aims to develop a generative chatbot framework as a proactive cybergrooming prevention approach.This framework will provide authentic dialogues between the cybergroomer and a potential victim without exposing the youth to potential harms, such as the use of perverted, sexually sensitive languages.
We name our generative chatbot framework 'SERI' for Stop cybERgroomIng.We designed SERI as a pre-stage phase and validated its generation performance before its deployment to a real human youth user, which will be done in our future work.We will make SERI involve an authentic dialogue with a stranger or acquaintance and learn how to respond to such a person asking for sensitive or private information.SERI consists of two chatbots, one for playing a perpetrator and the other for playing a potential victim.

II. RELATED WORK
In this section, we provide an overview of the state-ofthe-art related work in terms of cybergrooming detection, chatbot application tools, and deep reinforcement learning (DRL)-based conversation generation.We also identify their limitations and what gaps we fill in this work.

A. Cybergrooming Detection
Several traditional Machine Learning (ML) algorithms, such as support vector machine (SVM), -nearest neighbors (KNN), Random Forest, Decision Tree, fuzzy logic, Naïve Bayes and Neural Network (NN) classifiers [2], have been studied to detect cybergrooming from the online forum or social media platforms, by leveraging the lexical and behavioral features.Former studies have identified cybergrooming attack stages [6] among perpetrators and the victims based on the conversational relationship.Perpetrators usually build a relationship and evolve to a closer stage to realize the cybergrooming crime.While most previous studies aimed at detecting and analyzing features of potential perpetrators [7], there is no research on identifying the characteristics of potential victims in cybergrooming scenarios.

B. Chatbot Application Tools
An early-stage chatbot, named Negobot, was developed to detect and analyze potential pedophiles in social networks [8].A game-theoretic reward can push the chatbot toward the next grooming stage or keep the current stage.In recent years, pre-training language models, such as BERT, and GPT, and sequence-to-sequence models, such as BART, and T5, have demonstrated their superior capabilities in natural language understanding and generation from the large-scale data training.Several chatbot programs have explored the pre-training language models for conversation generation.For example, TransferTransfo [9] extends GPT-2 with a multi-task objective, combining unsupervised prediction tasks.However, no previous chatbots have been developed to avoid cybergrooming by simulating conversations between a cybergroomer and a victim.Step I: TextCNN Classifier

Leading role
Step III: T5 Fine-tune DRL policy and reward

PJ Dataset
Step II: T5 Fine-tune Step I: T5 Fine-tune Victim Chatbot

Follow
Step IV: Stage Evolution

C. DRL-based Conversation Generation
As a typical approach to learning efficient and effective dialogue strategies, Reinforcement Learning (RL), such as Q-learning or SARSA (state-action-reward-state-action), has been commonly used [10] to identify optimal dialogue strategies providing high-quality conversation with minimum retrieval cost.DRL has been used to evaluate emotion [11], evaluate the interactive RL (IRL) method to offer affordable and faster evaluation, or to generate the dialogue style transfer based on the GPT-2 and BART seq-to-seq models [12].However, no prior work has leveraged RL to generate dialogues to model the behaviors and strategies of online social attackers (e.g., cybergroomers) given their attack goals and intents.

III. PROPOSED APPROACH: SERI
Fig. 1 provides an architectural overview of the proposed SERI framework.SERI contains two chatbots with the four components: (1) Training a cybergrooming stage classifier for each perpetrator's utterance in the PJ dataset; (2) Fine-tuning both chatbots for a perpetrator and a potential victim on the large-scale ConvAI2 dataset; (3) Fine-tuning the two chatbots on the preprocessed PJ dataset.Specifically, the perpetrator chatbot is trained with a DRL policy and a reward that measures how likely the generated utterance is from the target grooming stage; and (4) Advancing the perpetrator chatbot to a higher-level stage to continue the dialogues.

A. Classifying Perpetrators' Messages per Stage
Zambrano et al. [7] partially labeled the PJ dataset with the six grooming stages of cybergrooming conversations.To assign a stage label for the remaining conversations of PJ dataset, given each utterance , we use TextCNN [13] as the stage classifier.The output of the convolutional layer after dropout, denoted as u, is the contextual representation of .Then we apply a linear function to classify  to one of the six stages with the softmax function.This stage classifier aims to minimize a categorical cross-entropy loss, L  , as: where Here  is the set of utterances in a dialogue, and  is the set of all the target stages.The ỹ denotes a vector of probabilities over all stages for , and ỹ, is the probability of predicting  with stage .The  , indicates whether  is the true stage label of  by  , = 1 or not by  , = 0.The parameters W and b from the dense layer are learnable.The six stages in [7] are identified by unsupervised clustering algorithms and are not clearly defined because some perpetrators' utterances could fall into multiple stages according to their definitions.Their stage  1 is for contacting and getting to know the target, while stage  4 establishes trust with the target victim [7].Similarly, we combine stage  2 of establishing a link to friends or family and stage  3 of collecting location and parents' information in [7] to the new stage s2 , which collects private information and social relationships.
Table I summarizes the key conversation contents and topics for the perpetrators covered by each new restructured stage.s3 and s4 are the same as original  5 and  6 .In the end, through the TextCNN stage classifier and stage consolidation, we assign a stage label for each utterance in the PJ dataset.

B. Fine-tuning the Chatbots on the ConvAI2 Dataset
We apply two sequential fine-tuning steps to both chatbots from T5 (Text-to-Text Transfer Transformer) model (i.e., an original T5 checkpoint).The first step uses a large-scale Con-vAI2 dataset containing broad topic dialogue turns to improve the fluency of the generated conversations.The second step further fine-tune it on the PJ dataset.To train the perpetrator's chatbot with the ability to lead the conversation, we use 65 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.In our SERI's perpetrator and victim chatbots (see Fig. 2), a training unit concatenates four or five consecutive utterances from the ConAI2 dialogues.For training the perpetrator chatbot, we take five consecutive utterances as a training unit, while for training the victim chatbot, we use four.In each training unit, the prediction target (i.e., ground truth response) is the last utterance.Accordingly, because both training units begin with a perpetrator's dialogue, each training unit is a dialogue history of three or four preceding utterances for the target victim and perpetrator chatbots, respectively.
Given a source dialogue history as , we have the T5 model to generate a response by optimizing the following objective: where Θ is the set of the T5 model parameters and   is the -th token of the response utterance.
After fine-tuning T5 on the ConvAI2 dataset, we further finetune the victim's chatbot on the domain-specific PJ dataset, which aims to shift it to generate cybergrooming responses.

C. Fine-tuning the Perpetrator Chatbot with a DRL Policy
The perpetrator gradually levels up the grooming stages in Table I to build a trust relationship with a victim and complete the ultimate grooming goal.As a result, the perpetrator chatbot is able to generate stage-related conversations during the finetuning using the PJ dataset.We optimize a DRL policy to generate utterances closer to the intended stage.
State.A state is denoted by the two previous dialogue turns, and it contains four consecutive utterances [ 1 ,  2 ,  3 ,  4 ].The dialogue history is further vectorized by feeding the concatenation of  1 to  4 into a T5 encoder.
Action.An action corresponds to the next dialogue utterance to be generated.The action space can be unlimited since sequences within the max-length hyper-parameters can be generated.
Reward.We implement a classification confidence-based reward to encourage the chatbot to follow the expected grooming states.We train the stage classifier TextCNN in section III-A and evaluate how well the generated sentence y matches the target stage by the confidence: where y represents the generated sentence, (|y ) denotes the probability distribution over all the target stage labels, and  refers to the parameters of the stage classifier, which are fixed during fine-tuning.The stage-related reward is: where   is the correct stage from the ground truth.
Gradients and objectives.The policy gradient is given by: where  is the stage classifier reward, y s is sampled from the distribution of model outputs at every decoding time step, and Θ is the model parameters.
The objective is a combination of the T5 model loss in Eq. ( 2) and the policy gradient of the reward in Eq. (5).We test multiple candidate ratios between the two items and identify that 1:0.3 is the best ratio between the T5 loss and the policy gradient.In Fig. 3, we summarized the procedures for estimating the loss with a DRL policy.Output filtering.To assure the generation of consistent and logically smooth (i.e., human-like) conversations, each chatbot is allowed to produce five candidate utterances every time.We can choose the best utterance based on the connectivity scores of the five occurrences to the previous utterance and the similarity score to the previous utterance.The connectivity scores are computed based on the pre-trained BERT next sentence prediction function and can ensure the consistency of a response utterance with the previous contexts.The similarity scores are computed from the Semantic Textual Similarity model [14] and can maintain the diversity of a generated utterance to prevent duplicate generation.Given a different scale of the connectivity score and the similarity score, we find that 1:3 is the best ratio between them for output filtering.

D. Stage Evolution of the Perpetrator Chatbot.
The perpetrator chatbot not only generates utterances resembling a given stage but also evolves the grooming stage to a higher level after maintaining a sufficient number of dialogue turns (e.g., 20).For example, if the chatbot stays at stage s1 for 20 rounds, including 10 perpetrator responses and 10 victim responses, this perpetrator will move forward to stage s2 .Each stage will start with a specific trigger sentence (see the third column of Table I), to direct the conversation toward the topic of a specific stage.If the perpetrator is too aggressive, the victim may be aware of the malicious intent and terminate the conversation to cause a failure of the cybergrooming attack.Otherwise, the user can continue the conversation while the perpetrator may not be able to make good grooming progress.

A. Datasets & Data Preprocessing
The ConvAI2 dataset [15] is a two-person casual chat dataset with topic labels.We collect 2,000 dialogues with more than 60,000 utterances under the "history" label from the ConvAI2.We also downloaded the PJ dataset from the official PJ website [16].It consists of 100 grooming conversations with more than 100,000 dialogue turns between real perpetrators and professionally trained volunteers acting as potential victims.The PJ dataset is split into the training, validation, and testing sets randomly following the ratio of 8:1:1.All other key parameters and their means and default values used in our SERI framework are summarized in Table II.

B. Metrics
The performances of SERI chatbots are evaluated in terms of the quality of automatic dialogues using referenced metrics, unreferenced metrics, and human evaluations.The commonly referenced metrics include BLEU,ROUGE, and BERTScore [17].BLEU calculates the penalty based on the length of the generated sentence and the precision of gram between generated sentence and references.ROUGE also calculates the recall of -gram.BERTScore is a metric based on the pre-trained BERT model, computing BERT embeddings and pairwise cosine similarity between generated sentences and references.The unreferenced metrics are perplexity and MaUde scores.The perplexity score indicates the understandability of a sentence, whereas a lower perplexity score represents higher fluency.The MaUde score can judge language quality in multiple aspects, such as fluency, reasonableness (i.e., logical flow), or repetition avoidance.
The human evaluation is conducted by two graduate students and one NLP expert.Each participant completes all the 200 randomly selected evaluation questions, containing four previous utterances and two candidate utterance responses (either generated by the SERI chatbot, or the original PJ dataset).Human annotators choose one response from the two candidates based on whether the response is human-like in terms of consistency and logical fluency.

A. Referenced Metrics-based Analysis
Table III shows BLEU, ROUGE, and BERTScore of our chatbot, which indicate that the perpetrator chatbot has lower BLEU and ROUGE scores than the victim chatbot, reflecting a lower similarity between SERI's and ground truth dialogues.This is because most online chats are informal without strict grammar or fluency rules.The higher BERTScore of the perpetrator can be explained by: (1) BERT fails to learn informative contextual representations from many of the functional and uninformative words, such as yes, haha, or why; and (2) The BERTScore is highly sensitive to certain word pairs which fail to capture any meaningful semantics of very short messages.

B. Unreferenced Metrics-based Analysis
The perplexity scores from the ground truth dialogues and SERI-generated ones are shown in Table III.The ground truth dialogues show much higher perplexity than SERI's generated dialogues.Since the perplexity measures the level of easy understanding, the lower perplexity from SERI's dialogues means that the original PJ dialogues have more informal expressions and grammar or logical errors than SERI's.
Since MaUde score measures the reasonableness of the dialogues, SERI's dialogues demonstrate a slightly higher MaUde score than the original PJ dataset in Table III.This MaUde shows that our SERI chatbots can generate dialogues with a comparable language quality to the ground truth dialogues.

C. Human Evaluation Analysis
At least two out of three annotators agreed 74 SERI produced sample responses out of the total 200.This count achieves a 37% success rate of the Turing test, demonstrating SERI's promising role in dialogue generation.We provide 67 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

D. Impact of ConvAI2 Fine-tuning and DRL
As shown in Table III (see with vs. without ConvAI2 finetuning), the dialogue generated by the model with fine-tuning on the ConvAI2 dataset shows a better performance with all five metrics compared to the model without it.In addition, we also observed that overall the model with DRL outperforms the model without DRL (see with vs. without DRL in Table III).The model with DRL reaches a higher score of BLEU and ROUGE.DRL simulates the stage strategy of a real perpetrator, which can lead to a higher similarity between the ground truth and generated conversations.Although the model with DRL showed lower performance in the perplexity score (i.e., higher perplexity), it outperformed the model without DRL in the MaUde score.Overall, this proves that DRL does not introduce a significant quality loss in the generated text while introducing the perpetrator's goal-driven conversation.

VI. DISCUSSIONS AND CONCLUSIONS
The main research challenges to resolve are with the existing metrics for dialogue evaluation.First, the existing metrics cannot effectively reflect the logical fluency between one utterance and its history utterances.We observed that if conversations are free from grammar errors, the existing metrics give high scores without considering logical flows.Second, the existing metrics cannot show the performance of the domain-specific application, such as our chatbot.That is, any existing metrics could not provide meaningful measures to indicate the grooming effect of the perpetrator's utterances on the victim's vulnerability to cybergrooming.
Under the challenges, we made the key contributions: • We provide a chatbot framework, that simulates a cybergrooming conversation between a perpetrator and a potential victim.This serves as a tool for an immersive setting for learning in increasing cybergrooming awareness.As the limitations, we found that the domain-specific PJ dataset has informal styles and poor readability, which complicated the training.Although we cleaned some raw PJ data, including informal expressions or perverted languages by text preprocessing, it is not trivial to improve the quality of SERI's dialogue generation.We feel that informal expressions and poor use of language may be the key features in natural conversations with perpetrators.Conventional metrics are mainly based on the formal languages in a dictionary.Therefore, they often cannot capture the quality of human-like languages.
For future research, we plan to conduct the following future research works: (1) Develop new metrics that can capture the naturalness of humans' informal languages and expressions; (2) Investigate how game theory can optimize the current seq-to-seq model to introduce a perpetrator's strategic conversations; (3) Explore the stage-related aspects of the problem, such as when the perpetrator initiates a new stage; and (4) Test users' acceptance of the final platform, involving a rather large sample of participants and use specific tools to test the acceptance of technological systems.

VII. ETHICAL STATEMENT
We aim to develop a chatbot SERI framework to learn to generate simulated conversations between cybergroomers and potential victims, especially children and teenagers.This will be later integrated into a cybergrooming prevention program to improve the sensitivity and awareness of potential victims of online grooming.As there are no proactive programs available to prevent cybergrooming though it has been a serious concern to society, this work significantly contributes to educating and protecting youths from any online sexual grooming.While recognizing the remarkable benefit and contribution of SERI, we also admit the potential risks and concerns SERI might introduce.Here we discuss several regulations and strategies to ensure that SERI will be properly and ethically used by educators, parents, and children: • While there are few human participants in this work, we will commit to an Institutional Review Board (IRB) review before deploying the chatbots in training programs.• Given the potential concerns, we will not release the programs and models of SERI to the public.Instead, we will restrict its access to parties (e.g., Education Institutes, research labs, certified parents) for research and education purposes by request.One may argue that highly intelligent perpetrators can develop our proposed chatbot based on this paper.However, without having the exact code and implementation details of the developed models, it is highly unlikely for the perpetrators to reproduce this chatbot exactly.We believe the human perpetrators will be much more intelligent than the human-like chatbots.Hence, the proposed chatbot will be used to increase the awareness of youth's awareness of cybergrooming as a proactive defense and prevent potential harm by perverted languages used in the cybergrooming conversations.• A potential ethical concern of SERI lies in the inappropriate and sexual language generated by the chatbots.To solve this problem, we will leverage available resources, such as the profane lexicons1 , and design computational approaches to automatically detect the obscene words in real-time and replace them with moderate ones to avoid any potential bad influence on the users, especially children, and teenagers.• We will also design various monitors and strategies to ensure the safety of SERI and prevent any potential ethical concerns or risks while delivering it as an education program.For example, we will design automatic approaches to keep track of the conversations between SERI and the users, detect potential grooming activities, and provide alerts whenever grooming is about to happen.We will also follow the regulations and standards stated in legal systems, such as General Data Protection Regulation (GDPR) 2 , and properly use and store conversational data.

Fig. 2 .
Fig. 2. Two sample training units for the perpetrator and pseudo-user (i.e., potential victim) chatbots.the leading person's dialogues in the ConvAI2 to train the perpetrator chatbot and use the other one's responses to train the victim chatbot.In our SERI's perpetrator and victim chatbots (see Fig.2), a training unit concatenates four or five consecutive utterances from the ConAI2 dialogues.For training the perpetrator chatbot, we take five consecutive utterances as a training unit, while for training the victim chatbot, we use four.In each training unit, the prediction target (i.e., ground truth response) is the last utterance.Accordingly, because both training units begin with a perpetrator's dialogue, each training unit is a dialogue history of three or four preceding utterances for the target victim and perpetrator chatbots, respectively.Given a source dialogue history as , we have the T5 model to generate a response by optimizing the following objective:

TABLE I CYBERGROOMING
STAGES & TRIGGERING SENTENCE OF EACH STAGE Stages Conversation Content Trigger Sentence of Each Stage s1 Greetings, casual talks for initiation of a trust relationship hi , how are you doing today?s2 Private information collection, such as identity as name, age, gender; social relationship as a family, school, location; or interests and schedule you parents know you be chatting with me? s3 Sexual questions or conversations, or sending/ requesting sexual pictures/videos how many pictures you have, any sexy?s4 Attempts of in-person contact or requesting online or in-person meeting what will we do if you meet me? 4. that is a good show i watch that while drinking iced tea. 5. i agree.what do you do for a living?

TABLE II PARAMETERS
AND THEIR DEFAULT VALUES USED FOR SERI FRAMEWORK • We employ DRL to generate authentic, strategic dialogues where the perpetrator has a clear, ultimate attack goal to achieve offline sexual exploitation.•SERI is trained on the T5 model via the following two sequential steps: fine-tuning the perpetrator and victim chatbots on the ConvAI2 and PG datasets.• The perpetrator chatbot is supplied with dialogues and rewards based on the corresponding grooming stage to guide the dialogue generation.• Our results prove the high quality of dialogues generated by SERI.Remarkably, the human evaluation shows that SERI is preferred approximately 37% more than the original dialogues in the PJ dataset.