The project that launched E-2 Unlimited Technologies, LLC, began in 2007 when an undergraduate student proposed developing automated techniques for identifying online predatory content. The project soon expanded to incorporate the detection of cyberbullying, self-harm, relationship violence, and other harmful material in online text. Over 65 students, many of them students of color and women in computer science, have participated in the project, with many co-authoring some of our research papers. This project has received multiple rounds of funding from the National Science Foundation. Learn more at http://www.chatcoder.com.
For the past decade, we have effectively applied machine learning technologies to detect cyberbullying and Internet sexual predation (Bigelow et al., 2016; Kontostathis et al., 2012; Reynolds et al., 2011). This work was informed, in part, by the application of Olson’s luring theory for child predation (Olson et al., 2007) to the online domain (Kontostathis et al., 2009). As an example of the effectiveness of our algorithms, the PAN2012 Sexual Predator Identification competition contained two subtasks: the identification of user IDs (anonymized) that are “owned” by Internet sexual predators and the identification of sexually explicit or predatory posts. We could detect up to 87% of the predatory authors and 50% of the predatory lines in the test set using our fully automated system (Kontostathis et al., 2012). This level of performance resulted in a rank of 10 in the author detection task and second place in the predatory line detection task.
We have since applied similar techniques to cyberbullying detection. This task is much more complex (and even fewer resources are available to determine algorithm effectiveness). For this task, we took a more experimental approach and achieved precision as high as 92% at rank 100 (Kontostathis et al., 2013; Bigelow et al., 2016). Recently we have attempted to reduce our reliance on a hand-crafted dictionary for these detection tasks and have developed a semi-supervised machine learning approach that builds a dictionary automatically. This tool is very lightweight (in terms of memory and runtime requirements). It converges very quickly, reaching a high degree of accuracy after only three iterations of input to the learner.
One problem related to detecting harmful behavior on social media is the need for more high-quality data to study the problems. The existing datasets are often dated, and while there is a reasonable presumption that the communication is between youth in many cases, in others, it is well-known that the available data also shows adults posing as youth (McGhee et al., 2011). To provide a more robust corpus for research on these topics, we completed a multi-year study of the prevalence of cyberbullying among youth. The first phase of our study collected data about the relationship between self-disclosure and teen cyberbullying behavior. The second phase consisted of multiple focus group discussions to determine impressions and attitudes about cyberbullying and self-disclosure from particular subgroups of teens. Finally, in part three, we deployed cell phones to youth ages 10-14 and collected all textual data on the phones for a full year, more than 800,000 messages.
E-2 Unlimited Technologies, LLC was founded in 2019 to support the commercialization - in support of our mission to Safeguard Kids and Maintain Privacy (SKAMP).
April Edwards (formerly Kontostathis) earned a Ph.D. in computer science from Lehigh University and an MA in mathematics from Duke University. Dr. Edwards currently holds a position as a Professor of Cyber Science at the US Naval Academy. Dr. Edwards is an expert in machine learning technologies and text analysis. She has co-authored numerous articles describing methods for determining the most critical values in the reduced dimensionality matrix and improving Latent Semantic Indexing (LSI) efficiency and effectiveness. She has authored many articles on machine learning techniques to detect cyber predation and cyberbullying. Dr. Edwards began her academic career as a professor in 2003 and served as department chair, associate dean, Interim Vice President for Academic Affairs, and Interim Dean of the College at Ursinus College in Collegeville, PA. She also served as the Vice President for Academic Affairs and Dean of the Faculty at Elmhurst College in Elmhurst, IL. Dr. Edwards has served as PI or co-PI for grants of over $1.2 million from the National Science Foundation.
Before academia, April worked for 13 years for Electronic Data Systems (EDS), where she specialized in project management, database design, and administration.
Lynne Edwards is a Professor of Media and Communication Studies at Ursinus College in Collegeville, PA, and a Distinguished Fellow at the Annenberg School of Public Policy at the University of Pennsylvania. Dr. Edwards is the author of several publications, including “Victims, Villains, and Vixens: Teen girls and Internet Crime” in Girl Wide Web: Girls, the Internet, and the Negotiation of Identity. Sharon Mazzarella, editor. (Peter Lang Publishing, 2005), “Black Like Me: Value Commitment and Television Viewing Preferences of U.S. Black Teenage Girls in Black Marks: Minority Ethnic Audiences and Media" (Ashgate, 2001), “Choices and Voices: Deciding Between Qualitative and Quantitative Methodologies.” in Communication Impact: Designing Research That Matters. Susanna Hornig Priest, editor. Lanham: (Rowman & Littlefield, 2005), and “Slaying in Black and White: Kendra as Tragic Mulatta in Buffy the Vampire Slayer” in Fighting the Forces: Essays on the Meaning of Buffy the Vampire Slayer (Rowman & Littlefield, 2002). Dr. Edwards co-edited a volume about the final two seasons of Buffy the Vampire Slayer called Buffy Goes Dark: Essays on the Final Two Seasons of Buffy the Vampire Slayer on Television (McFarland Publishers, 2008).
Bigelow, J., Edwards, A., Edwards, L. 2016. Detecting Cyberbullying using Latent Semantic Indexing. In Proceedings of Cybersafety 2016. October 2016. Indianapolis, IN.
Edwards, A., Demoll, D., Edwards, L. 2020. Detecting Cyberbullying Activity Across Platforms. In S. Latifi (eds.), 17th International Conference on Information Technology–New Generations (ITNG 2020), Advances in Intelligent Systems and Computing 1134, https://doi.org/10.1007/978-3-030-43020-7_7
Kontostathis, A., Edwards, L., Leatherman, A. 2009. ChatCoder: Toward the Tracking and Categorization of Internet Predators. In Proc. Text Mining Workshop 2009 held in conjunction with the Ninth SIAM International Conference on Data Mining (SDM 2009). Sparks, NV. May 2009.
Kontostathis, A., Reynolds, K., Garron, A., Edwards, L. 2013. Detecting Cyberbullying: Query Terms and Techniques. In Proceedings of the 5th ACM Web Science Conference (WEBSCI2013). Paris, France. May 2013.
Kontostathis, A., West, W., Garron, A., Reynolds, K., Edwards, L. 2012. Identifying Predators Using ChatCoder 2.0 - Notebook for PAN at CLEF 2012. In Proceedings of CLEF 2012. Rome Italy. September 2012.
McGhee, I., Bayzick, J., Kontostathis, A., Edwards, L., McBride, A., Jakubowski, E. 2011. Learning to Identify Internet Sexual Predation. International Journal on Electronic Commerce. Volume 15, Number 3. Spring 2011
Olson, L., Daggs, J., Ellevold, B., Rogers, T. 2007. Entrapping the innocent: Toward a theory of child sexual predators’ luring communication. Communication Theory, 17(3):231–251, 2007.
Reynolds, K., Kontostathis, A., Edwards, L. 2011. Using Machine Learning to Detect Cyberbullying. In Proceedings of the 2011 10th International Conference on Machine Learning and Applications Workshops (ICMLA 2011). December 2011. Honolulu, HI.