Reading list of online hate detection
- Characterizing and Detecting Hateful Users on Twitter
- The Risk of Racial Bias in Hate Speech Detection
- Automated Hate Speech Detection and the Problem of Offensive Language
- Detecting the Hate Code on Social Media
- Hate Lingo: A Target-Based Linguistic Analysis of Hate Speech in Social Media
- Learning to Decipher Hate Symbols
- Analyzing the Targets of Hate in Online Social Media
- Characterizing and Detecting Hateful Users on Twitter
- Peer to Peer Hate: Hate Speech Instigators and Their Targets
- Spread of Hate Speech in Online Social Media
- Mobilizing the Trump Train: Understanding Collective Action in a Political Trolling Community
- A Benchmark Dataset for Learning to Intervene in Online Hate Speech
- Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations
- The Risk of Racial Bias in Hate Speech Detection
- TEXTBUGGER: Generating Adversarial Text Against Real-world Applications
- Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency
- GENERATING NATURAL ADVERSARIAL EXAMPLES
- Combating Adversarial Misspellings with Robust Word Recognition
- Generating Natural Language Adversarial Examples
- NATURAL LANGUAGE ADVERSARIAL ATTACKS AND DEFENSES IN WORD LEVEL
- Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey
- Analysis Methods in Neural Language Processing: A Survey
- Adversarial NAACL 2019
- Natural Adversarial Examples
- FREELB: ENHANCED ADVERSARIAL TRAINING FOR LANGUAGE UNDERSTANDING
- Adversarial Training for Free!
- You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle
- Reading Thieves Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces
- Generating Text via Adversarial Training
- GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution
- SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
- Defending Neural Backdoors via Generative Distribution Modeling
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Adversarial Examples for Evaluating Reading Comprehension Systems
- A Survey: Towards a Robust Deep Neural Network in Text Domain
- TextFool: Fool your Model with Natural Adversarial Text
- Universal Adversarial Perturbation for Text Classification
- Towards Robust Toxic Content Classification
- Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
- Deep Text Classification Can be Fooled
- Adversarial Attacks and Defenses in Images, Graphs and Text: A Review
- Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples
- Universal Adversarial Triggers for Attacking and Analyzing NLP
- ADVCODEC: TOWARDS A UNIFIED FRAMEWORK FOR ADVERSARIAL TEXT GENERATION
- DETECTING EGREGIOUS RESPONSES IN NEURAL SEQUENCE-TO-SEQUENCE MODELS
- Say What I Want: Towards the Dark Side of Neural Dialogue Models
- Finding Social Media Trolls: Dynamic Keyword Selection Methods for Rapidly-Evolving Online Debates
- Must-read Papers on Textual Adversarial Attack and Defense (TAAD)
- On the Robustness of Self-Attentive Models
- How the Embedding Layers in BERT Were Implemented
- TEXTSHIELD: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation
- Hidden resilience and adaptive dynamics of the global online hate ecology
- Defending Against Neural Fake News
- MIT CSAIL’s TextFooler generates adversarial text to strengthen natural language models
- Towards Robust Toxic Content Classification
- Defending Neural Backdoors via Generative Distribution Modeling
- Say What I Want: Towards the Dark Side of Neural Dialogue Models
- CNN-generated images are surprisingly easy to spot... for now
- Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model
- Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper)
- Exploring Hate Speech Detection in Multimodal Publications
- Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification
- Generating Counter Narratives against Online Hate Speech: Data and Strategies
- Attend and Attack: Attention Guided Adversarial Attacks on Visual Question Answering Models
- Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
- Fooling Vision and Language Models Despite Localization and Attention Mechanism
- Adversarial Training and Robustness for Multiple Perturbations
- Awesome Multimodal ML
- A Survey of Black-Box Adversarial Attacks on Computer Vision Models
- Adversarial examples: Attacks and defenses for deep learning
- On Evaluating Adversarial Robustness
- Adversarial Examples Are Not Bugs, They Are Features
- Deepsec: A uniform platform for security analysis of deep learning model
- Adversarial training and robustness for multiple perturbations
- On adaptive attacks to adversarial example defenses
- Benchmarking Adversarial Robustness
- Adversarial attacks and defenses in images, graphs and text: A review
- Towards a Robust Deep Neural Network in Texts: A Survey
- Textual Adversarial Attack and Defense Reading List
- Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
- What are you talking about? Text-to-Image Coreference
- Attend and Attack: Attention Guided Adversarial Attacks on Visual Question Answering Models
- Tutorial on Multimodal Machine Learning
- Fooling Vision and Language Models Despite Localization and Attention Mechanism
- Exact Adversarial Attack to Image Captioning via Structured Output Learning with Latent Variables
- Spam Review Detection with Graph Convolutional Networks
- Exploring Visual Relationship for Image Captioning
- Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
- Relational inductive biases, deep learning, and graph networks
- Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations
- Deep Anomaly Detection on Attributed Networks
- The Risk of Racial Bias in Hate Speech Detection
- What Do We Understand About Convolutional Networks?
- Order Matters: Semantic-Aware Neural Networks for Binary Code Similarity Detection
- Living the Meme: AI as Funny as Humans for Generating Image Captions
- Memeify: A Large-Scale Meme Generation System
- SimMeme: A Search Engine for Internet Memes
- On the Origins of Memes by Means of Fringe Web Communities
- NeuronInspect: Detecting Backdoors in Neural Networks via Output Explanations
- PI-Bully: Personalized Cyberbullying Detection with Peer Influence
- Fairness in Deep Learning: A Computational Perspective
- “I am uncomfortable sharing what I can’t see”: Privacy Concerns of the Visually Impaired with Camera Based Assistive Applications
- Techniques for Interpretable Machine Learning