Online survey research, a fundamental method for data collection in many scientific studies, is facing an existential threat because of large language models, according to new research published in the Proceedings of the National Academy of Sciences (PNAS). The author of the paper, associate professor of government at Dartmouth and director of the Polarization Research Lab Sean Westwood, created an AI tool he calls "an autonomous synthetic respondent,” which can answer survey questions and “demonstrated a near-flawless ability to bypass the full range” of “state-of-the-art” methods for detecting bots.
According to the paper, the AI agent evaded detection 99.8 percent of the time.
"We can no longer trust that survey responses are coming from real people," Westwood said in a press release. "With survey data tainted by bots, AI can poison the entire knowledge ecosystem.”
Survey research relies on attention check questions (ACQs), behavioral flags, and response pattern analysis to detect inattentive humans or automated bots. Westwood said these methods are now obsolete after his AI agent bypassed the full range of standard ACQs and other detection methods outlined in prominent papers, including one paper designed to detect AI responses. The AI agent also successfully avoided “reverse shibboleth” questions designed to detect nonhuman actors by presenting tasks that an LLM could complete easily, but are nearly impossible for a human.
“Once the reasoning engine decides on a response, the first layer executes the action with a focus on human mimicry,” the paper, titled “The potential existential threat of large language models to online survey research,” says. “To evade automated detection, it simulates realistic reading times calibrated to the persona’s education level, generates human-like mouse movements, and types open-ended responses keystroke by-keystroke, complete with plausible typos and corrections. The system is also designed to accommodate tools for bypassing antibot measures like reCAPTCHA, a common barrier for automated systems.”
The AI, according to the paper, is able to model “a coherent demographic persona,” meaning that in theory someone could sway any online research survey to produce any result they want based on an AI-generated demographic. And it would not take that many fake answers to impact survey results. As the press release for the paper notes, for the seven major national polls before the 2024 election, adding as few as 10 to 52 fake AI responses would have flipped the predicted outcome. Generating these responses would also be incredibly cheap at five cents each. According to the paper, human respondents typically earn $1.50 for completing a survey.
Westwood’s AI agent is a model-agnostic program built in Python, meaning it can be deployed with APIs from big AI companies like OpenAI, Anthropic, or Google, but can also be hosted locally with open-weight models like LLama. The paper used OpenAI’s o4-mini in its testing, but some tasks were also completed with DeepSeek R1, Mistral Large, Claude 3.7 Sonnet, Grok3, Gemini 2.5 Preview, and others, to prove the method works with various LLMs. The agent is given one prompt of about 500 words which tells it what kind of persona to emulate and to answer questions like a human.
The paper says that there are several ways researchers can deal with the threat of AI agents corrupting survey data, but they come with trade-offs. For example, researchers could do more identity validation on survey participants, but this raises privacy concerns. Meanwhile, the paper says, researchers should be more transparent about how they collect survey data and consider more controlled methods for recruiting participants, like address-based sampling or voter files.
“Ensuring the continued validity of polling and social science research will require exploring and innovating research designs that are resilient to the challenges of an era defined by rapidly evolving artificial intelligence,” the paper said.