Interesting contribution to the discussion by Ed Felten.
My concern today is whether Flu Trends can be manipulated. The system makes inferences from how people search, but people can change their search behavior. What if a person or a small group set out to convince Flu Trends that there was a flu outbreak this week?
An obvious approach would be for the conspirators to do lots of searches for likely flu-related terms, to inflate the count of flu-related searches. If all the searches came from a few computers, Flu Trends could presumably detect the anomalous pattern and the algorithm could reduce the influence of these few computers. Perhaps this is already being done; but I don’t think the research paper mentions it.
A more effective approach to spoofing Flu Trends would be to use a botnet — a large collection of hijacked computers — to send flu-related searches to Google from a larger number of computers. If the added searches were diffuse and well-randomized, they would be very hard to distinguish from legitimate searches, and the Flu Trends would probably be fooled.
This possibility is not discussed in the Flu Trends research paper. The paper conspicuously fails to identify any of the search terms that the system is looking for. Normally a paper would list the terms, or at least give examples, but none of the terms appear in the paper, and the Flu Trends web site gives only “flu” as an example search term. They might be withholding the search terms to make manipulation harder, but more likely they’re withholding the search terms for business reasons, perhaps because the terms have value in placing or selling ads.