In 2004, when color printers were still somewhat novel, PCWorld magazine published an article headlined: “Government Uses Color Laser Printer Technology to Track Documents.”
It was one of the first news reports on a quiet practice that had been going on for 20 years. It revealed that color printers embed in printed documents coded patterns that contain the printer’s serial number, and the date and time the documents were printed. The patterns are made up of dots, less than a millimeter in diameter and a shade of yellow that, when placed on a white background, cannot be detected by the naked eye.
The existence of the hidden dots gained renewed interest this week when they were found embedded in a top-secret report by the US National Security Agency (NSA) that was published by The Intercept on June 5. About an hour after the report was published, the Department of Justice (DOJ) announced that the Federal Bureau of Investigation (FBI) had arrested a suspected leaker. The 25-year-old NSA contractor, Reality Leigh Winner, was charged “with removing classified material from a government facility and mailing it to a news outlet.”
In an affidavit released by the DOJ, an FBI agent described how Winner had been tracked down. The scanned copy of the document, which The Intercept had given to the government to confirm its authenticity, “appeared to be folded and/or creased,” the agent wrote, “suggesting they had been printed and hand-carried out of a secured space.”
When researchers later discovered the tracking dots embedded in the document, many quickly assumed that the NSA had used them to find Winner. However, according to the affidavit, an internal government audit found that only six people had printed out the classified documents. Winner was one of those six people, and the audit found that she had also sent an email to the news outlet from her work computer. An analysis of the dots was therefore probably not necessary to track down Winner, despite several misleading news reports that suggest otherwise. But their presence has nonetheless resurfaced long-standing privacy concerns.
By analyzing the dots in the top-secret document, researchers were able to conclude it came from a printer with a serial number of 29535218, model number 54, and that it was printed on May 9, 2017, at 6:20 a.m., at least according to the printer’s internal clock. In a case where a leaker had covered his or her tracks more carefully, or where the leaked documents had been printed by far more than six people, or perhaps printed on a non-government printer, the dots certainly could have come into play.
In addition to the yellow-dot technology, Xerox implemented another feature around the same time that forced color copiers to shut down if they detected steganography in documents indicating they were currency. In 1994, the US Central Intelligence Agency approached Xerox about using the same technology to stop the unauthorized copying of classified documents, and Crean provided some ideas in a brainstorming session with two agents that year, he said. He wasn’t aware of whether the agency used any of his ideas, but the functionality to detect currency, he said, “was in most of the machines at least through the mid-2000s.”
“The possible misuses of this marking technology are frightening,” wrote the Electronic Frontier Foundation (EFF) in a blog post responding to the article. “Individuals using printers to create political pamphlets, organize legal protest activities, or even discuss private medical conditions or sensitive personal topics can be identified by the government with no legal process, no judicial oversight, and no notice to the person spied upon.”
Eventually, a volunteer working with the EFF noticed that the dots represented a binary code, Schoen said. It allowed them to crack the logic behind them, and to read the information embedded in any document that used yellow-dot steganography. The organization published the results of its work, along with an interactive tool to decode the dots.
“Other companies came up with other variants of that scheme that were more complicated, harder to decode. Canon kind of twist theirs around in a spiral,” Crean said, “but everybody was basically putting a small digital set of bits smack dab all over the print.”
“What we’ve learned is that there is a second generation of the technology that some of the manufacturers have switched over to,” Schoen said. “We’ve never cracked that or even had a way to detect it.”