What is wkhtmltopdf and why does its presence in PDF metadata indicate phishing?

wkhtmltopdf is an open-source command-line tool that converts HTML pages into PDF documents. Legitimate software vendors rarely use it to generate customer-facing documents; they use commercial PDF libraries or cloud rendering services. Phishing kit operators use wkhtmltopdf to convert HTML lure pages (fake invoices, renewal notices, callback prompts) into static PDFs. The conversion strips JavaScript, forms, and live URLs from the source HTML, which makes the resulting PDF appear clean to gateway scanners that look for active content. The Creator metadata field in the PDF dictionary records which tool produced the file, making wkhtmltopdf a reliable forensic indicator of kit-rendered phishing documents.

What is a TOAD phishing attack?

TOAD stands for Telephone-Oriented Attack Delivery. The attacker sends an email (often with a PDF attachment) that contains no malicious links or payloads detectable by gateway scanners. Instead, the document instructs the recipient to call a phone number to resolve a billing issue, cancel an account, or dispute a charge. When the victim calls, they reach an attacker posing as customer support who then walks them through steps that compromise credentials, install remote-access tools, or authorize fraudulent payments.

Why does a subject-to-attachment mismatch matter for detection?

Phishing kits often recycle template components across campaigns. An operator running a shipping-lure campaign may attach a Geek Squad renewal PDF without updating the document's internal metadata or title. Gateways that scan only email body content or link reputation miss this inconsistency entirely. Detection systems that cross-reference the email subject, body theme, and attachment metadata can flag the mismatch as an anomaly even when all individual components appear technically clean.

How did the Gmail API dispatch pattern contribute to this attack?

The Received header showed the message was sent via HTTPREST through gmailapi.google.com, indicating the attacker used the Gmail API rather than a standard SMTP client. Gmail API dispatch passes all Gmail authentication checks (SPF, DKIM, DMARC, and ARC all pass) because the message genuinely originates from Google's infrastructure. This means the sending reputation is Google's, not the attacker's throwaway account, making reputation-based filtering ineffective.

Phishing Email Security IRONSCALES Attack Research AI SOC 2025 Attack of the Day

The PDF Creator Field Knew What the Subject Line Didn't: wkhtmltopdf and the Geek Squad Shipping Mismatch

Audian Paxson July 12, 2025

TL;DR A threat actor sent a shipping-lure email from a Gmail address via the Gmail API, attaching a PDF whose Creator metadata field read 'wkhtmltopdf 0.12.6' and whose internal document Title read 'Geek Squad Subscription Renewal', directly contradicting the package-tracking subject. The body carried only an opaque alphanumeric token, no URLs, no phone number visible to scanners. The wkhtmltopdf tool is a known phishing-kit component used to render HTML lure pages into static PDFs, stripping active content so gateway scanners return a clean verdict while the social-engineering payload remains inside the document.

Severity: Medium Callback-Phishing Social-Engineering Impersonation MITRE: T1566.001 MITRE: T1204.002 MITRE: T1656

Most phishing PDFs try to hide their malicious content. This one hid its identity, and the metadata gave it away before anyone opened the file.

A recipient at a regional services organization received an email from briliansupriyadiwill[@]gmail[.]com with the subject "Package Scanned A507JYE85A8N3_SFFC4~Z7EACLM4FBM1.2CRG." The body contained only that same opaque alphanumeric string: no carrier name, no tracking URL, no CTA, no context. One PDF attachment accompanied it. Gateway verdict on the PDF: clean. No JavaScript, no AcroForm fields, no embedded files, no extractable URLs.

The story was in the metadata the scanner didn't surface.

What the PDF Creator Field Reveals That Body Scanning Misses

PDF documents store a metadata dictionary in their header that records, among other fields, the Creator (the application that generated the content) and the Title (the document's internal name). Static analysis of the attachment showed:

Creator: wkhtmltopdf 0.12.6
Title: Geek Squad Subscription Renewal
Subject line: Package Scanned (shipping lure)

Those three data points form the core of this attack's forensic tell. The subject frames the message as a shipping notification. The PDF title reveals the actual lure: a Geek Squad subscription renewal claim. And the Creator field identifies exactly how the document was produced.

wkhtmltopdf is a command-line HTML-to-PDF renderer used extensively in callback phishing kits. Its appeal to phishing operators is straightforward: convert a fully functional HTML lure page (complete with branding, urgency language, and a phone number) into a static PDF. The conversion removes all active content. The resulting file contains no clickable links, no JavaScript, no forms. Every gateway check that looks for exploitable mechanisms in PDF attachments returns clean. The social-engineering payload (the callback number and the scripted urgency) survives in the visual layer of the rendered document, invisible to parsers that don't perform optical character recognition or visual analysis.

MITRE ATT&CK T1566.001 covers spearphishing via attachment. T1204.002 (user execution: malicious file) applies when the victim opens the PDF and reads the callback instructions. T1656 (impersonation) covers the Geek Squad identity claim embedded in the document.

The Shipping Subject as Misdirection

The subject line and body token serve a specific purpose: they get the email past content filters without triggering brand-impersonation detection for Geek Squad. A Geek Squad-themed subject would match known phishing patterns in many gateway configurations. A shipping token does not.

The mismatch is not accidental. Phishing kit operators running volume campaigns frequently reuse template components, a shipping email wrapper here and a Geek Squad renewal PDF there, without synchronizing them. The result is an internal contradiction that tells the analyst what the gateway score cannot: these pieces come from different templates assembled for evasion, not for coherence.

The social engineering mechanism depends on urgency and authority. Geek Squad renewal phishing typically claims an auto-renewing subscription charge and instructs the recipient to call immediately to cancel. The callback phone number connects to an attacker posing as support, who then guides the victim toward credential disclosure, remote-access tool installation, or a fraudulent refund transaction.

See Your Risk: Calculate how many threats your SEG is missing

Gmail API Dispatch and the Authentication Laundering Problem

The Received header showed the message dispatched via gmailapi.google.com with HTTPREST, not a standard SMTP client. This indicates the attacker sent through the Gmail API, an automated-send pathway that fully authenticates through Google's infrastructure.

SPF passed (Google's outbound IP designated as permitted for gmail.com). DKIM passed (signed by gmail.com). DMARC passed (policy p=NONE, result pass). ARC sealed at i=2 by google.com. Every authentication check returned a positive result because every authentication check was evaluating Google's infrastructure, not the attacker's identity.

First-time sender status and the implausible local-part of the sending address (briliansupriyadiwill[@]gmail[.]com reads as a string-generated account name, not a human's) are behavioral signals that authentication headers cannot encode. These are exactly the signals that impersonation detection must evaluate when the cryptographic layer provides no evidence of spoofing.

What Detection Requires When the File Is Technically Clean

The case illustrates a gap that gateway-layer PDF scanning cannot close. Static analysis correctly returned a clean verdict: no executable content, no network callouts, no exploit primitives. That verdict is accurate and useless simultaneously.

Closing the gap requires two capabilities working together. First, metadata extraction that surfaces Creator, Producer, and Title fields alongside the attachment verdict, in the primary analyst view rather than a supplemental log. wkhtmltopdf in the Creator field, combined with a document Title that names a brand the sender has no relationship to, is a high-confidence indicator of kit assembly.

Second, cross-field coherence checking: does the email subject match the attachment's declared content? A shipping-themed email body attached to a "Geek Squad Subscription Renewal" PDF is incoherent. Humans notice incoherence. Automated scanners that evaluate each field in isolation do not.

IRONSCALES detected the behavioral anomaly (first-time external Gmail sender, opaque body, PDF attachment with mismatched internal metadata) and flagged the incident for quarantine. The Themis AI engine identified the credential-theft pattern at the campaign level, matching the assembly characteristics to known social engineering kits despite the absence of any technically malicious PDF component.

Indicators of Compromise

Type	Indicator	Context
Sender address	briliansupriyadiwill[@]gmail[.]com	Attacker-controlled Gmail account; first-time sender; dispatched via Gmail API HTTPREST
Attachment filename	EKW5ZT9FMPMTSI6Z1QB[.]pdf	Randomized filename; MD5 f19651e085bee0ed7f9bc79833e26d76; SHA-256 503de9c70cd31a4c40a4a3879dfa8557a025c9945e0e440c280e936c16b29e00
PDF Creator metadata	wkhtmltopdf 0.12.6	Known phishing-kit HTML-to-PDF renderer; presence indicates kit-assembled document
PDF Title metadata	Geek Squad Subscription Renewal	Mismatches shipping-themed email subject; reveals actual lure brand
PDF Producer metadata	Qt 4.8.7	Qt rendering engine paired with wkhtmltopdf; consistent kit signature
Dispatch method	Gmail API HTTPREST	Full SPF/DKIM/DMARC/ARC pass; authentication reflects Google infrastructure, not sender identity

Email Attack of the Day is a daily series from IRONSCALES spotlighting real phishing attacks caught by Adaptive AI and our community of 35,000+ security professionals. Each post breaks down a real attack. What it looked like, why it worked, and what to do about it.

Related attacks

Attack	What happened
The Calendar Invite That Was a Bill: Malwarebytes Impersonation via Same-Day Domain and Google Calendar	Attackers registered infodeliv.com the same day they sent a Google Calendar .ics invite demanding a $479.33 Malwarebytes charge.
The .com That Wasn't the .org: TLD Confusion in a Payroll Email With an Empty Body	A payroll email about annual salary and benefits arrived from the .com version of a nonprofit's domain.
Microsoft Bookings as a Weapon: When DMARC Says Trust Me and ARC Quietly Disagrees	A phishing email sent from bookings.microsoft.com passed every authentication check.
Perfect Authentication, Zero Payload: The Yahoo Free-Mail BEC That Microsoft Flagged but Didn't Block	A Yahoo free-mail account with perfect SPF, DKIM, and DMARC authentication sent a zero-payload account change request to a state government health agency.
The RSA Follow-Up That Wasn't: How a Post-Conference Calendar Invite Fooled Three Inboxes	A calendar invite landed right after RSA Conference, appearing to be a follow-up from an internal VP.

Explore More Articles

Say goodbye to Phishing, BEC, and QR code attacks. Our Adaptive AI automatically learns and evolves to keep your employees safe from email attacks.

For Enterprises

For MSPs & MSSPs

Protect Better

Simplify Operations

Empower Your Org

17,000+ Customers and Counting

Case Studies

Reviews

Osterman Research: The (Higher) Business Cost of Phishing

Case Study: Telit

Our Awards

How IRONSCALES Works

Platform Overview

API Integration

Artificial Intelligence

Human Element

Agentic Capabilities

Agents Overview

Red-Teaming Agent

Phishing SOC Agent

Phishing Simulation Agent

Platform Tours

BY USE CASE

Business Email Compromise

Advanced Malware & URL Attacks

Account Takeover Attacks

Email Encryption

Deepfake Attack Protection

DMARC Management

Phishing Simulation Testing

Security Awareness Training

BY PLATFORM

BY PROJECT

BY INDUSTRY

BY ROLE

LEARN

Blog

Threat Intelligence Center

Cybersecurity Glossary

Resource Library

Guides

Platform Tours

CONNECT

Events

Newsletter

LinkedIn

The Hidden Gaps in SEG Protection

New Gartner® Email Security Magic Quadrant™

Attack of the Day Explorer

Phishing Prevention

Spear Phishing

Voice Phishing

BY TYPE

MSPs and MSSPs

Resellers

Technology Partners

ENGAGE

Become Partner

Partner Portal

Partner with IRONSCALES

The PDF Creator Field Knew What the Subject Line Didn't: wkhtmltopdf and the Geek Squad Shipping Mismatch

What the PDF Creator Field Reveals That Body Scanning Misses

The Shipping Subject as Misdirection

Gmail API Dispatch and the Authentication Laundering Problem

What Detection Requires When the File Is Technically Clean

Indicators of Compromise

Related attacks

Explore More Articles

The Squarespace Phish With No Brand Text to Match

Three Brands, One Lure: A Chase 'Secure Message...

A Trusted Domain, a Voicemail, and a Windows .EXE

The Squarespace Phish With No Brand Text to Match

Three Brands, One Lure: A Chase 'Secure Message...

A Trusted Domain, a Voicemail, and a Windows .EXE

The Squarespace Phish With No Brand Text to Match

Three Brands, One Lure: A Chase 'Secure Message...

A Trusted Domain, a Voicemail, and a Windows .EXE

The Squarespace Phish With No Brand Text to Match

Three Brands, One Lure: A Chase 'Secure Message...

A Trusted Domain, a Voicemail, and a Windows .EXE