Post-incident review

Yesterday I discovered that I had leaked 3,638 email addresses by uploading them to a public GitHub repository.

Here's what happened, how it happened, what I did about it, and what I'm doing to make sure it can't happen again.

TL;DR

This post will be long, so here's the essential stuff if that's all you care about.

Yesterday morning (2026-03-03 06:00 -- all times in this post are UTC) I was alerted by email and private message that someone's unique email address, that they had only ever used on my services, had received a phishing email purporting to be from PayPal.

There's only one way for this to be possible: it meant that I had allowed the address to be stolen. I acknowledged this with a blog post at 06:46, requesting that people let me know if they were affected.

Through the day I received, or could infer from my own data, 17 addresses that either did, or did not, receive the phishing email.

Analysis of these 17 data points led me to the realisation that I had uploaded a folder full of text files containing email addresses to a public GitHub repository. This made them visible to anyone who cared to go looking. I confirmed this theory at 18:00 and immediately made the repository private. I acknowledged this with a blog post at 18:18.

The first upload of 1,650 addresses occurred on 2024-10-08. Further updates were uploaded through 2025. The final number of addresses in the repository was 3,638. Because I don't know when the data was scraped from GitHub, I can't be sure which of those addresses was harvested. My assumption must be that they all were.

I am deeply sorry. I strive every day to be an exemplar of 'the good Internet'. In this instance, I have failed you quite miserably.

How to determine if your address was leaked

Immediately after this post has been published I will email the 3,638 addresses that were leaked.

From: Johnny 'Decimal' Noble <hello@johnnydecimal.com>
At: around 2026-03-04 18:00 UTC
Subject: Important: your email address was exposed [D25.14.44]

If you receive this email then your address was leaked.

You can also check if you received the phishing email:

From: PayPal Security <no.con**star@ca**a.cl>
At: around 2026-03-03 05:30 UTC
Subject: Please confirm your identity

If you receive this email then your address was leaked. (Check your spam folder: many providers correctly identified it as a phishing attempt.)

If you do not receive either of these emails then your address was not leaked.

Was any other data leaked?

Like names, addresses, dates of birth, passwords, or access to accounts?

No. The only data leaked was a list of email addresses in a text file.

What's the impact of the leak?

The email address that was leaked will receive spam and phishing attempts.

To be clear, your account has not been compromised. Only your email address has been made public. You do not need to change your password (assuming it is already unique -- see What can you do? at the end of this post).

Full timeline

06:00
- Leak notification received.
- Incident D25.14.44.2026-03-03 created.
06:46
- Public informed of breach.
08:16, 08:32, 08:47, 09:40, 10:43, 11:51, 12:55, 14:02, 14:43, 15:31
- Data received from affected users helped me to build the picture.
- I'll explain below why this took so long.
18:00
- I realised what it was and used local data to confirm.
- The repository was made private.
18:18
- Public informed of cause.

How it happened

Here's why I had a folder full of your email addresses, and how they ended up on GitHub.

Over the years I've used a variety of 3rd party platforms to deliver websites and services, as well as hosting a public email list.

Buttondown
- Used to host the mailing list from 2019–2024.
- Allowed users to opt-in to email marketing.
- Users hold an account of sorts; while you don't have a password, the platform records your email address.
Netlify
- Used to host johnnydecimal.com and jdhq.johnnydecimal.com from 2019–current.
- Users do not hold an account at Netlify: it's an infrastructure service only.
Gumroad
- Used to sell the Workbook from 2023–2024.
- Allowed users to opt-in to email marketing.
- Users held an account at Gumroad.
Thinkific
- Used to sell various products from 2023–2025.
- Users held an account at Thinkific.
Shopify
- Used to sell various products from 2024–2025.
- Allowed users to opt-in to email marketing.
- Users held an account at Shopify.
Stripe
- Used to process payments from 2024–current.
- Users hold an account of sorts at Stripe: you can't log in, but they hold a record of transactions linked to your email address.
PayPal
- Used to process payments from 2024–current.
- Users may hold an account at PayPal.
- It's also possible to check-out as a guest, but in any case your email address is recorded.
Listmonk
- Used to host the mailing list from 2025–current.
- Allows users to opt-in to email marketing.
- Users hold an account of sorts; while you don't have a password, the platform records your email address.
- The link at the bottom of every email allows you to unsubscribe (in which case your email address is retained) or delete your data entirely.
PikaPods
- Used to host Listmonk from 2025–current.
- Users do not hold an account at PikaPods: it's an infrastructure service only.
Amazon Simple Email Service (SES)
- Used to send emails from 2025–current.
- Users do not hold an account at Amazon: it's an infrastructure service only.
Clerk
- Used to host JDHQ user accounts from 2025–current.
- Users hold an account at Clerk.

If this sounds like a nightmare, it's because it is. But this is the reality of running a small online business: you stitch together that business based on the services available to you at the time. This is a factor of features, cost, and experience. As your business grows, you move between platforms.

To be explicit: none of this is the fault of any of these platforms. I list them all merely to demonstrate the range of stuff one has to deal with.

The reality of running this type of business is that you spend a lot of time consolidating user data from these services. Why is this person in this data set and are they the same person as this person in that data set? You do this so you can mail the right people the right information; send subsets of people offers that are only relevant to them; migrate user accounts from old platforms to new platforms.

As a small business with limited resources there's only one practical way to do this analysis: Excel. In a bad month I might spend 50% of my time 'mashing' data like this. I hated having to do it, and eliminating it was a very strong driver in my building JDHQ: because now you have a single account, forever. No more mashing of CSV files exported from half a dozen platforms.

(Ironically, this leak is partially a result of my very, very strong desire to never send anyone an email that they don't want. I've spent dozens of hours over the last few years meticulously poring over and cross-referencing this data, taking pains to ensure that someone who opted out on this platform didn't get opted back in on that platform. Had I not bothered, some of this data might not have been on my laptop. C'est la vie.)

So we have a folder full of CSV files

That's where we are in the story: I have a folder full of CSV files from these various platforms that I've been using to consolidate and migrate accounts, and to send email directly from my laptop via Amazon SES.

So how did they end up on GitHub? Sheer stupidity.

This repository started on 2024-01-10 as an intentionally-public archive of emails sent to the mailing list. It contained 3× text files.

The problem started on 2024-10-08 when I started to use the same folder to store the output from these CSV files.¹ Forgetting that the linked repository was public, I committed these files and 'pushed' them to GitHub. At this point they were available publicly.

Why this git/GitHub thing?

It's natural to use git -- which is software independent of GitHub, the website -- to manage a dataset like this. It gives you version control, which is really useful. If you mess up, you can just 'roll back' to a previous state.

Using git locally isn't the problem. Not realising that the linked GitHub repository is public and pushing to it was the fatal mistake. I'll address this below.

From 2024-10-08, this folder, which lives at ~/dev/amazon-ses on my laptop, continued to be used as a place where I stored lists of email addresses so that I could send email via Amazon SES. Every time I did that, I committed the changes and pushed them to GitHub. And so by yesterday, the repository held 3,638 unique email addresses.

My own password hygiene

So that I don't need to make the point below with regards to each specific service, I'll make it here.

I have exclusively used 1Password for password management since 2009. All of my passwords are unique and they exceed all modern standards for entropy. Where 2FA is an option I always enable it.

It was, at least, nice to know from the start that this leak wasn't the result of bad password management.

My 'secret key' hygiene

Separate from passwords, developers of sites like mine have 'secret keys' that are used by servers and services to talk to each other. From JDHQ, for example, you can opt-in to product notification emails. This is possible because the service that serves JDHQ, Netlify, has the secret key for Listmonk and can talk to it directly via the software I've written. If you have the secret key for a service, you can read all of the data contained therein.

I was already planning an article detailing the steps I take to secure these keys, so I'll just note here that they're also all stored in 1Password, and that I had already taken what I believe to be extraordinary lengths to secure them against attack. More to follow.

Listmonk and PikaPods

I 'self-host' my mailing list using Listmonk hosted on a PikaPod.² At 06:31 I identified that the Postgres instance used by Listmonk was open for login using a public console. This isn't enabled by default: I had turned it on a few months earlier so that I could connect directly to the database using client software on my Mac.

Forgetting to disable that access was definitely a mistake -- see action 4, below -- but the risk seemed low. The username and password are set by PikaPods and both were secure: the username not being admin or similar, and the password being a 24-character string. Checks of the console software, Adminer 5.4.2, showed no vulnerabilities.

For these reasons, I dismissed the possibility of a leak via this route. Because I was so sure that this wasn't the cause, I didn't email PikaPods until 15:36. Finally doing so, I asked them if there were any logs for this console endpoint that might be useful.

Their reply just 45 minutes later was stunningly helpful, clearly written by a caring human. I could not have recommended them more before this happened, and yet here we are, my endorsement stronger than ever. A superb service, utterly without fault.

At this point (16:13) I hadn't positively eliminated this as the cause, but it seemed vanishingly unlikely.

Buttondown

Early data supported a theory that Buttondown's service had been compromised. As a reminder, they hosted my mailing list from 2019–2024. So it's not unusual that there was a 1:1 match between leaked email addresses. I mailed their support at 08:57 emphasising:

"This is NOT an accusation -- I'm trying to figure this out myself. It's purely FYI, and I genuinely hope for you that I'm wrong."

– thinking that, if a leak had occurred, their support might appreciate the data.

Through the day I continued to receive data points, none of which disproved this idea. This analysis was basic science at work: build a hypothesis, collect data, see if data supports hypothesis.

Of course good science is about a falsifiable hypothesis, and at 15:31 I was made aware of 3× addresses that had received the phishing email that were not in my Buttondown exports. It was a relief to rule them out as a cause and I notified them immediately.

Again, I can't sing the praises of a company enough. Anita from Buttondown support felt like a friend yesterday, keeping in touch even after I informed them of this finding. If you need an email newsletter, use Buttondown. They're truly good people.

How I came to the answer

Around 16:30 we went for our usual end-of-the-day walk. Still not knowing the cause, I replayed all of this to Lucy. Thoughtful questions followed and, via this conversation, the thought occurred to me. Back to science, this is Occam's Razor in a nutshell: given all possibilities, the simplest should be considered the most likely.³

The simplest possibility being that I, as a holder of all of this data on my laptop, committed it to a public repository. Getting home, this was confirmed at 18:00.

A note on the data sources involved

You might be on this list despite having never signed up for my mailing list. For example, hundreds of email addresses from JDHQ members are impacted. Those addresses aren't on the public list, because I never sign you up without your explicit opt-in.

These addresses are in the data because I used the leaked scripts and Amazon SES to send transactional emails as well as marketing emails to the public list. Similarly, addresses from previous platforms may be present.

Lessons learned

Looking out of the window earlier this week I saw a fire engine, sirens blaring, scream up behind a learner driver. Should be part of your driving test, I thought. Because you can't truly be prepared for the panic induced by sirens a metre behind you until it happens.

In a way, I'm glad this happened. Without trying to minimise the event, it's fair to say that on the spectrum of security and data leaks, this is about as benign as it gets. Better that this happens now when I can learn from it, than a much worse event happens later.

Because if this hadn't happened, I would have spent the day developing JDHQ, where you'll soon have the ability to store your own notes, and create your own IDs.⁴ It is impossible to convey the depth of responsibility I feel for this data. I lie awake at night thinking about how to keep it secure. (An enjoyable problem to solve, to be clear.)

Like I said above, I've already been planning a post addressing this data and the measures I'll take to protect it. So here let's just talk about what I learned yesterday.

If you don't have it you can't lose it

I can't stress enough that I do not want your personal data. Having it means having responsibility for it: and then look what happens.

You may notice that I don't ask for your name when you sign up for JDHQ. Requiring a name is an option at Clerk, where your user account is held. I turned it off. You may notice that I don't ask for your address when you make a purchase. This is an option at Stripe, who processes your payment. I turned it off.

Still, in diagnosing this issue yesterday I realised how much data I have on this laptop. Again -- see above -- this is largely unavoidable. I run a business, I need to manage the business' data. For example, every quarter I need to download my transactions for tax reporting.

But I can definitely change my behaviour.

Action 1: make customer data more difficult to access

You might think I'd be safer just deleting it, but this data proved very useful in troubleshooting yesterday's issue. Without it I'd have been blind. So I'd rather not delete it; I don't think there's a problem that this would solve.

The problem to solve is that having this stuff sitting in folders on my laptop is too loose.⁵ It's too easy for something to leak; too easy for me to think of these files in the same way that I think of all my other files. Instead, I need to be acutely aware that when I'm interacting with them that I'm in some special place. I need to be on high alert.

I have created an APFS encrypted disk image and moved all existing customer data to it. It is mounted on-demand. The password is in 1Password and requires a manual copy/paste: I won't store it in my Keychain, so the disk can't ever mount without my explicit action.

Mounting it will forever recall this incident, and I'll be vigilant. I'll do what I need to do and unmount it. There's no chance that I'll make the mistake of pushing something in there up to the cloud.

Action 2: only download what I need

When I download data from these platforms -- say that quarterly tax analysis from Stripe -- my tendency has been to be lazy, and grab everything. As in, choose all the columns, download everything offered.

Because if you're not exactly sure what you need, it's more convenient to have everything to hand than to have to go back and get what you missed.

From today, I'll only download the specific data that I need to do the job. For Stripe's tax analysis, that might be as minimal as the transaction ID, amounts, and the country of purchase. Your name and email address isn't a factor in my reporting a quarterly sales tax total to the Australian Tax Office -- so why even request that in the export?

Action 3: Proactively delete accounts

I still had a Buttondown account that I wasn't using. It still contained thousands of email addresses.

It, and its associated data, has been deleted. If I think of any other similar accounts, I'll delete them.

Action 4: reminders to disable open services

While it proved not to be relevant, I was disappointed to find that I'd left the Listmonk Postgres database console access enabled. This was an unnecessary risk and happened because I simply forgot to disable it when I was finished.

In the future, if I open up anything like this -- which, again, is going to be necessary to run a business -- I'll set a timed reminder for myself to close it when finished.

Action 5: GitHub is not a backup service

I have a tendency to think of GitHub as a useful backup service. Do a quick git push and now the precious data that I just spent all day massaging into shape is copied to the cloud.

This isn't what GitHub is for! Never again, for any data.

Action 6: Compliance/reporting

Thanks to my Discord for flagging the possibility that I might need to register this breach with various national authorities.

I've investigated a few, and this event is below the reporting threshold. GDPR says that I need to keep internal breach logs and learn from the event, which I was already doing. If you're aware of a more strict requirement from your local authority, let me know and I'll gladly comply. Obviously I can't check them all.

What can you do?

You, as in the reader. What can you do to make yourself more safe online?

Lucy and I have been talking for at least a year about producing a (free) video series addressing the basics of online hygiene. We've moved that idea to the top of the list and will start working on it immediately.

I need to get this post published so I won't go into details here, but the two things you can do to put yourself above 99% of everyone else are:

Use a password manager. Allow it to generate random passwords for you, different for every site. This isn't optional in 2026.⁶
Use unique email addresses for each service that you sign up for. This is more difficult, and introduces complexity.

We'll cover both of these in the course.

Questions?

If you have any questions please post them at the dedicated forum thread for this incident.

If you run a small business and need help: we are here. This stuff is difficult: I messed up and I'm supposed to be an expert. Please ask and we will help you.

End of incident review.

100% human. 0% AI. Always.

More accurately, the leaked data took the form of lists of email addresses in a .sh script, which called the aws sesv2 command to send an email. For the sake of simplicity I'll continue to refer to 'CSV files' in the main story; the data is identical, the only difference being a file extension. ↩
Acknowledging that this is a stretch of the definition of 'self-hosting'; the point being that PikaPods provides me the raw instance, and that configuration and management of this instance is my responsibility. They sit between PaaS and SaaS. ↩
I know that's a common mis-reading of Occam's Razor, which actually states that the theorem that introduces the fewest new elements should be considered the most likely. Close enough. ↩
One person has already written me asking why I think storing user data in JDHQ is a good idea. This isn't the post to prosecute that question but briefly: I'm building features that I want. I think you'll find them useful too. If you never want to use them … just don't. ↩
Noting that I don't consider the laptop itself to be an attack vector. It requires my password immediately after being locked, has FileVault full-disk encryption, and has no open Internet ports. ↩
Coincidentally, a friend of Lucy's was hacked last week. They got her Microsoft OneDrive account and everything in it. She's in the process of getting new passports and drivers licences. A true nightmare. Why? Shared passwords. Your password must be unique. ↩