In this podcast, we hear from a few IT experts about the worst IT nightmares they've ever encountered.
Greg: Hi, I'm Greg Mooney and this is Defrag This. Every IT expert has a story to tell about when everything has gone wrong on the business network. It's these nightmares of the IT professional that keeps IT being proactive, because when it comes to keeping the stack up and running, Murphy's Law always applies. There's ghosts always in the system ready to wreck havoc. It's happened to me and it's probably happened to you. These are the stories that keep IT professionals up at night and shivering under the covers. We're going to talk to a few IT pros and get to hear some of their IT horror stories. Some of the stories today will be common occurrences, but horrifying nonetheless, while others may be downright spooky.
The first one up, we have from Adam Bertram, also known as Adam the Automator. You can follow him @adbertram on Twitter, and you can check out his website at adamtheautomator.com. So, sit down, relax and be prepared to hear some horrible IT nightmares.
The Great Server Crash of 2001
Adam: Picture this. It was probably 15 years ago. Let's all get around the campfire here. It's been about 15 years ago, I was in a credit union, a small credit union, probably the second job out of college, you know, all bright-eyed and bushy tailed in IT. I was going to save the world. At the credit union, I was a network administrator so I was pretty much in charge of all the minor switching and routing and I did some help desk stuff. I kind of did everything at the time. And during that time, I was kind of looking into getting into InfoSec, into security. So, I had studied Foreign Pass, the CompTIA Security+ Certification and I was studying for my CISSP at the time. And during that time, we routinely do a tape backups.
At that time, the company was doing a lot of tape backups. So, one of my jobs was to go in the data center every now and then and just, you know, swap out tape, put in tapes and that sort of thing. So, being my security conscious self, I decided that the tapes needed to be encrypted because one time that I would bring the tape, take the tape out of the drive, and one day I forgot to put it in the little storage location we had and brought it back to my desk. I looked at it and I thought, you know, there's gotta be...you know, what if I leave my desk or what if...I can't remember her name, the admin assistant actually took it home in her purse but just leave it out somewhere and somebody would take this and, you know, it had some critical data on it for a file server that we had at the time. And what I realized was that, you know, they should encrypt these things.
So, at the time we were using, I believe it was Veritas Backup Exec, I think now it's Symantec Backup Exec, and that had an option to encrypt the tapes as we did them. So, you know, that was a good idea, talked to the boss, said, "Yes, that's a good idea. You know, that shouldn't be a problem."
So, I went alone and encrypted all of the...set it in the Backup Exec program, I encrypted all of the tapes, so all of the tapes. We had about, I don't know, it was only, like, maybe 5 or 10 servers at the time or something like that. And I encrypted all of them. It was great and the boss said, "Yep, we're happy, good. We're secure. If the tape gets lost, then we don't have any issue with somebody trying to get into the data."
Well, six months later came by and we had one of our main file servers die. It just went kaput. One of the RAID arrays just completely corrupted a lot of the drives in there and the data was unrecoverable. So, during that time, we were at...as everybody was freaking out, trying to figure out how to get this data back, we thought, "Well, we're making backups every day. No problem. So I'll just put the tape in there and just bring up a new RAID array, create some partitions and restore the data and it'll be good to go."
I rebuilt the array, created all the partitions like we were supposed to and put the tape in, and up popped a password prompt. And that's when I kind of had that, "Oh my God" moment because I didn't know the password. What I didn't realize was I was going to create some super complex password that nobody, and right now I realized not even myself would remember, who would know, because I was...I don't know, for some reason I didn't have a password manager, I didn't write it down. For some reason, I thought it would remember it but I never did. I never actually told anybody what the password was.
So, I was sitting there at this password prompt that the entire credit union essentially just saying, "Oh my God, the sky is falling." I need this password. I tried everything I knew, every password under the sun that I thought it could possibly be, and it was not any of them. This went on for a couple days. I would try passwords, I would Google, try to figure out how to correct the encryption. I think I even called Veritas at the time, support, and they said, "Sorry, we don't know the password. We can't decrypt the data."
And a couple days went by, it was days when this was a very important file server at the time, and I cannot for the life of me remember how we actually got the data back or if we did. But, I believe we had actually sent the tape off to one of those really expensive, like, drive savers, one of those expensive data recovery places. And we finally got the data off the drive, and I think we were able to replicate, get some of the data off of some other servers and get it on there just to get people working again, and then the rest of the data, you know, it took a week or two to get everything back. From what I recall, I think we were able to get everything back. I don't know, it's probably been 15 years or so to date.
So, that ended up being as one of the marketing guys I clearly remember at the time calling it, "The Great Server Crash of 2002" or maybe it was 2001 or something. And so, it was an infamous server crash because it was so important and I simply did not remember the password. Now thankfully, I don't know how but I managed to keep my job. I was able to get everything to a fairly normal state after that, but I think, you know, out of all of the 21 years or something I've been in IT, I think that's probably the most memorable one that I have.
Greg: That was truly a spooky tale from Adam Bertram. Next up, we have Tim Warner, who is a Microsoft MVP and an Azure Solution architect, based in Nashville, Tennessee. Here is his IT horror story.
Evil IoT Devices from Hell
Tim: I was doing some IT consulting. That's really been my gig, so to speak, for the last several years. And a client I had ran a relatively small home-based business and he was concerned that he was exposing his proprietary business data inadvertently by using network-attached storage devices. And so, as part of my work for this client, I hit up Shodan, which if listeners don't know, Shodan, shodan.io is a website that spiders the web collecting data from any IOT device in the world that happens to be internet-exposed.
And so, by running some Shodan searches, you can find some very scary things, misconfigurations, routers, storage that are misconfigured or left at their defaults. And in my work analyzing my client's environment on Shodan, I happened across the Iomega vulnerability that was exposed a few years ago, just a particular brand of home consumer-based basically network storage. And if you just left everything at the default, you were exposing that entire device to the world. And I didn't go too far down the rabbit hole, but my heart just kind of swelled for a moment because I saw that there was this particular family had happened to be out in California, and I just clicked in for a moment and saw they got their whole life exposed to the internet by purchasing this device. I actually don't remember if it was Iomega specifically, so let me make that clear. But they had a defaults settings device exposing their health records, their tax information, everything, and I just felt tremendous compassion for this family, and I felt that it was important that I let them know.
So on one of the documents, of course I didn't snoop through all their documents, but I saw a document that had contact information, and I was quite nervous about doing it but I called them up. And he was obviously completely flabbergasted. I identified myself by name, told him what I was doing and how I came across his data, and told him he might wanna unplug that thing immediately. And we spent 10 minutes, him just trying to understand the context of the conversation.
I mean, I can understand from his point, this came out of left field. This guy from Tennessee calls him who he's never known, probably never will know, telling him that his whole life and his family's life, pictures and everything are exposed to the internet. But he calmed down after a moment, after awhile and he thanked me, and that was the last I heard from him. But, it was a horror story just getting me thinking about, you know, inadvertently exposing personal data through IOT.
One of the hazards of Internet of Things, of course, is problems like insecure defaults. And if you don't update the device but have it off default, you could wind up with vulnerabilities, just how fragile privacy is. And the counterbalance between...I'm not a security professional as such, I'm an IT generalist, but the counterbalance between altruism, wanting to do something nice for another human being but also in the back of my mind thinking, do I have legal exposure if I do so. That was one of the possible horrors that came to my mind, convincing this guy, which I didn't have to lie, was able to be completely honest that I didn't have nefarious motives. I came across his data innocently and I just wanted him to do a solid to help him protect his family.
Greg: Ooh, that one had me shivering in my boots. Need more good samaritans like Tim out there. The next one we have is from Dan Franciscus who is a systems engineer in VMWare Certified Professionals specializing in VMWare, PowerShell, and other Microsoft-based technologies. You can check out Dan's blog at www.winsysblog.com. Here is his IT horror story.
The Disappearing Internet Explorer Favorites
Dan: So, yeah, so two horror stories I have actually, and both happened at my first job. And they were pretty bad actually. I've fortunately not screwed up this bad since then but...so the first one was, you know, it was a fairly small organization, you know, a couple hundred people. It was actually a school so there was, you know, there was about I wanna say 1500 students and about, you know, 400 staff.
But so, at the time, we were using Active Directory and Group Policy to do config management on, you know, end user machines. And one of the things we came up with is we wanted to push out Internet Explorer Favorites, you know, a couple links for everyone, for the website homepage or HR sites and stuff like that. Basically things that everyone would need. And so, you know, I was very green and I did not know Group Policy real well.
So, what I ended up doing is I made the change, I pushed it out without any testing whatsoever, pretty typical in schools unfortunately, and instead of adding on the Internet Explorer Favorites, I wiped everyone's Favorites out for the entire organization instantly, as soon as that Group Policy was pushed. So, yeah, it was kind of bad. Thankfully, people actually weren't that upset. I guess they didn't really use them as much as I thought they did. But yeah, it was pretty bad. And so, yeah, that was my first big horror story.
And so, at the time, you know, we were basically two people in IT and I did not figure out a way to solve it at the time. I'm not sure if there really is actually, if you wipe it out it was It was Windows XP. So I never actually solved it, we kind of just had to roll with it and people had to add them back manually if they wanted them. So yeah, part of my advice there would be to always test things before you push them out obviously, and I think most orgs are pretty good with that now. Changed management processes and, you know, those things don't happen as much as they used to.
Ghosts of Active Directory
My second horror story, again, same organization, you know, probably around the same time, within a year. So, we had home directories for everyone on their network share, right. So, people through Active Directory connect, you have your own network share for your personal files. And so, we were doing some changes...so, the way network shares work, you have root permissions to the share usually, and each person has access to their folder into that share, at least that's the way we did it. And we were messing around with the permissions on the root share, and unfortunately you know, the change we made inherited to the folders underneath that, to everyone's personal folder and it removed access, individual access to all the folders.
So, within like, you know, 10 minutes, no one had access to their personal files anymore. So, yeah, that was pretty bad. And so, we actually had to...this is prior to me learning any kind of scripting or PowerShell. Nobody in the organization knew it, and so we had to individually go in and basically add the permission back for each user for, like, 1000 people. So, it was pretty tedious and, you know, it was bad. So again, advice, obviously it's mostly just you have to test things, you know? If you're gonna make a change, you really have to test it well, you have to know what the outcome's gonna be, and you know, I learned the hard way. And you know, both were really bad mistakes to make, but you do learn from those mistakes as I did, and you have to test things.
Greg: Right you are, Dan. Test all the things. Next up, we have co-host, Jeff Edwards who would like to share a spooky story that he found online. You may have heard it, you may have not, but let's hand it over to him.
The Unexplainable Email Bug
Jeff: Anyone who works in IT has at one point or another, received a ridiculous ticket from a user, sometimes a higher up, who thinks they've discovered a new type of bug on their system, something that can't be explained. Inevitably, the bug turns out to be simply user error, but that's not the case in this next story which flips that stereotype on its head.
Our story begins a long time ago in an place far, far away. Okay, not that far. The time is 1995 and the place is the University of North Carolina at Chapel Hill. That's where the hero of our story, Trey Harris spends his time as an IT tech, running the campus email systems. One day, when Trey was minding his own business, going about his job, enjoying a nice latte, he got a call from the chairman of the statistics department who said he's having a problem sending email over the department. "What's the problem?" asked Trey. "We can't send mail more than 500 miles," the chairman explained. Trey choked on his latte. "Come again?" Sounding agitated now, the chairman repeat himself, "We can't send mail more than 500 miles from here," he repeated. "A little bit more actually. Call it 520 miles but no further."
Trey tried to hold his composure. He was shocked. This was the exact problem that email was invented to solve. That's not how email works. This is what he told the chairman. Email doesn't really work that way generally," Trey said, trying to hide the panic in his voice and be as polite as he could. This was after all, the chairman of the department. "What makes you think you can't send mail more than 500 miles?" asked Trey. Now the chairman was getting angry. "It's not what I think," the chairman said. "You see, when we first noticed this happening a few days ago..." "Wait, a few days?" interjected Trey. "And you couldn't send email this whole time?" "We could send email, just not more than 500 miles," said the chairman. "Five-hundred miles, right, I got that," said Trey. "But why didn't you call earlier?" "Well, we hadn't collected enough data to be sure of what was going on until just now," said the chairman. "Right," thought Trey. This was the chairman of the statistics department, of course.
The chairman continued. "Anyway, I asked one of my geostatisticians to look into it, and she's produced a map showing the radius within which we can send email to be slightly more than 500 miles. There are a number of destinations within that radius that we can't reach either or can only reach sporadically, but we can never email further than this radius, 500 miles." "I see," said Trey, putting his head in his hands. "When did this start? You said a few days ago, but did anything change in your systems at that time?" "Well," said the chairman, "The consultant came in and patched our server and rebooted it, but I called him and he said he didn't touch the mail system." "Okay, let me take a look and I'll get back to you," said Trey, scarcely believing that he was playing along. It wasn't April Fool's Day. He tried to remember if someone owed him a practical joke. Despite the seeming ridiculousness of the request, Trey's curiosity was peeked. He had to know what was going on plus it was his job to fix the emails.
He logged into the statistics departments servers and sent out a few test mails. A test mail to his own account went out without a hitch. Ditto to one sent to Richmond, and Atlanta, and Washington, another to Princeton which was 400 miles away worked. But then he tried to send an email to Memphis, which was 600 miles away. It failed. He tried Boston, it failed. Detroit, failed. Trey got his address book out and tried anyone over 500 miles he could find, anyone he could think of, trying to narrow the problem down. An emailed to a friend 420 miles away in New York worked, but an ex-girlfriend in Providence, 580 miles away failed. Trey was beginning to wonder if he was losing his sanity, so he thought of a test. He emailed a friend who lived in North Carolina but whose ISP was in Seattle. Thankfully, the email failed. Trey breathed out a sigh of relief. If the problem had to do with the geography of the human recipient and not with their mail server, Trey would have broken down in tears.
So, Trey had established the unbelievable, that the problem as reported was true. The statistics department could not send emails more than 500 miles. What are the odds of that? The next step was to take a look at the sendmail.cf file, the configuration files. They looked fairly normal, in fact, they looked familiar to Trey. He diffed it again against the sendmail cf, his home directory. It hadn't been altered. It was sendmail configuration file he'd written and he was fairly certain that he hadn't enabled the "fail mail under 500 miles" option.
At a loss for what to do next, Trey tried Telneting into the SMTP port. The server happily reported with a SunOS sendmail banner. Alarm bells started going off in Trey's head. Wait a minute, alarm bells started ringing in Trey's head. Wait a minute, a SunOS sendmail banner? At the time, Sun was still shipping Sendmail 5 with its OS even though Sendmail 8 was fairly mature. Being a good sysadmin, Trey had standardized on Sendmail 8. Also being a good sysadmin, he had written a sendmail configuration file that used a nice, long, self-documenting option and variable names available in Sendmail 8, rather than the cryptic punctuation mark codes that have been used in Sendmail 5.
All at once, the pieces started falling into place and Trey had an epiphany. When the consultant had patched the server, he had apparently upgraded the version of SunOS and in doing so, he had downgraded Sendmail back to Sendmail5. The upgrade hopefully left the Sendmail configuration file alone, even though it was now the wrong version. Just so happens that Sendmail 5, at least the version that Sun shipped which had some tweaks, could deal with the Sendmail 8 configuration file. Most of the rules had at that point, remain unchanged. But the new long configuration options were seen as junk and skipped, and the Sendmail binary had no defaults compiled in for most of these. So, finding no suitable settings in the Sendmail configuration file, they were set to zero.
One of these settings that was automatically set to zero, was a timeout to connect to the remote SMTP server. Trey performed some experimentation and established that on this particular machine of this typical load, a zero timeout would abort a connect call in slightly over three milliseconds. At that time, the campus never had an odd feature, it was 100% switched. An outgoing packet wouldn't occur a router delay until hitting the POP and reaching a router on the far side. So, time to connect to a lightly loaded remote host on a nearby network would actually largely be governed by the speed of light distance to the destination, rather than incidental router delays.
Feeling slightly giddy, Trey typed into his shell, "Units, 13,011 units, 63 prefixes. You have three milliseconds, you want miles times 558.84719 divided by 0.0017893979," 500 miles or a little bit more.
Big thank you to Trey Harris for sharing this story on his website 20 years ago. Since then, it's become a massive hit in the IT internet. You can read this story yourself at ibiblio.org/harris/500milemail.html. There's also a feck [SP] where Trey addresses some of the math inconsistencies you may have noticed. Thanks.
Greg: Wow, I've heard some spooky stories in the past but that one from Jeff might take the cake. Anyways, that's about all the time we have today but we hope you enjoyed our IT Horror Stories on "Defrag This." Remember you can follow us at defrag_this or @ipswitch on Twitter. Or you can check out more podcasts at blog.ipswitch.com/podcasts. Until next time, I'm your host, Greg Mooney. You all stay safe out there.