Transitions: Munki in a Cloud

It’s MacDevOps YVR week, one of my favorite of the year. This morning, Clayton Burlison released an awesome package called terraform-munki that does something super helpful: it creates a set of terraform templates to create useful resources within your own AWS account to prepare an S3 bucket, and create a CloudFront Distribution with a TLS certificate.

This is exactly what I’ve been working around in my development of munki-in-a-cloud, which will replace munki-in-a-box due to the deprecation of Server.app’s web services by Apple later this year. I have the script done, except for the creation of the CloudFront Distribution, which I was reading all about when Clayton suddenly said “Oh! I did that! And I’m releasing it this week!”

So I’ll be figuring out which parts of terraform-munki are helpful to this new project and will get used or adapted into munki-in-a-cloud.

The goal is the same as munki-in-a-box: A script to create a functional munki environment and repository, and make it ready for use in the cloud. With a future version of macOS removing the Web service functionality entirely, it seems prudent to look at good cloud options.

If you’ve got opinions on a project like this, I’d love to talk more with you. Find me on the Mac Admins Slack to talk about it more.

Installing Ubuntu 17.10 on ESXi, from a Mac Client, to the ESXi Server

This is a guide as much for me, as much as it is for anyone else. I came to the conclusion I wanted a testbed for Reposado and Margarita, and as much as Clayton Burlison has the install of Reposado and Margarita on lock, I needed a refresher on how to create a new Ubuntu VM on my ESXi-capable Mac Mini.

First up, in VMware Fusion, connect to the server. File > Connect to Server… will give you access to the virtual machines stored on the server. You will then see a list of all the VMs currently on the server, active or not:

ESXi Host VM List

From here, you can click the + at top left to add a new VM:

ESXi Host New VM Screen

Since we are putting the VM directly on the server and not our local machine, select “Create a virtual machine on a remote server.”

ESXi Host Server Picker

Next up, you will be asked to select the server. Choose your local ESXi host.

ESXI Host Choose Host and Datastore

From here, you get to select which Datastore you want to store the new virtual machine on. If you had multiple volumes, you could select it here, whereas I just have my internal storage volume.

ESXi Host Choose HW Vers

VMware will then ask you to select a Hardware Version. There might be reasons to choose earlier versions, depending on what your local situation is like, but I’m up to date, so I’m choosing version 11.

ESXI Host Pick Network

Next, you get to choose which Network you’ll put it on. If you had multiples, you’d want to select the correct VLAN. I just have one, so I’m keeping it right where it is. You can also have VMs that have no network interface, and that’s an option here, too.

ESXI Host Pick OS

Since I’m running Ubuntu Server, 64-bit, for the final project, I’m selecting this version also for my sandbox VM.

ESXI Host Pick Firmware

If you wanted to opt to use UEFI or Secure Boot, here would be your opportunity! Ubuntu doesn’t need that, so I’m just clicking through.

ESXI Host Pick Disk Size

Last but not least, it’s time to pick your disk size. Since I’m using Reposado and Margarita, it’s a 200GB minimum to enter this party.

Now that we have our virtual machine, we need to get our copy of Ubuntu 17.10 Server. I grabbed mine from the Release Notes Page, which includes links to the Ubuntu download system. As long as you have an ISO, you should be fine to get started. Before you turn the VM on, you need to attach that ISO to the VM’s CD-ROM Drive. In the Virtual Machine’s Settings, you can select CD-ROM, and then specify the locally-stored ISO file to use as a connected volume.
VMware Settings CD ROM

Once you have selected the Ubuntu ISO file to attach as a CD, you are free to boot your virtual machine, and you’ll be presented with the next few screens as part of the process.

Ubuntu 1 Language Select

Select your preferred language for the Installer to use.

Ubuntu 2 Install Starter

Select the option to Install Ubuntu Server

Ubuntu 3 Language Select

Select your preferred language for the __operating system__ to use

Ubuntu 4 Region Select

Select the preferred region for the __operating system__ to use.

Ubuntu 5 Keyboard Config

Pick the keyboard you’re using

Ubuntu 6 User Creation

Ubuntu 7 Set Password

Set up your admin username and password. Don’t forget these. Store them in a 1Password item if you can.

Ubuntu 9 Set Timezone

Set your timezone

Ubuntu 10 Set Storage
Ubuntu 11 Volume Config

Also set up how you want the volume to be formatted. Defaults are fine, but you might choose to use Logical Volume Manager to handle your storage.

Ubuntu 12 Actually Doing Stuff

Now, it will install the OS and you’ll get an occasional screen to set up an HTTP proxy, allow security updates automatically, etc.

Ubuntu 13 Proxy

Then you can choose to install just the server, or a bunch of extra tools. Since I’m using Clayton’s guide for installing Reposado and Margarita, and it has the needed download commands, I’m just going to take the OS as it’s given to me.
Ubuntu 14 Select Additionals

More installing will occur here. Get a glass of water, you’re probably not hydrated enough today.

Ubuntu 15 More Installing

After this, you’ll be prompted to setup the GRUB Bootloader. Since this is the only Linux install on the virtual disk (It is, right? It would be really weird if it weren’t.) you can accept this configuration.

Ubuntu 17 Grub

And after that all completes, it’s time to get rolling through to your VM!

Ubuntu 18 Time to Restart

You can eject the CD-ROM on the next startup cycle.

Ubuntu 19 Logged In

Once we’ve got the box up, we want to install sshd to allow for remote access, because while it’s nice to have direct command line via the VM, ssh is so much more convenient!

We’re going to need to do a couple commands here to get it:

sudo apt-get update
sudo apt-get install openssh-server

This will install the standard openssh server and prepare it for use, allowing you to login remotely. There’s a million fiddly bits associated with opens, and you may want to customize it so that two-factor works, or machine tokens like YubiKey tokens act as your key. That’s an exercise best left for the reader. For now, I’m not publicly exposing that interface as part of this process.

Now that you have an Ubuntu 17.10 server ready to go, you can follow all the instructons of Clayton’s guide for Ubuntu 14.04 for installing Reposado and Margarita (it all still works as of 17.10!)

The Mac Pro That Just Would Not Shut Down

This post is part travelogue, part breadcrumbs, part manual on how to troubleshoot a problem. It is posted here as a signpost to others, to discuss how I approached a particular problem with a 2013 Mac Pro cylinder that refused to power down.

The problem began innocuously enough. Sometime after 10.13.3, my Dad’s 2013 Mac Pro, which powers his digital photography workflows, refused to shutdown. The behavior was odd enough, if you chose Shutdown from the Apple Menu – or, as we began the process, Restart – it would close out from all open applications, log out the active user session, and then sit at a black screen, doing close to nothing. External storage volumes would occasionally report disk activity with blinking indicator lights, but the system would never shutdown, never return to a login screen, and when a key on the keyboard was depressed, make a simple “bonk” noise that indicates that input is currently unwelcome. The mouse cursor would still move, but a click would be fruitless.

The system was effectively hung after logout, but before the shutdown task was complete.

The Players

It’s important to know the actors in any play, so let’s discuss setup.

My Dad in his retirement enjoys photography, and he takes a lot of pictures. He’s also quite skilled at digital editing, and keeps Adobe Creative Suite handy to do this work. Photoshop, Bridge, Lightroom, these are his stock and trade. He has several nice cameras that take ridiculously large images, so he has amassed a collection of external storage, including a Promise RAID, a couple OWC ThunderBay arrays, and some various and sundry external storage volumes that are helpful in keeping good backups.

I have learned my backup paranoia sensitivity by watching him take the same diligence he once used to operate a submarine’s nuclear reactor to keep his backups in line. He maintains a bootable clone and multiple Time Machine backups, which came in very handy as we began to troubleshoot the problem.

So, we had:

  • A 2013 Mac Pro, well equipped with D500 Graphics cards and 64GB of RAM
  • 2 x NEC ColorSync displays, connected via Mini DisplayPort and USB for calibration purposes
  • 2 x OWC Thunderbay arrays for storage, running with SoftRAID involved for redundancy and/or speed
  • 1 x Promise Pegasus R6 array
  • 1 x OWC Thunderbolt 2 Dock for additional USB storage
  • 2 x secondary USB 3 hubs for connecting peripherals, including some film scanners and some printing tools

There’s a lot of hardware here.

Where Do We Start?

Well, there’s the obvious steps, the ones that would surely be on everyone’s list. Suffice to say, they’re easy, so we tried them first:

1. PRAM Zap
2. SMC Reset

They were entirely fruitless. The machine was heard audibly guffawing as we tried it. Still, I know Apple’s going to want us to do it before we beg for warranty support, so, we did them.

Next were removing hardware factors:

3. Disconnect Everything But The Displays, Keyboard and Mouse

I was honestly hopeful this was going to be it, because it was going to lead to some sort of weird hardware combo that lead to a race condition at shutdown time or something.

4. Update All The Things

SoftRAID was a point release or two out of date, so we updated the kernel extension and driver, and tried again.

This was also not it.

5. Reinstall the Operating System

This used to be an awful experience. Thankfully, it’s not anymore. Boot from Recovery, run the installer again, restart, cross fingers, sacrifice goat, dance naked under the light of the full moon.

Failure. Again.

6. Safe Boot

This was where things took an interesting turn. And actually, we did this part for the first time after step 3, but we started to look at the serious methods below to look for a permanent solution.

After a Safe Boot, we were able to shut down the machine.

Safe Boot as Apple helpfully explains will:

  • Only load the required System kexts
  • Prevents LaunchDaemons and LaunchAgents from loading
  • Disables user-installed Fonts
  • Deletes Caches for the Kernel and the System, and resets the Font caches

That isolated our problem to one of those four areas. As the /Library/Extensions folder is protected by System Integrity Protection, starting there seemed foolhardy. We opted for the LaunchDaemons and LaunchAgents.

At first, I started by disabling a few that we might be able to live without entirely, leaving some key LDs and LAs in place to promote the usability of the environment in the event that would solve the problem. That was not helpful. I eventually did what many old-school Mac Admins will remember doing: disabling them all, in the hopes that a clean boot would identify the culprit. Then, at least, you can use split-half testing to identify which of these objects were causing the problem.

But this wasn’t successful either.

The Fonts definitely weren’t the problem, so we left that part alone.

This was the point at which we began to consider Serious Measures™ to fix the issue.

But First: Why This Method?

Troubleshooting a problem of this difficulty is an effort to balance severity of the solution’s effects on the user’s working environment and the ability to seek out a non-destructive solution. It would be cavalier to just wipe the internal drive and re-stage the machine without knowing what wouldn’t fix it, and it could be destructive to the workflow of the user, so we left that for last. What we opted to try were the solutions I’d call the “quick” fixes:

  • Zap the PRAM
  • Reset the SMC
  • Safe Boot to clear the System and Kernel Caches

These steps can solve tricky problems and they do it in just a few minutes, getting the user back on their feet quickly. These are non-destructive solutions as they only dump resources that are quickly rebuilt by the system in a programmatic way.

What I really wanted in the midst of all this was an equivalent to the verbose boot for the shutdown process, but that process doesn’t exist, and searching through the Console and System log in the 10.12+ era is worse than looking for a needle in a haystack. So I started to look for key identifiers of a potential solution by eliminating variables.

Searching for a hardware problem can be a challenge, especially if you have bus conflicts, or related issues due to the large number of USB and Thunderbolt devices in play, so I removed those from the equation early to eliminate a key source of potential interference in the system’s good operation.

With those gone, reducing the number of root-level processes seemed to be the next key target, as user-level events were eliminated by the logout of the user. A quick attempt at culling ancient remains of programs long out of use, but whose LaunchAgents and LaunchDaemons remained behind, were part of what came next. I was sincerely hoping that there was some obscure abandoned LD or LA that was triggering our failed shutdown, but with that gone, we were left with just one solution.

As a last ditch effort, we booted from an external clone to test whether or not it might be the internal SSD that was causing the issue. The failure to shutdown was independent of the volume that it was booted from, and related purely to what was stored in the OS.

The Solution: Nuke. Pave. Migrate Back.

Yes, this is the sad end to this tale. We were left with a solution that was unappealing in its chance to damage the user’s workflow, but we were out of options. So, we opted for:

  1. Create a bootable clone of the boot volume
  2. Back the boot volume up to a Time Machine destination (or two. or three.)
  3. Boot from Recovery
  4. Wipe the boot volume
  5. Reinstall macOS High Sierra from Recovery
  6. Test the system’s functionality to look for a hardware error.
  7. Once testing is complete, use Migration Assistant to restore from the bootable clone.

This is what solved our issue.

I suspect we had a rogue or old kext that was protected by SIP, and had we disabled SIP and done split-half testing we might have found it sooner.

But, this is how I worked the problem, and I hope that if you’re reading this in front of your Mac Pro that won’t shut down, you might take some ideas from this post.

Some Thanks

Many thanks to John Lamb, Eric Holtam, Ron Sanders, Owen Pragel, and Graham Gilbert for advice and encouragement!

With Apologies to the Bard…

Friends, Mac Admins, countrymen, lend me your ears.
I come to bury Server, not to praise him.
The evil that people do lives after them;
The good is oft interrèd with their bones.
So let it be with Server. The noble Mac Admin
Hath told you Server was ambitious.
If it were so, it was a grievous fault,
And grievously hath Server answered it.
Here, under leave of Server and the rest—
For the Mac Admin is an honorable person;
So are they all, all honorable people—
Come I to speak in Server’s funeral.
Server was my friend, faithful and just to me.
I speak not to disprove what some Mac Admin spoke,
But here I am to speak what I do know.
You all did love them once, not without cause:
What cause withholds you then, to mourn for them?

The End of Munki-in-a-Box

Today, Apple released some updated guidance for macOS Server. Apple will be deprecating the following services in Server.app in the Spring: Calendar, Contacts, DHCP, DNS, Mail, Messages, NetInstall, VPN, Websites, and Wiki.

With no native Websites functionality, Munki-in-a-Box will cease to function correctly against a Server.app installation.

I’m faced with a few choices:

  • Pick a Webserver to install
  • Pick a method to install it
  • Refactor MIAB to do just that
  • Make sure that this continues to work going forward

None of these seem like good value propositions when I’m moving a different direction for my practice (that is, S3 + CloudFront + Middleware).

Perhaps my time would be better spent creating documentation around a standardized basic solution for Munki-in-the-Cloud instead. Got an opinion either way? Bug me on the Mac Admins Slack.

Interviewing Chip Pearson

We interviewed Chip Pearson of Jamf Software this week on the pod, and the episode dropped today, and if you’ll pardon the cliché, it’s our best episode yet. Chip gives us a view into what it was like to be a Mac Admin in the days when Timbuktu was king, when AppleTalk Zones were still a big potential hurdle, and the Mac IIcx was the workstation du jour. He talks about quitting his job one morning, what it’s like to move from consultant to product to chairman of the board.

I remember very clearly seeing Chip speak at Macworld when I was still a very wet-behind-the-ears IT consultnat, and his message of empathy and focus on the role of the individual in tech’s embrace within your organization was very critical.

This was a fun interview, and it’s worth your time. Thanks Chip!

We All Come From Somewhere

This photo, from the Davis Enterprise, circa 1986-ish?, shows me with my third grade teacher, my elementary school principal, my Dad, and an engineer from the University in the town I grew up in.

I am absolutely a product of growing up in a University town, with involved parents, with schools that saw the future early and had the connections to figure it all out.

The next generation of people that do what we do now are out there, and they have to be more representative of our world than our current community is.

We all come from somewhere. There’s a spark in everyone’s life that makes a job a career. We can be the principal, the engineer, the parent, the teacher. Who are we preparing for what comes next? That’s what Carole Franti, Mary Ellen Dolcini, Adam Bridge and Charles Soderquist taught me, starting with that Apple ][. That’s why I give talks and teach workshops now.

Complex is not a Pejorative

From Volume 100 of Techno Bits:

The existing complexity of DEP, though, gives us choices. There’s no reason we couldn’t setup multiple MDMs for multiple departments within an organization, allowing central management of assets, and separate management of devices at the department levels, allowing for good competition between MDM vendors in the mid-level of the environment. Having multiple options is good, because it gives us choice, and it avoids obvious anti-trust complaints.

Read the whole thing. Complex shouldn’t be immediately pejorative. Good management can, and almost should be considered a complex task.