Scaling down the home lab, slightly

Since my last post about my new Zion 4U storage server I have been more and more impressed by Unraid and have started using Docker and KVM virtualization on it.

Since then I have moved some VMs over to KVM virtualization on Unraid. That left my Domain Controllers, IPAM, and VCSA left on my VMware environment. My secondary DC was also broken so I was running with just a single DC. This made me re-think how the home lab was configured and decided to simplify some of the many moving parts.

The main hassle is recovering from a power outage. Typically I have to manually intervene to get most things back online. Having systems more consolidated will allow a faster and more hands-off recovery.

Continue reading…

Website Refresh! v2.0

The vSkilled blog website has had some major improvements and is now officially launched as version 2.0!

The previous design had been in use since late 2014. Over time there were design elements and plugins that stopped working altogether or were causing various issues. I had worked tirelessly to improve the page loading times but had exhausted all my options on the old design. I knew a new design was going to be needed and I began slowly scoping out what I wanted for the new website refresh.

Version 2 (2017 – present)

Version 1 (2014 – 2017)

As you can see I wanted to keep a similar layout, only have it more simplified, and easier to maintain. I believe that has been accomplished. The cleaner look makes it look more professional and easier to read. I think the single post style instead of a post grid also makes the front page more attractive and relevant. Continue reading…

Home Lab Rebuild

It’s been long overdue for some changes to my home lab. The latest full outage on Sept 4, 2017 due to a power brown-out had me realizing that some improvements can be made. There has not been any major changes to the lab since 2015. In 2016 I upgraded the storage in NAS1, memory upgrade for VMH02, added Ubiquiti UAP-AC-LITE access points, and a security camera.

Now I’m going back to the drawing board and doing a fresh rebuild. The goal this time around is to be simple and redundant.

  1. Hardware firewall: I have custom built a 1U Supermicro server that will be used as the new firewall. It has a Intel Xeon X3470 CPU, 8GB RAM, quad gigabit LAN ports and a 200W low power supply. I’ve also replaced the stock passive CPU heat-sink with the Thermaltake Engine 27 low profile heat-sink. It’s a well balanced combination of performance, power and noise. In the old lab design the virtualized firewall introduced too many dependencies and greatly increased the complexity of the network. During a power outage scenario it also requires me to have a VM host and storage online which does not last long on UPS batteries. Having a low power hardware firewall allows me more flexibility and faster recovery from a total lab black-out.
  2. Additional UPS backup power: There will now be a third UPS battery for the home lab. I will dedicate one UPS for the core networking equipment and try to keep the load on it under 25% to maximize the battery life. The rest of the gear will be balanced over the other two UPS batteries.
  3. Standard Virtual Switches: I will be removing the Virtual Distributed Switch and LACP on the ESXi hosts.  This is a tough call but I have weighed the options. The VDS in my environment is overkill. I have two hosts, with only one of them on at a time. In my scenario the VDS’s only purpose is configuration sync. I don’t use traffic shaping, private VLANs, LLDP, etc! The only loss I will take by moving down to a VSS is having to manually maintain the port groups exactly the same on each host and no LACP. That doesn’t concern me because that hardly ever changes.

Continue reading…

Disaster strikes as NAS3 crashes

This past weekend we had a power brownout for about 4 hours. This caused my servers to fail-over to battery power. The batteries don’t last long with servers running. I guess something went sour with the automatic shutdown of my NAS3 which is used only for my VMware virtual machines and it did an improper shutdown. The RAID has crashed.

I don’t have anyone to blame other than myself and I knew eventually this day would come. NAS3 was in RAID-0. That means striping with no redundancy. A failed array on RAID-0 typically means total data loss. I take daily backups of this entire NAS nightly so I am aware and prepared for the risk of using striping. That does not mean that it’s a fun time recovering from it.

Adding additional redundancy for blackouts

Currently, one of the hardest things to recover from in my current home-lab environment is a total power blackout. Everything right now is planned & designed around losing certain components like 1 disk, 1 switch/network cable, etc. However when everything is off and I need to bring things back online it’s a painstaking and very manual process. Over time my environment has also become more and more complex. This latest outage has me scratching my head at how to recover faster & simpler from a power blackout.

Continue reading…

Tuning Large Windows DHCP Servers

I’ve been involved in setting up some very large Windows DHCP deployments during my time working as a Consultant at Long View Systems. Along the way I’ve found some interesting challenges and caveats of using Windows DHCP, especially so anytime your working with DHCP enabled dynamic DNS updates. I wanted to have a quick post about this for my own reference and hopefully might come in handy for others as well.

  • DHCP Failover Scopes
  • Administration Overhead
  • DhcpLogFilesMaxSize
  • DynamicDNSQueueLength
  • DnsRegistrationMaxRetries

DHCP Failover Scopes

I’ve covered this topic extensively in my Windows Server 2012 R2 – DHCP High Availability / Fail-over Setup Guide series. Basically, if you are deploying Windows DHCP on a 2012+ server then you should be using DHCP Failover (not to be confused with split-scope or ms-clustering).

Administration Overhead

If you’re working with more than 100 scopes using only the default DHCP MMC-snap in’s, you’re gonna have a bad time.

Almost 1,000 DHCP scopes, 150k+ IP addresses

Performing administration tasks in the console with a large number of scopes becomes very repetitive and time consuming as each task normally requires many clicks. Making mass-changes is also very difficult or next to impossible. You may find yourself becoming familiar with Powershell scripting to resolve this problem. The DHCP Server Cmdlets in Windows PowerShell are very easy to use and Microsoft has great documentation on this. I found myself making Powershell scripts to make mass-changes much easier and less vulnerable to human error due to the very repetitive nature of the default GUI. Continue reading…