my personal blog about systemcenter

Archive for April, 2016

The Argument for the Anti Home Lab

Categories: Uncategorized
Comments Off on The Argument for the Anti Home Lab

First off Knowledge is King

Second if its faster/easier/better to use home lab don’t stop 🙂 keep learning

That being said it’s beyond my understanding why more people don’t use shared labs

Pooling resources is what we been doing for the last decade and many more than that

If everyone is buying NUC’s that isn’t utilized all of the time why is that a good idea ?

If you got unlimited money to spend go on home labs keep going down that path

There are plenty of arguments against it
     Red Tape
     X broke my Lab now i have to rebuild
     Y powered off my server during a demo
     Company won’t pay for hardware
     Company won’t pay for power/colo
     We won’t use the people to maintain it we are a consultant company not a hosting company
     Just use Azure/Ravello/AWS/Google Cloud/Air
     If I leave company I loose access to lab

If each company that have consultants took 1 or 2 hours of billable time each month and “used” that for a shared lab i believe everyone would be better off.
And since many agree that pay isn’t always the first reason to join a company perhaps offering a access to a real playground will provide you easier access to talent, and if nothing else retain the people they have

So can a company really afford not to provision lab equipment for their staff?

We also have some good friends that help out sometimes and they get access to the playground, help being anything from hardware/software/time or just being good guys/girls

We are a small company (even by Danish standards) but we still have and maintain a rack where we have our gear for testing/playing/demoing

One of the reasons is that many (at least imo) don’t want hardware at home even if the company paid for it, its noisy and uses to much electricity and at least when we are using edge stuff the lack of multiple public ip addresses is a pain

If i didn’t have a decent lab i wouldn’t spend the time i have poking around

Waiting for slow hardware isn’t good for my mood
Fabric work is hard to do purely virtual, you need iron at some point (for now)
Forgetting to power off cloud usage is a pain
Forgetting to power on before a demo is a pain
Having to power on to test/check something is a pain

So what did we do

Got a rack at a colo with a 100mb/s connection not impressive but more than enough for playing around and 32 ip addresses , and ipv6 almost there

Bought old servers , if someone decommissioned a old server we bought that compared to them throwing it out

This meant our playground with very little cost went from a few old HP server to a fully stuff C7000 with 16 blades and now back to rack servers
The cost between brokers and used hardware been in the range of 1-2 hours pr person pr month , the old blades we gave away most of them as we didn’t have any use so they live on helping other places same for the C7000

Rack+Power is covered but we don’t power on servers that isn’t used so we try to be reasonable no point in having 5 host online that isn’t used for anything

Moving from 1g to 10g cost a bit for the switch and we dont have RDMA or 25/50/100 in scope but 10g is good enough for most of the testing as it isnt performance we are benchmarkin

And with S2D for Windows Server 2016 we ended up with 4 NVMe boards and 20 SSD again small sizes and not enterprise class but good enough for testing , and way better  than simulating everything in VM’s

For storage we mostly have DAS , but we also have a few NetApp’s , one rented out for a customer and then returned so its “free” as the rent covered the base cost.

How does it look

We try to structure it in demo , playground and Reserved

Demo         : Stable environment, controlled changes so it’s always ready for customer demo, no breaking demo allowed except for named VM’s for DR/Backup
Playground    : Everything Goes, don’t except anything to be rock solid, still don’t break anything on purpose, send email if you does crazy stuff to warn

Reserved    : Named Host / Part of Host / VLAN  , dedicated to named “used”

And when I say try it’s because during the whole Windows Refresh Cycle rebuilds are more often than we prefer but it will be more stable as we go toward GA of the 2016 wave

And once in a while breaking access require heading onsite, (killed firewall with a upgrade night before holiday)

Learnings are DO NOT KEEP any production on any parts of the environment , separate compute/storage/firewall/ip Everything

We have some servers in the same rack that’s “production” but the only thing that are shared are the dual pdu’s , and we haven’t broken that yet

Licensing covered on trial mostly and a few NFR again back to refresh of whole Environments

What can’t we do currently ?
         top of mind items for now

        No NSX
         No RemoteFX/OpenGL
         No FC integration (currently)
         No TOR integration (aka no Arista)
         No Performance Testing (Consumer Grade SSD , No RDMA , 10G)
         No Very Large Scale Testing (2k+ VM) with dedupe and delta disk we probably could but not full VM’s)
         No Virtual Connect or anything fancy blade for other vendors but again don’t believe in blades anymore)   
         No hardcore fault domains

Wish List for near future

        RDMA/ High Performing Storage
         More Azure running aka balance between credits and features aka send more money

Wish List for 2018+

        Everything in Azure   

/Flemming , comments at [email protected] or @flemmingriis