Posts tagged Development

Why I’m building a Custom Development Home Server

I’m a pretty diehard Azure guy, been so since PDC09 when I was trying to get storage tables\queues and VM’s implemented at my job at the time. Almost all of my projects and stuff I land on Azure, it’s just the easiest place for me to do so, if not a little bit spendy.

I wasn’t always an Azure or cloud guy before I was a developer I was a systems administrator, responsible for servers ranging from SQL to Exchange at a small company. We 4 full racks of servers with a bunch of others with network equipment, firewalls, UPS’s and the like. During those days I couldn’t imagine a time not having control of the metal, and there almost nothing like unpacking a new piece of equipment, setting it up and getting it racked.

Back in those days, I had rack and servers all over my place. All the old stuff we decommissioned from the office. When I had all those up and running the noise was insane, but I was used to it from working in the server room most of the day.

Oh how times have changed, besides a beefy workstation and some other small devices and systems everything else I do is Azure based and has been for years. But try as I might there just are some things that I need on-prem at my house. So I will take you through that thought process.

First, where I live doesn’t have the best Internet. I now pay for business-grade Charter Spectrum, but the upload on that service is still pathetic, only 10Mbps. That may sound like a lot, but I plan on having some cameras installed at my place and each one of those requires 2Mbps. I used to use CrashPlan for my backup solution but with them going business only I took a hard look at what I was backing up and $100 or more a year and the cost in resources (upload) was pretty steep. Backups just made sense to have in-house with occasional syncing to OneDrive or some other storage provider.

So I looked into NAS options, finally settling on a Synology one, but when pricing it out it was pretty ouch, $500 for the unit and + the drives I wanted to add to it (with the expansion frame). Pricing that all out it would be about $1,500 for the NAS and storage (with fault tolerance i.e. RAID5) that I wanted. That’s a lot of money!

Recently I’ve started to work on other skills, mainly Containers, and non-Microsoft development stacks. Although I could use Azure for this, spinning up and keeping that stuff in the cloud could get pretty expensive, do I want to spend $300+ a month for some VM’s and serverless resources?

So I decided a local server would solve a lot of my needs. I could use it as a NAS, server to test deployments and any long running processes (like a Team City build server) and the like. So I will need a system with enough cores, ram and disk space for all those workloads.

An off the shelf system is also going to be a little on the big and loud side. I’m not too interested in having another jet turbine in the office. So based on that and the price for a system with enough core’s I’m going to have to build it myself.

I found this blog post, Build Your Own 32 Core Home Lab Server and it pretty closely matched my needs, so I decided to give it a go. The post is over a year old now so I will chronicle my experience buying the gear now, the cost and the setup.

So here are my targets, ideally Id like to get a 48 core system, but just a quick glance at some of those procs they seem pretty expensive. I want to be able to support 6 decently spec’ed VM’s, I’m equating those to D4MS Azure instances which are 4 Core, 16GB of RAM, which cost about $167/mo.

CPU 40 Core
RAM 128GB
HDD 32TB RAID 5

My goal is to keep the cost below $2,500 (which would be 1k above a fully loaded NAS). Which would start paying for itself after 14 months. As much as I can I’ll be trying to buy the parts new from Amazon, and then fallback to eBay when needed (for cost or availability).

If you’re a First Responder or know one check out Resgrid which is a SaaS product utilizing Microsoft Azure, providing logistics, management and communication tools to first responder organizations like volunteer fire departments, career fire departments, EMS, search and rescue, CERT, public safety, disaster relief organizations.

Kicking the can down the road

Being a developer is an interesting profession. On one hand there are an engineering feel to it but on the other hand it’s far more like art. For me calling myself a “Software Engineer” is more for marketing then anything else. But lets face it, little of what developers/programmers do is actual engineering.

kicking-the-canEngineers are certified go through an apprenticeship process then design and create things in the real world that, for the most part, have to stand the test of time. To me Electrical, Mechanical, Structural, etc are the real engineers and us developers haven’t yet earned that right to use that title in the way we do.

How often in a development meeting do we sit there, talk about a design/architecture/code flaw that could cause issues down the road but decide to not even begin to address it? It happens so often we have an industry term for it “Tech Debt”. Can you imagine some structural engineer’s having the same discussion?

Tom: “So I cobbled together this bridge design from a bunch of other actual bridge designs and sample designs and we look ready to go.”

Mike: “Nice, but looking at these designs if we have the amount of traffic we estimate, and we fully expect the bridge to be popular in the near future, it will start leaning to one side”

Tom: “Yea, but we have a deadline, we can always go back and add more supports latter”

Structural Engineers can’t go back and refactor the bridge after it’s been built as easily as developers can with code. There are also lots of other differences, but it really boils down to is that engineers can’t kick the can down the road as easily as developers can, and so we do.

We talk about “performance as a feature”, but what about “ease of maintenance as a feature”, “scalable architecture as a feature”, “security as a feature” or “testing as a feature”? Every time we kick something down the road, label it as “Tech Debt” and put it in the backlog if were being truthful with ourselves we know full well that unless it catches fire we will almost never be back to address it.

Eventually all that Tech Debt will catch up to you and when it does, it can cost you business, alienate customers, hurt your image or even cause your company to fail. I’ve talked about this problem before and at Resgrid I try and follow a 60/40 approach. 60% new features/customer facing bugs, 40% tech debt, testing, automation and tooling.

Here are some guidelines I feel will help stop us from kicking cans down the road and start turning developers into engineers.

  1. Design/Code for the developers around you.
    One thing I’ve always admired about the military is the “Do it for the person next to you” attitude most service members have. The same can be said for the fire service. Sure you start off wanting a thrill, but after the first few times your going in because it’s your duty and because your doing it for your crew. Developers need to have the same mentality, don’t code for yourself or for your company, craft code for the developers on your team. Do it for the person in the cube next to you.
  2. Balance Architecture & Implementation
    I’ve seen my fair share of architecture astronauts in my day and they can do more harm then good. The architecture implemented needs to match the problem domain and how it’s going to be implemented. The architecture you use for a internal LoB app will be different then one that will be deployed on the cloud. No architecture is perfect and it never will be. Ideally you need to design your Architecture for the worst case scenario for 5 years out? How many users do expect in 5 years? Double that, then how are you going to handle that? How many web servers, databases will you need? Is that the scale you should have been using eventing, CQRS, etc?
  3. Code for now and for the future
    If your response to something new is “well that’s the way we’ve always done it” or some variation of it your coding in the past. Yes it’s painful to constantly keep up to date with the latest trends and best practices, but your hurting yourself and the entity you working for by not incorporating the current best practices. You will find it increasingly hard to find developers that want to work, or even know how to work, in your ‘brand new’ code base. I’ve been interviewing a lot of developers lately and almost none of them with less then 7 years experience have ever worked on an WebForms app. Starting a new project? Using WebForms? You will find it very difficult to find developers that know that technology in the near future. This same principal goes for patterns & practices.
  4. Don’t Investing in Tooling/Automation too Early
    When you first starting out a project get the minimum amount of tooling/automation you need. This is a train of thought used by startups. Time spent on setting up elaborate tooling, complex automation is time not spent properly architecting your project, implementing features or fixing bugs. Because tooling/automation lives out-of-band of your core project you can cycle on it more quickly and once you’ve done things by hand a while you know exactly where the pain points are. When you first start off, you’ll just be guessing. A lot of tooling/automation is coupled to your environment as well, so you may start off deploying locally, but then move to Azure, your tooling will have to change at that point. Start working on automation/tooling once your project is established, is maintaining good velocity and your target environments are well known.
  5. 60/40 Every Sprint, In Every SDLC Phase
    Whether your just starting your app or are in maintenance mode balance the work between features/bugs (the 60%) and tech debt, testing, automation and tooling (the 40%). Break down complex technical debt items into smaller pieces and work on them every sprint. I call the 40% bucket “Preventative Code Maintenance” and it should be the #3 priority on your backlog at all times.
  6. Document your culture and live it
    Have coding contracts and guidelines before you start any project. Your codebase, especially early on, should be unified and feel cohesive. If I got into Mikes code it should feel like Sally’s. Utilize tools like Style/FX Cop and ReSharper to nudge people in the right direction. Utilize Pull Requests, Peer Reviews or Pair Programming to keep your codebase like a well run HOA. No one developer should ‘own’ code, go in clean up code and fix broken windows. Practice Scout Coding at all times. Not everyone on the team needs to be in complete agreement on something, but once the team commits to it everyone needs to be on board. You can either succeed as a team or fail individually.
  7. “Whatever” as a feature
    When your documenting what your application or system should do. Right there should be “It should be performant, it should scale, it should be maintainable and it should be secure”. Even for internal only applications if your app goes down or is slow, it’s costing you money in wasted employee time. You shouldn’t, on a whim, sacrifice performance or security to push out a new feature without the business knowing the full extent of the tradeoff.
  8. Automate Deployments with Smoke Tests
    At first glance this may seem contradictory to #4, but that item is based on timing. When you first start working on the project you won’t be deploying to a production or pre-prod environment. Much latter down the road (hence the timing aspect) you should completely automate your deployments. In the day of Containers, Slot Deployments, CI servers you should never be manually modifying your production environment. One day you will mess up and it could cost you. Yes, the “rm –rf” guy was a hoax, but it’s also a cautionary tale.

We have to balance getting features out or getting the product out with all of the ‘back of the house’ concerns. But we have to remember that we spend our days in the ‘back of the house’. If the architecture it’s good, code isn’t well formed or meets standards, patterns and practices aren’t being followed it’s only going to slow development down, cause bugs and arguments.

If you’re a First Responder or know one check out Resgrid which is a SaaS product utilizing Microsoft Azure, providing logistics, management and communication tools to first responder organizations like volunteer fire departments, career fire departments, EMS, search and rescue, CERT, public safety, disaster relief organizations.

You will never be bug free

Recently I was on a call with a miffed client dealing with issues in a software product. Understandably they were upset that what they were paying for had an issue that impacted them. All of that is pretty SOP but then an IT guy pipes up:

Do you guys have any testing? How did this make it into production? You should be catching all bugs during development!”.

bug-featureComing from a business person this is somewhat of an expected statement, but from someone in the technology field this is borderline ignorant. Here is the cold hard fact, you will never, ever be bug free. If you think you are 100% bug free, your not, they just haven’t been exposed or reported yet.

This has nothing to do with your methodology (Agile/Waterfall/Kanban), your delivery (SaaS, Mobile App, Desktop App) or your audience (Consumer,  Business, Gov). This is just the reality of software development, but it’s not limited to just software.

Mariner 1

On July 22 1962 Mariner 1 was launched on it’s way to Venus. A few minutes after launch Mariner 1 began to fly off course and the guidance system failed to correct it. As the spacecraft started to veer toward North Atlantic shipping lanes it was destroyed by the safety officer.

So what caused the issue? It was a typo.

Pentium

The year, 1994 and Intel launched their brand new Pentium check to the masses. After many years in development this was the first new chip to usher in a new era after the 486. How could this go wrong? Well it  did, the chips made some mistakes during floating point division.

What’s to blame? A faulty division table.

Mars Climate Orbiter

1962 too old? They didn’t have QA, or automated testing back then you say. Plus Pentium is a hardware issue! Well on December 11st, 1998 the Mars Climate Orbiter launch and was on it’s way to Mars. On September 23rd, 1999 NASA lost contact with the orbiter as the craft started to enter orbit.

What caused this issue? One team used English units and another team used Metric units.

F22 Raptor

I have first hand knowledge how much dealing with dates, times and time zones sucks. But thankfully I’ve never encountered an issue quite like this one. In 2007 during a 15 hour flight from Hawaii to Okinawa, Japan multiple on-board computers crashed when the planes crossed the International Date Line. The F-22 Raptor with a simulated KD Ratio of 241-2 was grounded due to a software issue dealing with the International Date Line.

How many lines a code does it take to cripple an advanced fighter? Just a couple.

Flash Crash 2010

It was calm and sunny day in NYC. The date, May 6th 2010, the time, 2:32 PM. The next 36 minutes will go down in history for the for the NYSE. A trading algorithm ran amok by some accounts or worked too well by others. Causing a massive and almost instantaneous drop of the DJIA by 9.2%, an intra-day swing of 1,010.14 points.

How much damage can a guy in his basement do? A lot apparently.

Knight Capital

August 2nd 2012 between 9:30AM and 10:00AM Knight Capital’s automated market making trading platform started generating erratic trades. We all know to buy low and sell high, but apparently in this case the the system didn’t get that memo and started buying high and selling low.

The cost of some bad logic? $440 Million and a company.

The above stories are just a super small sampling of software bugs that made it past, in some cases, some very rigorous testing and QA. You think you have it tough running your software changes through QA, talk to anyone who’s worked in the aircraft or defense industry.

We should do everything possible to catch, fix or remediate bugs or issues before they make it into production. But no matter how rigorous your testing, QA or automation process is, they will always be bugs in production. Once you accept this as a fact of life you can look into minimizing the impact those bugs have.

First monitor production system from every angle you can. Hardware, software, traffic, errors, analytics, etc. When you develop a baseline of your system running normally you can use that profile to determine when it isn’t running properly.

Second, pay attention to production logs and automation notification of actual errors. Don’t put a bunch of Exception logging code all over the place and have a system notify you every time it’s tripped, that will just become noise and you’ll ignore it.

Third, make it stupid simple for users to report errors and issues. Your users, for the most part, will not bother to inform you of most errors they encounter. They just don’t care about your software or service that much. It’s when it truly impacts their life or business that you will hear about it and by then it’s too late.

Finally, jump on customer impacting production bugs right away. You ideally want to fix these issues within hours, not days. I’ve pulled many all nighters fixing production issues and it’s not fun, but your users appreciate it. Keep them informed and over communicate with a status/service page, on social media and via email.

Bugs in production happen, you are just fooling yourself if you think otherwise. It’s how quickly you address those bugs and how you handle the affected customers that really matter. Don’t put all your eggs in the “catch all bugs during development” basket, save some for the production and customer service side.

If you’re a First Responder or know one check out Resgrid which is a SaaS product utilizing Microsoft Azure, providing logistics, management and communication tools to first responder organizations like volunteer fire departments, career fire departments, EMS, search and rescue, CERT, public safety, disaster relief organizations.

Go to Top