Jan 05 2016

Today’s SDLC and Cybersecurity – Coding, SMEs, and Spaghetti

One of our friends in the insurance world sent us an email with an interesting article about the nature of “secure programming” (see here for the article) and how it has influenced some of the information security challenges the market is facing. I initially shared my insights (below) to the email, but I figured it was worth posting. Enjoy!

— —

Over 30 years ago, one of my personal heroes (Ken Thompson, coder-extraordinaire and inventor of the B and Go programming languages) said something very profound as part of his Turing Award Lecture (the Turing Award is akin to the Nobel prize in computer science): “The moral is obvious. You can’t trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of source-level verification or scrutiny will protect you from using untrusted code.” (“Reflections on Trusting Trust,” K. Thompson, 1984.)

I got into computers (hardware/software development) at around age 12, and as part of teaching myself C, Basic, Assembly, etc., I devoured anything published by the likes of Thompson and Dennis Ritchie including the said article. I was running a handful of popular national and international BBSs (Bulletin Board Systems were the pre-WorldWideWeb gathering places for the techies, using dial-up modems), and wanted to create extensions for my BBS platform (Renegade) to allow users to do more cool things on my BBS than the competitors. I developed my own short Best Practices guide for the trusted few friends joining my development effort, which read: “(1) Never use GOTO unless coding in ASM; (2) Avoid reusing [fusion_builder_container hundred_percent=”yes” overflow=”visible”][fusion_builder_row][fusion_builder_column type=”1_1″ background_position=”left top” background_color=”” border_size=”” border_color=”” border_style=”solid” spacing=”yes” background_image=”” background_repeat=”no-repeat” padding=”” margin_top=”0px” margin_bottom=”0px” class=”” id=”” animation_type=”” animation_speed=”0.3″ animation_direction=”left” hide_on_mobile=”no” center_content=”no” min_height=”none”][someone else’s] code; (3) Compile-link-run after every function or 50 LOCs, whichever’s first; (4) Verify every heap allocation/deallocation; (5) …” Curiously, some 25 years later that Best Practices guide written by a kid still holds true (in fact, it was integrated into a “Secure eXtreme/Agile Programming Paradigm” a while back for a Fortune 500 company that heavily focuses on security).

I see three major problems with today’s software development lifecycle (SDLC) – the list is long, but I immediately recall three that are worth mentioning:

  1. Blind and ignorant code reuse

Principles such as DRY (“Don’t Repeat Yourself”), although have merits in certain circumstances, have become the mantra of sloppy coders and newbies who are far too willing to cut and paste some random code into their projects and call it ‘best practices.’ Legal and ethical issues aside, there are recent studies by true experts in the field who are negating the benefits of these types of misguided methods advocated by academia and industry alike since the 1990s (including fanatical and often needless use of OO programming techniques, wrapping someone else’s work without thorough review or even understanding its underpinnings, etc.). Some argue these ‘best practices’ were advocated by the dot.com industry during the mid-90s’ dot.com bubble to enable a code-cranking frenzy by the lowest-common-denominators of the software engineering realm (e.g., college freshman ‘programmers’ who had made a clock in Java in their CS101 class, and decided to drop out of college to chase the six figure salaries being thrown at them). Also, by re-wrapping the same pieces of proverbial digital golden nuggets, the employers could have our ‘IT guy’ buddies, liberal arts majors, etc., to also get in on the action, significantly driving down the cost of coders (which is exactly what happened in the late 1990s/early 2000s to mid 2000s).

Was ‘democratizing’ software engineering by lowering the bar a bad thing? Absolutely not! I’m all for it. I think anyone who can should be a coder and contribute.  But just because (almost) anyone can get into the passenger transport business today by signing up with Uber, Lyft, etc., it doesn’t mean they’re qualified to drive the school bus, or semi-trucks full of precious cargo, or fly a passenger plane. Some minimum level of rigor and quality control is required, but obviously lacking given the deluge of terribly-written, buggy and unsustainable code at just the enterprise level alone (don’t even get me started on the other stuff that’s made its way into popular mobile app stores, but hopefully those apps are not managing every hospital’s medical devices, yet…).

My point was demonstrated by Bob Beck during the Heartbleed fiasco: The coding atrocities found in OpenSSL during his team’s code review was sobering, and sadly, I would put OpenSSL contributors in the top 90th percentile of the software developer community in terms of skillset and knowledgebase;  the majority of today’s software developer community seems to be only competent of writing (terrible) Java, JavaScript, Python, and Ruby code without even understanding the nuances of OSs, kernels, hardware, networking, security, etc.  (see http://www.infoq.com/news/2014/05/libre-ssl-first-30-days for links to more of Bob’s excellent presentations, etc.) (Disclaimer: Of the programming languages I have used extensively, I am very biased towards C, C#, Python, and Erlang, am very biased against Java, C++, and VBasic, and stand neutral with respect to JavaScript, ColdFusion, Go, Perl, PHP, OpenCL, OpenGL, CUDA, ADA, Fortran, SQL, Verilog, and VHDL).

Another paper co-published by Stanford and UT Austin a year before Heartbleed called OpenSSL (and the like) “the most dangerous code in the world” (https://dl.acm.org/citation.cfm?id=2382204): Without making assumptions about the library developers’ competence or the quality of their products, this study pointed out the horrendous mistakes made by application developers (including those at Google, Amazon, Chase, Apache foundation, etc.) in implementing security libraries such as OpenSSL, which essentially meant the little lock favicon in the browser (supposedly indicating a secure SSL/TLS connection with a bank, email provider, etc.) meant nothing. Attackers equipped with some open source sniffing tools and a weekend worth of YouTube tutorials could effectively eavesdrop on the traffic, and a few more hours of YouTubing on “packet crafting,” etc., would allow them to inject code into the traffic. To put things into perspective, those application developers in my opinion comprise the top 75th percentile of the coding community in terms of quality control and INFOSEC knowledge. You can imagine what kind of security one should expect from the garden-variety developers.

  1. The epidemics of the “Multicerted SMEs”

I remember in the 1990s when I was running my own PC repair / networking business as a teenager many organizations were at the mercy of the stereotypical ‘IT guy’: The bearded, overweight former mechanic, plumber, etc. who took a handful of certification courses at a local technical institute, was trained in connecting wires, and maybe picked up how to run a handful of configuration scripts along the way. Needless to say, they were relatively inexpensive to hire (compared to an bona fide engineer) and I was quite happy delegating the easy-to-do tasks to them, allowing myself to focus on the more complex challenges. Unfortunately, today’s supposed ‘cyber experts’ (whatever ‘cyber’ means) come from similar backgrounds as our old IT buddies, but are even more inept as the certification processes have become even more watered-down, more irrelevant, and more fraught with cheating. Sadly, as I recently experienced with a recent graduate from one of US military’s most prestigious graduate schools, even despite a 4.0 GPA and an MS degree in ‘cyber,’ many of the graduates seem to lack the fundamental competence in core topics such as cryptology, networking, databases, backend/frontend technologies, low-level programming, high-level programming, algorithm development, testing, architecture / system design, probability and statistics, etc. This is by no means an isolated case, although outside of the likes of Google and Facebook, the majority of folks doing the hiring at most organizations (especially in the public sector) lack the technical skillset to put the applicants through the technical paces during the interview. ‘Academic institutions’ and ‘cyber’ cash-cow curricula at established and lesser known institutions alike have been popping up in the past decade, cranking out unqualified graduates with plenty of wall-flare, but deadly-bad behind an actual keyboard.

If ‘cybersecurity’ can be analogized to a car race, it is not reasonable to assume that any cab driver with a valid driver’s license would (or even should) be qualified to enter into Formula One race, but that is precisely what is happening today in the realm of ‘cyber’: Newbies watch a couple of YouTube videos on IDAPro and Kali, and slap together a few dozen lines of lousy (and mostly pilfered) code in Python and claim to be ‘developers’ or ‘cybersecurity researchers.’ Sadly, the consumers of cybersecurity are so desperate and ignorant that they often put their faith and trust in the hands of these Multicerted SMEs, with predictable results.

Am I hating on the IT guy, or the pseudo-technical degree holder who can crank out more PPTs on TP than there are asses to wipe?  Absolutely not!  I have uttermost regards for the handyman who takes care of the plumbing, roofing, concrete, painting, electrical, etc. at my house.  He is one awesome McGuyver, and I’d be living in a tent if it weren’t for his ninja-like repair fu.  But I don’t think it’s a good idea to have him and his team design and build the International Space Station, or a business high rise in downtown Manhattan.  A city built by a bunch of handymen is the favelas in Brazil: I’d never hate on their skills, but prefer not to live there.  In today’s IT/INFOSEC realm, we can’t escape the digital favelas, the ever-giant, all consuming mountains of digital waste that surround every aspect of our IoTs and pollute the very packets we sniff on millisecond basis.  A few of us actually see it, many more are bothered by it, and all of us are impacted by it.  None of us seems able to stop it.

  1. Security Alla Spaghetti

I recall from my time studying at the Bocconi University in Milan that the Italian system of education, government, etc. had a tendency to borrow a proven concept from another nation, heavily mix it with unnecessary fluff, and present a highly complicated and less effective version of it as a new one.  For instance, the very same book on econometrics (written by my undergraduate professor) that was only 200 pages in English had been translated to well over 1,000 pages in Italian, adding nothing of nutritional value to the original content other than tons of proverbial bleached-flour noodles.  Why?  So that the Italian students keen on photocopying textbooks would be discouraged from pirating since copying would cost more than purchasing a legitimate copy (this is the pre-digital-book era).

Sadly, the same has become the current state of cybersecurity: Highly unqualified entities and individuals suffering from what I deem the Seven Deadly Cybersecurity “Eyes” (Ignorance, Irreverence, Impotence, Incompetence, Inadvertence, Inanity, and Irrelevance) are extremely prolific on coming up with PowerPoint-du-jour on ‘cyber’ and security, which take simpler (though unproven and likely incorrect) concepts, and presenting them as a novel and effective solution.  Back in the 1970s through 1990s, a seemingly-endless group of charlatans would pollute the public airways and conferences posing as mentalists, psychics, Gurus of [fill-in-the-blank]. Much of the well-regurgitated concepts such as “CIA” (Confidentiality-Integrity-Availability), “Risk = Impact x Likelihood” are so mathematically, technically, and logically unsound that do not make any sense outside of the MBA-level courses, the office of your local insurance salesman, or DISA (the DISAstrous, DISAppointing, federal agency that many hope would DISAppear from promulgating any more regulatory digital dysentery).  The problem is computers deal in binary, and saying “we have assured 100% CIA” means absolutely nothing when a group of Romanian high school students have encrypted your entire datacenter using widely-available ransomware (in which case your information is securely protected in full integrity and is readily available—but just not to you).  Some of these concepts can be improved by tacking on additional qualifiers (e.g., add Authenticity, Usability, and Possession to CIA), but the philosophical/business-oriented concepts are still far from adequate.

So why are we stuck on B-school jargon that insult the intelligence of real scientists and engineers who may know a thing or two about security?  Because for the most part, there is no adequate, common language or even framework for discussing ‘cybersecurity,’ INFOSEC, or even IT.  Despite a number of half-baked and failed attempts, there is still no common lexicon to describe qualitatively and quantitatively what it means to be “secure,” “at risk,” etc.  Can you really convey Shakespearian sonnets if neither the presenter nor the audience don’t know a word of English?  Let’s say the presenter is smarter than the average bear and may actually have a gist of what Hamlet is supposed to be like, and tries to act out a scene to his colleagues by pointing a finger to his skull.  Members of the audience may only notice the giant bald spot on the presenter’s head and presume Shakespeare was waxing poetic about Minoxidil.  A random observer may interpret the speaker’s strained facial expressions, jargonesque language, and the finger pointing as a cry to be put out of one’s misery and fulfill the request by a giant rock.  Some of us have scars to prove it.

I remember a very well-known and well-regarded cybersecurity entity pitching me about their IDS/IPS/SIEM/WAF/DLP product that claimed “your system is only 87% secure, we can make it 99.999%.”  I asked “what does that even mean?  Show me your model in source code.”  The ‘sales-engineers’ replayed the whole animated slide deck while their “technical team” had the clear “crap, this guy saw right through it and now wants us to explain our ‘Machine Learning’ algorithm, which we just ripped out of github last week and have no idea what it really does” look on their faces.  Their lead engineer applied for a job with my group a little after the company spun off its cybersecurity business and was duly denied.

Aside from the ‘engineers’ exploiting a lack of common lingo to stick it to the illiterate CTOs, CIOs, and CISOs, another reason for the current security-alla-spaghetti dieting fad is that in the mind of the naïve consumer, there is no clear distinction between ‘checklist’ or ‘virtual’ security, and actual security.  The blind ‘STIG-patch-prey’ and alike regiments are making the IT-guy-turned-CIO feel warm and fuzzy about having done ‘something’ about ‘it,’ without having an honest and in-depth technical look at the root causes of security vulnerabilities. A few years ago, I wrote a (never-submitted, as I expect to meet the ill fates of the likes of Georg Ferdinand Ludwig Philipp Cantor, but with a fraction of their intelligence) paper for analytically (e.g., in closed-form mathematics) proving why achieving 100% ‘cybersecurity’ (or even close to it) was a fool’s errand (using Levy processes and Brownian ratchets, among various concepts from group- and control-theory). There is fundamental stochasticity at the very elements of the hardware and software that comprise a computing system that according to recent publications, even cast doubt on the Church-Turing thesis when applied to actual, physical systems currently in use. In a nutshell, no one can create an accurate weather-prediction model, which in many ways is far less complex than predicting all the tessellations and state spaces within the execution of a typical computer application on commodity hardware with millions of logic gates inside doped substrates.

The situation is exacerbated by the lack of adequate, comprehensive expertise required to promulgate useful cybersecurity concepts. At most, there may be a few dozen individuals worldwide who may have full mastery of hardware and software on a single architecture (say x86, and ignoring the massive, alternate ecosystems such as ARM) and all their nuances (key word is full mastery, and not merely “I have taken a bunch of graduate level courses on the topic” or “I’ve been working at Intel for 30 years”). There is even a smaller subset with an in-depth understanding of probability and statistics, predictive modeling, control theory, risk management, etc. who would be qualified to opine on the root causes of the cybersecurity problem and possible solutions that may actually work (I have been on the source selection committee of a number DARPA-related projects attempting to solve similar problems, including creating a whole new secure computing architecture from scratch, and having seen proposals from some of the world’s brightest minds in this area, I can probably count the number of true experts in this field on the tip of my fingers).

Even then, there is a problem of execution: Who is going to implement this model of cybersecurity, and who is going to supervise the process to assure correctness, trust, etc.?  Now we are operating in the null space of a complex Venn diagram as we have way too many ‘ideas people’ who can’t execute a single one of their own ideas, let alone someone else’s.  As my high school teacher used to gripe: “Too many geniuses with diarrhea in the mouth and constipation in the brain do not a pyramid build.”

Possible solutions that are technically sound, but very expensive:

  1. Create dedicated teams of expert coders (e.g., who have written at least a million original, professional-grade SLoCs in each language currently under review) and reverse engineers (having reversed thousands of lines of disassembled code) to perform a comprehensive code review of every module and dependency that comprises an application and its operating environment
  2. Require formal methods (e.g., create a mathematical model of all the possible state-spaces of the application and perform IV&V using symbolic/concolic execution) as part of the T&E/C&A
  3. Create a dedicated and highly skilled group of pen testers to put every system and component through its paces on continual basis
  4. Create an exclusive contract with a trusted manufacturer (including design, fab, packaging, etc.) for ALL the hardware components for a given ecosystem
  5. Adopt a highly secure operating base and combine the BIOS/UEFI/kernel/OS into one integrated, formally V&Ved (again, formal means using exhaustive, mathematical modelling) ASIC with no firmware, softcores, etc. whatsoever (essentially, the core aspects of a system are part of a read-only memory; need “patching”?  swap out the ASIC with an upgraded one; if you need to patch as often as Windows, then hire yourself a competent technologist who can find you a more suitable ecosystem as there are many out there)
  6. Get rid of the Multicerted SMEs (as the job description by one of the top “Cyber” entities recently acquired by Raytheon reads: “certifications are neither required nor helpful”)
  7. , etc., etc., [but few will understand them, none of whom have the authority/responsibility/ability to execute]

Possible solutions that are technically questionable, but less expensive:

  1. Settle for the status quo: Fuzzy-security, snake oil, AdHocile development methodology, and assume the risk (e.g., insurance and reinsurance)
  2. Dump the computers for mechanical typewriters (I believe this has been working well based on the low number of reported cyber incident responses)
  3. Hire someone who looks and sounds like the “guru” out of the HBO show Silicon Valley or the movie The Guru, and introduce him/her to your clients as the organization’s new Senior Architect Chief of Systems Hacking Information Technologist – you’d be amazed how far unintelligible pseudo-technical mumbo jumbo can go in the current environment (e.g., much of the current IT/INFOSEC leadership); by the time they realize what happened you’ll either be sipping on Seaweed shakes in Belize or be beyond reproach as the appointed global Cyber Tsar