Your operational tools deliver continuous monitoring and alerting—why doesn’t your security suite? No single path exists to a rugged DevOps approach that works for every organization, but certain key principles and techniques are used by the DevOps elite that give them distinct advantages.
You can use these and revamp your organization’s processes and behaviors to gain efficiencies in your security operations. Security can no longer be thought of as being a separate step in a launch. Instead, security must be integrated into the overall processes of development and deployment.
As organizations move more deeply into continuous patterns of development and deployment, the importance of implementing continuous security behaviors becomes non-negotiable.
In this presentation recorded at BSidesSF, Evident.io CEO Tim Prendergast discusses strategies to better understand their value to an attacker, how to better define the battlefield for their own advantage, how to identify potential Rugged DevOps allies within the organization, why it is time to embrace continuous security cycles and automate security acceptance tests as part of the QA process, and the value of operationalizing security alerts and remediation efforts to achieve a more agile security posture.
Tim Prendergast is co-founder and CEO of Evident.io seeks to help others avoid the pain he endured when helping Adobe adopt the cloud at a massive level.
After years of building, operating, and securing services in AWS, Tim set out to make security approachable and repeatable for companies of all sizes. Tim led technology teams at Adobe, Ingenuity, Ticketmaster, and McAfee.
The full presentation transcript:
MC: The sponsors for making this available and happen. Also, if you have a chance, please fill out some feedback. Information is located on your ticket and also on the website on the schedule page. Today we have Tim Pendergast. Please give a welcome applause. Thank you.
Tim: Thank you everyone. Let me adjust this thing. I was conscious the fact that I stand between you guys and the bar, which thank you whoever organized me to be the fall guy for that one. But there’s so much material to cover here. I kind of took approach to get you guys the seeds of thinking about the change and approach that change of paradigm here. I’ll talk through some technical examples, but I’m not going to go through a bunch of demos or dive into a bunch of code examples because it’s something that actually is better suited for you to explore and to journey on your own. But let’s explore today what it means to get a Rugged DevOps Approach to securing and protecting the Cloud, which is really as everyone thinks about it.
So it’s your data center that someone else is hosting for you now or it’s the data center you host that’s used in a very different way. Of all the things that happened around this Cloud movement, I think the most example thing that’s happening now is specifically around security. Sure it’s cool to elasticize workloads and have something where I can spin up and down resources on the demand and just pay for what I use. But the value it brings when we talk about the approach we take to security and the challenges we face as we move into this era are far more exhilarating.
While people get excited about like, “Hey, I can run it. There’s a few servers or thousand servers.” The idea that, hey, of all those servers, they can take Dynamic Security Postures on. They can be reflexive and react to the kinds of request that are coming into them and these changes can happen in the seconds time frames, is a totally different paradigm than what we’ve been used to in security. And it’s something in which really you think about start taking advantage of.
Going to give you guys like the step one because I don’t know everyone’s background. What is Cloud? I’ll settle this argument that people have had forever. It is a programmatic infrastructure. It could be anything that false under that. But think of it in the sense that it’s API-driven infrastructure. It’s not what VMware is going to sell you guys. It’s not what Amazon Web Services specifically tells you it is. It really is a different way for us to interface with infrastructure that goes beyond that rack and stack approach we took.
Since I referenced it, what is DevOps? If I ask people to raise hands, I’m sure everyone pretty much knows. It’s really just smart people building things in a more intelligent way. It means automation. It means scale. It means let’s get away from the typical behaviors we used to do. When you think about this acceleration towards deploying infrastructure and applications and services we call DevOps, there’s specific needs that have arisen that drive people to get into this state and those needs are specifically focused around flexible, rapid iterations and testing and validation of what you’re doing and then iterating forward on that. Meaning don’t waste six months to put this new innovation out there for your users.
It means wait a week after you see people start using what you built, and then learn how they’re using it. Learn the cool things you can do it and then iterate forward on that. The same time, security is never gone away. All of us are here for a particular reason. All of us here have jobs that focus some aspect on this. We have to bring to a business or an entity some sense of control, some sense of stability, some sense or risk aversion because we all know that a major compromise tends to end a lot of small, medium businesses.
And large business, while they kind of slough them off, suffer massive reputational damage over time. If I surveyed the crowd and had everyone just shout out the name of a company that has a bad reputation for security, I know probably the five or six that most everyone’s going to shout out. That’s a sad state in the industry, but it’s also telling to how we, as humans, emotionally connect to people not doing everything they can to protect us. The thing that’s really interesting about this is we inadvertently jam these two worlds together and we did that because security is still a relevant function for a business, but the business had also said, “We have to move faster.”
And historically, what? Acceleration is not the best friend of security? The dynamics and the mass change introduces a little bit of chaos and makes it hard to drive signal from noise, right? But there’s a new way for us to actually think about doing this where we can actually have the best of both worlds. This is the thing that I struggle with, which is I have to step out of my traditional ivory tower, paratrooper knight in shining armor save the day, kind of role where you do instant response and that’s the end of the security discussion with a lot of people and that’s where you convinced them to be part of the solution.
Then to this educator role, this embracing role of like, “Hey, you know the guy on the other side of the phone that’s building this application and managing this infrastructure is a really bright person. He just hasn’t had 15 years of getting his butt kick in this industry like I have. If I could actually teach him or enable him in some way to not have to go through that same pain that I went through and learn from what I’m trying to tell him is the right way to do things without me being the guy with the big stick, then we could actually come out way ahead.”
So quick survey. Who here thinks you have enough people at your company doing security? I’ll see you outside buddy. Who here has open job retro security engineers? Yeah, basically everybody. So this is the pain point we see. We can’t just poof into existence enough people to actually manage security. We can’t say, “Oh cool, next year’s graduating class is coming out fully ready to come protect the infrastructure we’ve been building for years.” It’s one of those things where you have to take advantage of the people that you can train up. What’s really interesting is the people that really get this right now are those engineers and product teams that are building that infrastructure that you have been used to protecting and the data center that you now have to do in the Cloud.
This paradigm’s totally shifted. The one thing that you see that’s really different here is not only the ownership aspect of the infrastructure layer, but really we’ve taken these broader concepts where you built walls around the garden, right? And saying, “We actually have resources that can exist for seven minutes.” I didn’t get up and my data center had a thousand servers yesterday and 200 a day. Like you’ve been robbed, right?
I also didn’t get up and walk in there expecting to see 1 rack of service. Now I have like 500 racks of service. It doesn’t happen at the data center. Your expectation of things being static is how we built security around it. We said, “Cool. I can put zones for network and charter domains in place. I can put firewalls around those. I can actually put IDS, IPS in line of everything I own. I have ways to get what I want to get so I can try and do my job better.”
The Cloud basically took that and threw it out the window. It’s said, “Oh. By the way, you don’t own this and you’re not allowed to come plug directly into it. Just trust us. We’re doing everything for you.” The reality of the situation is they do everything below a certain layer. But you, as the customer, own everything that’s really relevant. The app player, the network traffic stack. What’s really changed is the way you interface with these Cloud infrastructure services, right?
So if you think about it, has anyone here done API programming like Tcl Expect that kind of stuff against like old firewalls, fixes, you know IDS, IPS? It’s terrible. Nothing’s the same. The unifying constant of Cloud infrastructure that’s programmatic is we have an API interface that now speaks universally to identity, network, storage, server. It’s actually a really different approach now. We can actually talk on one language plane to all these different layers and then cross-correlate across them.
This is actually enabling a whole new era of security for us. It’s one of those things that we’re excited to talk about it. Yeah, little box and it’s all changed. So, why is it cool? Because the vectors have actually totally changed. This didn’t happen in the data center. We call this the side door to the Cloud. When you think about it, it’s not like someone puts up a new server inside of your firewall in the Cloud and then someone could get total control of your data center because that server came online.
That just generally doesn’t happen that common in the real world unless you’re getting hardcore [inaudible 00:07:45]. What’s happened is we’ve taken the stack where you had …you can call it a three-tier stack. Whatever you want to call it. It’s just logical. Your web map application hosts the same. These are containerized functions that you can support around wherever you want. The way we defend the applications and services at the app layers, secure coding life cycle, verifying your O off’s [SP] top 10 and 20 are being guarded against, doing typical quals [SP], in-app scan.
This kind of stuff isn’t that new. It’s not really changing much. What changed was we talk about where are all my data records that really matter? Were they actually dissolved? The typical layer of running giant Oracle SQL clusters and push those into managed database servers isn’t a lot of infrastructure. We’ve also taken the traditional storage host, which by the way, as much as we all pay for them, there’s a big stack of Unix or Windows boxes, right? We’ve taken those and dissolved those, and we’ve stored these objects into like storage services.
They can be archival. They can object store. It could be Dropbox for all we care, right? Someone else is giving us a presentation layer that we engage and push data to and when we need the data back, we ask for it. That contract of getting that back and forth as an acceptable latency has said, “We don’t have to manage these things.” The other thing it said implicitly that we all don’t accept is that now I have to figure out how to protect it over in these services where they’re publicly accessible in the data center.
No one can just walk through the wall in your infrastructure and start ripping out servers. We have multiple instances where publicly accessible Cloud APIs have led to the complete demise of companies in this industry. HD Morris’ paper on S3 Buckets walking, Global NameSpace. You can just type in random names and eventually you’re going to get hits back. You’ll get hits back that have medical records, legal records, because there’s so many permutations and there’s so many micro rappers of security policy around every instance of an object that you can create or an asset in the Cloud, that how do you know that the 3 billionth object you’ve created is consistent with the other 2.99 billion, right?
How do you know if someone internally didn’t change that policy and make it accessible and popped the link on Pays Spin so everyone’s now pulling down on your entire payroll record? This is actually the challenge you face in the Cloud now because we’re no longer in complete control, but we have a new way to think about this.
I will become very unpopular very quickly. We haven’t done a good job in the industry of product. I started on the security product side of the house. I went defender and now I’m back on product. We haven’t done a good job keeping up with the change. Chef and Puppet and these infrastructure as code components had been around five, seven plus years, right? And really popular the past three or four. Where’s the security innovation been like, right? We still think that you can just take…I kid you not. I won’t name names. I’ve had vendors who take this big load balancer that you have in your data center and just you know, “Employ our virtual appliance in the Cloud. It will do the exact same thing.” You sit there and you’re like, “I went through this whole iteration of redesigning my architecture the way I think of things and concepts and loosely coupling my application and destroying all these concepts 20 years of infrastructure. Now you want me to undo that and then put something right in the middle of it that’s a giant single point of failure.”
This is not the way that we engineer for Cloud. Fundamentally, the failing has been the four thinkers and thought leaders that have embraced programmatic infrastructure have not been those security companies. They’re still building products and solving challenges they know happen in the data center. Meanwhile, the world’s moving into a new paradigm, and we’re lagging because of it. Part of that challenge also is the way that we’ve structured ourselves in the industry and community.
If you think about it, the DevOps community has done a great job with Chef Con, Puppet Con, all these DevOps-centric conferences. People get up here and they open the kimono, and they dump all their innovation out. You guys can copy it, get it off GitHub, and go home and start from their new baseline. Fundamentally, in the security industry, we’ve not been as open, and we’ve not been as welcoming to others, and we haven’t been able to share as elicitly [SP] as that community has. That’s one of the things that we have to think about in a social layer of changing over time or else we’re going to be stuck here.
How do you freshen the stack, right? Think about it. There’s a whole set of major services that compromise probably 60 to 70% of your infrastructure in the Cloud that are completely API driven. No IP address, can’t network scan them, can’t MMAP them, nothing. All you get is hey, the API point’s open. But you know what? There’s thousands of functions and millions of assets behind those APIs that you now have to actually protect and think about how you guard. If you go top down…I’m an old network nerd. I went to the OSI model kind of in around here.
You think about the business process. That’s ADP Workday. None of us are going to worry about that. By the way, they’re the original Cloud service that if your company says, “We don’t use Cloud services,” ask them where your payroll and your social security and all your benefits data is. Someone else’s Cloud. Application security, still a relatively the same. We’re getting new Cloud intelligent WAFS. Signal Sciences shout out, those guys are awesome. They put out some new product this week.
We have Cloudware agents. We’ve seen the end point evolve to understand, but the end point hasn’t evolved to understand that we’re going to have totally serverless infrastructure like AWS Lambda and some of these other code pass layers. Then you have all the rest of this identity network, administrative relationships, and roles. Regionality of your data that is only accessible through API and protected as such. I don’t know if Adrian’s here. If he is, I want five bucks after this.
Adrian’s totally nailed it. The first analyst I saw that really got it. It was like you can now push a button and destroy a data center. How scary is that, right? If the wrong person can get credentials or can phish you, boop, goodnight. Oh, by the way, your backups are in the Cloud. Those are gone too. That’s how frightening it is. That’s how much owness is on the people running infrastructure, and not to scare people, the people running infrastructure tend to be people with not a ton of security experience and not the battle scars that a lot of people here have from making mistakes and having to do instant response and correct it over time.
How do you bridge the gap? Totally makes sense that we’d go to security as code, right? If these API driven services are completely code oriented and on the back end of what our security infrastructure looks like, it’s like a JSON Blob, then we actually build automation around that construct knowing that if what we do to the infrastructure through that JSON is what comes out the other end because the providers are using automation to build from that same JSON Blob, then we create a one to one construct and security where we say, “What I say is what I do, and what I say is what you do.” Now that means watching the shift in the dynamism in your environment programmatically allows you to interceded programmatically. It’s not oh, by the way, someone launched a bunch of Bitcoin mining machines in a region in APAC at 3 in the morning. It means if someone launches anything that I don’t want to be launched, terminate it immediately. Right?
Don’t page someone and have a human wake up and go log and see if they can figure out what’s wrong. We can articulate what security means in the Cloud, and we can actually enforce it programmatically which is something we’ve not been able to effectively do before. Totally basic example. This is Amazon JSON Structure. Sad. Looks horrible on the screen, sorry. Fifty percent of the time we see people who are building and rolling new images and talking about patch management in the Cloud and all these things, struggle with it.
And the reason they struggle with it is how do you know what should be where any point in time? Humans are horrible at going around and taking like AMI image IDs and knowing which one should be in every region because they are different in every region. Thank you, Amazon. They’re different in every region. They change as you roll out new patches. So if you guys have like the SSL issues that came out recently, you have to [inaudible 00:15:32] and patch and reroll your images globally and roll it back out. People have gotten sophisticated enough to actually bake and build AMIs using infrastructure automation and push them but then, on the back end, the security process is broken.
We have successfully seen people start to actually integrate these audits and checks into the CICD pipeline. While you’re actually doing a build and push of a new image, not only does it mark that as the new golden image so you know what’s no longer in date, so you have a complete inventory of what has to be taken out of date or taken out of that infrastructure, you also now can say globally is anything running during the life cycle of this image that’s not approved? You want to know. I’m an attacker. I get access to your Cloud infrastructure. I’m not going to launch a golden image, guys.