Overly Sharpened Blog

Help me Roberto, my web server just got hacked!

Reposted with kind permission from Robert Moir

… I wrote ages ago back when I could still be bothered to blog.

This question keeps being asked repeatedly by the victims of hackers breaking into their web server. The answers very rarely change, but people keep asking the question. I’m not sure why. Perhaps people just don’t like the answers they’ve seen when searching for help, or they can’t find someone they trust to give them advice. Or perhaps people read an answer to this question and focus too much on the 5% of why their case is special and different from the answers they can find online and miss the 95% of the question and answer where their case is near enough the same as the one they read online.

That brings me to the first important nugget of information. I really do appreciate that you are a special unique snowflake. I appreciate that your website is too, as it’s a reflection of you and your business or at the very least, your hard work on behalf of an employer. But to someone on the outside looking in, whether a computer security person looking at the problem to try and help you or even the attacker himself, it is very likely that your problem will be at least 95% identical to every other case they’ve ever looked at.

Don’t take the attack personally, and don’t take the recommendations that follow here or that you get from other people personally. If you are reading this after just becoming the victim of a website hack then I really am sorry, and I really hope you can find something helpful here, but this is not the time to let your ego get in the way of what you need to do.

You have just found out that your server(s) got hacked. Now what?

Do not panic. Absolutely do not act in haste, and absolutely do not try and pretend things never happened and not act at all.

First: understand that the disaster has already happened. This is not the time for denial; it is the time to accept what has happened, to be realistic about it, and to take steps to manage the consequences of the impact.

Some of these steps are going to hurt, and (unless your website holds a copy of my details) I really don’t care if you ignore all or some of these steps but doing so will make things better in the end. The medicine might taste awful but sometimes you have to overlook that if you really want the cure to work.

Stop the problem from becoming worse than it already is:

  • The first thing you should do is disconnect the affected systems from the Internet. Whatever other problems you have, leaving the system connected to the web will only allow the attack to continue. I mean this quite literally; get someone to physically visit the server and unplug network cables if that is what it takes, but disconnect the victim from its muggers before you try to do anything else.
  • Change all your passwords for all accounts on all computers that are on the same network as the compromised systems. No really. All accounts. All computers. Yes, you’re right, this might be overkill; on the other hand, it might not. You don’t know either way, do you?
  • Check your other systems. Pay special attention to other Internet facing services, and to those that hold financial or other commercially sensitive data.
  • If the system holds anyone’s personal data, make a full and frank disclosure to anyone potentially affected at once. I know this one is tough. I know this one is going to hurt. I know that many businesses want to sweep this kind of problem under the carpet but I’m afraid you’re just going to have to deal with it.

Still hesitating to take this last step? I understand, I do. But look at it like this:

In some places you might well have a legal requirement to inform the authorities and/or the victims of this kind of privacy breach. However annoyed your customers might be to have you tell them about a problem, they’ll be far more annoyed if you don’t tell them, and they only find out for themselves after someone charges $8,000 worth of goods using the credit card details they stole from your site.

Remember what I said previously? The bad thing has already happened. The only question now is how well you deal with it.

Understand the problem fully:

  • Do NOT put the affected systems back online until this stage is fully complete, unless you want to be the person whose post was the tipping point for me actually deciding to write this article. I’m not linking to the post so that people can get a cheap laugh; I’m linking to warn you of the consequences of failing to follow this first step.
  • Examine the ‘attacked’ systems to understand how the attacks succeeded in compromising your security. Make every effort to find out where the attacks “came from”, so that you understand what problems you have and need to address to make your system safe in the future.
  • Examine the ‘attacked’ systems again, this time to understand where the attacks went, so that you understand what systems were compromised in the attack. Ensure you follow up any pointers that suggest compromised systems could become a springboard to attack your systems further.
  • Ensure the “gateways” used in any and all attacks are fully understood, so that you may begin to close them properly. (e.g. if your systems were compromised by a SQL injection attack, then not only do you need to close the particular flawed line of code that they broke in by, you would want to audit all of your code to see if the same type of mistake was made elsewhere).
  • Understand that attacks might succeed because of more than one flaw. Often, attacks succeed not through finding one major bug in a system but by stringing together several issues (sometimes minor and trivial by themselves) to compromise a system. For example, using SQL injection attacks to send commands to a database server, discovering the website/application you’re attacking is running in the context of an administrative user and using the rights of that account as a stepping-stone to compromise other parts of a system. Or as hackers like to call it: “another day in the office taking advantage of common mistakes people make”.

Make a plan for recovery and to bring your website back online and stick to it:

Nobody wants to be offline for longer than they have to be. That’s a given. If this website is a revenue generating mechanism then the pressure to bring it back online quickly will be intense. Even if the only thing at stake is your / your company’s reputation, this is still going generate a lot of pressure to put things back up quickly.

However, don’t give in to the temptation to go back online too quickly. Instead move with as fast as possible to understand what caused the problem and to solve it before you go back online or else you will almost certainly fall victim to an intrusion once again, and remember, “to get hacked once can be classed as misfortune; to get hacked again straight afterward looks like carelessness” (with apologies to Oscar Wilde).

  • I’m assuming you’ve understood all the issues that led to the successful intrusion in the first place before you even start this section. I don’t want to overstate the case but if you haven’t done that first then you really do need to. Sorry.
  • Never pay blackmail / protection money. This is the sign of an easy mark and you don’t want that phrase ever used to describe you.
  • Don’t be tempted to put the same server(s) back online without a full rebuild. It should be far quicker to build a new box or “nuke the server from orbit and do a clean install” on the old hardware than it would be to audit every single corner of the old system to make sure it is clean before putting it back online again. If you disagree with that then you probably don’t know what it really means to ensure a system is fully cleaned, or your website deployment procedures are an unholy mess. You presumably have backups and test deployments of your site that you can just use to build the live site, and if you don’t then being hacked is not your biggest problem.
  • Be very careful about re-using data that was “live” on the system at the time of the hack. I won’t say “never ever do it” because you’ll just ignore me, but frankly I think you do need to consider the consequences of keeping data around when you know you cannot guarantee its integrity. Ideally, you should restore this from a backup made prior to the intrusion. If you cannot or will not do that, you should be very careful with that data because it’s tainted. You should especially be aware of the consequences to others if this data belongs to customers or site visitors rather than directly to you.
  • Monitor the system(s) carefully. You should resolve to do this as an ongoing process in the future (more below) but you take extra pains to be vigilant during the period immediately following your site coming back online. The intruders will almost certainly be back, and if you can spot them trying to break in again you will certainly be able to see quickly if you really have closed all the holes they used before plus any they made for themselves, and you might gather useful information you can pass on to your local law enforcement.

Reducing the risk in the future.

The first thing you need to understand is that security is a process that you have to apply throughout the entire life-cycle of designing, deploying and maintaining an Internet-facing system, not something you can slap a few layers over your code afterwards like cheap paint. To be properly secure, a service and an application need to be designed from the start with this in mind as one of the major goals of the project. I realise that’s boring and you’ve heard it all before and that I “just don’t realise the pressure man” of getting your beta web2.0 (beta) service into beta status on the web, but the fact is that this keeps getting repeated because it was true the first time it was said and it hasn’t yet become a lie.

You can’t eliminate risk. You shouldn’t even try to do that. What you should do however is to understand which security risks are important to you, and understand how to manage and reduce both the impact of the risk and the probability that the risk will occur.

What steps can you take to reduce the probability of an attack being successful?

For example:

  • Was the flaw that allowed people to break into your site a known bug in vendor code, for which a patch was available? If so, do you need to re-think your approach to how you patch applications on your Internet-facing servers?
  • Was the flaw that allowed people to break into your site an unknown bug in vendor code, for which a patch was not available? I most certainly do not advocate changing suppliers whenever something like this bites you because they all have their problems and you’ll run out of platforms in a year at the most if you take this approach. However, if a system constantly lets you down then you should either migrate to something more robust or at the very least, re-architect your system so that vulnerable components stay wrapped up in cotton wool and as far away as possible from hostile eyes.
  • Was the flaw a bug in code developed by you (or a contractor working for you)? If so, do you need to re-think your approach to how you approve code for deployment to your live site? Could the bug have been caught with an improved test system, or with changes to your coding “standard” (for example, while technology is not a panacea, you can reduce the probability of a successful SQL injection attack by using well-documented coding techniques).
  • Was the flaw due to a problem with how the server or application software was deployed? If so, are you using automated procedures to build and deploy servers where possible? These are a great help in maintaining a consistent “baseline” state on all your servers, minimising the amount of custom work that has to be done on each one and hence hopefully minimising the opportunity for a mistake to be made. Same goes with code deployment - if you require something “special” to be done to deploy the latest version of your web app then try hard to automate it and ensure it always is done in a consistent manner.
  • Could the intrusion have been caught earlier with better monitoring of your systems? Of course, 24-hour monitoring or an “on call” system for your staff might not be cost effective, but there are companies out there who can monitor your web facing services for you and alert you in the event of a problem. You might decide you can’t afford this or don’t need it and that’s just fine… just take it into consideration.
  • Use tools such as tripwire and nessus where appropriate - but don’t just use them blindly because I said so. Take the time to learn how to use a few good security tools that are appropriate to your environment, keep these tools updated and use them on a regular basis.
  • Consider hiring security experts to ‘audit’ your website security on a regular basis. Again, you might decide you can’t afford this or don’t need it and that’s just fine… just take it into consideration.

What steps can you take to reduce the consequences of a successful attack?

If you decide that the “risk” of the lower floor of your home flooding is high, but not high enough to warrant moving, you should at least move the irreplaceable family heirlooms upstairs. Right?

  • Can you reduce the amount of services directly exposed to the Internet? Can you maintain some kind of gap between your internal services and your Internet-facing services? This ensures that even if your external systems are compromised the chances of using this as a springboard to attack your internal systems are limited.
  • Are you storing information you don’t need to store? Are you storing such information “online” when it could be archived somewhere else. There are two points to this part; the obvious one is that people cannot steal information from you that you don’t have, and the second point is that the less you store, the less you need to maintain and code for, and so there are fewer chances for bugs to slip into your code or systems design.
  • Are you using “least access” principles for your web app? If users only need to read from a database, then make sure the account the web app uses to service this only has read access, don’t allow it write access and certainly not system-level access.
  • If you’re not very experienced at something and it is not central to your business, consider outsourcing it. In other words, if you run a small website talking about writing desktop application code and decide to start selling small desktop applications from the site then consider “outsourcing” your credit card order system to someone like Paypal.
  • If at all possible, make practicing recovery from compromised systems part of your Disaster Recovery plan. This is arguably just another “disaster scenario” that you could encounter, simply one with its own set of problems and issues that are distinct from the usual ‘server room caught fire’/’was invaded by giant server eating furbies’ kind of thing. (edit, per XTZ)

And finally

I’ve probably left out no end of stuff that others consider important, but the steps above should at least help you start sorting things out if you are unlucky enough to fall victim to hackers.

Above all: Don’t panic. Think before you act. Act firmly once you’ve made a decision, and leave a comment below if you have something to add to my list of steps.

Minimum Standards

You’re a knowledge worker.

A fancy term that just means you use your computer for actual honest real creative work. Not talking about time-sheets and a contact list here. That spreadsheet that serves as your companies ERP. The irreplaceable original files making up your portfolio. The curriculum for the class you’re teaching. The knowledge that you’ve gained and reified into something communicable. These things have value. But only as long as they exist.

This feels like a good time to point out that your hard drive is probably going to die this year. “Oh.”

Okay, maybe not this year. Really, it’s only a 5% chance or so. But a 5% chance of an unrecoverable loss of data is enough to keep me up at night. There are many things that can cause you grief in this department.

The point is to acknowledge that the universe tends towards maximum irony and to start acting like failures are expected, so that when they happen you’re not left with the choice of redoing a month’s work, or a $5,000 recovery bill, or simply being forced out of business because that information was both truly irreplaceable and irrecoverable. All you need is a minimum standard of care. You wear your seatbelt when driving. You have working smoke detectors where you sleep. And you have automatic nightly backups of your data.

Well, you will, soon enough. :)

Backups

The common wisdom: “You need backups! What if the building burns down?”

This is not why you should have backups. The common wisdom is really just a ready-made statement to show that you think about Big Problems, while giving you an excuse procrastinate and generally ignore the issue. Yes, it’s a problem that should be addressed, but our goal today is the minimum standards of hygiene, and worrying about redundancy and geographic distribution and backup windows is just going to overwhelm you and give you an excuse to give up on the whole thing. And we’re not giving up today.

No, you’re going to get your backup situation figured out because your hard drive is going to die this year. Yes, really. Hard drives have lifetimes measured in years, not decades. And your machine isn’t exactly new, is it?

We’re going to do a daily backup. Backups are annoying because they take a long time to copy everything, and the machine is slow due to the extra load on the disk while they’re running. But if you back up every day, then you only ever have one day’s worth of extra data to move.

We’re going to back up everything. Every day. Getting selective and only backing up particular files is a very good way to ensure that you’re missing only the most vital files. Trust me, you don’t want there to be any question that that work you did last week for the first time in a new program is being included.

And, we’re going to back everything up every day, automatically. It’s vitally important that the backup happens whether or not you remember to start it. And running it by hand will tempt you into doing changing the process by hand, and for this task inconsistency is your absolute enemy.

We want an automatic process that backs everything up every day.

So, what tool do we use for this?

The common approach is to use a nice point-and-click tool to run the backups. You should not use one of these tools.

Transparency is your ally in this task. You need to understand each link in your backup process in as much detail as you can, and this means minimizing the number of links in the chain. Point-and-click tools excel at creating intricate setups that are not the simplest thing that could possibly work.

You’re going to need three things. You already have two of them.

  • External Drive
  • You need an external drive that plugs into your computer using a USB cable or similar. This shouldn’t set you back more than $100-$150, but it’s not optional.

    Backing up to CDs or DVDs practically guarantees that you won’t perform the backups on a regular basis, and makes the whole process far for painful than it needs to be.

  • Scheduler
  • On any modern unix (Linux, BSD, Apple’s OS X, and so on), the scheduler will be cron. Commonly, there will be a folder called /etc/cron.daily, and any script placed in that folder will be run once per day at a suitable time. Exactly what we need.

    On windows, there’s typically a built-in scheduler service which is adequate for the task.

  • Copier
  • Again, the tool we need is already available on any modern unix. The rsync tool will reliably copy everything, automatically checking that everything was written correctly, and keeping any special data necessary that other tools may not include.

    On windows, I’d strongly recommend grabbing a copy of rsync from cygwin or wherever.

What we want is a very simple script, so simple that you can understand it.

#!/bin/bash

# -v Print the names of the files to the screen as we back them up.
# -a Do the things necessary to give a nice complete archive of a set of files:
# -x Don't go exploring mount points that we run across
# --delete Delete files files from the backup if they're no longer found.
# / The source: copy everything from the root drive.
# /media/disk/backup/root The target: copy everything to here.
rsync -vax --delete / /media/disk/backups/root

Save this in a file called “backup”, and added it to your /etc/cron.daily folder. Tomorrow morning, check that your external drive has a copy of all your data on it, and bask in a warm glow knowing you’re doing better than 90% of your peers.
Much better, right?

Other Games of Life

I’ve been playing with duplicating some of the results on the emergence of cooperation in variations of iterated prisoner’s dilemma.

An interesting (if somewhat trivial) thing to note is how easily behaviour similar but distinct from the Game of Life. A small grid randomly populated only with Defectors and Cooperators, for instance, often produces glider-like patterns.

My goal is to implement some dynamic programming approach to generating strategies, and the attached source betrays that goal with some complexity that is unnecessary for the behaviour I’m talking about, but anyways.

Minting is Rare

Or, “How to resolve revocation with an immutable capability-secure world.”

A path is stored not by minting a new reference to the target (a hardlink), but rather by storing the path itself (a symlink). Each segment of a path represents a node that serves as a proxy for the rest of the path (onion routing)

Now, where does this leave me if I want permissions to be baked into a capability, and generally immutable?

On the one hand, I can now manage mutability in a sane way in the UI layer, because now the size of local neighborhood is under the control of node. This is bit of a return to the heavier weight approach to linking that I was originally considering, although it still only requires write access to one side of think, plus mint access (which could be a limited form of write in the mutable case, but it doesn’t have to be).

On the other hand, immutability is really really nice. Specifically, being able to decode the permissions and determine if an operation is allowable offline is a big win, as it allows for some fairly aggressive caching even in the worst case, and in the best case may actually reduce the time-complexity through memoization.

Notes on the Implementation of a Blog

Publishing is currently a three step process consisting of writing an entry perfectly, publishing it to a staging point, and then committing contents of that stage point to the real blog. This causing me a small amount of grief:

  • Links don’t work on the staging site
  • Comments are published to the staging site rather than the front page (this is not quite a feature)

Approaches I’ve considered to fix this:

  • Just use Blogger as intended 

    Not going to happen because of principles I’ll explain some other time

  • Rewrite the content as Blogger uploads it to the staging site

    A bit more tempting, although it’s brittle. More importantly, it opens me up to all sorts of security vulnerabilities that I’d rather avoid in what should be a rock solid “deploy” script.

  • Replace Blogger with something I write myself

    Now we’re getting somewhere!

However, understand that when I say “replace blogger”, I’m not starting from square one. Some time ago, I actually used a framework of my own invention based on a capability security model inspired by Richard Kulisz (don’t ever tell me I didn’t give credit where credit is due :p). The problem being that I wasn’t following my own advice, and didn’t have a backup of the material in a usable form when the inevitable happened. Ah well, live and learn, it wasn’t that fun to work with anyway.

The idea here is to do it again, but with a focus on creating on top of that architecture that ends up feeling more like a blog than a wiki.

So, how do you create a blog on top of a CSM architecture? Why, I’m glad you asked…

A question about PyPy’s JIT

A question about PyPy’s JIT

Although I’m sure this is already obvious to the PyPy people, I’m quite interested to see how close they are to a system that would be capable of efficiently executing interpreters written on top of the existing system.

PyPy is a python implementation written in python. The translation and jit architecture (as I understand it) uses manually inserted hints [pdf] to indicate what variables belong to the interpreter vs the interpreted program, so that the jit can accurately determine when the interpreted program has looped (as opposed to the interpreter itself). This is important because, in general, optimizing the code executed of the interpreter has fairly limited gains: you gain a faster interpreter, but execution is still interpreted. The hints allow the system to distinguish between the accidental work of interpretation from the essence of what is being interpreted.

But it’s not recursive. Even though the runtime has the required logic, it is missing the hints, and so an interpreter running on top of this stack will run faster, but it won’t make the jump to direct execution.

The question that intrigues me is this:

Would it be possible to generate those hints dynamically to make the gains available to higher level interpreters?

Wait, I know what happens next…

Interesting service being launched today by Wolfram Research

Any teenager is capable of making a machine from scratch that can seriously injure a person. The difference between a motor that you can stop with a bare hand, and one that will simply take your hand off can remarkably incremental.

I wonder how close we have to get to that threshold in order to have enough experience to see how close we are.

No, I don’t think the imminent launch implies an imminent hard takeoff. I just find myself wondering if we’ll recognize the potential of the technique, or whether we’ll just make two big lumps of fissile uranium by accident (“Oooo! Shiny!”) and get on with the business of bashing them together.

Kelly criterion

Aside from the obvious issue of compensating for errors, I don’t have a good intuitive understanding of why one wouldn’t want to maintain a bet as close to the kelly bet as possible.

It seems that the usual complaint seems to be that you want to minimize your downside risk in the short term, and a kelly bet is concerned only with maximizing the long term gains.

What doesn’t sit well:

  • “Short term” is already well known to be a bad measuring stick. You measure your progress over months and years, not days and weeks. Optimizing for the short term isn’t much different than sitting down at a $1-$2 no limit poker game with your last $50, hoping that you can play so conservative that you somehow stave off the inevitable ruin.
  • I suspect that the simple explanations are applying a rule-of-thumb in order to model the needs of a career player: the competing constraints of paying the bills while also growing the bankroll. My intuition is that including this explicitly, or baking the effect into an “effective bankroll” (instead of a fudge factor on the ideal bet) would give a better over all result, or at the very least, a more intuitive explanation.

I need to think about this more.

Card-counting and evolutionary algorithms

I’m a bad blackjack player. Bad enough that I refrain from playing anywhere except a friendly home situation with no money at stake, where I definitively demonstrate how bad I actually am.

That said, I find myself intrigued:

Given a population of players based on existing counting techniques, with crossover and mutation, and a fitness function that included optimizing for function size and minimum worst-case runtime memory usage, what sort of counting technique might we turn up?

HTML is to the web as assembler is to computing

I’m kinda surprised that there haven’t been more projects taking a translation/compilation approach to working around IE’s rendering deficiencies. We have a good specification of how things are supposed to work, and many years of experience with how they actually work in various browsers.

Is it that the folks interested in compilers and such just aren’t interested in web technology? (Well, when I put it that way…)

IE6 CSS Fixer is an example of the sort of approach that I think could yield big benefits, especially with guidance from someone with some compiler-writing experience. Splitting the general problems of graphic design and application development from the tinkering needed for cross-browser issues would mean one less rather annoying pebble in many folks’ shoes.