France requires new silly data retention policies

Yes, this isn’t as exciting and dramatic (and traffic generating) as “France outlaws hashed passwords”, the headline on slashdot and Hacker News, but it’s the reality of the situation.

France has passed a new law, requiring companies store “…users’ full names, postal addresses, telephone numbers and passwords. The data must be handed over to the authorities if demanded.”

While it’s a pretty stupid to require the storing of passwords that can be handed over to authorities (probably to allow them to use those passwords to access services outside France), there’s nothing which prevents the continued secure use of password hashes.

A simple system which meets these new requirements is:

  • Store password hash with salt in live database as is best practice
  • Encrypt the plain text password using public key encryption, and store the encrypted value in another database in a record along with the plain text username. If the username already exists, replace the stored value with the new one.
  • Store the private key offline in a secure bank vault (or 2), using multiple USB keys for data protection
  • If and when the government require access, company director goes to bank vault, retrieves USB key, uses private key to decrypt stored password value of that single user, then returns USB key to bank vault

It’s a hassle, and it’s definitely a bit silly, but this new law doesn’t “require” any massive reduction in security if implemented correctly. Yes, the private key could provide access to all usernames’ plain text passwords, but this is an existing issue around things like hashing algorithms, salts, and source code security.

And if a company doesn’t implement it correctly? Well, the same recommendation as always applies – never reuse passwords for multiple sites, especially your email accounts, which can be used to retrieve or reset passwords using most website “Lost your login details?” functions.

Control totals – the sanity check of migrations

I spend a lot of my time working on system migrations, it’s pretty much all I’ve done for the last few years. Often, I work with the same clients again and again, but when it’s a new client and they don’t have much experience with migrations, some of the things that are best practice in migrations can be quite hard to justify.

One of those practices that people sometimes have a problem with spending time on is building sensible control totals, a set of values which should clearly and simply define the success or failure of a migration.

If you are migrating a financial application, a very simple set of control totals might be:

  • Number of accounts to be migrated
  • Total outstanding debt across all accounts to be migrated
  • Number of accounts in bad debt to be migrated
  • Number of bank account details to be migrated
  • Number of credit card details to be migrated

Once you have these rules written in plain English, you then need to get someone to build the source database extract queries to produce these figures, and someone to build the same rules on the target database.
Then, when you perform your migration tests, you execute the control totals on the source database prior to the migration, and the same totals on the target database post-migration. If both sets of figures don’t add up, then you have a problem – either a set of accounts has not migrated, or the wrong data values have been inserted.

While people often say that control totals don’t matter, because individual record failures will be highlighted by database loading scripts, their real value is in verifying that the dataacross both the legacy and new systems are in their expected format.

An example of the issues that control totals can highlight is if the source application stores prices in a decimal value in the table ‘prices’, storing a price of $10 as ‘10.00’ in the database, and in the target application the price is stored as an integer in cents, storing the same $10 as ‘1000’. If you insert the decimal value without multiplying it by 100, and simply count number of records inserted into the ‘prices’ to verify it, then you may not find out until it’s too late that you’ve just cut every price in their system to a fraction of what they should be..

A  control total on ‘Total sale price of all products’ would quickly and easily pinpoint the exact issue.

Hopefully this will help some people out there with their own system migrations, it’s certainly helped me crystalise my own thoughts on their worth for any future clients that ask why they should spend time and money on what at first glance can look a pointless extra step.

Public Cloud Computing – From “We can’t…” to “We can”

From the first day of public cloud computing, there’s been people saying “We can’t use public cloud computing, because…”, followed by a range of reasons, all perfectly legitimate but generally based around company policies or long-held fears about shared resources, security, and support, rather than technical limitations.

Over the past few years, Amazon and the other public cloud providers have been chipping away at these reasons for not using public cloud computing, with Amazon recently upgrading their “Virtual Private Cloud” offering of a VPN connection to their servers to now include controllable secure networking of their instances.

Now, Amazon have launched “Dedicated Instances“, an offering where you pay a flat rate of an extra $10 per hour per region when you launch any number of dedicated instances. By “dedicated instance”, Amazon mean an instance running on hardware that’s only running instances by you, noone else. No more multitenancy resource fears on the server, reduced worries about over-commitment of hardware resources, potential weaknesses in the Xen hypervisor, etc.

You still get many of the benefits of public clouds – no up-front costs, the massive volumes of AWS leading to lower overheads, commodity services, and so on, you just pay a slightly higher per-hour price to remove one of the major hurdles in moving to public cloud computing.

I’m sure that dedicated EBS will be coming along soon, and perhaps dedicated S3 storage for people using more than something like 10TB of data – the amount that would justify a dedicated shelf of storage replicated to multiple locations?

While these recent moves won’t let everyone use the public cloud to reduce their computing costs and improve their flexibility, it’s a big step in moving people from “We can’t do this because” to “We can do this, now let’s get on with it”.

And that’s got to be good, hasn’t it?