August 30, 2013

I’ve spoken before about perceived versus actual risk in terms of Facebook privacy, but one issue that is even more relevant to a lot of our customers is just how safe and secure content like Microsoft Office documents (.doc, .xls. .ppt files) and PDF files are when they are hosted on their site.

I have to log in to upload a document, so it’s secure, right?

login

WordPress 3.6 login screen

The common misconception that exists when using a content management system (CMS) is that your documents are secure because they’re behind a login prompt.  If you log in to your website, upload a file and then publish a link to it on a password protected page then no-one will be able to find that document, right?  You had to log in to upload it, didn’t you? Yes.  And that page that links to it is password protected, yeah? Absolutely. So someone has to log in to read it, right?

Wrong.

What you’ve password protected on the page is simply a link to the document, not necessarily the document itself or the location that you’ve saved it to.  For ease of access, anything you upload to your website (by default) is likely to be publicly available – unless you’ve specifically chosen a CMS or document management solution that provides a membership portal or something similar. There are a number of suitable products out there that do this, though it’s pretty important to understand that WordPress – one of our main weapons of choice – doesn’t do this natively.

So anyone can find my document then?

Well… No. And errr… yes.

If you have a document hosted on your website and a link to it doesn’t appear anywhere on your website – or is on a password protected page – then there will be no way for a general visitor clicking through your site to be able to accidentally come across it.  What this method of “hiding” access to your file doesn’t take into account, however is someone sharing the link to the document.

Example: You upload a commercially sensitive sales report to your website and send the link out to the sales team, or a customer, or your supplier via email.  There is nothing stopping the recipient of that email forwarding on that link to other people to give them access to that report.  If your content is super sensitive then you certainly shouldn’t be hosting it directly on an open platform.

The other (even bigger) issue is that Google is awesome.  And that can be a problem.  Microsoft and Adobe have built fantastic, open SEO friendly document formats that encourage indexing by search engines such as Google, Bing and Yahoo.  That’s a good thing when you’re uploading the latest race results or committee meeting minutes for your classic car club.  It’s not a good thing when you’re uploading a list of your employees and contractors and their after-hours contact details.

So what can we do?

We can put all of your documents into a password protected folder on your website.  That’s pretty simple and it stops search engines from getting anywhere near your documents.  What it doesn’t do is make life easy for your end user. Every time they try to access your document they’ll be asked to enter a username and password, providing a hurdle to a comfortable User Experience (UX).

There are also methods that web developers can use to request that search engines do not index certain parts of your website – including where you upload your files to.  This will stop the likes of Google, Bing and Yahoo from doing just this and stop them providing links for your sensitive data to the general public.  What it won’t stop though are less scrupulous search engines or even malware crawlers from trawling through your documents and pulling out the data therein.  There’s also the concern that you might want some documents to be indexed and some not to be.  Most CMS platforms – WordPress included – don’t work like this out-of-the-box as they upload all files to a single location which has to be publicly accessible for good SEO.

So does that mean there’s nothing you can do to prevent access to your files?  Not entirely, no.  Though there’s very few simple answers to this ever present concern.

File hosting services

dropboxThere’s a plethora of file hosting services that can be used to ensure that your documents are not being indexed by search engines.  Dropbox is one of our personal favourites, though there are others such as the far more collaborative document storage solutions of Google’s Google Drive, Microsoft’s SkyDrive and Evernote – to name but a few – and I notice even Apple is getting in on the cloud application game with the recent inclusion of beta versions of Pages, Numbers and Keynote on icloud.com.  There’s such a range of products and prices for these services too – there’s even a few freebies amongst them – and all of them provide more fine-grained access to your precious documents and data.  You can choose to share them with no-one, someone, a whole team or the entire internet.  It’s really up to you.

The beauty of using these third party services is that they take care of the security for you.  They also stop the search engines from getting to your sensitive data.

Problem solved then?

Well… not really, no.  As with almost all web based services there is the human element to consider.  Just like in the example of sharing a link, it is possible to share login details (usernames and passwords) as well as the document link.  If you’re in charge of state secrets and you post those anywhere on the internet that doesn’t include biometric scanning then you’re going to have a bad day.   But in most cases, best effort is 99% of the way to solving your problem and you’re almost certainly not going to be able to guarantee 100% document security.

What can I do if I’m concerned about my data?

Talk to us.  If you’re in doubt about how secure your files are then we’re happy to help you out and try and find a solution that’s right for you.   In the meantime, if you’re worried that Slugworth might find your recipe for the Ever-Lasting Gobstopper… Just. Don’t. Post. It. Online.