Crypteron introduces secure, searchable encryption

What's the best way to search through my encrypted data?

We've had customers ask us this question several times before. So now we're proud to announce that Crypteron natively supports searchable encryption! In this post, we'll be covering¬†exact searches, wildcard searches as well as fuzzy searches. And we'll be doing that via industry standard, battle tested encryption algorithms. All this is available today and follows the same uber-simple programming model you've come to expect from us. Hooray ūüéȬ†!

But first - the natural tension

One of the fundamental objectives of strong encryption is to eradicate all patterns from encrypted data. Everything should looks like noise or garbage. And all garbage should look the same. However, searching depends on patterns to traverse the search space. So there is a natural tension between strong encryption and efficient searching. These fundamental opposing forces are why it's very difficult to combine encryption with searching.

The broken ways

Before we dive into how Crypteron does it, lets cover some of the broken ways some other platforms have taken to achieve searchable encryption. They perform "encryption" in a way that utterly breaks the promises of modern, strong cryptography. Such crypto-sins include using AES in ECB mode or using a zero or constant initialization vector (IV). So even if you are technically "encrypting" your data, it's not secure. In fact, some systems make this bad situation even worse by then encrypting each word individually! This utterly destroys AES security. Even the simplest of frequency analysis attacks can effortlessly decrypt "encrypted" data. In real time, on low end mobile processors.

The images below show the original unencrypted source ("plain text"), an "encrypted" version using the above mentioned kludges/hacks and finally using a modern encryption the right way.

Original source
Broken cryptography ebc encrypted
Cryptography done correctly crypto done right

Another example

Original Broken cryptography Cryptography done correctly
source ebc encrypted crypto done right

You can visually see the leakage of information above. For those concerned with compliance, none of the above approaches would pass NIST or NSA criteria.

The experimental ways

Exotic and experimental cryptography such as Homomorphic encryption or order preserving encryption has a lot of academic interest. The ultimate goal is to permit certain operations (like searching) over encrypted data without loss of privacy or integrity. However order preserving encryption has been proven to leak data. Homomorphic encryption hasn't proven very strong either. Plus, depending on which expert you talk to, it's about a billion to a million times slower than today's encryption systems. Commercial feasibility, if ever, is projected to be about 20-30 years away!

The point is that there is no need to risk your valuable data on unproven, experimental encryption algorithms. You get a false sense of security, end up wasting your security budget and get distracted from real solutions.

The solution

Short version

If you're in a hurry, just know that Crypteron users just have to put [Secure(Opt.Search)] (in C#) or @Secure(opts = Opt.SEARCH) (in Java) in front of their search fields. Crypteron takes care of everything behind the scenes. Here are actual examples showing it in action.

C# Example

// Attributes on data class
public class Patient
{
    public int Id {get; set;}
    
    [Secure]
    public string FullName {get; set;}
    
    [Secure(Opt.Search)]
    public string SocialSecurityNumber {get; set;}
}

// To search for SSN 123-456-7890, 
// generate a search prefix
var searchPrefix = 
    SecureSearch.GetPrefix("123-456-7890");

// Use the search prefix in a query
var foundPatient = secDb.Patients.Where(p =>
    p.SocialSecurityNumber.StartsWith(searchPrefix)
)

Java Example

// Annotations on data class
public class Patient
{
    private int Id

    @Secure
    private String fullName;

    @Secure(opts = Opt.SEARCH)
    private String socialSecurityNumber;
}

// To search for SSN 123-456-7890,
// generate a search prefix
final String searchPrefix = 
    SecureSearch.getPrefix("123-456-7890");

// Use the search prefix in a query:
final TypedQuery query =
    entityManager.createQuery("SELECT p FROM Patient p where p.socialSecurityNumber LIKE :searchPrefix", Patient.class);
query.setParameter("searchPrefix", searchPrefix + "%");
final Patient foundPatient = query.getSingleResult();

Long version - behind the scenes orchestration

Behind the scenes, Crypteron is generating an in-place, cryptographically secure, distributed search index. This happens as each piece of data is added and is constructed on-the-fly on a per searchable column/field basis. The distributed search index uses a HMAC-SHA256 primitive and the HMAC cryptographic keys are entirely separate from the data encryption keys. This encrypted search index is distributed across all searchable fields and it's storage adds about 33 bytes to each searchable field. The run-time performance impact is negligible, almost the same as non-searchable fields. Of course, you are shielded from all the complexities - the platform orchestrates it auto-magically behind the scenes. When searching the database, Crypteron's SDK provides an API that returns a search token. You pass this search token to the database to perform a native query, all without decrypting any data! So if you're searching for a "Maria" in your database - you immediately get it back at native lookup speeds.

All under warranty

What's great is that all other Crypteron features continue to work just fine. This means, your actual data is encrypted with AES, in GCM mode (super strong) and uses unique, cryptorandom IVs. You also get both self-integrity and tamper protection. Note that tamper protection is subtly distinct from self-integrity. Integrity means than an attacker cannot modify encrypted data (e.g. intern's salary) without an alarm going off. Tamper protection ensures that one cannot replace one perfectly fine encrypted value with another (e.g. replace intern's encrypted salary with CEOs encrypted salary) without an alarm going off. Crypteron effortlessly gives you both.

Advanced searches

Wildcards (e.g. "Mar*")

The above is great when handling exact matches like for example "Maria". But what about other other search patterns? Example, "Mar*"? The general idea is to first list your search requirements. Then build specific search indices for each as an optimization. This may sound complex, but it's really simple. Let's illustrate with an example. We'll use C# syntax, Java is similar.

Business requirement: Must be able to search by the first three letters of a customer's first name (Example: "Mar*")

Steps:

  1. Create another field, say, FirstNameFilter. This will only contain the first three letters, in lower-case, of the customer's first name. So while the FirstName may contain "Maria", FirstNameFilter will contain "mar". As you'll see, the lower case trick increases the versatility of this approach.
  2. Mark the FirstNameFilter field as Secure-Searchable. i.e.  [Secure(Opt.Search)]. Note that FirstName itself could be marked as [Secure]  or  [Secure(Opt.Search)]. The latter adds an exact search use case if you have one.
  3. Pass the first 3 letter you receive to the Crypteron GetPrefix() SDK/agent library to get the search token
  4. Issue the query as usual to the database using that search token

This way you'll get all customers like Mary, mary, Martha, maRTHa, Margaret, mariA, marie, Marilyn etc at native database search speeds.

Extend this pattern if you have similar search requirements on other fields. For example: Search via first 3 characters of last name or last 4 of social security number.

Fuzzy searches (e.g. "1-2-3", "12 3", "123")

What about fuzzy searches? For example, a US formatted phone number where (123) 456-7890, 1234567890 and 123 456 7890 all really mean the same thing.

You guessed it - create a special search index. Except now you pre-process the string to strip the non-digit characters. The same approach for dates like 12/31/2012, 12-31-2012 or 12.31.2012 and so on.

Conclusion

There you have it, searchable encryption over your encrypted data at native search speeds. The above scenarios should cover the vast majority of business requirements. All without compromising the security of your data via broken or experimental cryptography.

If you have any questions, comments or concerns, please do not hesitate to drop us a line at [email protected]. We'd love to solve your data security challenges.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent blog posts

Migrating existing live data into Crypteron

You’re already live in production. And you have sensitive in the clear. Read this article to see how Crypteron can help.

Encryption, Entity Framework and Projections

Projections in Entity Framework live outside the entity lifecycle. Read more to learn how your can use Crypteron to secure such data.

PCI DSS and key rotations simplified

PCI compliance requires data encryption keys to be changed frequently. Here is how you can do it easily.

Your data-center is not secure and what you can do about it

There is no secure perimeter anymore. Neither in your corporate network nor in your data center. Fight a winning battle armed with self-protecting data rather than a losing one trying to protecting the infrastructure.

Introducing the Crypteron Startup Innovators Program

Qualifying startups get up to 50% off all plans. Tell us how you’re changing the world and the our Startup Innovators Program will support your journey.

6 encryption mistakes that lead to data breaches

If encryption is so unbreakable, why do businesses and governments keep getting hacked? Six common encryption mistakes that lead to data breaches.

Announcing the new Crypteron Community Edition

Starting today you can now sign up for the Crypteron Community Edition for free with no performance limitations.

Data breach response – One click to save your business

Get breathing room – when you need it the most. Respond to a data breach with a single click.

Why We Need Proper Data-At-Rest Encryption: 191M U.S. Voters’ Data Exposed

Adding security at the application level is a large step forward in protecting data from the constant threat of data breaches

How to encrypt large files

CipherStor is blazingly fast! Here we show how to use it within your data-flow pipeline to maintain high performance when encrypting large files.

Crypteron introduces secure, searchable encryption

by Sid Shetye time to read: 5 min
0