Wednesday, July 20, 2005

Google@moon

Google map now has a branch in the moon! Google Moon, as it called, is unfortunately lacking the powerful locality search for now. Not until July 20th, 2069..... sigh!

Tuesday, July 19, 2005

Should we use Transparent Data Encryption?

If your data requires the highest level of privacy and security, by all means, use it. However if it falls somewhere in the mid to low range, then you may want to carefully consider your options.
 
Transparent Data Encryption (TDE) is a new feature of Oracle 10gR2 database that provides transparent encryption and decryption of table columns. Transparent means there is no code change required, all is handled internally by the database. So when you issue a SQL query, the database automatically decrypts the information and presents it to you. The developer may not even realize that the column has been encrypted. This in essence provides out of the box solution for physical data protection. No one can peep into your backup tape and see all info in clear text.
 
As with any additional layer of operation, TDE will need extra CPU cycles and thus will add to overall application latency. For a comparison, I did a quick test to see the difference in the speed of my queries for both with and without TDE.
 
Here is the setup: I have a hypothetical phonebook table and populated 1 million entries to it. Created two tables: normal table and one with TDE for two of its columns. Created an index for one of its column and compare the execution time.
 
Setup of normal table:
 

SQL> create table phonebook(

  2    ID      NUMBER,

  3    NAME    VARCHAR2(24),

  4    REGION  NUMBER,

  5    AREA    NUMBER,

  6    NO      NUMBER

  7  );

 

Table created.

 

Elapsed: 00:00:00.03

SQL > Begin

  2    dbms_random.initialize(614573482);

  3    For i in 1..1000000

  4    Loop

  5       Insert /*+ append */ into phonebook

  6       values(i, dbms_random.string('U',24), round(dbms_random.value(1,10)), round(dbms_random.value(100,200)), round(dbms_random.value(90000000,99999999)));

  7       If mod(i, 10000) = 0 then

  8       Commit;

  9      End if;

 10    End loop;

 11    Commit;

 12  End;

 13  /

 

PL/SQL procedure successfully completed.

 

Elapsed: 00:06:00.94

SQL> select count(*) from phonebook;

 

  COUNT(*)

----------

   1000000

 

Elapsed: 00:00:00.06

SQL > create index phonebook_idx on phonebook(AREA);

 

Index created.

 

Elapsed: 00:00:04.70

SQL> exec dbms_stats.gather_table_stats( 'jay', 'phonebook', cascade=>true );

 

PL/SQL procedure successfully completed.

 

Elapsed: 00:00:03.81

 
Setup of TDE table:
 

SQL> create table phonebook_secure(

  2    ID      NUMBER,

  3    NAME    VARCHAR2(24),

  4    REGION  NUMBER,

  5    AREA    NUMBER ENCRYPT USING 'AES128' NO SALT,

  6    NO      NUMBER ENCRYPT USING 'AES128'

  7  );

 

Table created.

 

Elapsed: 00:00:00.03

SQL> Begin

  2    dbms_random.initialize(614573482);

  3    For i in 1..1000000

  4    Loop

  5       Insert /*+ append */ into phonebook_secure

  6       values(i, dbms_random.string('U',24), round(dbms_random.value(1,10)), round(dbms_random.value(100,200)), round(dbms_random.value(90000000,99999999)));

  7       If mod(i, 10000) = 0 then

  8       Commit;

  9      End if;

 10    End loop;

 11    Commit;

 12  End;

 13  /

 

PL/SQL procedure successfully completed.

 

Elapsed: 00:08:12.75

SQL> select count(*) from phonebook_secure;

 

  COUNT(*)

----------

   1000000

 

Elapsed: 00:00:00.17

SQL> create index phonebook_secure_idx on phonebook_secure(AREA);

 

Index created.

 

Elapsed: 00:00:18.58

SQL> exec dbms_stats.gather_table_stats( 'jay', 'phonebook_secure', cascade=>true );

 

PL/SQL procedure successfully completed.

 

Elapsed: 00:00:06.20

SQL > select count(*) from phonebook_secure;

 

  COUNT(*)

----------

   1000000

 

Elapsed: 00:00:00.16

 

Note: the random generator will create the same set of data as we fed it with the same seed number.
 
And here are some queries against both of them:
 

SQL> select count(*) from phonebook_secure where area = '123';

 

  COUNT(*)

----------

      9884

 

Elapsed: 00:00:00.10

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=68 Card=1 Bytes=4)

   1    0   SORT (AGGREGATE)

   2    1     INDEX (RANGE SCAN) OF 'PHONEBOOK_SECURE_IDX' (INDEX) (Co

          st=68 Card=9786 Bytes=39144)

 

Statistics

----------------------------------------------------------

        607  recursive calls

          0  db block gets

        162  consistent gets

         82  physical reads

          0  redo size

        351  bytes sent via SQL*Net to client

        504  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

         10  sorts (memory)

          0  sorts (disk)

          1  rows processed

 

SQL> select count(*) from phonebook where area = '123';

 

  COUNT(*)

----------

      9884

 

Elapsed: 00:00:00.07

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=23 Card=1 Bytes=4)

   1    0   SORT (AGGREGATE)

   2    1     INDEX (RANGE SCAN) OF 'PHONEBOOK_IDX' (INDEX) (Cost=23 C

          ard=9832 Bytes=39328)

 

Statistics

----------------------------------------------------------

        283  recursive calls

          0  db block gets

         63  consistent gets

         29  physical reads

          0  redo size

        351  bytes sent via SQL*Net to client

        504  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          6  sorts (memory)

          0  sorts (disk)

          1  rows processed

 

SQL> select region, name, area, no from phonebook_secure where area = '156' order by 1,2;

 

    REGION NAME                           AREA         NO

---------- ------------------------ ---------- ----------

        10 ZXETXFLVDSGEUMEAXSVYZUYD        156   90593596

        10 ZZVHCAVLYHBCJVDAWGBXBKVW        156   94108548

        10 ZZYZOWHOQGQPPITIKUMQAYDQ        156   91832718

………………

 

9947 rows selected.

 

Elapsed: 00:00:21.77

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4131 Card=9786 Byt

          es=371868)

 

   1    0   SORT (ORDER BY) (Cost=4131 Card=9786 Bytes=371868)

   2    1     TABLE ACCESS (FULL) OF 'PHONEBOOK_SECURE' (TABLE) (Cost=

          4029 Card=9786 Bytes=371868)

 

Statistics

----------------------------------------------------------

         64  recursive calls

          0  db block gets

      18021  consistent gets

      18002  physical reads

          0  redo size

     412900  bytes sent via SQL*Net to client

       7797  bytes received via SQL*Net from client

        665  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

       9947  rows processed

 

SQL> select region, name, area, no from phonebook where area = '156' order by 1,2;

 

    REGION NAME                           AREA         NO

---------- ------------------------ ---------- ----------

        10 ZXETXFLVDSGEUMEAXSVYZUYD        156   90593596

        10 ZZVHCAVLYHBCJVDAWGBXBKVW        156   94108548

        10 ZZYZOWHOQGQPPITIKUMQAYDQ        156   91832718

 

9947 rows selected.

 

Elapsed: 00:00:11.18

 

Execution Plan

----------------------------------------------------------

   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1606 Card=9832 Byt

          es=373616)

 

   1    0   SORT (ORDER BY) (Cost=1606 Card=9832 Bytes=373616)

   2    1     TABLE ACCESS (FULL) OF 'PHONEBOOK' (TABLE) (Cost=1505 Ca

          rd=9832 Bytes=373616)

 

Statistics

----------------------------------------------------------

         64  recursive calls

          0  db block gets

       6688  consistent gets

       6671  physical reads

          0  redo size

     412900  bytes sent via SQL*Net to client

       7797  bytes received via SQL*Net from client

        665  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

       9947  rows processed

 

Summary:
 

Operation

Normal Table

TDE Table

% Difference

Insert

00:06:00.94

00:08:12.75

26.7 %

Create index

00:00:04.70

00:00:18.58

37 %

Count(*) statement

00:00:00.07

00:00:00.10

30 %

Select statement

00:00:11.18

00:00:21.77

48.6 %

 
It looks like the operation against normal column is typically 26 - 50% faster than that of TDE column.
Yes, statistically we should collect more sample data in order to get more precise figures, but this quick test gives a general feel of it.
 
Keeping this in mind, we should consider the followings when deciding whether to use TDE in our application:
  • Can we architect the system to cater for the extra processing? Extra processing means our system will not scale as well as we thought.
  • Can we keep references to the encrypted columns to absolute minimum? Eg. encrypted primary key should be avoided.
  • If no extra changes implemented, will the response time as seen by the end user be acceptable?
If the answer for most of the questions is NO, then maybe we should find other way than using TDE.
 
For more information about TDE, check out Oracle documentation.