top of page

External Tokenization In Snowflake

Updated: Mar 16

Authors: Kumar Shivam Singh & Kashish Bassi


Snowflake offers a wide range of security features like column masking policies, row-level access policies, and external tokenization to its customers. This blog describes external tokenization and its implementation on the AWS platform using a sample application from Protegrity (tokenization provider).


External Tokenization

The external tokenization feature allows the accounts to tokenize data before loading it into Snowflake and then detokenize the data at query runtime. Tokenization is the process of substituting a sensitive data element with a token value that has no meaningful value if breached. External tokenization works using Snowflake masking policies along with external functions.


The data is tokenized using a tokenization provider and then loaded into Snowflake; this ensures that sensitive data is never exposed unnecessarily. When any user executes a query, the masking policy is applied to the column at every location where the column appears. Based on the role hierarchy and other policy conditions, the masking policy calls the external functions. The user gets the data values either as the original (detokenized) or tokenized values.


Benefits
  • Pre-load tokenized data: As the data is already tokenized using a tokenization provider before loading into the Snowflake, it ensures that the sensitive data is never exposed unnecessarily to a user.

  • Ease of use: Once defined, a policy can be applied to any number of columns across the database and schema.

  • Change management: The content of the masking policy can be easily changed without having to reapply across all columns


Encryption vs. Tokenization

Encryption

Tokenization

Converts plain text value into an unreadable form called ciphertext with the help of encryption key and algorithm

Substitutes the data value with a token value with no meaningful value and uses a token database that stores the relationship between the token and the original value

Scales up to large data volumes with encryption keys to decrypt data

Scaling-up in Tokenization gets difficult as database size increases

Can be used for structured as well as unstructured data

Can be used only for structured data fields

The original data leaves the organization but in the encrypted form

The original data never leaves the organization and thus satisfies compliance requirements


External Tokenization Integration options on Different Cloud Platforms:

1. AWS – External Tokenization on AWS can be done using either of the two integration partners:

  • Protegrity

  • Baffle

2. Azure – External Tokenization on Azure can be done using either:

  • Baffle

  • Custom integration

3. Google Cloud Platform – External Tokenization on Google Cloud Platform(GCP) can be done using Custom integrations.


External Tokenization on AWS with Protegrity Sample Application

The flow of data in external tokenization using the Protegrity solution for Snowflake is provided in the figure below


Credits: Protegrity for Snowflake


Steps for External Tokenization with Protegrity Sample application:

1. Create a Warehouse; Database named as ‘PROTEGRITYDATABASE’ in the Snowflake account. Create table ‘PROTEGRITYTABLE’ in the database to store the tokenized data.


2. Create External Functions to protect and unprotect data elements using the Protegrity service.

The code for creating the external function on the Name column as per the Protegrity trial experience is given below. The external functions for the remaining columns can be created in a similar manner.

— 1 NAME

CREATE OR REPLACE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_NAME ( val varchar )

RETURNS varchar

NOT NULL

IMMUTABLE

COMMENT = ‘Unprotects for first and last name using an alpha token type.’

API_INTEGRATION = protegrity_api

HEADERS =(

‘X-Protegrity-HCoP-Rules’='{\”payload_type\”:\”JSON\”,\”jsonpaths\”:[{\”jsonpath\”:\”$.data[*][1]\”,\”op_type\”:\”unprotect\”,\”data_element\”:\”deName\”}]}’

)

CONTEXT_HEADERS = ( current_user, current_timestamp, current_account )

AS ‘https://vbfsp49jci.execute-api.us-east-1.amazonaws.com/SF_CUSTOMER’;

————————————————————————————————————————-

Credits: Protegrity for Snowflake


3. Masking policies, once applied to the columns, determine whether a user will be able to view the de-tokenized value or the tokenized value based on the role and specified conditions. Create Masking policies on all the columns as per the Protegrity Trial experience code given below:


— Masking Policy 1

create or replace masking policy pii_deName as (val string) returns string ->

case

when current_role() in (‘HR’,’FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_NAME(val)

else val

end;

— Masking Policy 2

create or replace masking policy pii_deCity as (val string) returns string ->

case

when current_role() in (‘HR’,’FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_CITY(val)

else val

end;

— Masking Policy 3

create or replace masking policy pii_dePostCD as (val string) returns string ->

case

when current_role() in (‘HR’) then PTY_UNPROTECT_POSTCD(val)

else val

end;

— Masking Policy 4

create or replace masking policy pii_deAddress as (val string) returns string ->

case

when current_role() in (‘HR’) then PTY_UNPROTECT_ADDRESS(val)

else val

end;

— Masking Policy 5

create or replace masking policy pci_deIBAN as (val string) returns string ->

case

when current_role() in (‘FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_IBAN(val)

else val

end;

— Masking Policy 6

create or replace masking policy pii_deSSN as (val string) returns string ->

case

when current_role() in (‘HR’,’FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_SSN(val)

else val

end;

— Masking Policy 7

create or replace masking policy pii_deAge as (val string) returns string ->

case

when current_role() in (‘HR’,’FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_AGE(val)

else val

end;

— Masking Policy 8

create or replace masking policy pii_deEmail as (val string) returns string ->

case

when current_role() in (‘HR’,’FINANCE’, ‘CUSTOMERREP’) then PTY_UNPROTECT_EMAIL(val)

else val

end;

— Masking Policy 9

create or replace masking policy pii_deDOB as (val date) returns date ->

case

when current_role() in (‘HR’) then cast(PTY_UNPROTECT_DOB(val) as date)

else val

end;

— Masking Policy 10

create or replace masking policy pci_deCCN as (val string) returns string ->

case

when current_role() in (‘FINANCE’,’CUSTOMERREP’) then PTY_UNPROTECT_CCN(val)

else val

end;

————————————————————————————————————————-

Credits: Protegrity for Snowflake


4. Apply masking policies to columns using the below code:


alter table PROTEGRITYTABLE modify column first_name set masking policy pii_deName;

alter table PROTEGRITYTABLE modify column last_name set masking policy pii_deName;

alter table PROTEGRITYTABLE modify column city set masking policy pii_deCity;

alter table PROTEGRITYTABLE modify column postcode set masking policy pii_dePostCD;

alter table PROTEGRITYTABLE modify column street set masking policy pii_deAddress;

alter table PROTEGRITYTABLE modify column iban set masking policy pci_deIBAN;

alter table PROTEGRITYTABLE modify column ssn set masking policy pii_deSSN;

alter table PROTEGRITYTABLE modify column age set masking policy pii_deAge;

alter table PROTEGRITYTABLE modify column email set masking policy pii_deEmail;

alter table PROTEGRITYTABLE modify column birthday set masking policy pii_deDOB;

alter table PROTEGRITYTABLE modify column cc set masking policy pci_deCCN;

– Loading the tokenized file provided by Protegrity


Select the warehouse



And load the Tokenized CSV File provided by Protegrity



Choose file format



Load the table –



Now, we are all set to show some of the cases that will demonstrate how the Protegrity for Snowflake Solution delivers only the data that a user needs to perform their job function with the help of external functions.


CASES

1.

Privileged Role like- ACCOUNTADMIN or SYSADMIN

Running Query-

SELECT * FROM PROTEGRITYTABLE;



Insight-

The query is performed by any member of the Privileged Role, i.e. ACCOUNTADMIN or SYSADMIN. We can see that the Privileged Role is not able to view any sensitive data in the clear. Tokenized value can be seen in the field since the data at rest is being de-identified.

This proves that sensitive data at rest is protected as the privileged roles themselves are not able to view de-tokenized data.


2.

Role – HR Role (Jacek)

Running Query-

SELECT * FROM PROTEGRITYTABLE;



Insight-

The query is performed by any member of the HR Role. Jacek from HR shows that even if Jacek gets to see much other information like the first name, email, etc. still financial account numbers like IBAN should not be seen. In these queries, the data at rest is de-identified or protected.


3.

Role – FINANCE Role (Monica)

Running Query-

SELECT * FROM PROTEGRITYTABLE;



Insight-

The constraints on the finance role are applied so that they can see the Credit Card Number, but SSN is masked to show the last four digits.


3.

Role – CUSTOMERREP Role (Maria)

Running Query-

SELECT * FROM PROTEGRITYTABLE;



Insight-

The Customer Service Rep is restricted to the last four digits of the SSN, Credit Cards are Nulled, and the IBAN account is masked.


CONCLUSION-

Encryption is a good technology for organizations that are using it for point-to-point encryption. And depending on who’s managing the keys, you can reduce scope using encryption in that manner.


As far as data at rest in your environment, you’re not reducing the scope, and you’re not reducing cost whatsoever because you maintain the keys. You can get back to the original data set.


So why not Tokenize it and get it out of your environment? If somebody

steals a token, you can’t ever get back to the original value. What you’re going to

have stolen is a surrogate value that has absolutely no meaning to it. So

tokenization is a great technology for organizations that don’t need to retain the

sensitive data or necessarily ever get back to it.


References: Protegrity for SnowflakeProtegrity Solution for Snowflake Trial Experience

111 views0 comments

Recent Posts

See All
bottom of page