Snowflake Features- Data Sharing, Data Masking, Monitoring

Author: Manohar Perabathula


DATA SHARING


Introduction:

Snowflake data Sharing is a simple feature that enables the sharing of data between two snowflake accounts. So by using data sharing we can share data to required consumers like media parties etc.

The data can be shared with the consumer from the provider end using the secure data sharing feature in the snowflake account.

Data can be shared with two types of accounts. They are,

1. Full Account

2. Reader Account.


Full Account:

If the consumer has a snowflake account or is a part of snowflake, then data sharing is possible.


Steps for Data Providers:

Data Sharing allows sharing of data such as snowflake database tables, secure views, and secure user-defined functions between two snowflake accounts.

  • Login to Snowflake accounts and switch the role to account administrator.

  • Click share and click outbound and then create a shared object.



  • Add consumers by mentioning “Account Type” as Full.



  • Enter the consumer's account name.


Steps for Data Consumers:
  • Login to Snowflake accounts and switch the role to account administrator.

  • Click share and go to inbound. The shared data will be visible.

  • Create a database for share.



Reader Account:

If the consumer is not having a snowflake account or is not a part of a snowflake, the data sharing is possible through a Reader Account.



Steps for Data Providers:
  • Login to Snowflake accounts and switch the role to account administrator.

  • Click share and click outbound and then create a shared object.



  • Add consumers by mentioning “Account Type” as Reader.



  • Create a Reader account by providing all credentials like user name, password, etc.



  • After creating a Reader Account, the Account URL will generate.



  • Then go back to the shared object and add the Reader account as a consumer.



Steps for Data Consumers:
  • Login to the Reader account and switch the role to account administrator.

  • Click share and go to inbound. The shared data will be visible.



  • Create a database for the share.



Since only the metadata will be consumed in the data sharing (not the physical database), the consumer does not pay anything for storage. The consumer will get charged for computing resources (virtual warehouse) created to query the shared data. Data providers charged for the storage cost.


Steps to perform data sharing using SQL commands:
  • Use CREATE SHARE to create a share.

  • Use GRANT<Privilege>.... TO SHARE to grant required object privileges to share.

  • Use ALTER SHARE to add one or more snowflake accounts to share.

So for better understanding let us consider an example

Assume a database named “Demo_DB” with a schema named “Sharing” and table named “Report” and the database,schema and table will be shared with different snowflake accounts.


Commands:

Use role accountadmin; Create share data_sharing; Grant usage on database demo_db to share data_sharing; Grant usage on schema mask to share data_sharing; Grant select on table demo_db.sharing.employee_info to share data_sharing; Show grants to share data_sharing; Alter share data_sharing add accounts=rv12365,sz54321; Show grants of share data_sharing;


Conclusion:

So data sharing plays a crucial role in sharing data with different clients or consumers using a snowflake account. Even though consumers are not a part of the snowflake account, the data will get shared through reader accounts.


Reference:

https://docs.snowflake.com/en/user-guide/data-sharing-intro.html


DATA MASKING

Data Masking is used for hiding or masking the particular column information in a table using the masking policy.The main aim of data masking is to protect the sensitive data in the required tables. At query runtime, the masking policy is applied to the column at every location where the column appears. Depending on the masking policy conditions, the SQL execution context, and role hierarchy, Snowflake query operators may see the plain-text value, a partially masked value, or a fully masked value.



Steps in creating Data Masking

  • Create any new roles and users, if needed, & grant privileges

Createuser Manu password= Aditya default_role= masking_admin must_change_password=False;

grantrole masking_admin touser manu;

grantselecton all tablesinschemamasktorole masking_admin;

grant all privilegesondatabase demo_db torole masking_admin;

grant all privilegeson warehouse demo_wh torole masking_admin;

grantapply masking policyonaccounttorole masking_admin;

grantcreate masking policyonschemamasktorole masking_admin;


For example, considering a scenario and creating a new role “masking_admin”and new user “manu” and granting all privileges on databases,schemas and warehouses.


  • Create and apply a masking policy.


create or replace masking policy masking1 as (email_id string) returns string -> case when current_role() in ('MASKING_ADMIN') then email_id else '**Masked**' end; alter table employee_info modify column email_id set masking policy masking1;


So here we are creating a masking policy named ”Masking1” and applying a masking policy for a column named “email_id” in the”employee_info” table.

Generally, there are many ways to mask sensitive data. We use dynamic data masking to hide the required columns.

Dynamic Data masking is a feature introduced in snowflake and eliminates the need for third-party tools to encrypt or mask the data.

In dynamic data masking, the data get masked by masking policies. It provides a column-level security feature in snowflake.


Requirement:

To implement the data masking on electrification data.

Users with an account admin role will be able to see the data. The users with an analyst role will be able to see masked data.


Masking Policy:

create or replace masking policy data_mask1 as (val string) returns string ->

case

when current_role() in ('ACCOUNTADMIN','SECURITYADMIN','SYSADMIN') then val

else '*******'

End;


Masking for multiple columns in a table:


altertable RURAL_ELECTRIFICATION_WB modify

column C_2000 set masking policy data_mask1,

column C_2001 set masking policy data_mask1,

column C_2002 set masking policy data_mask1,

column C_2003 set masking policy data_mask1,

column C_2004 set masking policy data_mask1,

column C_2005 set masking policy data_mask1,

column C_2006 set masking policy data_mask1,

column C_2007 set masking policy data_mask1,

column C_2008 set masking policy data_mask1,

column C_2009 set masking policy data_mask1,

column C_2010 set masking policy data_mask1,

column C_2011 set masking policy data_mask1,

column C_2012 set masking policy data_mask1,

column C_2013 set masking policy data_mask1,

column C_2014 set masking policy data_mask1,

column C_2015 set masking policy data_mask1,

column C_2016 set masking policy data_mask1, column C_2017 set masking policy data_mask1, column C_2018 set masking policy data_mask1, column C_2019 set masking policy data_mask1, column C_2020 set masking policy data_mask1;


The users with an account admin can able to see the data



After applying the masking policy, the analyst will be able to see masked data but not able to see actual data.



Conclusion:

Hence, Data masking plays a crucial role in snowflake accounts as it hides the most sensitive or confidential data. We can mask any number of columns and also unmask that data as per the user requirement.


Reference:

https://docs.snowflake.com/en/sql-reference/sql/create-masking-policy.html



MONITORING

Introduction:

Resource monitoring is used to limit the usage of a warehouse at the account level to control the overall credit usage.


Generally, the number of credits usage mainly depends on the size of the warehouse and how long it runs in that particular warehouse.


Only the account admin has access to the resource monitoring.


The resource monitor can monitor the user-created warehouses only. Every warehouse has a limit for a specific range. If the limit is reached, a notification alert is sent considering the credits consumed till then.


If the limit is exceeded, the warehouse gets suspended automatically by sending a notification.

Generally, a single monitor can manage all warehouses at a particular account level to limit credit usage. We can also assign a single monitor to a single warehouse.


Requirement:

To implement Resource monitoring in snowflake.


At the account level, three resource monitors are created for three warehouses that assign a single monitor for a single warehouse.




So, to send the alert notifications, there are some preferences like web and email. Using these preferences, the monitor will send alerts when the limit reaches the credit range.



Storage Monitoring:

In addition to that we are also able to see storage monitoring at a particular account level. Storage monitoring gives the overall total amount of storage used in a particular month.



Conclusion:

Thus, The Resource monitor in snowflake plays a prominent role in controlling overall compute cost. Moreover, it will also monitor warehouses used by cloud services and reduce the unexpected credit usage in running warehouses.


Reference:

https://docs.snowflake.com/en/user-guide/resource-monitors.html



75 views0 comments

Recent Posts

See All