CI/CD In Snowflake Using Schemachange And Jenkins

Author: Sravya Beecharaju



CI/CD:

CI/CD stands for continuous Integration continuous deployment. It is the best process in agile methodology. It is the process of delivering project deliverables continuously. Whenever a change is committed in the central repository, the changes are automatically built, tests are run and are deployed.



Jenkins:

It is an open source DevOps tool written in the Java programming language. It is used for continuous integration/continuous delivery and deployment and to implement CI/CD workflows, which are called pipelines for any development project. It helps in automating the software development processes like building, testing, deploying, etc.


Schemachange:

Schemachange is a database management tool to manage all snowflake objects. It is a python based tool and follows an imperative style approach to database change management. It plays a major role in enabling DevOps for database objects.


Jenkins-Schemachange-Snowflake:
  • Snowflake objects are managed using Database change management tool Schemachange and a CI/CD tool Jenkins.

  • A proper folder structure has to be maintained to implement schemachange, i.e., under a root folder (ex: Migrations), all the files have to be maintained with proper versioning.

  • Snowflake database objects’ sql scripts are committed to a version control tool like bitbucket.

  • A pipeline job is configured in jenkins to build once a day. During the build, schemachange checks the scripts and if there is any change in the scripts or new scripts, it will deploy them in Snowflake. Unchanged scripts are skipped.

  • Jenkins is hosted on AWS EC2 instance to be accessible to everyone.

  • To install and configure Jenkins on AWS EC2 instance, refer Jenkins on EC2


Jenkins-Bitbucket integration:
  • In Jenkins, On the left-hand side, go to Manage Jenkins

  • Click Manage Plugins

  • Under the Available tab, search for Bitbucket Plugin at the top right.

  • Select the checkbox next to the plugin, and then click Install without restart.




  • After installing the plugin successfully, go to Manage Jenkins from the left tab and scroll down to select Manage Credentials.


  • Click on add credentials

  • Provide bitbucket app password as password and the username associated with it as username and click on OK.



Now, Bitbucket is integrated with Jenkins.

Pipeline job creation

Next is to create a pipeline job.

Go to Dashboard and click on new item

Enter a job name, select pipeline and click OK.



It will be navigated to the configuration page of the pipeline.

Under General, select parameterised build and add the parameter types and their default values.



Okta is used to sign in to Snowflake, so an authenticator parameter is added with the default value as the okta url.

Provide a snowflake account, username and password with which snowflake has to be logged in.

Note that the name for the password parameter has to be SNOWFLAKE_PASSWORD and the user password has to be specified as default value.

Provide a warehouse, role and database to be used parameters. A database should be in place.



Under Build Triggers, select build periodically and specify a time when the build should automatically triggered.



Scroll down to Pipeline tab, select Pipeline script from SCM as definition, git as SCM.

Under repositories, give bitbucket repository url and bitbucket credentials (which were added earlier).



Click on advanced and specify as below



Specify the other configurations as below



Mention script path as Jenkinsfile.

A Jenkinsfile has all the pipeline steps to be executed, along with the command to execute schemachange.


python schemachange/cli.py [-h] [-f ROOT_FOLDER] -a

SNOWFLAKE_ACCOUNT -u SNOWFLAKE_USER -r SNOWFLAKE_ROLE -w

SNOWFLAKE_WAREHOUSE [-d SNOWFLAKE_DATABASE] [-c

CHANGE_HISTORY_TABLE] [--vars VARS]

[--create-change-history-table] [-ac] [-v]


It is committed to the bitbucket repository. Uncheck lightweight checkout and Click on Save.



Now the pipeline job is created and configured to automatically trigger a build at the specified time and deploy the database changes in Snowflake.

Jenkins File:

To deploy scripts in Snowflake using schemachange through Jenkins pipeline, schemachange repository is required.

It is cloned into jenkins workspace and installed using pip to run the deployment command (for pip installation on ec2, refer jenkins on ec2 document).

Migrations is the name of the root directory in which sql script files are committed.


pipeline {

agent any

environment {

AWS_DEFAULT_REGION='ap-south-1'

}

stages {

stage('Test run') {

steps {

echo "Production Deployment with Schemachange"

}

}

stage('Get Schemachange') {

steps {

sh "git clone https://github.com/Snowflake-Labs/schemachange.git"

}

}

stage('Run schema change') {

steps {

withCredentials([aws(accesskeyVaribale:'AWS_ACCESS_KEY_ID',credentialsId:'<aws-creds-uname-in-jenkins>',secretKeyVariable:'AWS_SECRET_ACCESS_KEY')]) {

sh "python3 -m pip install schemachange --upgrade"

sh "python3 schemachange/schemachange/cli.py deploy -f migrations -a ${SF_ACCOUNT} -u ${SF_USERNAME} -r ${SF_ROLE} -w ${SF_WAREHOUSE} -d ${SF_DATABASE} -c ${SF_DATABASE}.SCHEMACHANGE.CHANGE_HISTORY --create-change-history-table"

}

}

}

}

}


Sql scripts:

Schemachange follows an imperative approach to database management. Versioning of the script files has to be followed like V1.1.1__file_name.sql

All the sql scripts files are stored under a root directory in the repository.



Note that there should be two underscores (__).

Sql scripts execution status can be found under change_history table in schemachange schema in snowflake, which is created by default.


Implementation:

If there is any change in script files or any new files which are not being deployed in Snowflake, schemachange recognises and executes that script in the Snowflake environment provided in the configuration of pipeline job.



If there are no changes in the script files, i.e., all the changes are deployed in Snowflake, then those script files are skipped.



Conclusion:

Using Schemachange and Jenkins with Snowflake, database objects or changes can be managed, and also the production deployment in the snowflake can be scheduled. This helps in deploying database changes through a pipeline using modern software practices.


References:

https://quickstarts.snowflake.com/guide/devops_dcm_schemachange_github/index.html?index=..%2F..index#0


https://www.linkedin.com/pulse/snowflake-dataops-using-schemachange-nakul-gowdra?trk=articles_directory

61 views0 comments

Recent Posts

See All