top of page

Installing Customized Python Packages (PIP) In Snowflake

Author: Rupesh Neve


Introduction:

We know Snowflake’s snowpark supports python programming language, and as python is available, we can import the python libraries very obviously. Snowflake partnered with Anaconda and made some of the python packages available, but when we see anaconda repository for snowpark, we can see not all python libraries are available as snowpark is continuously evolving.


In this blog, we will discuss how to install or resolve the dependencies using Snowflake python stored procedure. We will deeply discuss the wheel files to install python packages that are not available in anaconda repository.

In this blog, we will use the use case where we need to import the DBT core python library in the Snowflake using stored procedure, but unfortunately, it is not available in the repository. We need to think of an approach where we can actually import any custom python library and use it for computation. So to achieve this, we tried to import wheel files into the stage and used them for the computation, so let's deep dive into the approach and explore the implementation.


Solution:

Check whether the required package is available in anaconda repository or not. If it's not available, then download the wheel file from Pypi. Let's discuss what these wheel files are; a wheel file is a built distribution of a python package which is available at pypi. Wheel files are always in zip format, and this is more secure as we just unzip the file and read the code for the whole package.


When you see the wheel file, its naming convention is as follows:

{dist}-{version}(-{build})?-{python}-{abi}-{platform}.whl


Some of the wheel files are written in c, and some are written in python itself. Wheel file example:


cryptography-2.9.2-cp35-abi3-macosx_10_9_x86_64.whl


How to import these files inside the Snowflake:


1. Create an internal stage in Snowflake.


2. Download .whl from PyPi: Go to PyPi and search for your desired package. Go to download files in the navigation menu as shown here



3. Download the file to your local machine.


Open Snowsql and upload the whl file into mystage using the PUT command.


Once the file is uploaded, verify your stage using the list @mystage;


Now, as we discussed before, these whl files are zipped files, so we need to unzip those files. Create a python wrapper to unzip those files and upload the wrapper in the stage.


Write your stored procedure in python and use the module you need for performing computations.


Note: In Snowflake, we can’t use wheel files that are written in c, as c needs executable rights. How to identify a wheel written in c?


cryptography-2.9.2-cp35-abi3-macosx_10_9_x86_64.whl


Red highlighted text indicates the module is written in c.


Conclusion:

We can install whole custom python packages which are not available in anaconda repository in the Snowflake and can use it.



References:


  1. What are Wheel files: link

  2. Snowpark stored procedure: link

  3. Unzipping staged file: link


147 views1 comment

Recent Posts

See All
bottom of page