top of page

Load Data To Snowflake From MongoDB Via Fivetran

Author: Ayushi Rawat


Introduction

MongoDB is famous for its capability to handle dynamic data and scalability since it is well organized. This blog will guide you through the steps of how you can load data from MongoDB to Snowflake via Fivetran. Fivetran provides a direct connector for MongoDB at source which makes the implementation pretty simple.


Let’s first understand what you can expect from this blog.


What will be covered in this Blog?

  • What is MongoDB?

  • Setup Snowflake environment

  • Adding destination in Fivetran

  • Configure the Source

  • Adding data sources in Fivetran


Let’s get started!


What is MongoDB?


MongoDB is known as a NoSQL database, which stands for ‘not only SQL’. In layman terms, it stores data in the form of folders with all kinds of data, like images, text, and more whether it's structured, semi-structured, or unstructured. It uses a document-oriented model, where the data is stored in JSON-like structure.


Let us start by setting up the Snowflake environment.


Setup Snowflake environment


Setting up Snowflake is easy, you only need to add the respective role, database, and tables that Fivetran will have access to and provide the necessary grants required on the SF objects. Below is a sample script you need to execute for the same. You can modify the script based on your requirements, for more details on this, you can read the Fivetran Setup guide for Snowflake.


begin;


-- create variables for user / password / role / warehouse / database (needs to be uppercase for objects)

set role_name = 'FIVETRAN_ROLE';

set user_name = 'FIVETRAN_USER';

set user_password = 'password123';

set warehouse_name = 'FIVETRAN_WAREHOUSE';

set database_name = 'FIVETRAN_DATABASE';


-- change role to securityadmin for user / role steps

use role securityadmin;


-- create role for fivetran

create role if not exists identifier($role_name);

grant role identifier($role_name) to role SYSADMIN;


-- create a user for fivetran

create user if not exists identifier($user_name)

password = $user_password

default_role = $role_name

default_warehouse = $warehouse_name;


grant role identifier($role_name) to user identifier($user_name);


-- set binary_input_format to BASE64

ALTER USER identifier($user_name) SET BINARY_INPUT_FORMAT = 'BASE64';


-- change role to sysadmin for warehouse / database steps

use role sysadmin;


-- create a warehouse for fivetran

create warehouse if not exists identifier($warehouse_name)

warehouse_size = xsmall

warehouse_type = standard

auto_suspend = 60

auto_resume = true

initially_suspended = true;


-- create database for fivetran

create database if not exists identifier($database_name);


-- grant fivetran role access to warehouse

grant USAGE

on warehouse identifier($warehouse_name)

to role identifier($role_name);


-- grant fivetran access to database

grant CREATE SCHEMA, MONITOR, USAGE

on database identifier($database_name)

to role identifier($role_name);


-- change role to ACCOUNTADMIN for STORAGE INTEGRATION support (only needed for Snowflake on GCP)

use role ACCOUNTADMIN;

grant CREATE INTEGRATION on account to role identifier($role_name);

use role sysadmin;

commit;


Once the Snowflake environment is ready, let us add Snowflake as our destination in the Fivetran account.


Adding destination in Fivetran


Once you are inside the Fivetran account, you can navigate to the destinations tab from the left panel. Hit the ‘Add destination’ button, and it will prompt you to add a destination name.


Next, select your data’s destination. Select Snowflake from the list of available destinations.

Next, you need to fill in some necessary information like host details, Snowflake credentials, authentication method, snowflake username and database, Data processing location, Cloud service provider etc. which will help Fivetran establish the connection and access the objects in Snowflake.


Refer to the screenshot below for a better understanding.

Once done, hit the Save and Test button, Fivetran will save the property configurations and test the connection.

Once all the connection tests are successfully passed, you have set up Snowflake as your destination and can view the details. Now, let us configure the source.


Configure the Source


In Atlas Could, navigate to your MongoDB DEPLOYMENT section from the left panel and switch to the Data service tab. Under Database Deployments, hit the connect button for the cluster you want to connect.


Refer to the screenshot below for a better understanding

Copy and save the host details from here.


Next, let's add the IP address in IP Access List inside the Network Access tab in order to whitelist the Fivetran IPs in the firewall. Copy the CIDR IPs from Fivetran configuration page and paste in MongoDB access list.


Refer to the screenshot below for a better understanding.


Now, let us add MongoDB as the source in Fivetran.


Adding data source in Fivetran


Navigate to the Connectors tab from the left panel. Hit the ‘Add connector’ button, and it will prompt you to choose from the list of available sources. Select MongoDB and click Set Up.


Fill in the necessary information like host details, MongoDB credentials, Connection Method, Pack mode, and IP address configurations, which will help Fivetran establish the connection and fetch data from Source.

Once done, Fivetran will run all the connection tests, if they pass successfully, you have set up MongoDB as your source and can view the details.

Now, we are all set to trigger our first initial sync.

Once the initial sync is triggered, you can monitor the logs in the Logs tab, and configure what data to pull from the Schema tab and your dashboard will look something similar to this. You can view the graph in the sync history and can toggle between an hour, a day, and a week.

Data in Snowflake


Once the initial sync is complete, you should be able to see the data inside Snowflake, and it will start reflecting as soon as the sync starts. I used dummy data in MongoDB for the blog demonstration purpose.


Conclusions

To wrap it up, the process of loading data from MongoDB to Snowflake through Fivetran offers a user-friendly and efficient way to handle data.


The Logging and alerting mechanism of Fivetran can help you to create resilient data ingestion pipelines.



References:
142 views0 comments

Recent Posts

See All
bottom of page