Author: Mahima Khandelwal
This blog aims to provide detailed information about GeoSpatial data and its handling in Snowflake.
GeoSpatial data files majorly come with .Geotiff extension, which can further be converted to GeoJson. We can then load this GeoJson data into Snowflake’s variant column. To make the analysis easier, we can flatten this data to a Geospatial data type.

What is GeoTiff?
A GeoTIFF is a standard .tif or image file format that includes additional spatial (georeferencing) information embedded in the .tif file as tags.

An image of Los Angeles from Planet’s February 2018 Basemap. ©2018 Planet Labs Inc, CC BY-SA 4.0.
GeoJSON
GeoJSON is a way of representing geographic data. It is a JSON-based format designed to represent geographical features with their non-spatial attributes.
Sample GeoJson data with a single value:

GeoJson properties may vary based on images.
Setting up Snowflake
You can create a free trial Snowflake account here - Snowflake Trial. The free trial provides 400 credits.
GeoJson to Snowflake - Raw Layer
We will be working with sample GeoJson data and using Snowflake’s internal stage to store data files. You can load data into the internal stage using the put command in snowsql.
Steps:
1. Create file format and internal stage
The below script creates a file format of type ‘Json’ for GeoJson data with gzip compression and an internal stage with the created file format.
create or replace file format JsonFormat
type = json,strip_outer_array = true,
compression = gzip;
create stage GeoJsonStage
file_format = JsonFormat;
2. Load data to the internal stage
The put command is used to load data from the local system to the internal stage we created in the previous step. You will need snowsql to execute this command.
put file:///POC/AllFiles/geotiff-1k-values-1k-rows_Modified.geojson @GeoJsonStage
3. List stage
View the loaded file using the list command in Snowflake.
list @GeoJsonStage;
4. Create raw table
Create a table with a variant column to load GeoJson - Semi-structured data.
create table Geo1Values1MRows(GeoValues variant);
5. Load data
Copy into command lets you load the data from the staged file into an existing table.
copy into Geo1Values1MRows
from @geojsonstage/geotiff-1-value-1M-rows_Modified.geojson.gz
file_format = JsonFormat;
Loaded data looks like this:

Here we have one variant column with json dump. This is our landing zone which is a faithful representation of source data.
Snowflake Raw Layer to Staging Layer- Flatten Data
Now that we have data loaded in a raw table let us flatten it with our column of interest to get meaningful insights.
In the above example, we will be flattening our data based on properties, latitude, longitude, and coordinates.
1. Create a table with the required columns
We need a table with 3 integer columns for properties-value, latitude & longitude, and a Geospatial data type - GEOGRAPHY column for coordinates.
create table Geo1Values1MRows_Modified(
Val number,
LNG number(15, 6),
LAT number(15, 6),
Coordinates geography
)
2. Load data
Insert data in the staging table, using a select query. We use colon notation(:) between the variant column name and any first-level element.
insert into
Geo1Values1MRows_Modified
select
Geovalues:properties.value as value,
Geovalues:geometry.coordinates [0] as LNG,
Geovalues:geometry.coordinates [1] as LAT,
TO_GEOGRAPHY(Geovalues:geometry) as Coordinates
from
Geo1Values1MRows;
Querying GeoSpatial Data in Snowflake
Once the data is loaded in our staging table let us query a select statement to fetch all the results.

We can also query latitude and longitude to get details of single and/or multiple coordinates.
1. Latitude & longitude - Single record.
Query on single latitude and longitude:
select * from
Geo1Values1MRows_Modified
where
lng = 4.341056
and
lat = 52.181202;

2. Latitude & longitude - Multiple records.
Below query results in all records between the specified latitude and longitude.
select * from
Geo1Values1MRows_Modified
where
(
lng < 4.341078
and lng > 4.00000
)
and (
lat > 52.179936
and lat < 52.18125
);

Let’s see how can we play along with geospatial data type in Snowflake -
1. St_within: Returns true if the 1st geography object is fully contained by the 2nd geography object.
select * from
Geo1Values1MRows_Modified
where
st_within(
coordinates,
st_makepoint(4.440488898317344, 52.18029904071257)
);

2. St_dwithin : Returns true if the minimum geodesic distance between two points is within the specified distance.
select * from
Geo1Values1MRows_Modified
where
st_dwithin(
coordinates,
st_makepoint(4.440488898317344, 52.18029904071257),
200
);

References
https://docs.snowflake.com/en/sql-reference/data-types-geospatial.html
https://docs.snowflake.com/en/user-guide/querying-semistructured.html