SFMC SQL & AWS — A Query Studio Alternative.
I heard whispers that the much loved SFMC Labs package “Query Studio” is nearing its end of life.
This article explores an alternative, high performance solution leveraging some AWS products with little to no code (set and forget for .zip files).
Previously I wrote about doing a similar activity with GCP which included some required steps to map field names, this remove the mapping process for the lazy or time poor people out there.
We are going to be exporting data out of SFMC into AWS S3.
From there we will unzip any .zip files with a lambda function, crawl over the .csv files to create tables using Glue for us to then run SQL over in Athena.
Data Extracts & File Transfers
For any SFMC Consultant or developer this is bread and butter stuff, we can move, manipulate and manage data via automation studio activities. As of a few months ago SFMC added in AWS S3 (Simple Storage Solution) as an external file location that can be interacted with from SFMC (data import, file transfer out).
This means that we now have the ability to push data out of SFMC and into an AWS instance into an S3 Bucket of choice. This could be a Data Extension extract OR a tracking extract (a zip file comprised of data from the data views and lists).
**TIP: make sure that you choose a delimiter of “|” rather than comma to be safe.
This function is only necessary if you plan to use tracking extracts in SFMC as the output is a .zip file that you need to unzip to get at the .csv files inside.
This function needs to be triggered by PUT on the bucket that your SFMC File Transfer is loading its data.
** TIP: make sure its hosted in the same region as the bucket, also make sure the lambda runtime and memory are sufficient to match your job size (check the lambda doco for max limits)
Code (this should work with a copy and paste — runtime Python 3.8):
This function will unzip and place the files in the same bucket, it will also delete the .zip file that that files came in.
If the file isn’t a .zip it will abort.
Once the automation has run and the Lambda has been executed you will see your raw files in your bucket.
In my example below I exported a data extract with most of the options selected from a sandbox SFMC BU.
From there you can go to AWS Glue and create a crawler, the steps are fairly straight forward and well documented. The crawler is an abstraction of a Hadoop job running over your S3 Bucket files mapping the schema of the files as tables.
Once the crawler has done its crawling you can connect the datastore to Athena and start writing SQL queries!
You can explore your database tables on the left and interact with a tabular query interface and results table in the same UI. You can also do your typical SQL action such as creating tables, views etc.