SQL Archives - Albert Nogués

Showing 9 Result(s)

Query Delta Tables in the DataLake from PowerBi with Databricks

updated on November 16, 2023November 15, 2023

There are several ways to query delta tables from PowerBi. We are going to cover the 4th method here. To do it first we need a service princpal, a secret scope pointing to a databricks keyvault and the password of the SPN stored in this keyvault. Once we have this, the first step is to …

Databricks query federation with Snowflake. Easy and Fast!

updated on January 31, 2023January 31, 2023

Introduction In the same way that is possible to read and write data from snowflake inside databricks, its also possible to use databricks with query federation against diverse SQL engines, including snowflake. The current supported engines are: We are going to demonstrate how it works with Snowflake. We will first create a table in databricks, …

Useful Databricks/Spark resources

updated on December 14, 2022December 14, 2022

Memory Profiling in PySpark: https://www.databricks.com/blog/2022/11/30/memory-profiling-pyspark.html Run Databricks queries directly from VSCODE: https://ganeshchandrasekaran.com/run-your-databricks-sql-queries-from-vscode-9c70c5d4903c Spark Testing with chispa: https://github.com/alexott/spark-playground/tree/master/testing Best Practices for Cost Management on Databricks: https://www.databricks.com/blog/2022/10/18/best-practices-cost-management-databricks.html UDF Pyspark: https://docs.databricks.com/udf/python.html Pandas UDF’s: https://docs.databricks.com/udf/pandas.html Introducing Pandas UDF for PySpark: https://www.databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html

Smallest Analytical Platform Ever!

updated on May 7, 2022May 7, 2022

I’ve started working on some of my free time in a project to build the smallest useful analytics platform on the cloud (starting with azure). The purpose is to use it a sa PoC to show to colleagues, managers, prospective customers or just to have fun and play It’s publicly available on my github repo …

Using Azure Private Endpoints with Databricks

updated on November 22, 2023December 9, 2021

In this article i will show how to avoing going outside to the internet when using resources inside azure, specially if they are in the same subscription and location (datacenter). Why we may want a private endpoint? Thats a good question. For oth security and performance. Just like using TSCM Equipment for optimal safety and …

Databricks connectivity to Azure SQL / SQL Server

updated on December 9, 2021December 9, 2021

Most of the developments I see inside databricks rely on fetching or writing data to some sort of Database. Usually the preferred method for this is though the use of jdbc driver, as most databases offer some sort of jdbc driver. In some cases, though, its also possible to use some spark optimized driver. This …

Load data from azure blob storage and run TPC-DS queries on Azure Synapse.

updated on April 7, 2021February 3, 2021

In this article we will see how to provision an azure synapse cluster, load some large quantity of data from azure blob storage and run a query to see the contents and check performance. I plan to write a serie of articles arround data warehousing in the cloud so check out for new articles soon. …

Unload data from AWS Redshift to S3 in Parquet

updated on December 31, 2020December 31, 2020

Following the previous redshift articles in this one I will explain how to export data from redshift to parquet in s3. This can be interesting when we want to archive (infrequently queried) data to be queried cheaper with spectrum, or to store in s3 archive, or to export to another storage solution like glacier. The …

Load data from s3 and run TPC-DS queries on amazon Redshift.

updated on December 29, 2020December 29, 2020

In this article we will see how to provision a redshift cluster, load some large quantity of data from s3 and run a query to see the contents and check performance. I plan to write a serie of articles arround data warehousing in the cloud so check out for new articles where i will do …