Part of Using Databricks in the NHS England Secure Data Environment
Using cells
SQL and Python cells
While SQL is a powerful query tool, there may be times when Python may be more appropriate. For example, when you need to do more complex operations, such as:
- data manipulation
- data visualisation
- data modelling
The cells in a Databricks notebook run the language you selected when creating the notebook. However, if you select Python and would then like to use SQL, you can use an SQL magic command %sql to switch. Similarly, if you select SQL and would then like to switch to Python, you can use a Python magic command %python.
All cells support syntax highlighting, which means that the code is highlighted according to the category of terms to make it easier to read.
Databricks uses Spark SQL, which has subtle and important differences to T-SQL, which is used by the SQL Server. Refer to the Databricks SQL reference guidance for more information.
Below is an example of an SQL cell:
Below is an example of a Python cell (using PySpark)
Markdown cell
If you would like to include documentation in a notebook, you can create a markdown cell by putting a %md magic command at the start of the cell. This is called a markdown magic cell and enables you to render your code as text. It also allows you to organise your code as cells under headings which can be expanded and collapsed with the -/+ buttons on the left.
Below is an example of a markdown cell.
Last edited: 11 January 2024 1:29 pm