What is OPENJSON function ?
OPENJSON is a powerful T-SQL function that parses JSON text, returning objects and properties as rows and columns. This function converts JSON string formats into schema object formats, making it invaluable for SQL database management.
How to declare OPENJSON functon ?
OPENJSON function is declare after “from” sentence
Ex.
select values from OPENJSON(@jsonvalue) |
We case use path sentence in declaration to declare which JSON file we would like to explode from file/string. It’s really useful with we have JSON with hierarchy or multilevel format,
Ex.
select values from OPENJSON(@jsonvalue, “$.data” ) |
What’s more we can use WITH clause to declare schema type from JSON string. Important is to be aware when we explode column but below is still JSON string, in this case we need to declare type nvarchar with AS JSON clasue.
OPENJSON( jsonExpression [ , path ] ) [ <with_clause> ] <with_clause> ::= WITH ( { colName type [ column_path ] [ AS JSON ] } [ ,…n ] ) |
Ok so let’s try it !
To proceed with converting a JSON format string for use in an MS SQL database and transforming it into a BeerReceipt table, you’ll first need to define the destination schema for this table. The process involves specifying the table structure in SQL, including column names, data types, and any constraints to ensure the JSON data maps correctly to the SQL table columns. This step is crucial for successfully importing JSON data into the database and converting it into a usable format for querying and analysis within your SQL environment.
CREATE TABLE [dbo].[BaweryRecipts]( [id] [int] NULL, [name] [nvarchar](100) NULL, [tagline] [nvarchar](100) NULL, [abv] [decimal](6, 2) NULL, [ibu] [int] NULL, [target_fg] [int] NULL, [target_og] [int] NULL, [ebc] [int] NULL, [srm] [int] NULL, [ph] [decimal](6, 2) NULL, [value_volume] [int] NULL, [volume_unit] [nvarchar](50) NULL, [boil_volume_value] [int] NULL, [boil_volume_unit] [nvarchar](50) NULL, [fermantation_temp_value] [int] NULL, [fermantation_temp_unit] [nvarchar](25) NULL, [twist] [nvarchar](25) NULL, [mesh_temp_temp_value] [int] NULL, [duration] [nvarchar](25) NULL ) |
{
„id”: 1,
„name”: „Buzz”,
„tagline”: „A Real Bitter Experience.”,
„abv”: 4.5,
„ibu”: 60,
„target_fg”: 1010,
„target_og”: 1044,
„ebc”: 20,
„srm”: 10,
„ph”: 4.4,
„attenuation_level”: 75,
„volume”: {
„value”: 20,
„unit”: „litres”
},
„boil_volume”: {
„value”: 25,
„unit”: „litres”
},
„method”: {
„mash_temp”: [
{
„temp”: {
„value”: 64,
„unit”: „celsius”
},
„duration”: 75
}
],
„fermentation”: {
„temp”: {
„value”: 19,
„unit”: „celsius”
}
},
„twist”: null
},
„food_pairing”: [
„Spicy chicken tikka masala”,
„Grilled chicken quesadilla”,
„Caramel toffee cake”
]
}
I would like to transform JSON to destination table schema.
To transform JSON into a destination table schema, first, insert the JSON string into the database, recommended as nvarchar(max) format. Then, use the OPENJSON function to parse this string.
For inserting the string, options include using Azure Data Factory (ADF) or Databricks to create a pipeline, manually inserting, or utilizing other ELT/ETL tools.
Additionally, exploring the use of external tables is another method, though more details on that will be provided later. Initially, demonstrating the use of OPENJSON can be as simple as copying the string and declaring it as an nvarchar(max) variable.
When processing JSON with the OPENJSON function, the result is a table with „key,” „value,” and „type” columns, where „key” is the property name from the JSON file in nvarchar(4000) format, „value” represents the data from JSON, and „type” describes the value’s data type.
Value of the Type column | JSON data type |
0 | null |
1 | string |
2 | number |
3 | true/false |
4 | array |
5 | object |
What we can do with this result ?
With the results from OPENJSON, you can use the WITH clause to create a first-level schema, organizing the JSON data into a structured format. This approach involves transferring all JSON data into a schema format, with JSON strings stored in columns of nvarchar(max) type. This method allows for more structured and accessible data manipulation and querying within your SQL database, making it easier to work with JSON data effectively.
When using the OPENJSON function for columns with nested JSON strings, be cautious of performance issues with large child datasets. The solution is to split nested JSON into new tables and create relationships based on IDs from the parent JSON file. This approach can mitigate performance problems arising from processing large amounts of data, especially when dealing with multiple files and records. Further details and optimizations for handling complex JSON structures in SQL Server will be discussed later. This method helps manage data efficiently while maintaining system performance.
select * from openjson(@jsonValue) with ( [id] int, [name] nvarchar(100), [tagline] nvarchar(100), [abv] decimal(6,2), [ibu] int, [target_fg] int, [target_og] int, [ebc] int, [srm] int, [ph] decimal(6,2), [attenuation_level] int, [volume] nvarchar(max) AS JSON, [boil_volume] nvarchar(max) AS JSON, [method] nvarchar(max) AS JSON, [food_pairing] nvarchar(max) AS JSON ) lv1 outer apply openjson([volume]) outer apply openjson([boil_volume]) outer apply openjson([method]) outer apply openjson([food_pairing]) |
Please focus on next-level child JSON from columns volum, boil_volume, method, food_paring.
For volume and boil_volume, it is easy:
with ( [value_volume] int '$.value’, [volume_unit] nvarchar(50) '$.unit’ ) lv2 outer apply openjson([boil_volume]) with ( [boil_volume_value] int '$.value’, [boil_volume_unit] nvarchar(50) '$.unit’ ) lv3 |
Handling more complex scenarios involves nested JSON at the third level within the second level. In this case, it’s necessary to extract mesh_temp and save it as JSON, then repeat the outer apply join to access mesh_temp, unit, and duration. The full query is as follows:
„For methods involving more complex scenarios, due to nested JSON in the third level inside the second level, we need to extract mesh_temp and save it as JSON. Subsequently, we repeat the outer apply join process to retrieve mesh_temp, unit, and duration.”
This correction ensures clarity and grammatical accuracy in the description of handling nested JSON data structures within SQL queries.
DECLARE @jsonValue nvarchar(max) = '{ „id”: 1, „name”: „Buzz”, „tagline”: „A Real Bitter Experience.”, „abv”: 4.5, „ibu”: 60, „target_fg”: 1010, „target_og”: 1044, „ebc”: 20, „srm”: 10, „ph”: 4.4, „attenuation_level”: 75, „volume”: { „value”: 20, „unit”: „litres” }, „boil_volume”: { „value”: 25, „unit”: „litres” }, „method”: { „mash_temp”: [ { „temp”: { „value”: 64, „unit”: „celsius” }, „duration”: 75 } ], „fermentation”: { „temp”: { „value”: 19, „unit”: „celsius” } }, „twist”: null }, „food_pairing”: [ „Spicy chicken tikka masala”, „Grilled chicken quesadilla”, „Caramel toffee cake” ] } ’ INSERT INTO [dbo].[BaweryRecipts] select [id], [name], [tagline], [abv], [ibu], [target_fg], [target_og], [ebc], [srm], [ph], [value_volume], [volume_unit], [boil_volume_value], [boil_volume_unit], [fermantation_temp_value], [fermantation_temp_unit], [twist], [mesh_temp_temp_value], [mesh_temp_unit] [duration] from openjson(@jsonValue) with ( [id] int, [name] nvarchar(100), [tagline] nvarchar(100), [abv] decimal(6,2), [ibu] int, [target_fg] int, [target_og] int, [ebc] int, [srm] int, [ph] decimal(6,2), [attenuation_level] int, [volume] nvarchar(max) AS JSON, [boil_volume] nvarchar(max) AS JSON, [method] nvarchar(max) AS JSON, [food_pairing] nvarchar(max) AS JSON ) lv1 outer apply openjson([volume]) with ( [value_volume] int '$.value’, [volume_unit] nvarchar(50) '$.unit’ ) lv2 outer apply openjson([boil_volume]) with ( [boil_volume_value] int '$.value’, [boil_volume_unit] nvarchar(50) '$.unit’ ) lv3 outer apply openjson([method]) with ( [fermantation_temp_value] int '$.fermentation.temp.value’, [fermantation_temp_unit] nvarchar(25) '$.fermentation.temp.unit’, [mash_temp] nvarchar(max) '$.mash_temp’ as json, [twist] nvarchar(25) '$.twist’ ) lv4 outer apply openjson([mash_temp]) with ( [mesh_temp_temp_value] int '$.temp.value’, [mesh_temp_unit] nvarchar(25) '$.temp.unit’, [duration] int '$.duration’ ) lv5 |
Conclusion
OPENJSON is a versatile tool for SQL database administrators, offering a method to efficiently handle JSON data within SQL Server. By following the outlined practices, developers can enhance their database management and querying strategy