Working with JSON in an RDBMS – Part 3, MySQL

The MySQL database logo has a dolphin in it

Welcome to the third and final installment covering how to store and interact with JSON data in an RDBMS. We’ve covered using JSON in PostgreSQL and Microsoft SQL Server. What we’ve learned so far is that Microsoft pushes you to store the data in an nvarchar field and then interact with a series of functions whereas PostgreSQL provides specific data types (json and jsonb) that provide a bit more power, though the functionality comes with a bit of a learning curve. Now let’s talk about how to store and query JSON data in MySQL.

Now, let’s spend some time looking at what kind of functionality MySQL provides for storing and working with JSON.

Use Case for JSON in a MySQL RDBMS

Just as in the first two parts of the series (JSON in MS SQL Server and JSON in PostgreSQL), we will use the same use case scenario to explore the MySQL functionality. We’re going to work with a provider (e.g. a physician, hospital, clinic) contract with a health insurance company. These provide a good analogy to a JSON document because they originate from actual documents (contracts between two parties) and they often contain contract language as well as tables and payment terms.

We’ll use the same table structure in MySQL that we used in both our prior articles. MySQL provides a “json” data type that we can use to store our contract terms document. We’ll see a number of similarities between MySQL and PostgreSQL as we proceed.

  • provider
    • id (bigint, auto_increment, pk)
    • provider_name (varchar(100))
    • npi (varchar(10))
  • provider_contract
    • id (bigint, auto_increment, pk)
    • provider_id (bigint, fk to provider.id)
    • contract_terms (jsonb) > this is where our JSON document will go
    • contract_eff_date (datetime)
    • contract_term_date (datetime)

Once again, we’ll use the same JSON documents and provider details that we did in our previous article for consistency.

Our JSON Data Structure

Princeton Plainsboro Teaching Hospital Contract:

{
    "out_of_area": "true",
    "service_rates": [
          {"service_description": "Office Visits 1", "service_code_range_start": "99211", "service_code_range_end": "99221", "flat_rate": "45.00"},
          {"service_description": "Office Visits 2", "service_code_range_start": "99222", "service_code_range_end": "99251", "flat_rate": "56.00"},
          {"service_description": "Anesthesia", "service_code_range_start": "00100", "service_code_range_end": "00952", "billed_percentage": "0.7401"}
    ]
}

Sacred Heart Hospital Contract:

{
    "out_of_network": "true",
    "service_rates": [
         {"service_description": "Misc Services", "service_code_range_start": "81200", "service_code_range_end": "82000", "billed_percentage": "0.6132"}
    ]
}

Setting Up Our Data

The beautiful thing about SQL is that some things don’t change from one platform to another. A lot of things do, but some things don’t. That means we can reuse our INSERT queries from MS SQL Server and PostgreSQL to get data into our MySQL database with the minor exception that we need to adjust our date formatting for the second set of INSERT queries to work.

INSERT INTO provider (provider_name, npi) VALUES ('Princeton Plainsboro Teaching Hospital', '0123456789')
INSERT INTO provider (provider_name, npi) VALUES ('Sacred Heart Hospital', '1234560987')
INSERT INTO provider (provider_name, npi) VALUES ('Princeton Plainsboro Teaching Hospital', '0123456789')
INSERT INTO provider (provider_name, npi) VALUES ('Sacred Heart Hospital', '1234560987')
INSERT INTO provider_contract (provider_id, contract_terms, contract_eff_date, contract_term_date)
VALUES (
1,
'{ "out_of_area": true, "service_rates": [ {"service_description": "Office Visits 1", "service_code_range_start": "99211", "service_code_range_end": "99221", "flat_rate": "45.00"}, {"service_description": "Office Visits 2", "service_code_range_start": "99222", "service_code_range_end": "99251", "flat_rate": "56.00"}, {"service_description": "Anesthesia", "service_code_range_start": "00100", "service_code_range_end": "00952", "billed_percentage": "0.7401"}] }',
'2019-01-01 00:00:00',
'2020-12-31 23:59:59'
)
INSERT INTO provider_contract (provider_id, contract_terms, contract_eff_date, contract_term_date)
VALUES (
2,
'{ "out_of_network": true, "service_rates": [{"service_description": "Misc Services", "service_code_range_start": "81200", "service_code_range_end": "82000", "billed_percentage": "0.6132"}] }',
'2020-01-01 00:00:00',
'2020-12-31 23:59:59'
)

And just like that, our MySQL database is ready to go. Just as with PostgreSQL, MySQL is going to validate our JSON document before it allows us to store it in a “json” type field. So we don’t have to do any sort of query to determine up front if we have valid JSON to work with. The database handles that for us, as it should.

Query JSON Data in MySQL

We’ll follow the same path we did before in terms of how we want to interact with data. The first thing that we want to do is query our database. The first thing we want to do is determine providers that out of area or out of network. MySQL provides us with similar functionality and very similar syntax for performing this task.

SELECT id, contract_terms->'$.out_of_area' AS out_of_area,
       contract_terms->'$.out_of_network'
FROM provider_contract
WHERE contract_terms->'$.out_of_area' = true;

You can see here that the query is eerily similar to the PostgreSQL query. We’re using the -> operator to retrieve the value stored in the JSON path that we specify (e.g. $.out_of_area). Note that with MySQL, we’re providing a proper JSON path to the value we want. We have to include the $. which is a proper JSON path. If we don’t include this, we’ll get an error. I actually prefer this though because it adheres to the standard more closely than the PostgreSQL implementation.

When we run this query, we’ll get the expected results: a single record (provider contract ID #1) with out_of_area = true and out_of_network is NULL because that field doesn’t exist in the document for contract ID #1.

Note again that using the -> operator will return the JSON element found at the specified path whereas using the ->> operator will return the text value of the element located at that path.

A More Advanced JSON SQL Query

The next query that we want to run is to look at our service rates within the contract terms document and identify the flat rate for the Office Visits 1 service. This presents a challenge because service rates is stored as an array in the contract terms document. We might have some service rates that are Office Visits 1 and others that represent different services.

Further, some services might have a flat rate and some might have a billed percentage. So let’s look at how we would extract our flat rates for Office Visits 1 services from our contract terms using MySQL. We’re going to use MySQL’s JSON_TABLE function to do this.

This will feel very similar to the WITH clause that we used in SQL Server in that we’re going to essentially define a table-structure within the function and then we’ll get back the columns we define, populated with data from our JSON document (when available).

SELECT p.id AS provider_id,
    p.provider_name,
    pc.id AS provider_contract_id,
    c.flat_rate
FROM provider_contract pc
INNER JOIN JSON_TABLE(pc.contract_terms,
    '$.service_rates[*]' COLUMNS (
        rowid FOR ORDINALITY,
        svc_desc varchar(500) PATH '$.service_description',
        flat_rate decimal(12, 4) PATH '$.flat_rate',
        billed_percentage decimal(12, 4) PATH '$.billed_percentage'
    )) c ON true
INNER JOIN provider p on pc.provider_id = p.id
WHERE c.svc_desc = 'Office Visits 1';

This actually feels easier than PostgreSQL did and I like that it relies directly on a proper JSON path to operate since that helps enforce standards that translate elsewhere. In this case, we’re using JSON_TABLE to convert the service rates array into a table structure.

We’re cherry-picking the fields we want from that array and turning those into columns. Notice that we specify $.service_rates[*] as the root path to the array we want and then from there, the paths we define for each column are relative. We’re using JSON_TABLE in an INNER JOIN clause, so as far as the rest of our query is concerned, this is just another table.

That means we can select from it and filter on it however we need to. Overall, this approach feels like the cleanest and most intuitive approach, but its not far from the PostgreSQL implementation so it’s pretty easy to translate from one to the other.

Returning JSON Documents

The next thing we want to do is return a JSON document that contains our contract terms along with the information for the provider. MySQL makes this pretty easy as well, providing the JSON_OBJECT function that allows you to specify field names and values explicitly. It then returns a JSON document that matches your specification exactly.

SELECT JSON_OBJECT('id', p.id,
    'provider_name', p.provider_name,
    'npi', p.npi,
    'contract_id', pc.id,
    'contract_terms', pc.contract_terms,
    'contract_eff_date', pc.contract_eff_date,
    'contrat_term_date', pc.contract_term_date) AS json_document
FROM provider_contract pc
INNER JOIN provider p on pc.provider_id = p.id;

This is a straightforward and easy to understand approach. It’s still not as clean in my view as Microsoft’s method of adding a single line at the end of the query, but it’s still very intuitive and works exactly as you’d expect it to.

Final Challenge: Query for Rates with a Given Service Code

Finally, we want to build a query that would allow us to determine the rate for a given service code. This means that we’re going to have a service code and we want to identify the range that it falls within. From there, we want to retrieve the service description, the flat rate or billed percentage, and some other details about it. This is as simple as using the query that we identified above to get the flat rate for Office Visits 1 services and tweaking the columns we’re retrieving and the filtering criteria:

SELECT p.id AS provider_id,
    p.provider_name,
    pc.id AS provider_contract_id,
    pc.contract_eff_date,
    pc.contract_term_date,
    c.svc_desc,
    c.svc_range_start,
    c.svc_range_end,
    c.flat_rate,
    c.billed_percentage
FROM provider_contract pc
INNER JOIN JSON_TABLE(pc.contract_terms,
    '$.service_rates[*]' COLUMNS (
        rowid FOR ORDINALITY,
        svc_desc varchar(500) PATH '$.service_description',
        svc_range_start varchar(50) PATH '$.service_code_range_start',
        svc_range_end varchar(50) PATH '$.service_code_range_end',
        flat_rate decimal(12, 4) PATH '$.flat_rate',
        billed_percentage decimal(12, 4) PATH '$.billed_percentage'
    )) c ON true
INNER JOIN provider p on pc.provider_id = p.id
WHERE c.svc_range_start <= '99215'
    AND c.svc_range_end >= '99215';

Here, the only changes we’ve really made are to include the service code range start and end fields in our JSON_TABLE function and then we modify our WHERE clause to get the service rate record where our service code (99215) falls within that range. This is a pretty simple and straightforward change, made easier by the fact that we’re using JSON_TABLE within our join. Again, once we have that in place, we can treat the columns it returns as a table and use it in our SELECT and WHERE clauses however we need to.

As with PostgreSQL, we barely scratched the surface of what MySQL provides us for working with JSON data types. The MySQL documentation is very easy to understand, with plenty of examples. It doesn’t feel like such a steep learning curve to pick up the MySQL functions. I’m not sure if that’s because I tackled PostgreSQL first or if it’s because the MySQL documentation feels cleaner and more intuitive or maybe a combination of both.

That’s All Folks!

So that concludes the series on using JSON in an RDBMS. As I mentioned early on, making the decision to do this is one that you should take into careful consideration. If there are specific uses where it would be helpful then it might make sense.

From a big picture perspective, if you have a handful of places in a large application where the document structure makes sense but everything else is highly transactional then this could be a great option that gives you flexibility without the added complexity of running multiple database platforms.

On the other hand, if your application is primarily document-driven with very little transactional requirement, you’d likely find that it’s better to go with a true NoSQL platform instead. Every situation is different, so I’d recommend looking at the needs, the pros, and the cons. But hopefully this series can help you make a more informed decision before your next project.

As always, Happy Coding!

%d bloggers like this: