Monday, 25 August 2025

Analysing Data with Access Crosstab Queries

In the previous blog post, we used a data set downloaded from the UK based HM Land Registry website to demonstrate how Access can be used in the context of Data Analysis.  As you may remember, the data set contained details of 2023 house sales in the Salford area of Greater Manchester, and we created a series of Access Queries to count the number of sales which occurred in seven separate price bands. This gave us an overview of what had occurred in the Salford area housing market that year.

In today’s post we are going to process the same data set with an Access Crosstab Query to give us a breakdown of average house sale price by property type and postcode area.  This will show us the differences in average house sale price according to where a property is located and the type of building it is (ie whether the property is detached, semi-detached, terraced, or a flat).  As we shall see, our crosstab query will restructure the dataset in such a way that property type and postcode values from the original table become row and column headings in the resulting new table.  Furthermore, the values produced in the new table are derived from an aggregated average of individual house sale amounts from the original.

Let’s consider what this means in practice.  The original table is comprised of rows which represent individual house sales with each column containing a specific piece of information about the sale.  Here is a sample of the 2,942 record dataset we downloaded and cleansed in the last blog post (see Using Access for Data Analysis).

Salford House Sales 2023
Figure 1: Sample from original dataset of house sales in Salford during 2023.

As we can see each house sale record has its own unique id, sale amount (price_paid), date (deed_date), postcode and property type. As stated above, we are going to create a crosstab query which will restructure the table so that each row represents a property type and each column a postcode area. Each value in the table is the average sale amount (ie price_paid) for all properties for the property type and postcode area of its particular row and column respectively.

The table below shows how the crosstab query results will look:

 

Postcode

Postcode

Postcode

Postcode

Postcode

Property type

Average £

Average £

Average £

Average £

Average £

Property type

Average £

Average £

Average £

Average £

Average £

Property type

Average £

Average £

Average £

Average £

Average £

Property type

Average £

Average £

Average £

Average £

Average £


Before we look at how to create the crosstab query itself, let us first clarify what the property type and postcode data actually means. Property type data can be one of four different letters – D, S, T or F.  These are the first letters of the following property types – Detached, Semi-detached, Terraced or Flat. For those of you outside of the UK and unfamiliar with these property types: Detached means a stand-alone building containing a single home; Semi-detached means a single building comprised of two separate homes; Terraced means a row of multiple adjoining homes; and Flat means a large building divided into multiple apartments. Detached homes tend to be the most expensive property type, followed by Semi-detached, Terraced and Flats respectively. However, location has a direct bearing on the property’s sale price, so a semi-detached property in a desirable area may be more expensive than a detached property located somewhere less desirable.  This is where the properties postcode value comes in.  Postcodes are comprised of two separate parts eg M28 1FU.  The full postcode tells us which part of a street the property is located, so each full postcode may be shared by a small number of other properties on the same street.  The first part of the postcode eg M28, on the other hand, relates to the general area where the property is located, so there can be hundreds of other properties sharing this part of the code.  As such this can give us a good idea about the desirability of the properties location, and it is this part that we shall use for the column headings in our crosstab query.


Creating the Access Crosstab Query

Let us now look at how the crosstab query is created.

In the previous blog post on Using Access for Data Analysis we downloaded a dataset from the UK based HM Land Registry website and imported it into an Access table.  We called the table tblSalfordHouseSales (after data cleansing).  We shall be using this table to create the crosstab query below. 

Here are the step by step instructions:

  1. Open the Access database in which tblSalfordHouseSales is located and select the CREATE tribbon.
  2. Click the QUERY DESIGN icon from the QUERIES group to open the QUERY DESIGN window.
  3. Click the ADD TABLES icon from the QUERY SETUP group of the QUERY DESIGN ribbon. This opens the ADD TABLES sidebar (if it is not open already).
  4. Select the TABLES tab from the ADD TABLES sidebar, and drag the tblSalfordHouseSales table onto the top section of the QUERY DESIGN window.
  5. Next, we need to click the CROSSTAB icon from the QUERY TYPE group of the QUERY DESIGN ribbon.  This adds CROSSTAB row to the QUERY DESIGN grid in the lower half of the QUERY DESIGN window.

    Figure 2: : The CROSSTAB icon in the QUERY TYPE group of the QUERY DESIGN ribbon.


    Figure 3: The QUERY DESIGN grid after selecting property_type in first column. Notice the extra CROSSTAB row heading appear after clicking the CROSSTAB icon in the QUERY TYPE ribbon group.


  6. The first field we are going to add to the QUERY  DESIGN grid is property_type.    Do this by  selecting the property_type field from the drop down list in the first column of the  FIELD row of the QUERY DESIGN grid, or by dragging property_type down from the tblSalfordHouseSales table box in the top half of the window.
  7. As explained above, each data value in the property_type field will act as a row heading in our crosstab query.  To set this up on the QUERY DESIGN grid, we go down to the CROSSTAB row on the property type column, and select ROW HEADING from the drop down list as in figure 3 above.
  8. The next field we are going to add to the QUERY DESIGN grid is postcode.  However, as explained above, we are not interested in the full postcode, just the general postcode area.  You may remember that we discussed how UK postcodes are comprised of two parts and it is the first part which tells us the general area.  As such, we need to use a string function to extract the first part of the code from the full field value.  We are going to achieve this end using the InStr function to count the number of characters in the first part of the code, and then using the Left function to extract these characters from the full field value. We use the InStr function to search for the space character which separates the two parts of the code.  The value the function returns is the numerical character position of the space.  The syntax for this is as follows:

 

InStr([postcode], “ “) -1

 

By subtracting 1 from the return value, we get the character count of the first part of the code. For example, if the postcode was “SK14 5RP”, our function returns 5 for the space position, and then 4 for the character count after we subtract 1.

 

We then use the Left function to extract the number of characters in the character count from the left side of the string. The syntax for this is as follows:

 

Left([postcode], InStr([postcode], “ “)-1)

 

… and this returns the first part of the postcode as a new string value ie “SK14” for the example used above. 

 

As such, we can now add this function to the field row in the second column of the QUERY DESIGN grid. The string data this function will return will form the column headings in our crosstab query result, so select COLUMN HEADING in the crosstab row drop down list as in the screenshot below:

Figure 4: NB Once you enter the expression, Access gives the column it’s own alias of Exp1.  You can overtype this with postcode_area to give it a more meaningful name. Doing so does not affect the query result in any way.

10. Now that we have sorted our row and column headings, we just need to take care of the actual values where the rows and columns intersect. As you may remember, these are going to be the average price paid for a property of a specific type (ie the property_type row) in a specific area (ie the postcode_area column). Do this by selecting the price_paid field from the drop down list in the third column of the  FIELD row of the QUERY DESIGN grid, or by dragging price_paid down from the tblSalfordHouseSales table box in the top half of the window.

11. Next, we need to change the value in the TOTAL row of the price_paid column in the QUERY DESIGN grid.  Change the value from GROUP BY to AVG.  This will give us the average price_paid aggregate for properties of a specific type in a specific area.

12. This is where we set the CROSSTAB row of the price_paid column in the QUERY DESIGN grid. Select VALUE from the dropdown list as in figure 4 above. This simply tells Access that the price_ paid field is constituting the values in the crosstab query.

13. Finally, we just need to set field properties for price_paid so that we get a whole number without any decimal places. Do this by clicking anywhere in the price_paid QUERY DESIGN GRID column, and then clicking the PROPERTY SHEET icon in the SHOW/HIDE group of the QUERY DESIGN ribbon, and then set the FORMAT property to CURRENCY and the DECIMAL PLACES property to 0.  This will make the query results much easier to read.

All we need to do now is click RUN in the RESULTS group of the QUERY DESIGN ribbon to see the results of our crosstab query!

Figure 5: Average price paid for property types in the different Salford poscode areas (data taken from sales during 2023).

As we can see, the different property type codes from data in our original table form the row headings in the crosstab query, and the postcode areas form the column headings. The actual table values, as we have learnt above, are an aggregated price_paid average of all properties of a specific type in a specific area.  So, for example, the average price paid of all detached properties in the M27 postcode area is £335,592. If we compare this to the average price paid of the same property type in the M28 postcode area we see the value of £440,601 is more expensive, thereby suggesting that the latter area is more desirable. As such, the crosstab query has given us a meaningful summary of the 2,942 rows of data from our original table.  In so doing it has helped us understand what the original data was telling us.