Showing posts with label external data. Show all posts
Showing posts with label external data. Show all posts

Monday, 26 May 2025

Using Access for Data Analysis

Normally, when we think of  MS Access, we do so in the context of database development; that is to say, as software to create database applications for long term use to keep track of data such as customer orders, income and expenditure, and stock control. Such database applications are used by one or more people over a period of time to store, process and retrieve information, and often support the day to day operation of a business.  

In this tutorial, however, we are  going to look at how MS Access can be used in the context of data analysis.  This use is very much different from the database development model described above. Rather than setting up a database for long term use in the day to running of a business operation for example,  we are going to use Access in a short term context, as a tool to import data from an external source, and then analyse that data through Access Queries to make it meaningful from a particular perspective. This summarised data can be presented to, say, a management team to make informed decisions based on what the information reveals.

To do this we are going to be working with 2023 house sale data for the Salford area of Greater Manchester.  The process will involve extracting a dataset from the UK based HM Land Registry website, importing the dataset into Access, creating query's to categorise and aggregate the data, and then create a report containing a clustered column chart to present the data visually.

Downloading the Dataset

The first step in the process is to download the dataset from the HM Land Registry website. This will involve filling in an online webform to specify the data we require. In our case this is all data between 1st January and 31st December 2023 in the district of SALFORD.

 1. Click this link the HM Land Registry link to open the web form in your browser.


2. Enter "SALFORD" in the DISTRICT textbox, 01/01/2023 and 31/12/2023 for the DATE range (note the UK date format) , and select "ALL" from the HOW MANY RESULTS option group at the bottom. 
3. Click the SHOW RESULTS button.

This opens a new web page showing 2947 property sale transactions for 2915 properties. We now need to download the results into a CSV file.

4. Click the  DOWLOAD DATA button at the top of the form.  This opens a web page giving us options for downloading:




5. Click GET ALL RESULTS AS CSV WITH HEADERS button to start the download. 

A dataset called ppd_data.csv will be saved to your downloads folder.  You can now open the file to inspect the data.  It will open in MS Excel by default and if you scroll to the bottom of the worksheet there should be 2947 rows.


Importing the Dataset into Access

We can now go ahead and import the data into our Access Database.

1. Click NEW DATA SOURCE on the Access EXTERNAL DATA ribbon. Choose FROM FILE when the context menu opens, and then click TEXT FILE.


This will start the GET EXTERNAL DATA - TEXT FILE wizard.

2. Click the BROWSE button on the first page of the wizard and select the text file which has been saved in your downloads folder.  Leave the top radio button selected - it reads "import the source data into a new table in the current database".  Click OK at the bottom of the wizard window.


3. The next page (see screenshot below) shows a sample of the data we are importing.  At the top it reads "Your data seems to be in a delimited format.  If it isn't, choose the format that more correctly describes your data".  If you look at the sample of data displayed, you see each field is separated by a comma.  We can therefore leave the "Delimited" radio button checked, and click NEXT.



4. The next page of the wizard (see screenshot below) asks us to choose the delimiter that separates our fields. As we noted above, our data is separated by commas, so we should now select the "Comma" radio button.  We also need to click the "First Row Contains Field Names" tickbox.  You may get pop up message explaining some of the field names have been modified to make them compatible with Access.  This is fine. Just click OK. We can now click NEXT to move on to the next page of the wizard.



5. This page (see screenshot below) gives us the option to specify information about each of the fields we are importing.  This includes specifying the data type for the field. We are going to change the data type for the price_paid field from SHORT TEXT to CURRENCY.  To do this, highlight the price_paid column and select CURRENCY from the DATA TYPE combo box. We will also change the data type on the deed_date field from SHORT TEXT to DATE WITH TIME. Another thing we are going to change is the index type (ie the Indexed field option) on the unique_id field from "Yes (Duplicates OK)" to "Yes (No Duplicates). This will ensure the data set does not contain any duplicate records.

At this stage we can clean the data set further by excluding any fields that we do not need.  To do this, highlight the field for exclusion and tick the "Do not import field" tick-box in the Field Options section.  We are going exclude the estate_type field, and all the fields from transaction_category and after. 

We can then click NEXT to continue.




6. The next page of the wizard lets us choose a primary key.  There are three options: these are "Let Access add primary key", "Choose my own primary key" and "No primary key".  We are going to select our own primary key, so tick the middle option and select the "unique_id" field from the combo box list. Then click NEXT.



7. In the last page of the wizard we are asked to choose a name for the table into which our data is to be imported.  Entry the table name "tblSalfordHouseSales" and click FINISH.


We can now take an initial look at the table of data we have imported.  Go to the Access navigation pane and open tblSalfordHouseSales (see screenshot below).


As we can see, the data set contains 2946 records.  Each record provides information about the sale of a given property and consists of  13 fields.  Some of  most useful fields we shall be using for data analysis in this tutorial are price_paid, deed_date and postcode. We also have the option of analysing house sales by property_type and whether the property is a new_build.


Checking the Data

Now we have imported the data set into Access, we can start checking it's accuracy.  However, before we start, lets make a copy of the imported table so we can undo any changes that we make without having to repeat the download and import process.  Do this by right clicking tblSalfordHouseSales in the navigation pane and select COPY from the context menu.  Then simply right click a blank space in the navigation pane and select PASTE.  A window opens where we can enter a new name to call the table we are pasting, and select which paste option we would like.  We are going to call the new table "tblSalfordHouseSalesOriginalData" and select the Structure and Data paste option.  This will paste a new table based on the same structure, and containing the same data, as the one we copied. Then click OK.




The first thing we are going to check are the postcodes (zip codes) in our dataset.  Since  the data we have downloaded is for the Salford area of Greater Manchester, each record in the dataset should start with one of the following values in the postcode field: - M3, M5,  M6, M7, M27, M28, M30, M38, M44, and M50 (after which there is a space followed by the second part of the postcode) . As such, we are going to create a query which groups the data into groups comprised of the first three values of the postcode, and provide some basic aggregate information for each grouping. From this information, we should be able to pick out any obvious anomalies contained in the data.

Create Query to Group & Aggregate Postcode Data




1. Create a new query by clicking the QUERY DESIGN icon on the CREATE ribbon.
2. Click the ADD TABLES icon (on the QUERY DESIGN ribbon) to display the ADD TABLES sidebar if it is not visible already.
3. Drag the tblSalfordHouseSales table from the tables list in the sidebar over to the QUERY DESIGN window.
4. Click the TOTALS icon in the SHOW/HIDE group of the QUERY DESIGN ribbon.  This adds a TOTAL row to the QUERY DESIGN grid.  This enables us to specify whether the field is to group the data contained therein, or perform an aggregation on it.
5. The first column in the query grid (see screenshot above) is going give us a list of postcode groups.  All the post code groups in the Salford area (see above) are 2 or 3 characters long.  So to extract the postcode group value from the value stored in the postcode field , we are going to create a calculated field in the first row of the first column of the query grid. The calculation is based on the postcode field in tblSalfordHouseSales and utilises the Left string function to take the first 3 characters of the field value.  This is enclosed within a RTrim function to remove the right blank space if the postcode group is two characters rather than three. It is written as follows: 

RTrim(Left([postcode], 3))

We will use the alias PCode as the field name. This is written ...

PCode:

... directly before the RTrim  function syntax. So altogether the first row of the first column in the query grid (see screenshot above) will read ...

PCode: RTrim(Left([postcode], 3))

We then set this field to GROUP BY in the TOTALS row.

6. In the second column of the query grid,  we are going to count the number of records within each postcode group.  So, in the ADD TABLES side bar, drag the unique_id field from tblSalfordHouseSales down to the first row of the second column in the query grid, and select COUNT from the drop down list in the TOTALs row.
7. In the third column of the query grid we are going to SUM the total of price_paid to give us the total price paid within each postcode group.  So drag price_paid from tblSalfordHouseSales down to the first row of the third column in the grid, and select SUM in the TOTALs row.
8. Click the RUN or DATASHEET icon  in the RESULTS group of the QUERY DESIGN ribbon to see the query results.



Looking at the query results above, we can see that most of the properties have a post code in the Salford area - these are M3, M5,  M6, M7, M27, M28, M30, M38, M44, and M50.  However there are three properties which do not have any postcode, and one property which has a postcode starting with WA1.  Taken together these come to a total value of £10,249,999.  In order to ensure our data set is accurate, we need to investigate why these properties do not have a Salford post code and remove them if necessary.  To do this we shall create another query to get the full information for the dubious records.



1. Repeat steps 1 to 3 above to create a new query based on the tblSalfordHouseSales table.
2. Drag the asterix * from the  tblSalfordHouseSales table down to the first row of the first column on the query design grid.  The asterix is a quick way to include all the fields in the table without having to drag each one individually.  However, if we need to enter a criteria for a particular field (which we do) we can still drag that field down separately .. 
3. As such, drag the postcode field down to the first row of the second column in the grid.
4 Click the CRITERIA row in the postcode column and enter the following expression: 

Is Null  or Like "WA1*"

The criteria Is Null Or Like "wa1*" will pull out any record containing a
postcode with a null value or a value starting with "wa1".

5. Click the RUN or DATASHEET view icon in the RESULTS group of the QUERY DESIGN ribon the see the query results



The first record returned by our query is for a property with the postcode WA14 5DY which (if you scroll across) is actually located in the town of Altrincham.  If you are familiar with the geography of Greater Manchester you will know that this town is in the borough of Trafford rather than Salford. As such, this record does not belong in our dataset and should be removed.  When we examine the data for the remaining three records, it is apparent that they pertain to land in a commerce park, a parking space, and a brown field site. As such they fall outside the scope of our interest in house sales, and can also be removed from our dataset. Its worth reflecting that had we not done this check, our dataset would have inaccurate to the value of £10,249,999.

A good way to remove these records from the dataset is to go back into QUERY DESIGN view and change the query type from SELECT to DELETE.




After changing the query type from SELECT to DELETE, you will see an addition row appear in the query grid entitled "Delete".  You will also see that Access has filled in the values for this row.  In the first column it has added "From" which means the records are being deleted from the tblSalfordHouseSales table, and in the second column it has added "Where" which means all records in the table matching our criterion Is Null Or Like "WA1*" will be deleted.  

Click the RUN icon in the RESULTS group of the QUERY DESIGN ribbon to run the delete query. 

Access will display a warning dialog saying that the 4 rows containing dubious records are going to be deleted from our table.



Click YES to continue the delete operation.

If we now go back and reopen tblSalfordHouseSales the record count will now show 2942 records, 4 less than the 2946 records we had previously.



There are many other checks that we can do to ensure our dataset is clean, but the steps mentioned above give us an idea of how we can use Access's queries and other features in this process.  Let us now move on and see how we can use Access to analyse our dataset.

Analysing Data with Access Queries

To illustrate how Access can be used to make sense of our housing sale dataset, we are going to create a number of price bands and count the number of sales in each band. This will give us a basic understanding of the housing market in Salford during 2023.  

This task will involve the creation of a subquery for each individual price band, and a parent query to pull together the sale counts within each band, showing them all in one row of aggregate data.  Lets begin with the subqueries.

We are going to create a total of seven price bands as follows:

PRICE BAND                                     SUBQUERY
< £100,000                                            qrySubBelow100K            
>= £100,000 and < £200,000                qrySub100Kto200K                
>= £200,000 and < £300,000                qrySub200Kto300K
>= £300,000 and < £400,000                qrySub300Kto400K
>= £400,000 and < £500,000                qrySub400Kto500K
>= £500,000 and < £1,000,000             qrySub500Kto1000K
>= £1,000,000                                       qrySubAbove1000K


Each of the above queries is based on the unique_id and price_paid fields of tblSalfordHouseSales.  Once again, the query is going to use the TOTALS row in the QUERY DESIGN grid because we need to aggregate the data returned.  We are going to set up our query to count the number of rows returned in the dataset when we add a criterion to the price_paid field. To do this we shall set the value in the TOTALS row of the unique_id column to COUNT,  and the value of the TOTALS row of the price_paid column to WHERE.  All we need to do then is enter the criterion in the CRITERIA row of the price_paid column.  For example, when we create qrySubBelow100K to count the number of sales less than £100,000  the criterion we enter is  <100000 as in the screenshot below:

You may have noticed how the tickbox in the SHOW row
of the QUERY DESIGN grid is unticked when we select WHERE 
in the TOTAL row for price_paid. This is because the only purpose 
of the price_paid column is to filter the rows to be included in the
count of unique_id's.

When we run the query a single row and column is returned with the count of unique_id's matching the query criterion entered for price_paid. For records with a price paid value less than £100,000 this should value should be 140.



Here are the instructions for creating the subqueries for the seven price bands.  We will need to repeat this process seven times for each of the subquerie:

1. Create a new query by clicking the QUERY DESIGN icon on the CREATE ribbon.
2. Click the ADD TABLES icon (on the QUERY DESIGN ribbon) to display the ADD TABLES sidebar if it is not visible already.
3. Drag the tblSalfordHouseSales table from the tables list in the sidebar over to the QUERY DESIGN window.
4. Click the TOTALS icon in the SHOW/HIDE group of the QUERY DESIGN ribbon
5.In the first column of the QUERY DESIGN grid, select the unique_id field from the drop down list in the top row.
6. We now need to go down to the TOTAL row in the first column and select COUNT from the drop down list.
7. In the second column of the QUERY DESIGN grid, select the price_paid field from the drop down list.
8. Going down to the TOTAL row of the second column, select WHERE from the drop down list.
9. We now need to enter the criterion in the CRITERIA row of the second column.  The criterion we enter depends which of the seven subqueries we are working on.  The criterion for qrySubBelow100K, for example, is <100000. The criteria for all seven queries are listed in the table below.
10. Click the SAVE icon at the top left of the screen and the name of subquery that we were creating.

WHERE CRITERIA                      SUBQUERY
< 100000                                           qrySubBelow100K            
>= 100000 AND < 200000               qrySub100Kto200K                
>= 200000 AND < 300000               qrySub200Kto300K
>= 300000 AND  < 400000              qrySub300Kto400K
>= 400000 AND < 500000               qrySub400Kto500K
>= 500000 AND < 100000               qrySub500Kto1000K
>= 1000000                                       qrySubAbove1000K

In addition to the seven subqueries above, we shall also create another subquery (qrySubCountAll) to give us a grand total for 2023 Salford house sales across all price bands.  This subquery is the same as those we created above, except we do NOT need the price_paid column or any WHERE clause therein. We just need the COUNT in the unique_id column as in all the previous subqueries:


When this query is run we get the same output format as the previous seven subqueries (see screenshot below).  The result is a count of all house sales in tblSalfordHouseSales.  This should be 2942.



Now we have created all our subqueries we can begin work on the parent query which pulls all the subquery results together in one row of aggregate data.

The Parent Query

We are going to call our parent query qryHousePriceGroups.  Each column in the query will display the aggregate total count of house sales for each of the price bands.  The last column will display the grand total.  To do this we will need to add all 8 subqueries we created to the QUERY DESIGN grid window and use the appropriate subquery as the data source for each of the parent query columns.



As we can see in the screenshot above, we have given each column name an alias such as "Below 100K" and "100K to 200K".  This will make the query output easier for users to read and understand.

Here are the step by step instructions for creating the parent query.

1. Create a new query by clicking the QUERY DESIGN icon on the CREATE ribbon.
2. Click the ADD TABLES icon (on the QUERY DESIGN ribbon) to display the ADD TABLES sidebar if it is not visible already.  Then select the QUERIES tab at the top of the sidebar.  You should see all the subqueries that we have created previously.
3. Drag all 8 subqueries over to the QUERY DESIGN window.  We will now start to dragging the CountOfUnique_id fields from each subquery onto the QUERY DESIGN grid below.
4. The first CountOfUnique_id field we are going to drag is from the qrySubBelow100K subquery.  Drag this down to the cell in the top row of the first column in the grid.  
5. We shall now enter the alias name for the column.  We do this by typing the name Below 100K in the same cell just before CountOfUnique_id separated from it by a colon as follows:

Below 100K: CountOfUnique_id

6. In the top row of the next column, we are going to drag CountOfUnique_id from the qrySub100Kto200K subquery.  We shall give this the alias 100K to 200K.
7. Repeat step 6 for each of the subqueries using the following alias's:

200K to 300K:
300K to 400K:
400K to 500K:
500K to 1000K:
Above 1000K:
All:

When we run the query we should get this result:



As we can see, out of a total of 2,942 house sales, the vast majority fall within the 100K to 200K, and 200k to 300K price groups.  The 100K to 200K group contains 1,258 sales, and the 200K to 300K group contains 992 sales, giving a total of  2,250 sales out of 2,942, or 76% of the total 2023 Salford housing sale count. The figures also reflect the fact that Salford has some poorer districts as well as those which are very affluent.  There were 140 house sales (5%) with  property values of less than £140,000, but 100 sales (5%) with values between £500,000 and £1000,000, and 26 sales (1%) above £1000,000.

Lets end by showing this data visually in a clustered column chart embedded  in an Access report.  Here are the step by step instructions:

1. Click the REPORT DESIGN icon in the REPORTS group of the CREATE ribbon.
2. Resize the REPORT DESIGN grid so we get a larger section of screen in which to create the chart.
3. Click the INSERT MODERN CHART drop down menu in the CONTROLS group of the REPORT DESIGN ribbon.  We are going to select the CLUSTERED COLUMN CHART from the COLUMN submenu.

4. The cursor will now change to a small chart symbol.  Position the cursor in the top left hand corner of the DETAIL section of the design grid.  A small chart will embed itself on the grid.  At this stage we just have sample data displayed in the chart.
5. Resize the chart by clicking it's border on the bottom left corner and dragging the corner down and across so we get a bigger chart.

New chart showing sample data.

6. We will now connect the chart to our parent query. We can do this from the CHART SETTINGS sidebar. Click anywhere inside the chart so it is selected. Then click the CHART SETTINGS icon in the TOOLS group of the REPORT DESIGN ribbon (if the sidebar is not displayed already).


7.  Click the QUERIES radio button under DATA SOURCE, then select the parent query that we created earlier from the drop down list.
8. Go down to the VALUES (Y axis) section below.  You will see a list of checkboxes with the price bands we created in our query. Tick each of the boxes except ALL. That is from Below 100K down to Above 1000K. You will notice there is a COUNT OF aggregate function contained within brackets after each of the check box labels. We need to change the aggregate function to NONE by clicking the drop down list which appears when you hover the mouse over the text. You should now see the data from our parent query appear on the chart. We now need to customise the appearance of the chart to make it more user friendly.
Chart showing parent query data ready
to be customised.

9. The first customisation we shall make is the addition of a data label above each chart column.  This shows the value for the count of sales for each of the price bands. To do this select the FORMAT tab at the top of the CHART SETTINGS side bar. Select the first price band Below 100K from the drop down list under the DATA SERIES title. Then tick the Display Data Label checkbox. 



    You should now see the value 140 appear above the Below 100K column on the chart.

10. Select the next price band from the Data Series drop down list and tick the Display Data Label checkbox. Do this for the rest of the price band columns on the chart.
11. For the remaining customisations, we are going to use the chart property sheet. Click anywhere inside the chart so it is selected, then click the PROPERTY SHEET icon from the TOOLS group of the REPORT DESIGN ribbon.
12. The next customisation we are going to make is the addition of a chart title. Select the FORMAT tab on the property sheet and locate the HAS TITLE and CHART TITLE properties further down the sheet.  Ensure the HAS TITLE property is set to YES, and enter "2023 Salford House Sales" for the CHART TITLE property value.



13. We shall next move the legend from the top of the chart to the bottom.  To do this locate Legend Position also on the FORMAT tab of the property sheet.  Change the property value to "bottom".
14. The final customisations we are going to make involve adding a title to the vertical axis and remove the text saying "140" below the horizonal axis.  To do this locate the primary values axis title on the property sheet and change the value to "Count of Sales";  and then locate the category axis font color property, click the ellipse button (...) in the value cell, and select white from the colour chart.  This will hide the unwanted text from view.



15. We can now open the report by clicking the REPORT VIEW button on the VIEWS group of the  REPORT DESIGN ribbon. The result should look like the screenshot below:




We can now see the data from our parent query in visual form, thereby gaining a clearer view of the concentration of house sales in the 100K to 200K, and 200K to 300K price bands.

There are, of course, many other dimensions we can use to analyse our data.  For example, we can create queries to group house sales according to property type, and find out the average value in each group.  We could also have downloaded a larger dataset which includes sales over a number of years, and create a chart showing how prices have changes over particular timespans. I intend to cover areas such as these in future posts.


Friday, 17 May 2019

Linking Access to an Excel Worksheet

Excel is great for number crunching, and can even be used as a basic database to store information.  However, when it comes to displaying and presenting information, the MS Access database has some big advantages over it's peer in the Office 365 suite of software. The reason for this is that Access has a number of tools and features specifically designed to make data easier to read and understand - the Reports facility being just one.

The good news is that Access has the ability to connect to, and work with, Excel data in two different ways.  Firstly, Access can import data from an Excel worksheet into an Access table.  This method lets us work with a "snapshot" of data from Excel as it was at the time the data was extracted (there is more information about importing from Excel in my post on "Importing and Exporting Data between Access and Excel").  The second method is to set up a "live" link between the Excel worksheet and the Access table. This has the advantage of giving us direct access to current data in the spreadsheet; so any updates, additions or deletions will show up in Access (after data is refreshed/reloaded) as soon as the data is saved in Excel. However, please do note that this connection is read-only; new data cannot be saved back into the Excel spreadsheet without specifically doing an additional export.

Today we are going to learn how to set up this "live" link. To do so, I have set up a sample Excel worksheet with fictitious personnel records.

Linking to Excel
  1. Create a new Access Database, or open an existing one.
  2. Select the EXTERNAL DATA ribbon.
  3. Click the NEW DATA SOURCE icon in the IMPORT AND LINK group.
    External Data Ribbon
    Above: Adding a New Data Source from the External Data ribbon.
  4. Select FROM FILE from the drop-down menu, and then EXCEL from the sub-menu.  This opens the GET EXTERNAL DATA - EXCEL SPREADSHEET dialog form.
    Get External Data - Excel Spreadsheet dialog form
    Above: The Get External Data - Excel Spreadsheet dialog form.
  5. Click the BROWSE button and select the Excel spreadsheet you want to link to.
  6. Click the LINK TO A DATA SOURCE BY CREATING A LINKED TABLE radio button, and then click OK.  
  7. The next page of the wizard opens where you should see a sample of the worksheet you are linking to.  If your worksheet had column headings, ensure the FIRST ROW CONTAINS COLUMN HEADINGS box is ticked, and click NEXT. (Please note that if your spreadsheet has more than one worksheet or named range, you will see an additional wizard page asking you to select the specific worksheet or range before the page in the screenshot below).
    MS Access - Link Spreadsheet Wizard
    Above: First page of the Link Spreadsheet Wizard.
  8. Enter the name you are going to call the linked Access table in the last page of the wizard, and click FINISH.

Once you have done this, you will see the new linked table appear in the left-hand Access navigation pane under TABLES.  Note how the linked table icon is a blue arrow pointing to the Excel logo, thereby indicating the table is linked to an Excel spreadsheet.

MS Access table linked to Excel
Above: The table linked to Excel.


Now the link has been established to the Excel worksheet, we are free to base forms, reports and queries on the linked table as if it is native to Access itself.



Friday, 4 May 2012

Using the Access Database Splitter

Last year I wrote a post on Linking to an External Data Source.  This is where we split an Access database  into a Back End (which stores all the tables and data), and a Front End (which contains all the forms, queries and reports).  There are a number of advantages to observing this practice.  To begin with, splitting a database file in this way is all but essential if it is to be deployed on a network in a multi-user environment. Another big advantage is that we can easily replace a front end file with a new updated version with minimal disruption - the actual data stored in the back end tables is not touched in this process. 

If you read the my post last year, you may remember that I showed you how to link a front end Access file to an existing back end database.  This is fine if the back end database already exists, or if you decide to create the application in two separate files from the start.  However, many developers create or inherit a complete database application as one stand alone file, and then proceed to split the database.  Splitting the database can be done manually, but this is often time consuming and there is always the possibility of making errors in the process.  An easier and quicker way of doing this is by using the Access Database Splitter.  This is a wizard that splits the database into a front and back end, as described above.  

Splitting an Access Database
The database that I am working with here is an Order Management System containing a number of tables.  If you want to try this on one of your own database files, it does not matter how many tables it contains - it could have a hundred or it could just have one.  
Figure 1
  1. Click the ACCESS DATABASE icon (located in the MOVE DATA group of the DATABASE TOOLS ribbon).  See figure 1 on on the right.  This opens the DATABASE SPLITTER Wizard.
  2. Click the SPLIT DATABASE button at the bottom of the Wizard (see figure 2).   
  3. You are now asked to select one of your folders for the new back end Access file.  Browse to a location of your choice, enter a back end file name, and then click SPLIT.  After a moment or two, you should get a message box saying "Database Successfully Split".
  4. Click OK.
Figure 2: The Database Splitter Wizard from Stage 2.
You can now look at the navigation pane of the front end database and see the linked tables (see figure 3).  You can try adding data to one of your tables and then open the back end file to see it stored there.  You could also try copying the front end file in order to access the back end tables with different front ends.  In so doing you can see the basic principle behind how multi-user systems operate*.
Figure 3 (above): This is how linked tables appear in the Front End Database file.
Figure 4:  Your Database is now split into two files one for the front end and one for the back end.

*Tip:  If you want to run more that one Access front end at a time, you will need to open multiple instances of the Access Database.  So rather than clicking on the access front end file directly (ie from within Windows Explorer), you should open Microsoft Office Access from the windows START button or DESK TOP and then open each front end file from the GETTING STARTED WITH MICROSOFT OFFICE ACCESS screen.   

Friday, 17 June 2011

Importing and Exporting Data between Access and Excel

When you work with any sort of external data in relation to Microsoft Access, you have three options.  You can Import, Export or Link.  The difference between Import and Export is simple: you are either importing  data into Access, or exporting data out of Access.  The data you are working with, in this case, is a snapshot. In the case of Importing, once data has entered your Access application from an external data source, any changes made to the original data source, are not reflected in your imported data.  The same principle applies when you export data to an external application - any changes you subsequently make in Access is not reflected in your exported data.  This is in contrast to the option we covered in the last blog post on Linking Access to an External Data Source, where the connection made is 'live'.  Any changes made in Access is reflected in the external application , and any changes made in the external application is reflected in Access.

In this blog post we are specifically going to be looking at Exporting and Importing snapshot data to and from Microsoft Excel.  NB you can also partially link an Access Database to an Excel spreadsheet, but the connection in this case in not completely live - you cannot change data from within the Access Database (which is then reflected in Excel), although data changed within Excel is reflected within Access - in other words, the connection is read-only.

Exporting Access Data to Excel

Let's begin by going through the procedure of Exporting Access data to Excel.  This is a procedure you might go through to export an Access Query or Table to Excel in order to perform further analysis.  We shall be using the Access database from my last blog post which you can download from here.
  1. Open the database containing the table to be exported.  
  2. Click the name of the table in the NAVIGATION PANE so that it is highlighted in orange.  Alternatively you could actually open the table.
  3. Click the EXCEL Icon located in the EXPORT GROUP of the EXTERNAL DATA Ribbon.



  4. When the EXPORT dialogue box opens, click the BROWSE button and select a location and filename for the newly exported excel spreadsheet.  Then click the EXPORT DATA WITH FORMATTING AND LAYOUT and the OPEN THE DESTINATION FILE AFTER THE EXPORT OPERATION IS COMPLETE check box's.

    Figure 1:  The EXPORT dialogue box for exporting to Microsoft Excel.

  5. Then click the OK button below.  This completes the export process.  The finished result can be seen in the screen shot below:
Figure 2: Table data exported from Access into Excel.

Importing Data into Access from Excel


Now that we have exported our Access Table into Excel, lets have a go at Importing data back into a new Access Table from an Excel spreadsheet.  This is a procedure Access Developers often make when they are convert an Excel spreadsheet to Access.  For sake of convenience, we are going to Import data from the same Excel Spreadsheet which was created when we did our original Export Procedure.
  1. Open the Access database into which we are going to Import the data.  You can use the same database as before, because we are going to import the data into a new table.
  2. Click the IMPORT EXCEL SPREADSHEET Icon located in the IMPORT GROUP of the EXTERNAL DATA Ribbon.

  3. This opens the GET EXTERNAL DATA dialogue box for Microsoft Excel.  Click the BROWSE button and search for the spreadsheet file to Import.
  4. You then have three options.  You can Import the data into a new table, append the data into an existing table, or create a link.  We are going to Import the data into a new table, so click the top check box which says IMPORT THE SOURCE DATA INTO A NEW TABLE IN THE CURRENT DATABASE.  Then click OK.

    Figure 3:  The GET EXTERNAL DATA dialogue box for Importing to Excel.

  5. This opens the first page of the IMPORT SPREADSHEET WIZARD (see screenshot below).  Our spreadsheet contains column headings which can be used as field names, so click the Check Box which says FIRST ROW CONTAINS COLUMN HEADINGS.


    Figure 4: The first page of the IMPORT SPREADSHEET WIZARD.

  6. Click the NEXT button for the second page of the Import Spreadsheet Wizard.  This is where we are able to specify information about each of the fields we are importing - information such as DATA TYPE, INDEX, and FIELD NAME.  Click the ID column heading so that it is highlighted (if it is not already).  Then change the INDEX to YES(NO DUPLICATES) and the DATA TYPE to LONG INTEGER.  This is so we can use the ID field as the table's primary key.

    Figure 4: Setting the  FIELD OPTIONS in the IMPORT SPREADSHEET WIZARD.

  7. Click NEXT to move to the third page of the Import Spreadsheet Wizard.  This is where we explicitly define the Primary Key for the table.  We already have a suitable column to use as the Primary Key, so we do not need Access to do this for us (the default option).  As such, click the middle Option Box where it says CHOOSE MY OWN PRIMARY KEY.  Then select the ID field from the drop down list (if it is not already selected).

    Figure 5: Defining the Primary Key.

  8. Click NEXT to bring up the final page of the Import Spreadsheet Wizard.  This asks you to enter the name of the table.  Change the default name to tblExcelImport (a new table name that does not already exist) and then click the FINISH button.

    Figure 6: Selecting the new Table Name for the Imported Data.
The table has now been imported and appears in the NAVIGATION PANE of your Access Database. You are now free to open the form and view the records, and may also edit the new table in DESIGN VIEW if you wish.