Showing posts with label calculated field. Show all posts
Showing posts with label calculated field. Show all posts

Friday, 2 March 2012

Calculating Questionnaire Percentages using a Multi-Level Query

Last Saturday I received an interesting question from Ogwang Denis on my Access Tutorial Facebook page. He had created an Access Database to store the results of a survey questionnaire comprised of a number of Yes/No questions, and wanted to know how he could process the answers in a query to produce a percentage.  

The solution to this problem is not quite as straight forward as it may at first seem. As Oswang correctly points out, the table field we want to process is of the Yes/No Data Type.  Moreover, each answer to this question is contained in separate interviewee records'.  The task can be articulated into three distinct stages:
  1. Count how many interviewees' have been asked this question regardless of answer given.
  2. Count how many interviewees' answered YES to this question.
  3. Calculate the percentage of Yes answers on the basis of information obtained in stages 1 and 2 above.
Each of these three stages is accomplished with a separate Query.  Moreover, the 3rd stage query uses the summary data provided by the two queries created in the first two stages. As such the final query is multi-levelled - the actual table data represents the first level; the stage 1 and 2 queries, the second; and the third stage query represents the third and final level.  We will see how this multi-levelling is achieved shortly.

Sample Exercise:
Let's look at a simplified database that I created in order to illustrate these stages.  The table, from which the queries are based, contain three questions in each record.  Each question is of the Yes/No Data Type.  

Figure 1: The sample data we shall be working with is
stored in a table called tblQuestionnaire.   Each record represents a separate
interviewee's set of answers.  In this exercise we are just interested in their answers' to Question1.
Stage 1:
The first stage involves creating a query to count the number of interviewee records - that is to say, the count of how many times Question1 was asked.  This just requires a single query column based on the count of  the ID field in tblQuestionnaire.  We shall save this query as qryQuestion1Count:

Figure 2: The stage one query saved as qryQuestion1Count.
Stage 2:
The second stage involves creating a query to count the number of YES answers to Question1. This is achieved by counting the ID field of tblQuestionnaire as we did in the first query, but this time we shall also add a second query column based on the Question1 field. This column needs to contain a criteria to filter out the YES answers (since -1 represents YES when dealing with the Yes/No Data Type, the criteria will be = -1).  As such we are creating a query to count the ID fields of all records WHERE Question1 = -1. We shall call this query qryQuestion1Yes:

Figure 3:  The stage two query saved as qryQuestion1Yes.  This counts the number of
ID fields (and hence records) in tblQuestionnaire WHERE Question1 = -1.
Stage 3:
For the third and final stage, we need to create a query based on the previous two queries.  We do this by selecting qryQuestion1Count (the stage one query), and qryQuestion1Yes (the stage two query), from the QUERIES tab of the  SHOW TABLE dialog form, when the Query Design Window opens.  

Figure 4: The Third Stage Query - qryQuestion1Percent.
As you can see from the screenshot in figure 4 above, both sub-queries (used in the main stage 3 query)  contain the same field name - CountOfId.  The difference between them is, as we have already seen, qryQuestion1Count counts the number of ID's of records where Question1 was asked (regardless of answer given), and qryQuestion1Yes counts the number of ID's of records where Question1 has been answered YES.  From this information we can create a Calculated Field to work out the percentage of Question1 YES answers.  In the screenshot above, I have divided 100 by the count of the number of times Question1 was asked, and multiplied this by the number of times Question1 was answered YES.  On account of the duplicate field names used in this calculation, I have had to refer to each field in turn with the query name, from which it has derived, preceding - ie [qryQuestion1Count].[CountOfId] and [qryQuestion1Yes].[CountOfId]. I have also given this Calculated Field the Alias of Percentage. (There is more information about Calculated Fields in the post I published last week).


The Result: 
In our sample data the question was asked 20 times, and was answered YES 10 times. As such, if we work out the calculation manually, the figures are:  100/20 * 10. The answer is 50%, which is indeed the result of the stage three query when it is run.


Figure 5: Query result returned by qryQuestion1Percent on sample data. 





  


Friday, 24 February 2012

Using Calculated Fields in Queries

Queries play an important part (in relational database design) in pulling information from different tables together. Consider the Many to Many relationship at the centre of an Order Management System.  Here we have three tables - Orders, Order Details and Products.  When we come to create the record source for the order items section of the invoice report,  it is necessary to get data regarding an order items quantity from the Order Details table, and that of product cost per unit from the Products Table. Then it is the OrderId from the Orders Table which groups all Order Details records into a single order.

Figure 1 (above):  A simplified example of a Many to Many Relationship from an Order Management Database.




Figure 2: A sample Invoice Report taken from one of my Order Management Database designs.
The Order Items Section is located in the middle of the report.  CostPerUnit is
displayed here as Unit Cost and Amount as Total.  This  section contains
details of each individual order item which, taken together,
makes up the whole order.
As well as bringing information together in this way, query's play another important role in as much as they are commonly used to calculate and process information from these tables.  This may be done by means of a Calculated Field.  This is basically a new query column, the values of which being derived from some calculation or expression.  As such, we are effectively creating new data for our database.  So why is this important?  Well, consider the Order Management Database Invoice mentioned earlier.  As already stated, the Order Details table contains data pertaining to an order items quantity, and the Products table to an items Cost Per Unit.  There is nowhere in a well designed ('normalised') database where we store the total of quantity*cost per unit. It is considered good practice to calculate this information (so we avoid storing unnecessary data which uses extra memory, and makes the database less efficient and user friendly).  The Calculated Field used in a query is a common way of doing this.

Creating a calculated field is really quite easy.  Rather selecting a table field on the top row of the query design grid, we manually type the name (or alias), followed by a colon, then followed by the calculation or expression.  So if we wanted to create a calculated field to process quantity * cost per unit, we would use this syntax:

Amount: [Quantity]*[CostPerUnit]

I should point out that the alias used here - ie Amount - is entirely arbitrary text chosen by the database developer. It is, however, good practice to make it something meaningful in order to make it easier for the developer to understand at a later point in time.

Let look at how this fits in with the rest of the query:

Figure 3: Query with Calculated Field on the right.

As you can see the query contains a mixture of fields from all three tables involved in the many to many relationship.  The Calculated Field has been added to the column on the right.  You may have noticed that the calculated field looks slightly more complicated than the syntax I wrote earlier in the post.  This is because we are also specifying the table name as well as the field name.  As such:

[tblOrderDetails].[Quantity] 

... refers to the Quantity field of tblOrderDetails, and:

 [tblProducts].[CostPerUnit] 

... refers to the CostPerUnit field of tblProducts.

It is essential to write the full reference in this way if any tables or sub-queries involved in the query contain duplicated field names.  Although this is not the case in this example, it does no harm to get into the habit of writing the full syntax anyway.

And this is the output from the query showing the result of the calculated field on the right:

Figure 4: The Query Output.
Since this query has an ID field corresponding to tblOrder.OrderId, it can now be used as the record source for the Order Items section of an Invoice Report or Orders Subform.  To do this the LINK MASTER FIELDS property of the subform is set to ID and the LINK CHILD FIELDS is set to OrderId.  There is then the matter of creating the record source for the main section of the invoice report or orders form (this also involves creating Calculated Fields to get the Order Totals).  Unfortunately this goes beyond the scope of this particular post.

Friday, 9 December 2011

An Expression to Obtain a Full Name from Three Separate Fields

There is a very good reason why we separate name fields when we create an Access Table.  If we store the Title, First Name and Surname in a single field, we limit the capability of our application to interrogate this part of our data.  For example, if we store a full name in a single field, we would not be able to sort a list of names into alphabetical order (because Access cannot differentiate between a title, first name and surname). Moreover, we would also not be able to extract just the title and surname in order to address a letter. This is because the individuals first name comes in between, thereby preventing us from using the name in a mail merge operation.

As such, there is a general database design convention of storing all the elements of a name in separate fields. This gives us maximum flexibility and control when we come to process this data.  What's more, we still have the capability of joining (concatenating) the full name back together again through use of an expression in a query's calculated field, for example.  So how is this done exactly?

Imagine we have a list of names stored in a table.  We have separate fields for Title, FirstName, and Surname.

Figure 1:  An Access Table containing a list of names
stored in separate fields.
The expression we are going to use to concatenate the name into a single whole is as follows:

        [Title] &" " & [FirstName] & " " & [Surname]


The three fields are separated by two ampersands (&), and a string containing a single space in between.  The ampersand concatenates the various elements of the name, and the empty space between the quotation marks simply creates a space between the three fields when joined together.  So we have a total of five separate elements concatenated by the ampersand operator - ie Title space FirstNamespace Surname.

As mentioned above, this expression can be used in a calculated field of an Access Query.  To do so, just enter an alias (ie the name we are going to call the calculated field) with a colon in front of the expression.  For example:


        FullName: [Title] & " " & [FirstName] & " " & [Surname]


This is entered into the FIELD row of the query design grid as follows:

Figure 2: Expression to concatenate a full name
entered into the Query Design Grid.
Then, when we run the query we get a list of full names, each one appearing as a single field:

Figure 3: Concatenated Names appearing in
Query Result.