Saturday, April 17, 2010

Filter multiple ID records by Max Date in Excel using MS Query

Let’s say you have a list of clients, an HVAC contractor has a customer list of service dates of furnace repair, or dates that you have serviced your car, or a Dentist’s clients list of teeth cleanings, or the example we’ll use, orders purchased by customers. We’ll focus on the orders by customer where we need to determine their last order date. But basically, any records that may have multiple entries for the same key field where you need to filter the maximum date for that key. In the case of the HVAC contractor who serviced many customers - many times over the years, they need to be able to determine the last date the customer was serviced. Of course this information could be sorted in decending order and quickly filtered by client, but what if you need to look-up all of your clients that have not had service within the past year? In order to do this the records need to be first filtered down to the maximum date for each key field, in this case, the customer, THEN those records filtered greater than one year. This then would be an accurate list of customers who have not had service within the past year. The same for determining the last order date of a customer’s many orders. When was the last time customers purchased something, and show me those that haven’t purchased anything in the past year!

Well jeez, that should be easy enough if your data is in Excel right? Wrong! Give it a go, try that pivot table (which there is a way to do it in a pivot, but still, it isn’t the best way), or try to filter or to group. Go ahead. Good luck. Let me know if you find an easier way than what I have below.

Many Excel users have no idea of an import feature called MS Query. If you utlize ODBC at all then you probably have seen this utility. And even if you have used it, did you know that you can point MS Query right back to the exact same workbook you are working on to filter data?

That is what I will show you here. We are going to assume that you already have the data in Excel. And if you don’t, you can still use MS Query to connect to many different sources. Here though, I’ll focus and demonstrate accessing the Excel data within the same workbook.

If you want to recreate this, I am using the Northwind database Orders table.

Reminder, when using your own data, this needs to be a table with column headings that should be in the first row. So first let me show you how there are multiple records for the same customer within this Orders table.

image

I’ll filter by customer ID 4 by clicking the filtered drop down of Customer ID in this example. This is just to show you the multiple records for this customer.

image

Notice I have five records for this customer ID. Now imagine  trying to determine their last order dates of hundreds of customers with hundreds of transactions, and then trying to filter those greater than one year. Normally it would be a nightmare in Excel. But I’ll show you an easy way to do this.

image

To get started, first SAVE YOUR SPREADSHEET. Then create a new worksheet, in this case I have one called MSQueryResults, and click on Import, Data, From Microsoft Query.

image

Select Excel Files, Un-Check the box as shown. Click ok.

image

Locate your spreadsheet, the one you are in right now that you just saved, and select it. Click OK.

image

Select the sheet name of your data. IF you do not see any sheet names, click on Options and be sure the System Tables box is checked. Click Add.

image

Notice how it adds the table to the query screen. Click Close.

image

Here I’ve added the Customer ID (unique ID), and the Order Date. I did this by double clicking on them.

 image

Now, where Order Date is listed in the result pane, double click on Order Date and select Total of Max. This is assuming the Order Date is truly a date value. If not then the Order Date fields needs to be wrapped with DATEVALUE(Order Date). However, here the date is a true date so that is not necessary.

image

Now click ! to refresh the screen and your customers are now filtered by Max date!

image

At this point there are a lot of options. A query filter could be created, the date could be formatted, a parameter query could be created to prompt for date, the same table could be added and linked if more fields are required, etc. To keep this simple, we’ll return this to Excel and then use Excel to format and filter.

Click the Return to Excel button.

image

Click OK and return the data to Excel.

image

image

At this point the Date field can be formatted and a filter created. As you can see, because this is a date field, there are many options now to filter further the returned result set from MS Query.

image

image

So there you have it. I took a large volume of data with multiple records with the same customer ID, used MS Query to generate a new filtered result-set by Max Date, returned it to the same Excel spreadsheet, and filtered those results. The great thing is the result set can be refreshed any time it is needed.

The next time you have a filter challenge within Excel, remember the MS Query option of looking at your same spreadsheet to return a filtered result set.

4 comments:

  1. Wow Rick. I need to give this a try! I forgot you were working on this... sorry!

    ReplyDelete
  2. Rick

    I tried this and it didn't work.

    I am very familiar with MS Query and use it quite often to pull data from multiple sources. I have never used to it to pull data from an existing excel worksheet.

    I have a dataset of multiple ids with multiple effective dates and need to pull the most current date. I tried your method and it still pulled in all the records.

    below is a sample of my records

    Eff Date Rate ID Name
    04/01/05 200 51520 Abramson
    01/01/06 210 51520 Abramson
    01/01/07 280 51520 Abramson
    04/01/05 100 60471 Adams
    01/01/06 105 60471 Adams
    01/01/07 125 60471 Adams
    08/01/07 110 60470 Agnew
    10/01/07 125 60470 Agnew
    01/01/08 130 60470 Agnew

    I'm trying to pull in the most recent rate for each id using the max effective date.

    when I used your method it pulled in all the records.

    what am I doing wrong?

    Shirley Dumas
    Reporting Mgr
    shirley.dumas@troutmansanders.com

    ReplyDelete
  3. Hi Shirley,

    Thank you for reading my blog! I believe your problem may be with the format of your date field. You may need to wrap the field with this function: DATEVALUE(your field). This will ensure it is recognized as a date field so then the max function will work. The entire formula is. MAX(DATEVALUE(your field)). You would place that as a column in MSQUERY. Then it should return your max date of that field from your results. Let me know if that helps or if you need further assistance!

    Rick Black

    ReplyDelete
  4. awesome!! thanks!!!!!!! :)

    ReplyDelete