Archive
MDX #28– Time functions will not work with subquery
I had a post last month, MDX #25 – Slicer or Sub-Cube?, verifying that the query context did not change with the subquery.
This can pose problems if you want to use the Time-series functions in MDX. Two functions come into mind, YTD() and PeriodsToDate().
Both functions need to know which member on the Date dimension we are operating on.
PeriodsToDate and YTD function will not work with a subquery
Let’s try this query, which tests the PeriodsToDate function with a subquery. It will not work with the created member Reseller Sales YTD. Created member [x] is here to show what the current member on the Datedimension is.
–PeriodsToDate will not work with the subquery
WITH
MEMBER [Measures].[Reseller Sales YTD] AS
Aggregate(
PeriodsToDate(
[Date].[Calendar].[Calendar Year],
[Date].[Calendar].CurrentMember
),
[Measures].[Reseller Sales Amount]
)
MEMBER [Measures].[x] AS
membertostr([Date].[Calendar].CurrentMember)
SELECT
{ [Measures].[Reseller Sales Amount],
[Measures].[Reseller Sales YTD],
[Measures].[x] } ON 0
FROM
( select
{ [Date].[Calendar].[Month].[March 2008]:
[Date].[Calendar].[Month].[April 2008]
} on 0
from [Adventure Works]
)
PeriodsToDate and YTD function will work with modification
The fault is not the subquery, rather it’s because we did not provide the query context.
We can modify the above query to provide the context by adding a member expression [Date].[Calendar].[Month].MEMBERS to the ROWS axis.
– PeriodsToDate works with [Date].[Calendar].[Month].MEMBERS on the ROWS axis
WITH
MEMBER [Measures].[Reseller Sales YTD] AS
Aggregate(
PeriodsToDate(
[Date].[Calendar].[Calendar Year],
[Date].[Calendar].CurrentMember
),
[Measures].[Reseller Sales Amount]
)
MEMBER [Measures].[x] AS
membertostr([Date].[Calendar].CurrentMember)
SELECT
{ [Measures].[Reseller Sales Amount],
[Measures].[Reseller Sales YTD],
[Measures].[x] } ON 0,
{ [Date].[Calendar].[Month].MEMBERS } ON 1
FROM
( select
{ [Date].[Month Name].[Month Name].[March 2008]:
[Date].[Month Name].[Month Name].[April 2008]
} on 0
from [Adventure Works]
)
Our result is showing both months on the ROWS axis, and with the correct YTD reseller sales for each month.
MDX #27–Who are Ascendants?
Many function names in MDX are very family-friendly. Children, parent, ancestors, descendants, ascendants, are all terms we often use in telling family stories. Except ascendants. I never really understand what this word means until I ran into this word in MDX.
Ancestors VS. Ascendants
I checked the www.thefreedictionary.com, and here are the definitions:
ascendant – someone from whom you are descended (but usually more remote than a grandparent).
ancestor – a person from whom another is directly descended, esp. someone more distant than a grandparent.
These two definitions are very close. Nothing really stands out for me to tell the difference.
Fortunately, in MDX, the difference between them is very easy to tell and also easy to demonstrate.
Ascendants() – returns all of the ancestors of a member in a set
That is, from the member itself up to the top of the member’s hierarchy. This is easy to see in this query.
SELECT
Measures.[Reseller Order Count] ON COLUMNS,
Ascendants(
[Sales Territory].[Sales Territory].[Northwest]
) ON ROWS
FROM
[Adventure Works]
In the [Sales Territory] dimension, Northwest is part of United States, which is part of North America. [All Sales Territories] is the root member and sits on top of all other members.
With one Ascendants() function, we get all the ancestors, including the member [Northwest]itself and the most top level member, the [All Sales Territories].
Plural S returns set; singular returns member
Notice that the above query returns a set back. This set contains all the members in [Northwest]‘s family tree.
With the same logic, we can make a good guess that the function Ancestors() will return a set too, while the function Ancestor() will only return a member.
This logical thinking actually applies to other functions as well. Children() will return a set, while FirstChild() returns a member.
Ancestor() VS. Ancestors()
The word Ancestors is a plural, so you might think that it will return all the upward members in the hierarchy. It is true, sort of. It can return all the upward members in the hierarchy, but only one at a time.
The following query can demonstrate this:
SELECT
Measures.[Reseller Order Count] ON COLUMNS,
Ancestors(
[Sales Territory].[Sales Territory].[Northwest],
1
) ON ROWS
FROM
[Adventure Works]
We can only see United States because Ancestors() function can only return one specific ancestor at the specified level.
If you change the level number to 2, you should see only North America.
Again, the plural only means that it is returning a set; a one-member only set in this case.
Ancestor() and Ancestors() – they return one specific ascendant member at a time, at a specific level
If you change the above query to use the singular Ancestor() function, you will get the same result, only that it is a member, not a set.
MDX #26 – SSN can only be referenced as a member property in MDX
I’ve blogged about the Properties() function before.
MDX #11 – How to get number of cars each customer owned using Properties() function?
MDX #12 – Do not forget the TYPED flag in the Properties() function
The Properties() function is used to explore the attribute relationships in a dimension.
If an attribute is also enabled as an attribute hierarchy, then of cause, we can also reference it as Attribute Hierarchy in MDX.
AttributeHierarchyEnabled = False
In the Employee dimension in the Adventure Works cube, employees’ SSN numbers is not enabled as an Attribute Hierarchy. Its property AttributeHierarchyEnabled is set to False.
Here are what will happen in SSAS, in MDX, and in client applications when AttributeHierarchyEnabled is set to False.
In SSAS: A disabled attribute hierarchy cannot be be used as a level in a user defined hierarchy
However, it can still be used to order the members of another attribute.
In MDX: 1) A disabled attribute hierarchy cannot be referenced as Attribute Hierarchy
This query referenced the SSN as attribute hierarchy, and no result is returned.
In MDX: 2) A disabled attribute hierarchy can only be referenced as a member property
This query referenced SSN as the property of current member of the employee. It shows the correct SSN for each employee.
In Client App: 1) Will not be used for browsing in client application
In Client App: 2) Will only be visible to client applications as a member property
In SSAS: still makes sense to order and optimize the disabled attribute
Because the members of these attribute hierarchies are still used for dimensioning fact data, ordering and optimizing the members of these attribute hierarchies can still improve performance.
Therefore, the following properties of these attributes can still be enabled.
- AttributeHierarchyVisible = True: (the attribute will be visible to the attribute hierarchies list in client applications)
- IsAggregatable = True: (it will still have the All top level member)
- AttributeHierarchyOptimizedState = FullyOptimized
- AttributeHierarchyOrdered = True
- GroupingBehavior = EncourageGrouping
MDX #25 – Slicer or Sub-Cube?
Slicer, Axes and Calculations Can All Filter Data in MDX
Every developer with SQL background knows how the WHERE clause works. You put some condition in the WHERE clause, such as TheDate = Yesterday (pseudo code), and it will only return data for yesterday.
Not so straightforward in MDX. We should expect more complex behaviors in MDX because of the multi-dimensional nature of the cubes.
But how different it can be.
There are many topics to explore, including why we prefer to call the WHERE clause slicer in MDX, how tuple is constructed, when default member is used, how slicer (WHERE clause if you insist) can be used to limit data, and how axes can be used, how the MDX engine decides when to use slicer, axes, and calculation formula to limit the data, etc..
Sub-Select Can Filter Data Too
In this blog, I’ll explore just one aspect of how we can use both the slicer and a sub-cube to limit the data, and where they are the same, and where they might give you different results.
Both Slicer and Sub-Select Produces the Same Result
Run these two queries (separating by GO), you get the same Internet Sales Amount from both queries, for July 1, 2008.
So our conclusion so far is that slicer and sub-select should give same results.
If you make such statement to some MDX experts, they will tell you that results from using a member in the slicer can be different from using the same member in a sub-select.
So what can be different?
While Sub-Select does not change the query context, the slicer does
Run this query pair. We are using the currentmember function to show what date we are currently at. In MDX’s term, we are checking the current member of the Date hierarchy in the query context (quite mouthful).
It turned out that the “current date” from the two queries is not the same. The first one with slicer says we are currently at just one day, July 1, 2008. The second one with the sub-select says we are actually currently at All Periods (all days in the entire Date dimension; the root member of the Date dimension).
Now we know that the query context (where we currently at) is different, depending on where we are putting our member, in the slicer or in the sub-select. The sub-select does nothing to change the query context, while the slicer changes the context according to the tuple (in the above example we only have one member in the tuple) we put in the slicer.
You might already figure out that we need to be careful now about the context.
When Sub-Select and Slicer Might Give Different Results
Here is an example where your filtered result might not be exactly what you want if you are using the sub-select.
When using the currentmember as a filter, You should expect the sub-select will give you the Internet Sales Amount for the entire date range in your Date dimension, not just from one day, July 1, 2008.
Confusing? Yes, it is. But no complaining?
For my other MDX blogs, visit http://bisherryli.wordpress.com/category/mdx-challenges/
MDX #24 – Leave out a dimension member using Except() Function
Functions such as NonEmpty(), Filter() and Exists() must use a numeric expression
Many of the MDX functions, such as NonEmpty(), Filter() and Exists(), seem to be easy enough to use, but they present quite a challenge for people with T-SQL background. The common theme among them is that they all claim to return dimension members, such as a list (or a set of tuples in MDX lingo) of countries, but they must use a fact measure (a numeric expression in another word) as the condition, such as the sales amount must be more than $10K. We are just not used to using fact measures.
Check out this post for an example of the NonEmpty() function.
MDX #1 – Why should we always provide set_expression2 in NonEmpty() function
Use Except() function to leave out a dimension member
In many situations, we just simply want to leave out a dimension member, such as one of the Product Categories, no matter what the sales amount is.
Run this simple query, and you will get the Order Quantity for all the Product Categories for each month.
SELECT [Date].[Month of Year].Children ON COLUMNS,
[Product].[Product Categories].[All].Children ON ROWS
FROM [Adventure Works]
WHERE ([Measures].[Order Quantity])
What if I simply just want to leave out the Components in my query? Components are not the “final” products. Clothing, Bikes and Accessories are. So I have quite a good reason to leave it out of the query.
Now run this query with the help of the Except() function. Components is no longer in the result.
SELECT [Date].[Month of Year].Children ON COLUMNS,
Except(
[Product].[Product Categories].[All].Children ,
[Product].[Product Categories].[Components]
) ON ROWS
FROM [Adventure Works]
WHERE ([Measures].[Order Quantity])
The two sets must be from the same hierarchy and at the same level
The basic syntax of the Except() function is:
Except(Set_Expression1, Set_Expression2)
It returns SET 1, minus the SET 2. Please make sure that both sets are:
1) from the same hierarchy, and
2) at the same level
If you want to leave out Brakes, the following query will not work, because the two sets in the Except() function is not at the same level.
(The following would not work.)
SELECT [Date].[Month of Year].Children ON COLUMNS,
Except(
[Product].[Product Categories].[All].Children ,
[Product].[Product Categories].[Subcategory].[Brakes]
) ON ROWS
FROM [Adventure Works]
WHERE ([Measures].[Order Quantity])
This query should work because both sets are at the Subcategory level.
SELECT [Date].[Month of Year].Children ON COLUMNS,
Except(
[Product].[Product Categories].[Subcategory].members ,
[Product].[Product Categories].[Subcategory].[Brakes]
) ON ROWS
FROM [Adventure Works]
WHERE ([Measures].[Order Quantity])
The Except() function does not involve any numeric expressions
So hopefully, it’s easier for people to understand with T-SQL background.
MDX #23 – “Hello World!” Lesson in MDX
Almost every tool we learned has some sort of “Hello World!” tutorial lesson.
So here comes the “Hello World!” lesson in MDX.
Putting “Hello World!” directly on the X axis doesn’t work
You would think this query would work, by putting “Hello World!” directly on the X axis.
Unfortunately it doesn’t.
[Measures].[x] is a perfectly legitimate and unique name for a tuple
The X axis expects a tuple/set expression. Or put it simply that it expects something like this:
[A Dimension].[A Hierarchy].[A Member]
To save some typing, I am going straight to the one special dimension in any SSAS cube, that is, the [Measures] dimension. This special dimension also has only one hierarchy, which happens to have the same name, [Measures]. This save me time to type it twice.
To further save myself typing, I am going to use one letter x for the member.
[Measures].[x] is a perfectly legitimate and unique name for a tuple. It’s a very rare chance that any cube designer would name any of the measures [x].
Do not use any existing member
If you try to use a measure that has already existed in the cube to represent “Hello World!”, you will get an error.
MDX for arithmetic calculation
If you are like me who writes so much SQL code every day that I never use the Calculator software on my PC for arithmetic calculation. SELECT 9999999 + 1 is a perfectly fine query and will return 10000000.
A bit more typing than in TSQL, but this works in MDX.
Even better, if you want to format it, go ahead use the format_string.
We can create a member on any dimension
[Measures] is not the only choice for a perfect “Hello World!” lesson. As a matter of fact, you can use any dimension in the cube, but it requires a bit more typing.
Let’s pick the Date dimension.
But don’t forget to pick a hierarchy
It didn’t work, because unlike the special [Measures] dimension, the [Date] has many hierarchies, including a dozen or so attribute hierarchies and a few user-defined hierarchies.
Let’s pick the user-defined hierarchy [Calendar].
If you want to summarize the “Hello World!” tutorial lesson in MDX, go right ahead!
For my other MDX blogs, visit http://bisherryli.wordpress.com/.
MDX #22 – What to use for explicit sorting on dimensions
I often think that I am lucky to work in a profession that I enjoy very much. It makes everything I do a fun thing to do. Majority of the fun comes from the fact that I have my hands in almost every aspect of building a data warehouse. From business analysis, to ETL, to star schema/cube design and development to reporting, and to customer service. My blogs most of the time reflect the different roles I play in my profession.
In this example, I am going from a reporting requirement to the MDX queries that allow me to meet the requirement.
Note: all the examples are demonstrated in the sample Adventure Works database (star schema or the multi-dimensional cube) from Microsoft.
Very often we need to produce reports that compare monthly data. Providing two Month reporting filters will be the first step.
My last two blogs dealt with some aspects of this kind of reporting requirements.
Reporting Requirements
In this example I need to accomplish the followings:
1) Get all the months that have data
2) Sort the months in descending order
3) Default the first filter to the most recent month that has data
4) Default the second filter to the previous month that has data
In this blog, I’ll try to accomplish 1) and 2).
To get all the Months, let’s put the Date dimension on Y axis
To make sure that the months I get have Sales data, let’s put the [Sales Amount] measure on X axis.
Note: If you’ve been using the Query editor in SSRS, you might have noticed that the only “Dimension” that is allowed on the X axis is the [Measures] dimension. So you cannot place the Date dimension on the X axis. This can create a bit confusion first, but it really doesn’t prevent me from writing MDX queries to my hearts desire.
Use the key word non empty to remove the months that do not have data
Note: Both the keyword NON EMPTY and the NONEMPTY(set_expression1, set_expression2) function will evaluate the cells for "empty" condition. The first only works on the axis. The latter can be used in any parts of the MDX queries and covers much more ground because it can take any two sets as parameters (although the second one is optional). Both should work in this example.
So far, I have met the first requirement 1) Get all the months that have data.
Explicitly sort the months in descending order
In order to meet the second requirement, I need to explicitly sort the months in descending order. I know I need some values I can find in the cube about the month for the sorting. Before I can figure out that, I need some help on all the things I can find about the month.
They are the member properties that I need to find out.
The Member Value is the one I can use to sort the months explicitly
All looks good, except the first member_key which returned a NULL value. It turned out that I cannot really use the member_key function here because the [Month Name] attribute has a composite key. Open the Date Dimension Editor, we can see the KeyColumns has a collection of Year and Month Number. The property Key0 and Key1 will give us the correct key values.
But the keys cannot provide help to sort the months. The Member Value is the one I can use to sort the months explicitly.
Use attribute [Month Name] rather than user hierarchy [Calendar]
Note: Since I am using the attribute [Month Name] for sorting, I’ll need to use it also on the Y axis. Otherwise using the [Calendar] user hierarchy will not work.
It took us a few tries along the way, but the final query is simple enough.
To summarize, here are what we’ve learned
1) Many member properties, including ID, key, caption, name, member value, can be potential candidates for explicit sorting.
2) I used to use member key for explicit sorting on dimensions. But it will not work on the attributes that have composite keys. To get to the composite keys, we need to use the Key0, Key1, etc.
3) Member Value can be a good choice if we need to sort by numerics. Sometimes, Member Name can also be a good choice if we need to sort alphabetically.
MDX #21 – Days in a month
ParallelPeriod() function is very useful when we need to present data side-by-side for two parallel periods
ParallelPeriod() function is often used when we need to present data side-by-side for two parallel periods, such as in the example below where daily data needs to be presented side-by-side for each day in October and November.
ParallelPeriod() is unaware which month is longer
If you have used the ParallelPeriod() function for this purpose you will surely appreciate its power. You will also experience some serious frustration too. Because ParallelPeriod() is unaware which month is longer, and it’s extremely frustrating to try to line up the days for the days when there is no data. Ideally, the query needs to be dynamic and be aware which month is longer.
Query to determine how many days in each month
So the first thing I’ll need to do is to query the cube to determine how many days in each month (days that have data).
This query is very similar to the query in MDX #20 – Last Data Date, with the Existing key word.
Without the Existing keyword, it will actually give me the number of days in the entire Date dimension. The Existing keyword forces the calculated member to be in the context of the query, which is for the month of July 2004 only.
Since I do not care each day in the month, the COUNT() function allows me to just see the number of days in the month.
If you need to query both months of July 2004 and August 2004, wrap both members up in { } (so they will be a set), and place it in the WHERE clause.
MDX #20 – Last Data Date
Need to check the last time when our ETL processes ran successfully
Data warehouse developers routinely query the last data date (the last date when we have fact data) to check the last time when our ETL processes ran successfully.
It’s not hard to do it in TSQL. It just requirements a bit of typing. In this example, I am showing the last data date in its readable format and the Sales Amount on that day.
A lot easier to do this in MDX
It’s actually a lot easier to do this in MDX. With the help of the TAIL() function and the NONEMPTY() function, the last data date is easy to query without any sub-queries or calculations.
Always verify
To make sure we got it right, run this simple query to verify.
Note: There is no need to order by the Date dimension. It’s already ordered in the SSAS.
MDX #19 – There is more to do to concatenate row values into column
Continuing from MDX #18 – Easy to concatenate row values into column in MDX.
I have not accomplished the goal of concatenating the set of top resellers in each country.
Count() function can count the number of tuples in a set
In this query, I am adding a count to show how many top sellers in the set for each country. The number [Top n] is 21, which is obviously wrong. It should show 3 for each country. 21 in some sense is correct too, since there are indeed 21 tuples in the set (3 resellers x 7 countries including an ALL country).
Adding a Existing key word to forces set [Top n Resellers per Country] to be evaluated within the current query context
Autoexists rule, which prevents tuples of attributes from the same overall dimension (such as attributes zip code and city in customer dimension) from existing if the dimension did not have at least one combination of them (meaning valid zip codes in a city.
Here [Country] and [Reseller] are two attributes that come from two different dimensions [Sales Territory] and [Reseller], the Autoexists rule doesn’t seem to apply.
Existing is a useful MDX operator that evaluates set_expression within the current member context (current member context sounds a little foreign, but it really means the sub-space in our Y axis which is the [Country] in dimension [Sales Territory] in the above query). If the existence of members in set [Top n Resellers per Country] depends on the current member context (in all dimensions other than just those of the set [Top n Resellers per Country]), the returned set will reflect the context. The long verbiage means that the returned set will be only in the context of each country, which is 3.
See EXISTING Keyword (MDX) for more details.
Show all 3 tuples in each country in a big string
SetToStr() function takes a set and converts it into a big string with each member being fully qualified.
The big string is not really useful to our business users.
Get a list of the reseller name only
We can use the second variation of the Generate() function to strip off the qualifiers and only get the name value for each member.
Although the Item() function works with the above query, I am still a little fuzzy on how the nested Item() function works.
This is how the set of 3 tuples looks like for France. Each tuple has two fully qualified members
{ ([Sales Territory].[Sales Territory Country].&[France],[Reseller].[Reseller].&[638]),
([Sales Territory].[Sales Territory Country].&[France],[Reseller].[Reseller].&[175]),
([Sales Territory].[Sales Territory Country].&[France],[Reseller].[Reseller].&[85])
}
I guess the first Item(0) returns each tuple, and the second item() function returns the first member France when index is 0 and returns the second member when the index is 1. Item() function works on zero-based index.
With the help of Count(), Existing, Generate() and Item() function…..
Finally we concatenated the three top resellers in each country.



