She learns Data, She leads
Introduction
As a data analyst, you are on a journey. Think about all the data that is being generated each day and that is available in an organization, from transactional data in a traditional database, telemetry data from services that you use, to signals that you get from different areas like social media.
For example, today's retail businesses collect and store massive amounts of data that track the items you browsed and purchased, the pages you've visited on their site, the aisles you purchase products from, your spending habits, and much more.
With data and information as the most strategic asset of a business, the underlying challenge that organizations have today is understanding and using their data to positively effect change within the business. Businesses continue to struggle to use their data in a meaningful and productive way, which impacts their ability to act.
A retail business should be able to use their vast amounts of data and information in such a way that impacts the business, including:
- Tracking inventory
- Identifying purchase habits
- Detecting user trends and patterns
- Recommending purchases
- Determining price optimizations
- Identifying and stopping fraud
Additionally, you might be looking for daily/monthly sale patterns. Common data segments that you might want to examine include day-over-day, week-over-week, and month-over-month so that you can compare how sales have been to where they were in the same week last year.
The key to unlocking this data is being able to tell a story with it. In today's highly competitive and fast-paced business world, crafting reports that tell that story is what helps business leaders take action on the data. Business decision makers depend on an accurate story to drive better business decisions. The faster a business can make precise decisions, the more competitive they will be and the better advantage they will have. Without the story, it is difficult to understand what the data is trying to tell you.
However, having data alone is not enough. You need to be able to act on the data to effect change within the business. That action could involve reallocating resources within the business to accommodate a need, or it could be identifying a failing campaign and knowing when to change course. These situations are where telling a story with your data is important.
The underlying challenge that businesses face today is understanding and using their data in such a way that impacts their business and ultimately their bottom line. You need to be able to look at the data and facilitate trusted business decisions. Then, you need the ability to look at metrics and clearly understand the meaning behind those metrics.
Data analysis exists to help overcome these challenges and pain points, ultimately assisting businesses in finding insights and uncovering hidden value in troves of data through storytelling. As you read on, you will learn how to use and apply analytical skills to go beyond a single report and help impact and influence your organization by telling stories with data and driving that data culture.
Before data can be used to tell a story, it must be run through a process that makes it usable in the story. Data analysis is the process of identifying, cleaning, transforming, and modeling data to discover meaningful and useful information. The data is then crafted into a story through reports for analysis to support the critical decision-making process.
As the world becomes more data-driven, storytelling through data analysis is becoming a vital component and aspect of large and small businesses. It is the reason that organizations continue to hire data analysts.
Data-driven businesses make decisions based on the story that their data tells, and in today's data-driven world, data is not being used to its full potential, a challenge that most businesses face. Data analysis is, and should be, a critical aspect of all organizations to help determine the impact to their business, including evaluating customer sentiment, performing market and product research, and identifying trends or other data insights.
While the process of data analysis focuses on the tasks of cleaning, modeling, and visualizing data, the concept of data analysis and its importance to business should not be understated. To analyze data, core components of analytics are divided into the following categories:
- Descriptive
- Diagnostic
- Predictive
- Prescriptive
- Cognitive
Descriptive analytics
Descriptive analytics help answer questions about what has happened based on historical data. Descriptive analytics techniques summarize large semantic models to describe outcomes to stakeholders.
By developing key performance indicators (KPIs), these strategies can help track the success or failure of key objectives. Metrics such as return on investment (ROI) are used in many industries, and specialized metrics are developed to track performance in specific industries.
An example of descriptive analytics is generating reports to provide a view of an organization's sales and financial data.
Diagnostic analytics
Diagnostic analytics help answer questions about why events happened. Diagnostic analytics techniques supplement basic descriptive analytics, and they use the findings from descriptive analytics to discover the cause of these events. Then, performance indicators are further investigated to discover why these events improved or became worse. Generally, this process occurs in three steps:
- Identify anomalies in the data. These anomalies might be unexpected changes in a metric or a particular market.
- Collect data that's related to these anomalies.
- Use statistical techniques to discover relationships and trends that explain these anomalies.
Predictive analytics
Predictive analytics help answer questions about what will happen in the future. Predictive analytics techniques use historical data to identify trends and determine if they're likely to recur. Predictive analytical tools provide valuable insight into what might happen in the future. Techniques include a variety of statistical and machine learning techniques such as neural networks, decision trees, and regression.
Prescriptive analytics
Prescriptive analytics help answer questions about which actions should be taken to achieve a goal or target. By using insights from prescriptive analytics, organizations can make data-driven decisions. This technique allows businesses to make informed decisions in the face of uncertainty. Prescriptive analytics techniques rely on machine learning as one of the strategies to find patterns in large semantic models. By analyzing past decisions and events, organizations can estimate the likelihood of different outcomes.
As the amount of data grows, so does the need for data analysts. A data analyst knows how to organize information and distill it into something relevant and comprehensible. A data analyst knows how to gather the right data and what to do with it, in other words, making sense of the data in your data overload.
A data analyst is one of several critical roles in an organization, who help uncover and make sense of information to keep the company balanced and operating efficiently. Therefore, it's vital that a data analyst clearly understands their responsibilities and the tasks that are performed on a near-daily basis. Data analysts are essential in helping organizations gain valuable insights into the expanse of data that they have, and they work closely with others in the organization to help reveal valuable information.
The following figure shows the five key areas that you'll engage in during the data analysis process.
Prepare
As a data analyst, you'll likely divide most of your time between the prepare and model tasks. Deficient or incorrect data can have a major impact that results in invalid reports, a loss of trust, and a negative effect on business decisions, which can lead to loss in revenue, a negative business impact, and more.
Before a report can be created, data must be prepared. Data preparation is the process of profiling, cleaning, and transforming your data to get it ready to model and visualize.
Data preparation is the process of taking raw data and turning it into information that is trusted and understandable. It involves, among other things, ensuring the integrity of the data, correcting wrong or inaccurate data, identifying missing data, converting data from one structure to another or from one type to another, or even a task as simple as making data more readable.
Data preparation also involves understanding how you're going to get and connect to the data and the performance implications of the decisions. When connecting to data, you need to make decisions to ensure that models and reports meet, and perform to, acknowledged requirements and expectations.
Privacy and security assurances are also important. These assurances can include anonymizing data to avoid oversharing or preventing people from seeing personally identifiable information when it isn't needed. Alternatively, helping to ensure privacy and security can involve removing that data completely if it doesn't fit in with the story that you're trying to shape.
Data preparation can often be a lengthy process. Data analysts follow a series of steps and methods to prepare data for placement into a proper context and state that eliminate poor data quality and allow it to be turned into valuable insights.
Model
When the data is in a proper state, it's ready to be modeled. Data modeling is the process of determining how your tables are related to each other. This process is done by defining and creating relationships between the tables. From that point, you can enhance the model by defining metrics and adding custom calculations to enrich your data.
The model is another critical component that has a direct effect on the performance of your report and overall data analysis. A poorly designed model can have a drastically negative impact on the general accuracy and performance of your report. Conversely, a well-designed model with well-prepared data will ensure a properly efficient and trusted report. This notion is more prevalent when you are working with data at scale.
From a Power BI perspective, if your report is performing slowly, or your refreshes are taking a long time, you will likely need to revisit the data preparation and modeling tasks to optimize your report.
The process of preparing data and modeling data is an iterative process. Data preparation is the first task in data analysis. Understanding and preparing your data before you model it will make the modeling step much easier.
Visualize
The visualization task is where you get to bring your data to life. The ultimate goal of the visualize task is to solve business problems. A well-designed report should tell a compelling story about that data, which will enable business decision makers to quickly gain needed insights. By using appropriate visualizations and interactions, you can provide an effective report that guides the reader through the content quickly and efficiently, therefore allowing the reader to follow a narrative into the data.
The reports that are created during the visualization task help businesses and decision makers understand what that data means so that accurate and vital decisions can be made. Reports drive the overall actions, decisions, and behaviors of an organization that is trusting and relying on the information that is discovered in the data.
The business might communicate that they need all data points on a given report to help them make decisions. As a data analyst, you should take the time to fully understand the problem that the business is trying to solve. Determine whether all their data points are necessary because too much data can make detecting key points difficult. Having a small and concise data story can help find insights quickly.
With the built-in AI capabilities in Power BI, data analysts can build powerful reports, without writing any code, that enable users to get insights and answers and find actionable objectives.
An important aspect of visualizing data is designing and creating reports for accessibility. As you build reports, it is important to think about people who will be accessing and reading the reports. Reports should be designed with accessibility in mind from the outset so that no special modifications are needed in the future.
Many components of your report will help with storytelling. From a color scheme that is complementary and accessible, to fonts and sizing, to picking the right visuals for what is being displayed, they all come together to tell that story.
Analyze
The analyze task is the important step of understanding and interpreting the information that is displayed on the report. In your role as a data analyst, you should understand the analytical capabilities of Power BI and use those capabilities to find insights, identify patterns and trends, predict outcomes, and then communicate those insights in a way that everyone can understand.
Advanced analytics enables businesses and organizations to ultimately drive better decisions throughout the business and create actionable insights and meaningful results. With advanced analytics, organizations can drill into the data to predict future patterns and trends, identify activities and behaviors, and enable businesses to ask the appropriate questions about their data.
Previously, analyzing data was a difficult and intricate process that was typically performed by data engineers or data scientists. Today, Power BI makes data analysis accessible, which simplifies the data analysis process. Users can quickly gain insights into their data by using visuals and metrics directly from their desktop and then publish those insights to dashboards so that others can find needed information.
Manage
Power BI consists of many components, including reports, dashboards, workspaces, semantic models, and more. As a data analyst, you are responsible for the management of these Power BI assets, overseeing the sharing and distribution of items, such as reports and dashboards, and ensuring the security of Power BI assets.
Apps can be a valuable distribution method for your content and allow easier management for large audiences. This feature also allows you to have custom navigation experiences and link to other assets within your organization to complement your reports.
The management of your content helps to foster collaboration between teams and individuals. Sharing and discovery of your content is important for the right people to get the answers that they need. It is also important to help ensure that items are secure. You want to make sure that the right people have access and that you are not leaking data past the correct stakeholders.
The management of Power BI assets helps reduce the duplication of efforts and helps ensure security of the data.
Telling a story with the data is a journey that usually doesn't start with you. The data must come from somewhere. Getting that data into a place that is usable by you takes effort that is likely out of your scope, especially in consideration of the enterprise.
In the recent past, roles such as business analysts and business intelligence developers were the standard for data processing and understanding. However, excessive expansion of the size and different types of data has caused these roles to evolve into more specialized sets of skills that modernize and streamline the processes of data engineering and analysis.
The following sections highlight these different roles in data and the specific responsibility in the overall spectrum of data discovery and understanding:
- Business analyst
- Data analyst
- Data engineer
- Data scientist
- Database administrator
- Business analyst
While some similarities exist between a data analyst and business analyst, the key differentiator between the two roles is what they do with data. A business analyst is closer to the business and is a specialist in interpreting the data that comes from the visualization. Often, the roles of data analyst and business analyst could be the responsibility of a single person.
Data analyst
A data analyst enables businesses to maximize the value of their data assets through visualization and reporting tools such as Microsoft Power BI. Data analysts are responsible for profiling, cleaning, and transforming data. Their responsibilities also include designing and building scalable and effective semantic models, and enabling and implementing the advanced analytics capabilities into reports for analysis. A data analyst works with the pertinent stakeholders to identify appropriate and necessary data and reporting requirements, and then they are tasked with turning raw data into relevant and meaningful insights.
A data analyst is also responsible for the management of Power BI assets, including reports, dashboards, workspaces, and the underlying semantic models that are used in the reports. They are tasked with implementing and configuring proper security procedures, in conjunction with stakeholder requirements, to ensure the safekeeping of all Power BI assets and their data.
Data engineer
Data engineers provision and set up data platform technologies that are on-premises and in the cloud. They manage and secure the flow of structured and unstructured data from multiple sources. The data platforms that they use can include relational databases, nonrelational databases, data streams, and file stores. Data engineers also ensure that data services securely and seamlessly integrate across data platforms.
Primary responsibilities of data engineers include the use of on-premises and cloud data services and tools to ingest, egress, and transform data from multiple sources. Data engineers collaborate with business stakeholders to identify and meet data requirements. They design and implement solutions.
Data scientist
Data scientists perform advanced analytics to extract value from data. Their work can vary from descriptive analytics to predictive analytics. Descriptive analytics evaluate data through a process known as exploratory data analysis (EDA). Predictive analytics are used in machine learning to apply modeling techniques that can detect anomalies or patterns. These analytics are important parts of forecast models.
Descriptive and predictive analytics are only partial aspects of data scientists' work. Some data scientists might work in the realm of deep learning, performing iterative experiments to solve a complex data problem by using customized algorithms.
Anecdotal evidence suggests that most of the work in a data science project is spent on data wrangling and feature engineering. Data scientists can speed up the experimentation process when data engineers use their skills to successfully wrangle data.
On the surface, it might seem that a data scientist and data analyst are far apart in the work that they do, but this conjecture is untrue. A data scientist looks at data to determine the questions that need answers and will often devise a hypothesis or an experiment and then turn to the data analyst to assist with the data visualization and reporting.
Database administrator
A database administrator implements and manages the operational aspects of cloud-native and hybrid data platform solutions that are built on Microsoft Azure data services and Microsoft SQL Server. A database administrator is responsible for the overall availability and consistent performance and optimizations of the database solutions. They work with stakeholders to identify and implement the policies, tools, and processes for data backup and recovery plans.
The database administrator is also responsible for managing the overall security of the data, granting and restricting user access and privileges to the data as determined by business needs and requirements.
What is Power BI?
Microsoft Power BI is a tool that helps organize and visualize data from different sources. It lets you connect to data, clean and structure it, create visualizations, and easily share your findings with others.
Data Visualization in Power BI.
Visually representing data lets people interpret and analyze data faster. For example, it's easier to find the most profitable year in a bar plot than scrolling through a spreadsheet!
Why Power BI?
There are several tools for business intelligence, so why Power BI? According to Gartner, Power BI is the leading BI tool. Over 97% of Fortune 500 companies use Power BI. In total, Power BI has over six million customers. With so many companies around the world using Power BI learning and mastering this BI tool can help you progress in your data-related career. Are you convinced yet? Let's learn more!
Power BI Desktop
Power BI Desktop is one of the main components of Power BI. It is a data analysis and report creation tool available on your local computer. It includes powerful features, like the Query Editor, and it's free to download. In this course, we'll use Power BI Desktop. This course will use the free version of Power BI, but there is a paid version called Power BI Pro available. Power BI Pro allows you to publish work on the Power BI cloud platform, called Power BI Service. It also allows you to collaborate with other Power BI users. Even though we're using the free version, everything you learn in this course will be applicable to the paid version.
Power BI Service
The second main component is Power BI service, the cloud version of Power BI. It can only be accessed with a Power BI Pro account. Reports can be edited in Power BI service, but not to the full extent as desktop. Instead, the main purpose of Power BI service is to share and distribute reports. You will commonly use Power BI Desktop to create a report and Power BI service to share that report.
Power BI Interface - Three views
Let's quickly review some key areas of the Power BI interface. Once a new Power BI Desktop window is opened, we will see the left-side icons for the three views.
Power BI Interface - Canvas area
The canvas area in the middle of the report view is where visualizations are created and arranged.
Power BI Interface - Filters pane
In the Filters pane you can filter data visualizations.
Power BI Interface - Visualizations pane
In the Visualizations pane you can add, change, or customize visualizations.
Power BI Interface - Data pane
Finally, the Data pane shows the available fields. You can drag these fields onto the canvas, the Filters pane, or the Visualizations pane to create or modify visualizations.
Consider the scenario where you have imported data into Power BI from several different sources and, when you examine the data, it is not prepared for analysis. What could make the data unprepared for analysis?
When examining the data, you discover several issues, including:
- A column called Employment status only contains numerals.
- Several columns contain errors.
- Some columns contain null values.
- The customer ID in some columns appears as if it was duplicated repeatedly.
- A single address column has combined street address, city, state, and zip code.
You start working with the data, but every time you create visuals on reports, you get bad data, incorrect results, and simple reports about sales totals are wrong.
Dirty data can be overwhelming and, though you might feel frustrated, you decide to get to work and figure out how to make this semantic model as pristine as possible.
Fortunately, Power BI and Power Query offer you a powerful environment to clean and prepare the data.
Clean data has the following advantages:
- Measures and columns produce more accurate results when they perform aggregations and calculations.
- Tables are organized, where users can find the data in an intuitive manner.
- Duplicates are removed, making data navigation simpler. It will also produce columns that can be used in slicers and filters.
- A complicated column can be split into two, simpler columns. Multiple columns can be combined into one column for readability.
- Codes and integers can be replaced with human readable values.
In this module, you will learn how to:
- Resolve inconsistencies, unexpected or null values, and data quality issues.
- Apply user-friendly value replacements.
- Profile data so you can learn more about a specific column before using it.
- Evaluate and transform column data types.
- Apply data shape transformations to table structures.
- Combine queries.
- Apply user-friendly naming conventions to columns and queries.
Power Query Editor in Power BI Desktop allows you to shape (transform) your imported data. You can accomplish actions such as renaming columns or tables, changing text to numbers, removing rows, setting the first row as headers, and much more. It is important to shape your data to ensure that it meets your needs and is suitable for use in reports.
You need to use Power Query Editor to clean up and shape the data before you can start building reports.
Get started with Power Query Editor
To start shaping your data, open Power Query Editor by selecting the Transform data option on the Home tab of Power BI Desktop.
When you import data from multiple sources into Power BI Desktop, the data retains its predefined table and column names. You might want to change some of these names so that they are in a consistent format, easier to work with, and more meaningful to a user. You can use Power Query Editor in Power BI Desktop to make these name changes and simplify your data structure.
Rename a query
It's good practice to change uncommon or unhelpful query names to names that are more obvious or that the user is more familiar with. For instance, if you import a product fact table into Power BI Desktop and the query name displays as FactProductTable, you might want to change it to a more user-friendly name, such as Products. Similarly, if you import a view, the view might have a name that contains a prefix of v, such as vProduct. People might find this name unclear and confusing, so you might want to remove the prefix.
Replace values
You can use the Replace Values feature in Power Query Editor to replace any value with another value in a selected column.
Replace null values
Occasionally, you might find that your data sources contain null values. For example, a freight amount on a sales order might have a null value if it's synonymous with zero. If the value stays null, the averages will not calculate correctly. One solution would be to change the nulls to zero, which will produce the more accurate freight average. In this instance, using the same steps that you followed previously will help you replace the null values with zero.
Remove duplicates
You can also remove duplicates from columns to only keep unique names in a selected column by using the Remove Duplicates feature in Power Query.
In this example, notice that the Category Name column contains duplicates for each category. As a result, you want to create a table with unique categories and use it in your semantic model. You can achieve this action by selecting a column, right-clicking on the header of the column, and then selecting the Remove Duplicates option.
Best practices for naming tables, columns, and values
Naming conventions for tables, columns, and values have no fixed rules; however, we recommend that you use the language and abbreviations that are commonly used within your organization and that everyone agrees on and considers them as common terminology.
A best practice is to give your tables, columns, and measures descriptive business terms and replace underscores ("_") with spaces. Be consistent with abbreviations, prefixes, and words like "number" and "ID." Excessively short abbreviations can cause confusion if they are not commonly used within the organization.
Also, by removing prefixes or suffixes that you might use in table names and instead naming them in a simple format, you will help avoid confusion.
When replacing values, try to imagine how those values will appear on the report. Values that are too long might be difficult to read and fit on a visual. Values that are too short might be difficult to interpret. Avoiding acronyms in values is also a good idea, provided that the text will fit on the visual.
When you import a table from any data source, Power BI Desktop automatically starts scanning the first 1,000 rows (default setting) and tries to detect the type of data in the columns. Some situations might occur where Power BI Desktop doesn't detect the correct data type. Where incorrect data types occur, you'll experience performance issues.
You have a higher chance of getting data type errors when you're dealing with flat files, such as comma-separated values (.CSV) files and Excel workbooks (.XLSX), because data was entered manually into the worksheets and mistakes were made. Conversely, in databases, the data types are predefined when tables or views are created.
A best practice is to evaluate the column data types in Power Query Editor before you load the data into a Power BI semantic model. If you determine that a data type is incorrect, you can change it. You might also want to apply a format to the values in a column and change the summarization default for a column.
To continue with the scenario where you're cleaning and transforming sales data in preparation for reporting, you now need to evaluate the columns to ensure that they have the correct data type. You need to correct any errors that you identify.
Implications of incorrect data types
The following information provides insight into problems that can arise when Power BI doesn't detect the correct data type.
Incorrect data types will prevent you from creating certain calculations, deriving hierarchies, or creating proper relationships with other tables.
Change the column data type
You can change the data type of a column in two places: in Power Query Editor and in the Power BI Desktop Report view by using the column tools. It is best to change the data type in the Power Query Editor before you load the data.
Change the column data type in Power Query Editor
In Power Query Editor, you can change the column data type in two ways. One way is to select the column that has the issue, select Data Type in the Transform tab, and then select the correct data type from the list.
Another method is to select the data type icon next to the column header and then select the correct data type from the list.
As with any other changes that you make in Power Query Editor, the change that you make to the column data type is saved as a programmed step. This step is called Changed Type and it will be iterated every time the data is refreshed.
After you have completed all steps to clean and transform your data, select Close & Apply to close Power Query Editor and apply your changes to your semantic model. At this stage, your data should be in great shape for analysis and reporting.
Profiling data is about studying the nuances of the data: determining anomalies, examining and developing the underlying data structures, and querying data statistics such as row counts, value distributions, minimum and maximum values, averages, and so on. This concept is important because it allows you to shape and organize the data so that interacting with the data and identifying the distribution of the data is uncomplicated, therefore helping to make your task of working with the data on the front end to develop report elements near effortless.
Assume that you are developing reports for the Sales team at your organization. You are uncertain how the data is structured and contained within the tables, so you want to profile the data behind the scenes before you begin developing the visuals. Power BI has inherent functionality that makes these tasks user-friendly and straightforward.
Examine data structures
Before you begin examining the data in Power Query Editor, you should first learn about the underlying data structures that data is organized in. You can view the current semantic model under the Model tab on Power BI Desktop.
On the Model tab, you can edit specific column and table properties by selecting a table or columns, and you can transform the data by using the Transform Data button, which takes you to Power Query Editor. Additionally, you can manage, create, edit, and delete relationships between different tables by using Manage Relationships, which is located on the ribbon.
Find data anomalies and data statistics
After you have created a connection to a data source and have selected Transform Data, you are brought to Power Query Editor, where you can determine if anomalies exist within your data. Data anomalies are outliers within your data. Determining what those anomalies are can help you identify what the normal distribution of your data looks like and whether specific data points exist that you need to investigate further. Power Query Editor determines data anomalies by using the Column Distribution feature.
Select View on the ribbon, and under Data Preview, you can choose from a few options. To understand data anomalies and statistics, select the Column Distribution, Column Quality, and Column Profile options. The following figure shows the statistics that appear.
Column quality and Column distribution are shown in the graphs above the columns of data. Column quality shows you the percentages of data that is valid, in error, and empty. In an ideal situation, you want 100 percent of the data to be valid.
Note
By default, Power Query examines the first 1000 rows of your data set. To change this, select the profiling status in the status bar and select Column profiling based on entire data set.
Column distribution shows you the distribution of the data within the column and the counts of distinct and unique values, both of which can tell you details about the data counts. Distinct values are all the different values in a column, including duplicates and null values, while unique values do not include duplicates or nulls. Therefore, distinct in this table tells you the total count of how many values are present, while unique tells you how many of those values only appear once.
Column profile gives you a more in-depth look into the statistics within the columns for the first 1,000 rows of data. This column provides several different values, including the count of rows, which is important when verifying whether the importing of your data was successful. For example, if your original database had 100 rows, you could use this row count to verify that 100 rows were, in fact, imported correctly. Additionally, this row count will show how many rows that Power BI has deemed as being outliers, empty rows and strings, and the min and max, which will tell you the smallest and largest value in a column, respectively.
This distinction is particularly important in the case of numeric data because it will immediately notify you if you have a maximum value that is beyond what your business identifies as a "maximum." This value calls to your attention these values, which means that you can then focus your efforts when delving deeper into the data. In the case where data was in the text column, as seen in the previous image, the minimum value is the first value and the maximum value is the last value when in alphabetical order.
With this knowledge, you can include in your toolkit the ability to study your data in an efficient and effective manner.
What is DAX?
DAX is a collection of functions, operators, and constants that can be used in a formula, or expression, to calculate and return one or more values. DAX helps you create new information from data already in your model.
Why is DAX so important?
It’s easy to create a new Power BI Desktop file and import some data into it. You can even create reports that show valuable insights without using any DAX formulas at all. But, what if you need to analyze growth percentage across product categories and for different date ranges? Or, you need to calculate year-over-year growth compared to market trends? DAX formulas provide this capability and many other important capabilities as well. Learning how to create effective DAX formulas will help you get the most out of your data. When you get the information you need, you can begin to solve real business problems that affect your bottom line.
Prerequisites
You might already be familiar with creating formulas in Microsoft Excel, and that knowledge will be helpful in understanding DAX. But even if you have no experience with Excel formulas, the concepts described here will help you get started creating DAX formulas and solving real-world BI problems right away.
We’ll focus on understanding DAX formulas used in calculations, more specifically, in measures and calculated columns. You should already be familiar with using Power BI Desktop to import data and add fields to a report, and you should also be familiar with fundamental concepts of Measures and Calculated columns.
Semantic model columns have a set data type, which ensures that all column values conform to that data type. Column data types are defined in Power Query, or in the case of calculated columns, it's inferred from the formula. Measure data types, similar to calculated column data types, are inferred from the formula.
Model data types aren't the same as DAX data types, though a direct relationship exists between them. The following table lists the model data types and DAX data types. Notice the supported range of values for each data type.
Binary DAX Data type
The Binary data type is used in the data model to store images or other non-structured types of information. It is not available in DAX. It was mainly used by Power View, but it might not be available in other tools such as Power BI.
Boolean DAX Data type
The Boolean data type is used to express logical conditions. TRUE and FALSE
Currency DAX Data type
The Currency data type, also known as Fixed Decimal Number in Power BI, stores a fixed decimal number. It can represent four decimal points and it is internally stored as a 64-bit integer value divided by 10,000.
DateTime DAX Data type
DAX stores dates as a DateTime data type. This format uses a floating-point number internally, wherein the integer corresponds to the number of days since December 30, 1899, and the decimal part identifies the fraction of the day. Hours, minutes, and seconds are converted to decimal fractions of a day. Thus, the following expression returns the current date plus one day (exactly 24 hours):
Decimal DAX Data type
A decimal number is always stored as a double-precision floating point value.
Integer DAX Data type
The DAX Integer data type stores a 64-bit value.
String DAX Data type
Every string in DAX is stored as a Unicode (UTF-16) string, where each character is stored in at least 16 bits (2 bytes).
Variant DAX Data type
The Variant data type is used for expressions that might return different data types, depending on the conditions. e.g: 1,0,N/A
By using Data Analysis Expressions (DAX), you can add three types of calculations to your semantic model:
- Calculated tables
- Calculated columns
- Measures
Calculated tables
You can write a DAX formula to add a calculated table to your model. The formula can duplicate or transform existing model data, or create a series of data, to produce a new table. Calculated table data is always imported into your model, so it increases the model storage size and can prolong data refresh time.
Note
A calculated table can't connect to external data; you need to use Power Query to accomplish that task.
Calculated columns
You can write a DAX formula to add a calculated column to any table in your model. The formula is evaluated for each table row and it returns a single value. When added to an Import storage mode table, the formula is evaluated when the semantic model is refreshed, and it increases the storage size of your model. When added to a DirectQuery storage mode table, the formula is evaluated by the underlying source database when the table is queried.
Measures
You can write a DAX formula to add a measure to any table in your model. The formula is concerned with achieving summarization over model data. Similar to a calculated column, the formula must return a single value. Unlike calculated columns, which are evaluated at data refresh time, measures are evaluated at query time. Their results are never stored in the model.
Occasionally, measures can be described as explicit measures. To be clear, explicit measures are model calculations that are written in DAX and are commonly referred to as simply measures. Yet, the concept of implicit measures exists, too. Implicit measures are columns that can be summarized by visuals in simplistic ways, like count, sum, minimum, maximum, and so on. You can identify implicit measures in the Fields pane because they're shown with the sigma symbol ( ∑ ).
Note
Any column can be summarized when added to a visual. Therefore, whether they're shown with the sigma symbol or not, when they're added to a visual, they can be set up as implicit measures.
Additionally, no such concept as a calculated measure exists in tabular modeling. The word calculated is used to describe calculated tables and calculated columns, which distinguishes them from tables and columns that originate from Power Query. Power Query doesn't have the concept of an explicit measure.
The DAX function library consists of hundreds of functions, each designed to accomplish a specific goal.
Because DAX originated with the Power Pivot add-in for Microsoft Excel 2010, over 80 functions are available that can also be found in Excel. It was a deliberate design strategy by Microsoft to ensure that Excel users can quickly become productive with DAX.
However, many functions exist that you won't find in Excel because they're specific to data modeling some of them are:
- Relationship navigation functions
- Filter context modification functions
- Iterator functions
- Time intelligence functions
- Path functions
Functions that originate from Excel
The following sections consider several useful functions that you might already be familiar with because they exist in Excel.
The IF DAX function tests whether a condition that's provided as the first argument is met. It returns one value if the condition is TRUE and returns the other value if the condition is FALSE.
Many Excel summarization functions are available, including SUM, COUNT, AVERAGE, MIN, MAX, and many others. The only difference is that in DAX, you pass in a column reference, whereas in Excel, you pass in a range of cells.
Many Excel mathematic, text, date and time, information, and logical functions are available as well. For example, a small sample of Excel functions that are available in DAX include ABS, ROUND, SQRT, LEN, LEFT, RIGHT, UPPER, DATE, YEAR, MONTH, NOW, ISNUMBER, TRUE, FALSE, AND, OR, NOT, and IFERROR.
Functions that don't originate from Excel
Two useful DAX functions that aren't specific to modeling and that don't originate from Excel are DISTINCTCOUNT and DIVIDE.
DISTINCTCOUNT function
You can use the DISTINCTCOUNT DAX function to count the number of distinct values in a column. This function is especially powerful in an analytics solution. Consider that the count of customers is different from the count of distinct customers. The latter doesn't count repeat customers, so the difference is "How many customers" compared with "How many different customers."
DIVIDE function
You can use the DIVIDE DAX function to achieve division. You must pass in numerator and denominator expressions. Optionally, you can pass in a value that represents an alternate result. The DIVIDE function's syntax is:
The DIVIDE function automatically handles division by zero cases. If an alternate result isn't passed in, and the denominator is zero or BLANK, the function returns BLANK. When an alternate result is passed in, it's returned instead of BLANK.
This function is convenient because it saves your expression from having to first test the denominator value. The function is also better optimized for testing the denominator value than the IF function. The performance gain is significant because checking for division by zero is expensive. What's more, using the DIVIDE function results in a more concise and elegant expression.
Tip
We recommend that you use the DIVIDE function whenever the denominator is an expression that could return zero or BLANK. In the case that the denominator is a constant value, we recommend that you use the divide operator (/), which is introduced later in this module. In this case, the division is guaranteed to succeed, and your expression will perform better because it will avoid unnecessary testing.
Overview
Data modeling in Power BI has transformed the way businesses analyze and visualize their data, providing a powerful tool for organizations to gain insights and make informed decisions.
But what exactly is data modeling in Power BI, and how can it help you unlock the full potential of your data to drive revenue?
This beginner's guide to data modeling in Power BI will take you on a journey through the fundamentals, guiding you from the very basics to more advanced techniques. By the end, you will have the tools and knowledge to create data models that drive insightful visualizations and inform critical business decisions.
Key Takeaways
- Data modeling in Power BI is essential for efficient data organization and powerful insights.
- Optimizing data models improves performance and user experience.
- It involves understanding tables, relationships, cardinality, and cross-filter direction to build models.
- Advanced techniques such as active/inactive relationships, USERELATIONSHIP measures, custom calculations using DAX, and optimizing key columns can unlock the full potential of data.
Understanding Data Modeling in Power BI
Data modeling in Power BI is the process of creating visual representations of data structures, relationships, and properties. It is primarily based on the connections between tables, representing data from one or more sources, which forms the basis of precise and efficient Power BI reports.
The core concepts of data modeling in Power BI, especially when working with multiple data sources, include:
- Star schemas
- Primary and foreign keys
- Cardinality
- Cross-filter direction
- Active and inactive relationships
In traditional Excel reporting, users generally bring different fields from multiple tables into a primary table by using VLOOKUP(), HLOOKUP(), or INDEX/MATCH() functions. Additionally, they will supplement these lookups with conditional computations, such as SUMIFS(), AVERAGEIFS(), and SUBTOTAL(), among others.
A data model eliminates the need for multiple lookups through the implementation of relationships among tables. Relationships allow modern analysts to quickly perform analysis over multiple tables and across much larger volumes of data.
Relationship settings
Prior to making connections or importing data, make sure that you verify current settings for your report. This verification can potentially save you time during setup or help you avoid more work to remove inaccurate autodetected relationships.
Go to File > Options and Settings > Options > [Current File] Data Load > Relationships.
If the Autodetect new relationships checkbox is selected, Power BI will attempt to identify and create relationships between tables after loading tables into your model from Power Query. This process is based on similarities between table and column names, but it's imperfect.
Relationship introduction
You can display a relationship by using a line connecting the two tables. When you hover over the line, the two corresponding fields that the relationship is built on will be outlined. The following example shows that the CustID field from the Quotes table aligns with the ID field from the Customers table.
Indicators along the line help a viewer understand cardinality, cross filter direction, and whether the relationship is active (solid line) or inactive (dotted line). This concept will be discussed in more detail later.
Creating these relationships is quick and simple, with many different methods that have different levels of detail or simplicity.
The simplest way to create a relationship is to go to the Data Model or Diagram view and then drag a field from one table to a field on another table. The data model will interpret the relationship between the two fields, and if it's reasonably apparent, the relationship will display.
Another more detailed approach to creating relationships is through the Manage Relationships window. You can access this window in Power BI on the Home tab while in the Model view or in Excel on the Data tab through the Relationships button.
The initial view when you select one of these buttons is a summary of all relationships in the data model.
Note
The Relationships window in Excel contains the same information, along with the same options, such as the Autodetect feature.
From the Manage relationships window, you can choose to Edit an existing relationship or create a New relationship. Both selections will take you to the same Relationship definition wizard.
To define the relationship in this manner, you'll select the two tables from the dropdown lists and then highlight the corresponding columns/fields to link in the relationship. The wizard will automatically determine cardinality and cross filter direction of the relationship. It also gives you the option to make the relationship active.
Note
You can use multiple columns to define a relationship by holding the Ctrl key and selecting the fields. A numerical value will display next to each icon to show how they align with each other.
Cardinality and cross-filter direction are complex settings that have ramifications across your data model in terms of performance, functionality, and ease of use.
Cardinality
The Cardinality setting reflects the directionality of the data and how the tables relate to each other. This relationship can be one-to-many (1 to *), one-to-one (1 to 1), or many-to-many (* to *). These definitions describe how many values of one type might be found in each table.
One-to-many
In a one-to-many relationship, a unique value in one table will be found many times in the other table. This relationship is most commonly used within data models. An example would be a product table with multiple properties that are related to each product that is tied to a sales table that has multiple records for each product.
One-to-one
In a one-to-one relationship, a value will have one unique entry in both tables. One-to-one relationships are uncommon, and an author will generally merge one-to-one relationships into a single table for use in a data model.
Many-to-many
In a many-to-many relationship, a unique value that is taken from the matching column will have multiple entries in both tables. Many-to-many relationships require careful consideration and advanced techniques for appropriate management. When starting out with Power BI, you should avoid many-to-many relationships because they can occasionally produce undesired results regarding measures.
Cross filter direction
The Cross filter setting tells the data model how filters propagate between two tables. In the Model view, the arrow(s) in the middle of the line represent the cross filter direction.
Active and inactive
Periodically, you might want to set up multiple relationships between tables (for example, Order date and Ship date fields from a sales table and the Date field from a calendar table). You are only allowed one active relationship between two tables at a time, but you can set up inactive relationships for use in more advanced Data Analysis Expressions (DAX) measures.
Initially, Excel limited users to only a few standard aggregations or calculations with PivotTables, including Sum, Count, Average, Min, Max, Product, Count, StdDev, StdDevp, Var, and Varp. When you use a data model, then DISTINCTCOUNT() becomes available as well.
Furthermore, when you use the DAX language, you'll have numerous functions available, which will allow you to calculate whatever you want. Instantly, all limitations and barriers that you previously encountered in your analysis will be removed.
Measures calculate a result from an expression formula. When you create your own measures, you'll use the DAX formula language, which is similar to Excel formulas. DAX also has many of the same functions as Excel.
You can create two types of measures: implicit and explicit.
Implicit measures
Implicit measures are helpful for beginners who need a straightforward way to get started. With this method, you only need to drag a field from a table and then drop it where you want it.
Implicit measures use a column from a data table (for example, Sales Amount) by dragging the field into a visual in Power BI. These measures are helpful when you're first starting in Power BI and they allow you to calculate Sum, Count, Average, Min, Max, and DistinctCount. When a field is in the values section, select the dropdown list to determine which summarization calculation that you want to perform on the field. These types of measures work for basic tables and summaries but are limited compared to explicit measures.
Explicit measures
Explicit measures require you to use the DAX formula language to explicitly write out the expression. These measures are better over an extended time and will allow you to create custom analytical metrics like Profit Margin YTD. Writing DAX measures becomes easier over time as you gain experience.
You can create explicit measures by writing a formula in the editor. These measures offer the most flexibility and give you the power to use all capabilities of DAX.
The primary goal of data visualization is to communicate information clearly and effectively to report consumers. That's why selecting the most effective visual type to meet requirements is critical. Selecting the wrong visual type could make it difficult for report consumers to understand the data, or worse, it could result in the misrepresentation of the data.
Visual selection can be challenging because so many visuals are available to choose from. To help you select an appropriate visual, the following sections provide tips and guidance to help you meet specific visualization requirements.
1. Categorical visuals
Often, bar or column charts are good choices when you need to show data across multiple categories. Selecting which type depends on the number of categories and the kind of information that you want to visualize. For example, if many category values are available, you should avoid selecting a visual where color is used to split the data, such as a stacked bar chart with a category legend. Instead, use the category dimension on the axis of a bar chart.
Additionally, you should avoid a line chart with a categorical X-axis because the line implies a relationship between elements that might not exist. In the following example, notice that the line chart visual implies a relationship between the product categories on the X-axis.
Make sure that you determine the sort order (ascending or descending) by what you want to first draw peoples' attention to so that it provides the report consumer with an intuitive visual that is organized to produce a natural flow.
2. Time series visuals
Always use a line or column chart to show values over time. The X-axis should present time, sorted from earliest to latest periods (left to right).
Other Power BI core visuals that you can use for time series data include:
- Stacked column chart
- Area chart
- Line and stacked column chart
- Ribbon chart, which has the added benefit of showing rank changes over time.
3. Proportional visuals
Proportional visuals show data as part of a whole. They effectively communicate how a value is distributed across a dimension. Column and bar chart visuals work well for visualizing proportions across multiple dimensions.
Note
Proportional visuals can't plot a mix of positive and negative values. They should be used when all values are positive or all values are negative/
Other Power BI core visuals that you can use for proportional visualization include:
- 100% Stacked Column chart
- Funnel chart
- Treemap
- Pie chart /Doghunurt chart
4. Numeric visuals
Often presented by card visuals, numeric values show high-level callouts that demand immediate attention. They can be powerful in dashboard and analytical reports because they communicate important data quickly.
You can also use a multi-row card to display multiple values in a single visual.
5. Grid visuals
Often overlooked, tables and matrices can effectively convey a lot of detailed information. Tables have a fixed number of columns, and each column can express grouped or summarized data. Matrices can have groups on columns and rows. Adding conditional formatting options, such as background colors, font colors, or icons, can enhance values with visual indicators. This extra context supports simple report consumption and can bring balance to a report page.
Additionally, matrices provide one of the best experiences for hierarchical navigation. They allow users to drill down, on the columns or rows, to discover detailed data points of interest.
The table and matrix Format options provide a high degree of control to format and style grid values.
6. Performance visuals
Communicating performance involves describing a value and its comparison to a target. Any difference between the value and target is its variance, which can be favorable or unfavorable. Color or icons can convey status. For example, when the variance is unfavorable, you can display a red color or an exclamation mark (!) icon.
Other Power BI core visuals that you can use to show performance include:
- Gauge
- KPI
- Table, with conditional formatting
- Matrix, with conditional formatting
- Geospatial visuals
When a semantic model has geospatial information, it can be conveyed by using map visuals. Power BI includes several core map visuals. Each visual offers various formatting options that, when appropriately applied, can help highlight geospatial data.
Note
A map visual can occupy considerable space on the report page. Also, geospatial data doesn't always need to be shown in maps. If location isn't highly relevant to the requirements, consider using a categorical visual instead.
Power BI Desktop gives you a variety of options for customizing how your selected visualizations look, such as the colors and format of the text that they contain. You should take time to explore the options to determine what impact they each have on a visual.
The formatting options that are available will depend on the type of visualization that you selected.
Common formatting options include the Title, Background, and Border.
In the Title section, you can add a title to the visual, if it does not have one, or edit the title, if it has one already. The aim of the title is to clearly describe what data is being presented in the visual. You can format the title by changing the text, text size, font, color, background, and alignment. The subsequent section shows an example of customizing a title.
In the Background section, you can set any color or image as the background for the visual. If you plan to use an image as a background, try to select an image that won't have lines or shapes that would make it difficult for the user to read the data. It is best to keep a white background so the presented data can be clearly seen. The subsequent section shows an example of customizing a background.
In the Border section, you can set a border around the visual to isolate the visual from other elements on the canvas, which helps make it easier for the user to read and understand the data. You can change the border color and radius to be consistent with your color scheme.
If a General section is available, you'll be able to set the precise size and place for your visual on your canvas. This option might be suitable if the drag-and-drop feature is not placing the visual exactly where you want it to be. It can also be useful to ensure that you have aligned specific visuals consistently.
You might also be able to format the colors and labels for specific data values. In the Data colors section, you can set the colors that you want to use for the data values in the visual. You can use different colors for different fields, but always try to be consistent when it comes to selecting those colors. It is best to use the same color scheme throughout the report. In the Data labels section, you can change fonts, size, and colors for all labels in the visual. Try to use solid colors so the labels are clearly visible. For example, if the background is white, use a black or dark grey color to display your labels.
The Tooltips section allows you to add a customized tooltip that appears when you hover over the visual, based on report pages that you create in Power BI Desktop. Tooltips is a great feature because it provides more contextual information and detail to data points on a visual. The default tooltip displays the data point's value and category, but your custom tooltips can include visuals, images, and any other collection of items that you create in the report page. The subsequent section shows an example of customizing a tooltip.
Key performance indicators (KPIs) are excellent in helping you track progress toward a specific goal over time. To use a KPI, you need three pieces of information:
A unit of measurement that you want to track, for instance total sales, number of employee hires, number of loans serviced, or number of students enrolled.
A goal for the measurement so that you can compare your progress with that goal.
A time series, for instance daily, monthly, or yearly.
In Summary
Designing a Power BI report involves laying out objects over one or more report pages. While you have a great deal of flexibility in how you do that, what matters most is that you meet requirements by clearly communicating the data to report consumers. Chances of success are higher when you select the correct visual types. Beyond the correct visual types, you then can consider overlaying analytics options, ordering the visual elements in an intuitive way, and applying suitable format and styling options.
There are many kinds of data visualization, some commonly used and some more specialized. Power BI includes an extensive set of built-in visualizations, which can be extended with custom and third-party visualizations. The rest of this unit discusses some common data visualizations but is by no means a complete list.
Tables and text
Tables and text are often the simplest way to communicate data. Tables are useful when numerous related values must be displayed, and individual text values in cards can be a useful way to show important figures or metrics.
Bar and column charts
Bar and column charts are a good way to visually compare numeric values for discrete categories.
Line charts
Line charts can also be used to compare categorized values and are useful when you need to examine trends, often over time.
Pie charts
Pie charts are often used in business reports to visually compare categorized values as proportions of a total.
Scatter plots
Scatter plots are useful when you want to compare two numeric measures and identify a relationship or correlation between them.
Maps
Maps are a great way to visually compare values for different geographic areas or locations.
Interactive reports in Power BI
In Power BI, the visual elements for related data in a report are automatically linked to one another and provide interactivity. For example, selecting an individual category in one visualization will automatically filter and highlight that category in other related visualizations in the report. In the image above, the city Seattle has been selected in the Sales by City and Category column chart, and the other visualizations are filtered to reflect values for Seattle only.
This learning path introduces a design process to design and deliver compelling Power BI reports. It begins by describing a proven design process created by leading report design experts. The process encompasses phases to understand the report users and their requirements, explore pleasing report designs, and develop reports all the way into production.
This module provides you with a strong foundation on which to learn how to plan your report design requirements.
Learning objectives
In this module, you will:
Determine business goals.
Identify your audience.
Determine report types.
Define user interface requirements.
Define user experience requirements.
Introduction
3 minutes
As with any project, a good starting point when designing reports is to define clear goals. In your reporting project, those goals should strive to help you determine your reporting requirements for:
Audiences
Report type
User interface requirements
User experience requirements
This module uses a scenario concerning the Contoso Skateboard Store company, which specializes in direct-to-consumer skateboard sales. To modernize their workflow and gain a deeper understanding of their product sales and inventory management, the company needs to create a set of reports to deliver data to their employees.
Currently, the finance and supply chain divisions operate in vertical silos. These divisions use data differently, which has resulted in many inconsistencies in how they interpret the data when making decisions. These inconsistencies have often led to disagreements and misalignment at the executive level because the numbers don't match.
Also, the company currently consumes this information by using formats that aren't optimized for a modern workforce. Also, employees commonly spend many hours creating improvised reports by using disparate data sources. These reports answer simple questions like "What were our sales in the Los Angeles area last month?" These improvised reports have contributed to a duplication of effort, resulting in an overlapping of reports that might occasionally use different calculation logic over the same data.
The company has committed to developing a new suite of reports to meet the following business requirements:
Promote a standard view of their data and business logic that is a single source of truth across all divisions.
Ensure reports deliver up-to-date data that is no older than 24 hours for sales and one hour for inventory.
Create modern sales and inventory reports that can be accessed at any time by their employees, whether they are in the office, working from home, or on the road. Reports are intended for use by finance managers, supply chain managers, and data analysts.
Design reports that feel natural and are user-friendly.
Publish reports that have sufficient depth of detail and allow report consumers to discover new insights.
Apply consistent corporate brand standards.
Identifying the audience is one of the most important steps in the report design process. It enables the report author to create a final result that can be efficiently used and will meet the needs of the report consumer.
The three broad report consumer audiences are:
Executive
Analyst
Information worker
An executive is a person who is charged with making plans and decisions that often involve a medium or long-term focus. Executives are responsible for making the business run smoothly. For example, the C-level executives at the Contoso Skateboard Store would be an executive audience.
An analyst is a person who provides guidance to the organization. Analysts can be responsible for a range of tasks, often with goals of determining the effectiveness of business strategies, developing or improving processes, or implementing change. A business analyst (or data analyst) in the sales division of the Contoso Skateboard Store is an example of an analyst audience.
An information worker is someone who uses data to help make decisions or take actions. Often, these decisions and actions are operational in that they are done on a daily basis. The inventory manager at the Contoso Skateboard Store, who needs up-to-date information about stock levels, is an example of an information worker audience.
Generally, report design can be classified by report type. Often, a direct mapping between the report audience and the report type occurs. Audience needs can be met by one, or possibly a combination, of four report types:
Dashboard
Analytical
Operational
Educational
Commonly, executives work with dashboards, analysts work with analytical reports, and information workers work with operational reports.
Each report type has a different approach to the user interface (UI) and user experience (UX) requirements. These requirements are described in more detail later in this module.
Dashboard
The primary goal of a dashboard is to interpret the story as quickly as possible. User interactions are limited by insights that are highly curated toward the audience. Report visuals are focused, self-explanatory, and clearly labeled. A dashboard directly communicates the meaning behind the data to minimize misinterpretation or confusion.
A good example of a dashboard is an executive dashboard, which often presents high-level metrics that are displayed on a single page. Dashboards help answer questions such as "How are we doing?" or "Are we there yet?"
At the Contoso Skateboard Store, a dashboard is a report that allows users to view several analytics values, targets, statuses, and trends.
Analytical reports
An analytical report is the most common type of report that can serve various report consumer use cases while providing a structured space for analysis.
The primary goal of an analytical report is to help report consumers discover answers to a broad array of questions by interacting with the report and its visuals. Analytical reports often have many slicers to filter report data, and they often contain complex visuals that expose in-depth detail of the data.
Report pages are often expressly designed for interactivity with a focus on UX features. Multiple pathways are often provided for the report consumer to follow, which allows them to explore a topic of interest, share their findings, or return to where they started. Report consumers can remove layers and add context and detail by incorporating interactive features. Common interactive features include drill down, drill through, and tooltips.
A good example of an analytical report is one that extends beyond the "How are we doing?" type of question to answer the "Why did that happen?" or "What might happen next?" type of questions.
An example of an analytical report at the Contoso Skateboard Store would be a sales analysis report that allows drilling into sales revenue from year, down to quarter, month, and day.
Operational reports
Operational reports are designed to give the report consumer the ability to monitor current or real-time data, make decisions, and act on those decisions. Operational reports can include buttons that allow the report consumer to navigate within the report and also beyond the report to perform actions in external systems. Frequently, operational reports serve as a hub for action that is used by report consumers as part of their daily activity and workload.
This type of report should minimize the number of analytical features to ensure that focus remains on the operation that it's designed to serve. A streamlined user experience is the primary aim for this report type because excessive clicking or illogical flow can lead to high dissatisfaction.
A good example of an operational report is one that allows monitoring of a manufacturing production line. When an unexpected event arises, such as equipment malfunction, a button could allow workers to start a maintenance request.
An example of an operational report at the Contoso Skateboard Store would be an inventory report that informs the report consumer of current stock levels, and highlighting low stock levels or back orders. It also includes a Submit Order button that allows users to create a purchase order.
Educational reports
Educational reports assume that the report consumer is unfamiliar with the data or context. So the reports must provide clear narrative detail and guidance to help with understanding. This type of report is often used in journalism and by governments to disseminate information to large audiences that have varying levels of understanding of the subject.
A good example of an educational report is one that describes the rollout of COVID-19 vaccination progress and that can be filtered by the home geographic region of the report consumer.
Next unit: Define user interface requirements
UI requirements relate to how reports are consumed and to the appearance and behavior of reports. Aspects to consider include form factor, input method, style and theme, and accessibility.
Form factor
In the context of report design, form factor describes the size of the hardware that is used to open reports, and to page orientation (portrait or landscape).
Generally, today's computers are accompanied by large-sized monitors and, occasionally, multiple monitors. Large monitor screen sizes are ideal for viewing reports in web browsers, especially when they include several visuals or complex visuals by using landscape orientation.
Mobile devices, which are typically used when the report consumer is away from their desk, have a smaller form factor. Phones and tablets display content in portrait orientation by default. Smaller form factors demand a different design approach compared to reports that are designed for larger form factors. Accordingly, when you design for a smaller form factor, strive for fewer and less complex visuals. Also, visuals should be large to help ease viewing and interactivity.
Mobile devices encompass augmented or mixed reality technologies, allowing head-mounted devices to display reports that are superimposed over the reality of the surroundings.
Occasionally, the design of a report that is intended for a large form factor can be refactored to create a mobile view. The mobile view might contain a subset of visuals from the full-sized report.
Input method
When defining UI requirements, also consider input methods that are supported by devices or applications.
While a computer has a keyboard and pointing device (mouse), mobile devices rely on common gestures, such as tap, double-tap, drag, pinch, spread, or press. Report consumers who are using mobile devices can also use on-screen keyboards, voice control, or barcode and QR code readers. Augmented or mixed reality devices rely heavily on hand gestures or body movement.
Knowing how your report consumers are likely interacting with reports should influence your report designs. Consider the number of visuals that display together on the page, the complexity of visuals, spacing between visuals, and the use of interactive design elements such as tooltips, slicers, buttons, or filters.
When a report is embedded in an application, input can be received programmatically from the application. For example, when the application user opens a page to view history for a specific customer, a filter automatically passes to an embedded report to filter and display sales history for that customer.
Style and theme
UI requirements should also consider style and theme. Strive to design reports with a consistent and distinctive appearance that is determined by a deliberate theme.
The report theme should express your organizational branding or aim to complement it. At a minimum, the theme should include the following elements:
A brand mark or logo symbol.
A palette of colors that align to, or complement, organizational branding. The colors should also be sufficiently different so that, when applied to visuals, they provide appropriate contrast when appearing side by side.
Text settings, including font selection, sizes, and color.
To manage style and theme changes effectively and efficiently, design reports that use images and themes that are stored in a central repository. This approach improves change management: Changes that are applied to the repository can automatically cascade through to reports.
Accessibility
UI requirements should also factor in accessibility. Reports need to communicate to the broadest audience possible. So you should consider how report consumers with no-to-low vision or other physical disability can fully experience the reports.
While you might not be aware of report users with accessibility issues today, it's a good idea to design for accessibility from the outset. Accessibility requirements can influence your report designs in terms of form factor, input method, and style and theme.
To support people with low or no vision, consider using:
clear and large-sized fonts,
well-spaced and large visuals,
sufficiently contrasting colors,
and intuitive report navigation that can be understood by keyboard and screen readers.
Some physical disabilities can inhibit or prevent report consumers from fully interacting with reports in the way that you intend. When possible, ensure that you offer alternative paths for consumers to follow to achieve the interactive design outcomes. For example, adding alt text to visuals for screen reader support, and setting tab order for keyboard navigation.
UX requirements relate to how reports deliver the expected report consumer needs. To assess user experience requirements, you should consider the audience, report type, and the UI requirements.
Other UX requirements can include:
Support for interactions, such as:
Drill up, drill down, or drill through to details.
Navigation within the report or to other reports.
Filters or slicers that can be applied to report visuals, specific pages, or all pages.
Data export as specific data formats, such as Microsoft Excel or a comma-separated value (CSV) file.
Support for ad hoc questions to retrieve a response in the form of a data visualization.
Configuring of data alerts to notify people when specific data values change or exceed predefined thresholds.
Links to open webpages.
Actions to open applications, write back data entry values, or trigger workflows.
What-if analysis that allows the report consumer to modify "what-if" values to understand the consequences of different scenarios. For example, what-if analysis could allow consumers to predict sales revenue based on different consumer demand estimates.
Page layouts that can extend over multiple pages and are suitable for printing as multi-page documents.
Printing the report to a physical printer or as a PDF document.
Subscribing to the report so that it can be automatically delivered as a document on a scheduled basis.
Adding commentary, feedback, or engaging in a conversation about the report.
The following report allows you to explore different report design outcomes for the Contoso Skateboard Company. The report shows the three factors that you should consider: audience, form factor, and other requirements. Select an option for each factor, and then select Go! to reveal a recommended report design.
Tip
You can maximize the report by selecting the double-headed arrow that is located in the lower-right corner. When you're ready to return to the module, select Escape.
Possible design recommendations that you can explore include:
Executive > Browser > None
Executive > Browser > Print-friendly
Executive > Phone > None
Executive > Tablet > None, and others
Embedded report allows you to explore design recommendations: