Welcome, everyone to today's webinar. And our topic of the webinar here is the value of integrated data quality automation. So as most of you will understand, data profiling and data quality assessments are some of the most essential elements to what's truly understanding the data within your organization and in order to determine whether that particular data is really fit for use or not. And most importantly, the kind of data that you're playing with-- does it possess or does it present any potential risks to your organization?
So without clearly understanding or getting a better view of the data quality within your landscape, understanding what the quality of data is across your source systems, how is the data being moved from which source to which target, what are the transformations that are being applied, and overall, not being able to understand what is the quality of data on those systems. Do you have good quality data? Do you have bad quality data? Does the data quality need to be enriched within certain of those source systems?
Without having this kind of clear visibility, your IT teams, your data governance teams, and your business users are at [INAUDIBLE] typical disadvantage in order to understanding what kind of data is existing within your organization. And, most importantly, they are leveraging that data, and that results in costly inefficiencies and making the wrong business decisions across your enterprise.
So on this webinar today, you will learn how the new integrated data quality automation capabilities within the Erwin Data Intelligence product can help you leverage a data-driven data catalog that helps you understand or helps you initiate the need for data profiling and quality assessments within your organization. You will also be able to understand how you can take advantage of automated data profiling and data quality scoring capabilities in order to get a true view of the fitness of your data.
And most importantly, being able to present these data quality measures or data quality scores to all your relevant stakeholders and consumers throughout the data intelligence and discovery journey so that you are able to understand how the quality of data is within your organization and visualize the data quality impact, whether you are an IT user, whether you are a business user, or whether you are a governance user.
You understand every time you look at a particular source system, you're looking at your data lineage. You can understand what the quality of data is as the data is flowing between different systems. And more importantly, if you are a consumer of a particular system, you are presented with the right level of insights towards understanding what the quality of data is within that particular system and whether that particular data is fit for user market.
So data quality is one of the most important pillars in a data driven organization. And the value to that particular organization is further enhanced and is deeply enriched when data quality is tightly interwoven with both data intelligence and governance. And the importance of data quality cannot further be [INAUDIBLE] stated based on what is presented here on this particular slide.
So at Erwin, we conducted a state of data governance and empowerment survey with about 2018 business leaders and a set of questions was provided to them, and based on the response that each of these 18 business leaders provided, the outcome was pretty unanimous on the importance of data quality to their business.
So as you can see over here, about 47% said that defining data governance was as good as ensuring data quality within their organization. About 41% said that improving data quality was one of the top drivers for their data governance programs. And about 45% stated that data quality was one of the top challenges for them in order to maximize their data dominance.
46% said that, again, data quality was one of the biggest bottlenecks in the data value chain. About 55% of folks were interested in automating certain aspects of data quality. And about 90%-- and again, this was one of the biggest unanimous decisions that came out of the survey was that about 90% believe that data initiatives are improving data quality across organizations.
So as you can see from this particular slide over here, data quality, the importance around data quality is only increasing as organizations continue on data intelligence and data governance initiatives. Anything to do with data, being able to make your organization data-driven, data quality is a very essential aspect of your data initiatives towards ensuring that you are on the right track. And more importantly, you are able to get the right business outcomes.
So this particular survey is available on our erwin.com, and for folks that are interested, you can go to our website and get this report as well to further deep dive into each of these specific areas to understand what type of questions were asked and what was the level of response that was provided.
But taking that aspect of data quality here at Erwin by Quest, we understand the importance that data quality has in our data governance and intelligence landscape. And as such, we have introduced a data quality offering in our latest version, 12.0, that is integrated and driven by our data catalog. With the new Erwin Data Intelligence solution, you can now leverage your data catalog and the metadata in it in order to identify your profiling and quality assessment or quality assessment requirements. You're able to automate your data profiling and quality scoring and also share these data quality metrics across your organization.
So this is not just restricted to your IT teams with the integration and the data quality offering that we now have in the Erwin data intelligence platform. You're able to share these really important insights around your data quality metrics to your IT users and to your business users and your governance users as well. So pretty much everyone in our organization that needs access to data and that needs to understand what the quality of data is within each of your source systems now has that available within the Erwin Data Intelligence platform.
And, more importantly, you're also able to empower your business users with the right level of data quality understanding so that-- again, this is presented in various forms using the data lineage visualization, using the mind map visualization. You look at a particular data source-- you're able to get instant insights with regards to what the quality of data is within each of those systems so that your business users, who are typically consumers of the data, are able to understand what the quality of data is within those systems and then be able to determine whether the data is fit for use and, most importantly, be able to leverage this data in order to make the right business decisions.
So with that I'll just quickly switch over to a quick demo of the platform over here. So in version 12, we have launched a new data quality offering, again, which is completely driven based on the data catalog. So we have seen in the past that certain data quality tools operate in a standalone manner or are not really well integrated with a data catalog. So considering the fact that within the Erwin Data Intelligence platform over here, [INAUDIBLE] this is a data catalog, and we have data literacy capabilities built on top of it, we now took this one level further and integrated the product with a data quality engine.
So within the platform over here, [INAUDIBLE] just within the data catalog, you can see a listing of all your metadata sources. And for each of these sources, you're able to easily understand which of these sources once you harvested the metadata-- so for example, if you go into a particular table or a flat file, et cetera, you can look at the underlying metadata within that particular source system or the data source itself.
And then, depending on which of these needs to undergo a profiling, you're able to easily understand the need for providing assessment on each of these data sources and with the click of a single button, you're able to push this to the data quality engine. So if you go into a particular source here, you can enable the data quality sync on any of these relevant data sources. This particular data source gets sent over to our data quality engine.
And once that is done, you will now be able to see, once the profiling results have been executed on each of your data sources, this happens in our data quality engine, which is another module that is built in with the product itself. But the [INAUDIBLE] linking or the value is in the fact that you can start this entire process from the data catalog itself.
So you can pick and choose which are the sources that you want to profile, send them over to the data quality engine. The data quality engine does all the job. It processes or profiles all of your data sets. It puts a level of scoring around it. And then all of the scoring is sent back to your data catalog in an automated manner.
So while you are a user within the data catalog platform, you're able to browse through your data sources. And while browsing through the data sources, you're able to visualize what the quality of score is on each of these particular data sources. So you have scoring at the column level. You have scoring at the table level. And, most importantly, these scores get rolled up to the environment level as well. So you can understand what the quality of data is within each of those data sets.
And then this scoring is based on three different criteria within the platform, but you also have the ability to go into each of these data attributes, your tables, or your columns, and define your own business goals as well and change the thresholds. So depending on the criteria that has been defined by you, you're able to go into each of these aspects over here. So if I just go back to this particular data set and, let's say, I drill down into this CSV file over here and into the properties, you'll be able to go into each of these attributes.
So, for example, if I drill down into the attributes over here, you can get a preview of your data from that particular table or view, in this case. What you can also do is you can go into each of these attributes over here-- let's say I'm interested in the column name attribute. I can drill down into the column name attribute. I can see some statistics around what kind of data each of those attributes contains.
I can go into the property section of that particular table or column. Let's say I'm interested in the column name again. I can go to the property section here, get some more insights with regards to what kind of data this contains. If the attribute contains sensitive data, the tool is automatically able to detect and let you know that based on the profiling, it also finds or has detected some amount of sensitive data.
You can go into the Rule tab over here. You can add your own rules in a business friendly manner. You can pick multiple conditions, or you can also define complex SQL as well. So if I go into this Department Check here, you can define your business rules based on complex SQL conditions. Or you can define simple if-then-else conditions as well.
So this way you're able to build your data quality rules repository and adjust the thresholds in order to execute these rules on your data sets. And as when these particular rules get executed behind the scenes, one of the most important aspects is this entire process is automated. So you pick and choose what your data sources that need to undergo a profiling assessment. The data quality engine keeps running keeps running the profiling engine behind the scenes. You are able to run this on a periodic schedule, or you can run it on an as-needed basis as well.
So depending on the execution or the frequency of execution, the results from the data quality engine are periodically sent back to the data catalog. And where the value further increases for your organization is the fact that while you're browsing through the data catalog, you're able to visualize these data quality scores.
And, more importantly, as a business user or as a consumer of data within your organization, if you go into our data discovery module, where you can go in and start searching for assets or browsing for the different assets that you have within your organization, you can come in over here.
Let's say, I'm interested in my customer data. So I can just come here to search for my customer data and then drill down into the relevant area of my charts. So let's say I'm interested in a particular table over here. I can drill down into Customer over here, and [INAUDIBLE] this particular customer data, I can see different representations of customer.
So from here, let's say I'm interested in this particular customer data over here. I can click this particular customer data set. I can see what the quality of score is, what the impact score is on each of these data sources. More importantly, let's say I go and run the data lineage. I am running or visualizing the data lineage-- let's expand this so that I can see this at a table level-- so on the data lineage visualization, I can enable the data quality score.
So when I do this, I can instantly see how the quality of data is as data is being moved between different sources or between different parts within your ecosystem. So from this visualization here, I can understand that the data quality for a particular source system is around 65%. Then it move to about 80%, but then it drops to a 30% over here. So there is a big red flag for you over here. And you would need to further analyze why the quality of data is this bad within this particular source system. But then, again, due to some minor transformations or due to some further enrichment of data, you're able to again bring it back to about 65% while this particular BI report over here does not have a quality of score on it.
So this instant level of data quality visibility is now available for you as you visualize your data lineage. And similarly, if you are a business user that is more interested on the business understanding of the metadata itself, let's say I'm someone that's interested, again, in customer, I can pick out a customer in this case, run the business mind map visualization on a particular attribute, and even on this visualization, depending on which of these tables are columns, all your data sources have a data quality assessment. You can see wherever there is a data quality score, that score is available to you as part of your visualization.
So let's say I'm further interested in this Northwind data source over here. I can click on the Northwind data source. I can see what the quality of score is. And each of these, again, have hyperlinks that will directly take you to the data quality in general so you can further deep dive into those respective areas.
So this way, just to recap, with version 12, we have a new data quality offering that pretty much lets you decide which are the data sources that you want to profile. Everything is driven and initiated from the data catalog itself. So that way, there is a tight integration between the data catalog and your data quality engine. You're able to pick and choose which data sources need to undergo a profiling assessment so that you can understand the quality of data within each of those data sources.
And, furthermore, you can also go into other areas of defining your own business tools, executing those business tools to tweak your thresholds. And, more importantly, if you're someone who wants to go further, beyond just the data profiling phase itself, you want to further go into cleansing and remediation or you want to go into defining more complicated rules-- being able to define them on a periodic schedule, et cetera-- you can do all of this using the data quality component as well.
That's pretty much what is summarized on this particular slide over here-- I just pull this back more. But with the new data quality offering you are able to profile and score your data attributes, your data sources. You can also take it beyond just the profiling and scoring. You are able to monitor and observe your data as it changes over a period of time. You are able to do trend analysis within the platform to see how the data has changed over the last six months or over the last one year. You are also able to remediate the data and collect the data within the platform.
So all of this is available to you as part of the Erwin Data Intelligence v12 platform, where we have a new data quality offering built into the product. And this new offering, we now have an expanded portfolio which contains a data catalog, a data quality solution, and a data literacy solution as well. And all of these, when they come together with our standard data connectors and smart data connectors, the value that you get out of the platform is further enriched.
So on the Erwin Data Intelligence product over here-- so with this expanded offering into data quality, we are now able to combine our data catalog, a data quality, and a data literacy engine in order to support both your IT and business needs.
With that, we come to the end of this webinar here today that was focusing on data quality and the value that it adds when you automate it via data catalog. Though, as previously mentioned, you are able to download the state of data governance report from our erwin.com website. So for folks that want to go and download that report and get more insights into the survey that was done, you have the report available on our website.
And based on the webinar today, if you would like to get a full demo of the new offerings that are available within the Data Intelligence platform, please reach out to your sales counterpart or just go and register on our website so you are able to get access to a full demo environment or a full offering of the Data Intelligence platform.
And also, a quick reminder-- we have our annual Quest Empower summit on November 1 and 2 this year, and we have a great lineup of speakers to talk about their respective data initiatives. So to know more and book your seat, I request that you go to quest.com/empower in order to register. This is a virtual event, so you can go and register on the site. We still have slots available. So I do recommend that you reach out to our particular website in this case, which is quest.com/empower and go and book your seat.
So with that, I'd like to thank you for your time today on this data quality webinar.