There are many books and photos that talk about the so-called revolution of artificial intelligence or AI I appreciate their enthusiasm however, I don’t think that is really being recognized that much of what they have invented i can be used in reality to greatly, reduce the time and effort to creat the kind of things we’ve had a creat for decades And they are essential to the creation of AI, which is predicated on learning and contact
One of the things that we will address is, there is a proclivity to use much language much of it is not very semantic. Today folks are describing architecture that already have descriptions. We have a mix of marketing terms and creative terms that mean the same thing, but they use words from the past. It’s causing confusion.
Just one example, maybe the term dimensionis one thing it meant to me thing 10 years ago come today it’s being used in a different context, in AI is it’s dimension or parameter or a feature There are many people of many years in a IT language is important it’s semantic
Another is the term similarity . It is used completely different in AI versus traditional, fuzzy matching. true the concept is the same but the Technical use Is different
There is no doubt of the benefit of what’s been created through the use of neural networks and transformers tthat hey can have tremendous positive impact on delivering business intelligence with the aid of artificial intelligence, machine, learning, and Deep Learning subsequently
I have been deeply involved in business intelligence, data quality, data profiling, and MDM and Data Governance for several decades.
I would like to take you on a journey and be able to help you exploit all these capabilities today and yesterday we are experience in the evolution of what we’re doing it is not a revolution. It is an evolution. if anything I hope to help achieve a basic understanding and terminalnology used in information architecture and various techniques that we have that will help , frankly nobody has a corner on the best approach it has all been done before at the logical level I want to be a part of helping us leverage, reuse, and apply what we were doing for decades, to what is now being introduced in the last several years you have to judge among the three goals, cheaper faster, better or we can guarantee is cheaper and faster. It’s up to you to make it better not necessarily the technology.
I’d like to offer some advice on transitioning skills and knowledge skills and knowledge who worked hard to retainto include some of the new AI and NLM developments it’s actually less impactful, and better than you may think
Well, I dressed up in a little bit for now let’s talk about prompt engineers You most likely have currenb SME’s or expert on your current data and requirements
First, do you notice I used the term AI scientist Instead of data scientist , A data scientist Currently is actually a AI model scientist and they will help you apply. We are concerned here with how a lot of folks have opinions heuristic no not necessarily fact-based already we are going to suggest some techniques, and provide some mentoring to explore this important factor in AI is proper training we specialize in providing techniques and mentoring in separate information which is Formulated and an opinion and facts or Data, which, cannot change
There is a series of steps involved in preparing for the use of Data inAI and Chat in the form of LLM models. This is not much different and you may have most of the information already gathered in order to properly design the requirements for your model we would collect the phone.it is important to realize the steps are critical, for you have confidence in your models putput which will be your result of integrating, your Word documents, your Presentations,, spreadsheets, and, of course your actual data.
We wiKeaton, Billy, and modeling of information words versus modeling and requirements for data preparation . There is a difference that is extremely important and in line with what you’ve been doing.I know that Data preparation is not glamorous, but in my 20+ years you’ll get nowhere without proper data preparation you can’t AI it you can’t impute it you need to discuss requirements with people and write them down and then execute it The AI will make the legwork, faster. but in the end you’ll have to review it otherwiseit otherwise you may end up needlessly retracing your steps based on Improper preparation I know that Data preparation is not glamorous, but in my 20+ years you’ll get nowhere without proper data preparation you can AI it you can’t imputed you need to discuss requirements with people and write them down and then execute a I will do it. Faster time is the legwork, but in the end you’ll have to review with Stuck you may end up needlessly retracing your steps based on.I know that Data preparation is not glamorous, but in my 20+ years you’ll get nowhere without proper data preparation you can AI it you can’t imputed you need to discuss requirements with people and write them down and then execute a I will do it. Faster time is the legwork, but in the end you’ll have to review with Stuck you may end up needlessly retracing your steps based on. improper preparation. This can be at Floydd by phone, the proper steps.
1. Word document
2. Presentations
3. Spreadsheets
4.Data reports
5. Data quality report for AI preparation
6.Internet
7.Other Sources (Network,Internet or Local)
We have suggested tools/techniques/open source.and suggestions for each of these. Don’t let that bother you, however, is important with today’s capabilities in AI integrate your words your thoughts, your abstraction, and your actual data together in order for you. They’re trustworthy results from your AI.
We will be providing a separate post on each of these and then finally how they come together our point is that the what you’ve been doing to understand and form requires for tradition BI can be reutilized and extend it for AI
We in IT have complicated and diluted the concept and process of analyzing data and business metrics incredibly in the last few decades. We seem to be focusing on the word data.
“There is a subtle difference between data and information.”
There is a subtle difference between data and information. Data are the facts or details from which information is derived. Individual pieces of data are rarely useful alone. For data to become information, data needs to be put into context.
The history of temperature readings all over the world for the past 100 years is data.
If this data is organized and analyzed to find that global temperature is rising, then that is information.
The number of visitors to a website by country is an example of data.
Finding out that traffic from the U.S. is increasing while that from Australia is decreasing is meaningful information.
Often data is required to back up a claim or conclusion (information) derived or deduced from it.
For example, before a drug is approved by the FDA, the manufacturer must conduct clinical trials and present a lot of data to demonstrate that the drug is safe.
Because data needs to be interpreted and analyzed, it is quite possible — indeed, very probable — that it will be interpreted incorrectly. When this leads to erroneous conclusions, it is said that the data are misleading. Often this is the result of incomplete data or a lack of context.
For example, your investment in a mutual fund may be up by 5% and you may conclude that the fund managers are doing a great job. However, this could be misleading if the major stock market indices are up by 12%. In this case, the fund has underperformed the market significantly.
Synthesis: the combining of the constituent elements of separate material or abstract entities into a single or unified entity ( opposed to analysis, ) the separating of any material or abstract entity into its constituent elements.
And with the simple action of linking data file metadata names to a businesses glossary or terms, Will result in deeply insightful and informative business insight and analysis.
“Analysis the separating of any material or abstract entity into its constituent elements”
In order for a business manager for analysis you need to be able to start the analysis at a understandable business terminology.
And then provide the manager with the ability to decompose or break apart the result.
They are three essential set of capabilities and associated techniquestechniques for analysis and lineage.
I have been in this business over 45 years and I’d like to offer one example of the power of the concept of a meta-data mart and lineage as it regards to business insight.
A lineage, information and data story for BCBS
I was called on Thursday and told to attend a meeting on Friday between our companies leadership and the new Chief Analytics Officer. He was prototypical of the new IT a “new school” IT Director.
I had been introduced via LinkedIn to this director a week earlier as he had followed one of my blogs on metadata marts and lineage.
After a brief introduction, our leadership began to speak and the director immediately held up his hand he said “Please don’t say anything right now the profiling you provided me is at the kindergarten level and you are dishonest”
The project was a 20 week $900,000 effort and we were in week 10.
The company has desired to do a proof of concept and better understand the use of the informatics a tool DQ as well as direction for a data governance program.
To date what had been accomplished was in a cumulation of hours of effort in billing that has not resulted in any tangible deliverable.
The project had focused on the implementation and functionally of the popular vendor tool, canned data profiling results and not providing information to the business.
The director commented on my blog post and asked if we could achieve that at his company, I of course said yes.
Immediately I proposed we use the methodology that would allow us to focus on a tops down process of understanding critical business metrics and a bottoms up process of linking data to business terms.
My basic premise was that unless your deliverable from a data quality project can provide you business insight from the top down it is of little value. In essence you’ll spend $900,000 to tell a business executive they have dirty data. At which point he will say to you “so what’s new”.
The next step was to use the business terminology glossary that existed in informatica metadata manager and map those terms to source data columns and source systems, not an extremely difficult exercise. However this is the critical step in providing a business manager the understanding and context of data statistics.
The next step, was the crucial step in which we made a slight modification to the IDQ tool and allowed the storing of the profiling results into a meta-data mart and the association of a business dimension from the business glossary the reporting statistics.
We were able to populate my predefined metadata mart dimensional model by using the tool the company and already purchased.
Lastly by using a dimensional model we were able to allow the business to apply their current reporting tool.
Upon realizing the issues they faced in their business metrics, they accelerated the data governance program and canceled the data lake until a future date.
Within six weeks we provided an executive dashboard based on a meta-data mart that allowed the business to reassess their plans involving governance and a data lake.
Here are some of the results of their ability to analyze their basic data statistics but mapped to their business terminology.
6000 in properly form SS cents
35,000 dependence of subscribers over 35 years old
Thousands of charges to PPO plans out of the counties they were restricted to.
There were mysterious double counts in patient eligibility counts, managers were now able to drill into those accounts by source system and find that a simple Syncsort utility had been used improperly and duplicated records.