Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Google Podcasts | Spotify | Android | TuneIn | RSS

Dan McCreary has years of experience selling AI solutions to executives. He uses a metaphorical story to show the importance of making your enterprise as intelligent and nimble as possible.
His story of the the evolutionary heritage of jellyfish and flatworms seemed to me like a great way to kick off this new podcast.
We talked about:
- the importance of helping an executive audience visualize the benefits of any technical solution, in particular the role of storytelling that will help your message stick
- the jellyfish and flatworm metaphor that he uses to help executives visualize their competitive environment
- how a knowledge graph lets companies build internal maps of their company and environment
- how a knowledge graph can enable micro-personalization
- how adding precision to a model improves your ability to predict customer behavior
- his simple description of embeddings: a way that we find when two things are similar
- his take on the benefits of labeled property graphs over knowledge graphs
- the idea of “reference frames” articulated by Jeff Hawkins and how knowledge graphs come closest to modeling them
- how three main ways of representing data – neural networks, knowledge graphs, and reference frames – are all based on graph network models
- the importance of freeing data from spreadsheets to enable the full productivity benefits of AI
- his insight that knowledge representation is the hardest part of AI
Dan’s bio
Dan McCreary is a solution architect focusing on AI and generative AI architectural patterns. In the past, he worked at Bell Labs with the creators of the UNIX operating system, with Steve Jobs at NeXT Computer, and founded his own consulting firm with over 75 employees. His background includes topics such as scale-out enterprise knowledge graphs, high-performance computing, and NoSQL databases. He is the co-author of the book “Making Sense of NoSQL” and is a frequent blogger on AI strategy. He has been closely following the growth of knowledge graphs and generative AI. He is a huge fan of GPT-4.
Connect with Dan online:
Video
Here’s the video version of our conversation:
Podcast intro transcript
This is the Content and AI podcast, episode number 1. As I was getting ready to launch this new show, Dan McCreary shared on LinkedIn a story that he uses to help executives understand why they need a smarter approach to their data and knowledge management. I always appreciate a good origin story – especially when I’m in the process of starting something new – so his comparison of the evolutionary heritage of jellyfish and flatworms resonated with me. I hope you like the story, too, as well as Dan’s take on knowledge representation, which he thinks is the hardest part of AI.
Interview transcript
Larry:
Hey everyone. Welcome to episode number one of the Content and AI podcast. I’m delighted to start off the series with Dan McCreary. Dan is an AI consultant based in Minneapolis, Minnesota in the US. Welcome to the show Dan, tell the folks a little bit more about what you’re up to these days.
Dan:
Thank you very much for having me. I have been working on the field of knowledge representation for most of my career. My background is – early on I did chip design for Bell Labs. I worked in the super computing industry, worked for Steve Jobs for a couple of years, and then I’ve been doing a lot of starting my own companies and consulting. And then I just recently left a Fortune five healthcare company where I ran a generative AI center of excellence there.
Larry:
Nice. Yeah, so you’ve been doing this stuff for a little while and that’s why I wanted to start off. I’ve been thinking of launching this podcast for a couple months now, and I saw this article that you wrote that said, “Aha, there’s my trigger here. Let’s go.” You wrote this brilliant piece that you’ve been kind of shopping around because just for everybody’s background, Dan has been explaining to executives for decades about how to get the most out of computers and computer stuff. And his latest thing – as an AI consultant – is helping people understand where to make smart investments in AI. And so he’s been looking for ways to explain that he came up with this brilliant metaphor of the jellyfish and the flatworm. Tell us about that, Dan.
Dan:
Well, first of all, anybody who’s in technology and has an intimate understanding of how bits move across networks and then goes into a room full of executives who maybe have a finance background or a healthcare background but can’t visualize the difference between two databases, you can get very quickly frustrated trying to guide them. And one of the things that I’ve always learned is that if your audience can’t visualize what you’re trying to explain, they won’t make the right decisions. AI today, say your executives are pondering a million or a 10 million or $100 million investment, they have to be convinced that it’s the right thing. And what’s interesting is they’ll often have you in a one-hour consulting meeting and then they’re going to go away. And the question is what will they remember? They’re not going to remember data and facts and bites and bits and all this stuff, but they will remember a story if that story attaches to their emotional memory.
Dan:
We think of emotions attaching memory to our brain. And so the idea here is to develop a compelling set of stories that when you’re not in the room, they can talk about it and they can say, “Hey, are we a jellyfish or are we a flatworm?” And then they’ll make the right decisions if you give them the right metaphors, right? So that’s the whole thing about being a good thought leader is having really good stories. By the way, they have to be accurate. You can’t just make things up. They have to be able to talk to their friends and say, “Hey, do you understand Dan’s jellyfish flatworm story?” And they have to say, “Yes, that makes sense,” but well, let me just tell you the story real quick. Okay. So in the evolution of animals on planet earth, about 600 million years ago, we had two animals.
Dan:
One, the jellyfish and the jellyfish live in a very simplified open ocean environment and they have to have a very simple set of rules. And they just hope that by following those rules go towards light, go down in the dark if there’s prey around very simple rules to hope that fish wander into their tentacles. On the ocean floor, though, the world was very different where these flatworms were crawling around and they had to know how to move towards their prey or they were often considered the first hunters and also avoid their predators. So they had to remember things and they had to remember maps of where the good places to go and where the bad places code. And they had to have complex rules so that if they turn around, they’re going to move away from their predators. So motion is complicated. It really tells you about you have to have a map of the world around you.
Dan:
That’s the flatworm. And most scientists say that the flat one was probably the very first animal to have a central nervous system. And most of other animals that move around their environment evolved from that. So I use that as a metaphor for asking companies to understand, are you floating in a competitive environment that’s simple? Do you sell one product to one consumer and you have no competitors? Right? And there are a few companies that do that, right? They’re specialized manufacturers, they make one part, they sell it to another manufacturer, they get the same contract every year. They’re very good at what they do. They’re so specialized, they don’t have competition. They have a simple world. They don’t need to have a huge massive IT department to simulate their competitive landscape. But in the real world, you have many products. You often sell to many different types of consumers, and every one of your products may have a hundred different competitors.
Dan:
That’s a complex world. That’s not like a simple jellyfish that floats in the open ocean. That’s a flatworm company, a company that has to start to take an understanding of the world around them and build internal maps. And by the way, there are companies that do have IT systems. You’ve got an IT system that runs your website, it has a log of who’s coming to your website. There’s a search field you can store who’s searching for what on your website. You have your customer relationship management system, you have your sales system, you have your commission system, you have your product management system, you have your inventory system. You got all these systems, and they’re all little silos. And what we’ve learned is that if you want to have intelligent agents that help knowledge workers make decisions, putting a hundred data silos in the cloud is not going to help you build intelligent agents that are helping your knowledge workers get their jobs done.
Dan:
What we need is to be like the flatworm where we centralize knowledge in a brain, our central nervous system as it were. And the manifestation of that is a knowledge graph where you can model the complexity of the real world in all of its detail to make good decisions and make good predictions about, hey, if I introduce this product, this is going to be a change in revenue. If I see this change happening in these products, here’s my prediction of why it’s happening. All of those things, if you have all your data spread across these silos, you’re not going to be able to have intelligent agents. So the way I say this is that if you want to be a flatworm company, you need to centralize your knowledge in a knowledge graph and you need to use all of the power of modeling the outside world to precisely predict your customer’s behavior. Does that make sense?
Larry:
That makes perfect sense. And what I love about one thing, I run a lot in the knowledge graph world and in there the classic use case for a knowledge graph is what you just described, the enterprise knowledge graph where you essentially have almost a digital twin of your company-
Dan:
There you go.
Larry:
… make decisions, all that good stuff.
Dan:
Customer 360.
Larry:
Yeah, exactly. And I know you have a background in BI and other stuff. That’s a rabbit hole. Let’s save that rabbit hole for another conversation. But I want to try to tie this to specifically to content practice in a couple of ways. One, well first one of the silo busting or silo spanning is like a concern to a lot of, especially content system people or people – there’s kind of a new content orchestration role emerging in enterprises where driven often by an omnichannel strategy, a need to make sense across a number of channels in previously siloed channel delivery systems. So there’s that, and it seems like knowledge graph is a great mechanism for helping content coming from different parts of an organization. One way I think about all that stuff you just described about that enterprise, at some point you’re going to say something about that and that’s where content comes in and I’m like, “Great, you’re saying this about that. I can help you with content.” Can you talk a little bit about how you see knowledge graphs helping content practitioners of different kinds?
Dan:
Absolutely. There’s this whole field now called micro-personalization where we don’t take one set of content and send it to all of our customers. We take into account what did they purchase, when did they purchase it, what did they like, what did they not like? What other things have they purchased, what competitors are they using? And we can now take that knowledge and we can use generative AI to generate content that’s highly personalized to that profile of that user, right? It’s not one content for everybody, it’s everybody generates their own content.
Dan:
And that’s a really good example of that. I think what we’re seeing is that you still need people to create great content, great messaging and things, but the point is we now have tools to take all of your existing content, put it in what’s called a vector database, find the most similar content, and then create a new prompt, find the most similar content, put it through a generative AI tool, and create very, very highly customized content. But where that comes from is often driven by what’s in your knowledge graph. And so you think of content as a product, we’re taking details from your knowledge graph, we’re matching it to those content products. We’re synthesizing new things and we’re customizing and mass personalization and we’re doing it really cost effectively.
Larry:
Yeah. And I don’t want to disappoint you, but that’s the dream and everybody’s talking, but there’s very little of that actually happening. Again, the way you just described it, I’m going to steal that and use that when I’m trying to pitch people on the dream. But one of the things in there that you just mentioned is that that notion of taking what you’ve got, doing some magic with it and automatically generating a customized version for a client.
Larry:
I’m thinking of how we used to publish. It used to be easy. You just wrote a speech and delivered it to an audience, you wrote an article and you did some audience analysis. This kind of, I think we’re more writing – one of the truisms, especially in the technical content world where this structured content designed to do what you just described – is that it’s like you begin to get this very abstract situation where the author of the content who’s designing for an intent and a purpose of that content, it’s separate from the way it ultimately expresses in the world. Help me figure out how knowledge graphs can help manage that enterprise challenge of helping authors do things differently, helping page builders and app builders do things differently.
Dan:
Yeah, I always go back to the metaphor of – imagine you’re living in a network, your customer, a specific customer lives in a vertex in that connected network. And what you’re going to do is you’re going to do a random walk around that customer to see all of the touchpoints of that customer. What are touchpoints? A call to your call center, a direct mail piece that you might’ve sent out to them, that they responded to an NPS survey, a complaint, a comment. All of those things are touchpoint. And when you put the customer at the center of a circle and then look at all of their customers and you traverse those, you’re going to really come up with a description of your customer.
Dan:
And that description effectively can go as a prompt in a generative AI system. And the instructions for the prompt is take this general message and customize it based on these hundred facts that I learned about wandering around the customer knowledge graph and, bang!, you generated. There are many companies that are doing that exact same pattern today, but they’re struggling because they put things in flattened entity relational diagrams, they stick it on a cloud. They only grab 10% of their real knowledge about a customer. They don’t harvest social media to see what their tweets and their LinkedIn posts and all this stuff. They’re only using 10% of the knowledge available. And so their content isn’t going to be competitive.
Larry:
As you’re saying that you’re reminding me of, we all want better information about how we’re going to serve this customer that we’re talking about. You were talking a minute ago about touchpoints. I do a lot of customer journey mapping and sort of service-design-style work in my current practice, and you’re always doing your best to identify them and you get to a point, but we’re mostly working with personas like a representative blob of people. But to go from that to real personalization, what’s the magic in this technology that permits us to go from designing for an audience exemplified in a persona to addressing the needs of a specific individual? I think it’s like you just said, you walk around them and vectorize it, but is there more to it than…
Dan:
I go back to the jellyfish and the flatworm story. Jellyfish do have simple rules that they work with, but what the flatworms that evolved had a precise model of their world. And the whole lesson about that is that the more precise your model is, the better your ability to predict people’s behavior is going to be. And I take that to be, the more data you gather about a customer, the better you’re going to decide whether or not content is appropriate for them. And AI and vector databases and embeddings, let’s, this word embeddings has got a huge wrapper around it, so let me just go really quick.
Dan:
Embeddings is a way that we find when two things are similar, that’s all you have to remember. The old AI was counting and summing counts and amounts, right? OLA cubes and bi and analytics. The new AI is all about comparing and similarity. Given a customer, can I find their 100 most similar customers and can I do that in a hundred milliseconds? And that’s what Vector or what we call concept indexes are helping us do right now. And there are companies that are doing that right now and there are new products that are attempting to do that.
Dan:
The rule of thumb that I always have though is that your ability to execute precision is going to be based on the precision of your model and your ability to integrate things. Now, just like your brain takes a lot of energy, it’s a 20-watt bulb, it takes a good percentage of our calories to fuel our brain. It costs money to build an integrated knowledge graph because you have to harvest all this data and bring it together. And it’s not just throwing it in a repository, it’s connecting it together. You can have a website that gathers who’s coming to your website, but if you can’t associate it with the cookie of the idea of the person who came in, you’re not going to be able to tie it together. That takes time and focus…
Larry:
And that starts to get… Oh, I’m sorry. Go ahead. You were going to say.
Dan:
Anyway, just summary is there’s a very direct correspondence between what you can model and the precision of the predictions you make, and that’s the story of the flatworm and the jellyfish.
Larry:
Got it. Yeah, and I just in terms of, not to get too deep in the technical part of it, but I think both knowledge graphs and the LLMs rely on embeddings and vectorization and all that stuff to do their thing, but whereas LLMs are just predicting based on occurrence is like what’s likely to happen now. A knowledge graph has this model. And tell me, that model, like you described earlier, like an enterprise, you described all a bunch of activities going on in there. And as I understand it, I think that model typically starts with a domain understanding, like some kind of domain modeling or a domain discovery. How do you get from that to a knowledge graph
Dan:
When you start out trying to model your customers? You are going to start to create a data model for that. Anybody who’s trained in entity relation modeling and ORM – object relational modeling – and UMLS, all of those are modeling techniques. The modern format of that is called an LPG model, A labeled property graph. And the LPG models have this wonderful property that they tend to scale better than anything else. And the reason is that every time you create a relationship, you can create any number of properties with that relationship. Many of the older semantic web and RDF systems, once you had a relationship, if you wanted to add a property to that, you had to break the model. And that would made all your queries invalid, right? It didn’t scale well. It was really good for on the wire representations, but it didn’t evolve well. And what we’ve now learned is that these labeled property graphs specifically in memory native labeled property graphs that are distributed over large cluster scale really, really well.
Dan:
And as a result, there are many companies now that are storing literally tens of billions of vertices about their customers on a daily basis based on those interactions. And they can retrieve all that information about any one customer and under a hundred milliseconds, you just can’t do that on these, what are called OLAP cubes, star schema, snowflake schemas, they all are going to bog down because they don’t really excel at relationship traversal. And once you understand that, your life becomes easy.
Dan:
Now I’m a little different. I’m dyslexic and I can see data moving through systems better than a lot of other people down to the chip level, down to instruction-set level. So I can see all this data happening, and it’s just very frustrating to me to see people who think that they know how to model the world because they’ve used spreadsheets and they realize the human brain doesn’t have any tables in it. We don’t have spreadsheets in our brains. All we have is networks. And once you start to read the readings of Jeff Hawkins and understand reference frames, you realize that you’re never going to be able to create great predictions about your customers by using a spreadsheet or a table in a relational database. Not going to happen.
Larry:
Let’s talk a little bit more about reference frames, because I’m pretty conversant in LLMs and knowledge graphs, but in reference frames is this new idea that you introduced in that article, at least that I hadn’t heard of it before, and one of the things I love about it is that the reference that they’re referring to is human intelligence. This is all about artificial intelligence, and most of my people are human-centered designers, so I’m sure they’re going to be curious about a human-centered guy like Jeff Hawkins.
Dan:
Yes, absolutely. So Jeff is amazing because he is a brain architect, right? He’s trying to observe the world and infer the structure of our brain from the way he sees the world. And he has some wonderful stories in his book, and I have blogs on that analysis. But the answer is, if you pick up a coffee cup and you run your finger around the edge of a coffee cup with your eyes closed, your brain kind of predicts what’s going to happen. It predicts structure. And what his observation is, is if we understand the way that flatworms started out with motion and how they built structural representations in the neurons of their brain called place cells and grid cells, if you want to go into it. The only way that you can really have our brain evolve into that, the neocortex, is to use this thing called reference frames.
Dan:
And reference frames is really kind of a knowledge representation that uses graphs, right? The neuron is the fundamental atomic unit, but what it does is it associates a thing in our world like a chair with a single vertex, a single neuron, but it does it in massively parallel. And then think of 150,000 things that vote in parallel, and they all come together and vote on when you see something, what’s the highest probability is that? And then that reasoning takes over.
Dan:
So it’s a structure and voting process that happens very much in parallel that Jeff really has. And what I think the future of AI is to get closer to reference frames. And the closest thing we have to that is knowledge graphs. LLMs, to be honest, they’re kind of a parlor trick. They’re not really a representation of the world. They’re a representation of symbols and language we use to communicate the world, which is kind of interesting. They’re certainly ready. You can do cool things, but you’re not going to get that precision that you have if you have a knowledge graph and it’s getting closer towards reference frames. So think of the world as three competing representations, neural networks, knowledge graphs, and reference frames, and the future is a synergy of all three of those systems.
Larry:
And each of those, it sounds like they’re all kind of graph-ey models, is one of our challenges to really benefit most from this stuff, to go from sort of this old school tabular thinking to more of a graph mindset.
Dan:
Yes, I do a lot of stories about what we call the tyranny of tables. Tabular representations have been around in humans since we started to write on cuneiform tablets 5,000 years ago. So we have 5,000 years of a legacy of trying to represent the world in a convenient form that we can represent in rows and columns because it’s a very easy way to write, or easy way to view in a spreadsheet. But you know what? The brain doesn’t do that. There are no tabular structures in our brain.
Dan:
There’s only networks and the three ways of representing data, neural networks, knowledge graphs, and reference frames, none of them use tables. So we have to work very, very consciously to remove tabular representations from our standing of the world. And if we don’t, we are destined to the only model of what we can model in a spreadsheet or a table of a relational database, and great for counts and amounts, don’t get me wrong, right? Ola cubes are wonderful things. If you’re counting the number of sales in each store, that’s not ai, that’s not similarity, that’s not embedding, that’s not generative AI, that’s not content generation based on that stuff. Those all use networks.
Larry:
Okay, so network thinking, are you going to write a book about that or something? It seems like we need help with this for sure. I also want to say, so it’s both network and graph, but, and one of the things about knowledge graph, I’ve done a lot of entity relationship diagramming and that kind of stuff. And when you do… And a lot of people in the design field talk about boxes and arrows and connecting things, but graph thinking is like I’m reminded all of a sudden of when I first learned years ago what a tesseract was, it’s like the four dimensional equivalent of a cube and somebody, I can’t remember, some science fiction out there described this thing and I was like, “Whoa, okay. I kind of get that.” And just as you can mimic a 3D cube on a 2D plane, you can mimic a tesseract on a 3D cube. Is that the kind of level of the next dimension we need to add to our thinking to really benefit from this stuff?
Dan:
I love the way you use the next dimension metaphor. When I try to explain what embeddings are, I always start out with a two-dimensional map and imagine two cities with a longitudinal latitude, and can we find out how close they are together? And then I ask, can you extend that to a third dimension? So if something’s higher up in the mountains, it’s farther away and most people can do that. We can easily do metaphors in two and three dimensions, but it’s harder for our brain to think in higher dimensions. But I assure you the math is the same and embeddings are just 150 to 200 dimensions of similarity. But we let our data scientists or machine learning figure out how to represent data in those embeddings, and our brains do similar things. Reference frames is about connecting similar things that are similar.
Dan:
In our brains a chair and a couch both have a neuron and they’re kind of next to each other, just like we can find the distance between embeddings. So we’re learning about the brain more every year. We’re learning more about neural networks, we’re learning more about knowledge graphs, and as we learn, we’re finding a consensus of knowledge that shows that if we represent our data correctly, our companies are going to be making better decisions and empowering the agents. This is all about agent enablement, by the way. The home run here is that every knowledge worker in your company should have 100 intelligent agents it works with, but those intelligent agents are going to be helpless if your data is trapped in spreadsheets on somebody’s desktop. Your job as an architect today is to get the data out of those spreadsheets into that central knowledge graph so that intelligent agents can make your knowledge workers more productive. That’s what AI is going to be about in corporations. Now we’re still years away from Skynet, the knowledge graph becoming conscious, but that is the path forward, and it’s not about spreadsheets and relational databases.
Larry:
Mm-hmm. But we’re pretty close to having a bunch of interns helping us, so that’s good. Hey, Dan, I can’t believe we’re-
Dan:
Good metaphor. I like it.
Larry:
Thanks. Yeah, I can’t believe we’re coming up close to time already. I like to keep these around a half hour. But before we wrap up, is there anything last, anything that’s come up in the conversation or that you just want to make sure we share before we leave?
Dan:
I think the summary is what Jeff Hawkins has in his books is that knowledge representation is the hardest part of AI. If you focus just on all the little algorithms in which is the best large language models, you’re missing the point here. You need to think about how your organization is going to represent knowledge in a way that you’re going to get the answers out of that knowledge efficiently. And if you put that as your number one rule, I think you’re going to be well guided. So that’s my general thing. Think of knowledge representation first before you think of all the bag of tricks that large language models give you.
Larry:
Thank you. I love that, and I think that’s going to resonate with content folks as well. Oh, hey, one very last thing, Dan. If folks want to stay in touch or follow you online, what’s the best way to connect?
Dan:
Best way is just to join my LinkedIn, send me a LinkedIn connection request, just Dan.McCreary or Dan McCreary at LinkedIn. Just search for me there, and I’m sure you’ll find me pretty quickly. I do have a GitHub repository. I have a lot of things in my medium blog, but all those are linked from my LinkedIn account too.
Larry:
Great. Well, thanks so much, Dan. Thanks for helping me kick off this new podcast, and thanks for all your insights.
Dan:
Good luck. I hope it goes really well.