Building the Infrastructure to Win the AI Race
Transcript
SUMMARY KEYWORDS
AI adoption, generative AI, national security, data normalization, LLMs, DoD challenges, AI infrastructure, multi-agent systems, GPU availability, data fusion, commercial applications, AI agents, policy reforms, dual-use technology, productivity focus.
SPEAKERS
Brian, Akhil, Maggie
Brian 00:00
We're going to need to figure this out faster than our adversaries on how to utilize LLMs coupled with agents right in order to advance mission as fast as we possibly can.
Maggie 00:12
Welcome to the Mission Matters Podcast, a podcast from Shield Capital where we explore the intersection of technology startups and national security. I'm Maggie Gray,
Akhil 00:22
And I'm Akhil Iyer,
Maggie 00:23
And we are your hosts from the investment team at Shield Capital. In this podcast, we discuss the technical challenges of developing and deploying commercial technology to national security customers, as told from the founders’ perspective. In this episode, we're joined by Brian Raymond, the CEO of Unstructured, a startup building the future of data tooling for generative AI transforming unstructured data into structured data that can be used by AI models. It's common wisdom that AI models are only as good as the data they're trained on and the data they have access to. However, much of the world's most valuable data is not stored online in a structured way that AI models can access. Rather, it's stored in PDFs, PowerPoint slide decks, emails, photocopies of paper, documents, satellite images and other hard to process data formats.
Akhil 01:25
Brian saw this phenomenon firsthand while working in government, in the Central Intelligence Agency and on the National Security Council, and then again, as a VP at Primer AI, which is a start up building natural language processing tools for the US government. Brian was inspired to build Unstructured, really, in order to accelerate the adoption of AI and a lot of the tooling needed for both the public and private sectors.
Brian, so excited, man, for you to join us here today. Really excited about what you've built already and what's yet still to come. And we're excited to just dive right in. And so I'd love to first maybe chat you came from a super sexy background, the Central Intelligence Agency and here you are digging through the dirty laundry of the AI infrastructure ecosystem. What got you excited about working on those types of really unsexy piping infra issues as you were thinking about what to do next and to build your own, your own company?
Brian 02:22
Thanks for having me and excited for the discussion today. It was a really, let's call it, non linear journey, and so I spent several years as a CIA analyst, did a few tours overseas, left government after a stint on the National Security Council, and after business school, ended up at a company called Primer AI, where they were doing some really interesting work with natural language processing and knowledge graphs. And so I joined the company very early on, right after they closed their Series A and had really like a front row seat to AI in action. This is in like the 2018/2019 timeframe. And at the time, natural language processing and language models more broadly, were having a bit of a renaissance after being dormant for several years. And so there's a lot of excitement about how to use these models and a lot of potential for accelerating various missions. So we were working, at the time with companies like Walmart, Goldman Sachs and others. And then we also had an In-Q-Tel work program, and so I had the privilege of helping lead a lot of the work related to that In-Q-Tel work program, where we were deploying these models, model pipelines and knowledge graphs into into classified spaces, and saw firsthand just the friction of trying to leverage these incredibly powerful models, which are tiny models, now in retrospect, in conjunction with mission relevant data. And so after four years of Primer I was looking at this problem space, and I said, look like there's this really ugly, nasty problem that no one else is really well positioned to solve at the far left hand side of the equation, which is, how do you take raw enterprise data and render it rapidly, render it in usable formats so you can use it in conjunction with these incredible models. And so that was really how Unstructured was born.
Akhil 04:20
That's awesome, Brian. For those who may not know, can you briefly explain what a work program is? But more importantly, as you were looking at that type of program, how unique or differentiated was their messy data infrastructure portion relative to some of the commercial companies you had looked at?
Brian 04:39
That's a great question. So the In-Q-Tel model is one in which they invest in dual use technology companies, primarily companies that actually have great traction in the commercial space, that are fairly early on and as part of their equity investment comes like a body of work to help more closely tailor that dual use technology to. So networks that you know, handled sensitive information to workflows that are more relevant right to the national security or defense space, until you take that base application and introduce some amount of additional work on it to just more allow it to be deployable and usable in those applications. In our case of Primer that lasted almost three years the work program, it was an extensive body of work that had an incredible outcome that led to, that led to, like, a multi year contract coming out of it. And so a lot of the manual work for that two and a half, three year work program was around taking raw XML data and then transforming that into chunk JSON. I mean, if that's not enough to put you to sleep, like, I don't know what is, but this was really a problem from hell, because each you know, each document, each kind of bit, and byte of data, it was non standard. And so anytime like, you know these, we'd have to hard code all these regex pipelines, these regular expression pipelines, to try and transform it, and then maybe a document layout might change, and then it'd break the pipelines, and then you'd have poor inference. Then you'd spend all this money on this knowledge graph, and you have to go back and redo it. It was really a lot more manual than a lot of the magic around AI would lead you to believe. And so the good news with that, though, is that like that XML to JSON, in this specific instance, everyone kind of deals with that. They deal with PDF to JSON. They deal with dozens and dozens of different file types to get it into the lingua franca for language models, LLMs, whichever may be which is generally chunked embedded JSON or markdown or HTML, but in any of it, you need to render into a canonical schema in a standardized format.
Maggie 06:54
Yeah, Brian, thank you so much for sharing that background on Primer. You've obviously been working in the national security space for a long time and been working at the intersection of AI and national security for a long time as well. So I'd really be curious to hear, when you started back in the government and at Primer, what was really the state of AI adoption within DoD and AI infrastructure within DoD, and how has it changed over the last decade or so to where we are today?
Brian 07:28
It’s a really interesting topic, because around 10 years ago, you saw natural language toolkit NLTK, a range of other models begin to emerge. But it wasn't really until 2017 when transformer models were introduced, that you saw the industry around this begin to explode, as well as the use cases that were reachable by machine learning begin to emerge. And so shortly after that, Project Maven was formed, right? And you saw significant investments around object detection, image classification models. So really around computer vision, that sort of work, there's a lot less around the language models. And one of the reasons, or a few of the reasons are because, like around the computer vision, there was an immediate need for this in Afghanistan at the time, and a lot of work going on in Syria around taking drone feeds, for example, and being able to classify rapidly classify targets in it. And it was a great application for the tech at the time, and so they were able to put money against labeling data, fine tuning models, deploying these models at scale, and then nesting those within different mission applications.
On the language side, there was intense interest. But the one of the main challenges with these models is that, if you think of it like horse blinders, they had very small token windows. And so, like our context windows. And so today we're often we see context windows, 50,000 tokens, a half million or several million tokens. And so you can put an entire book in that context window, like think ChatGPT, and ask questions of it. Back then, your thing the context window was around 15 to 20 words. And so what you were able to do with it was not really take a huge amount of data and then ask questions of it easily. Instead, what you had to do was structure that data using these models so pull out like named entities or topics, or do different classification tasks, and then index all of that and then do Boolean queries against it. That is as slow and boring as it sounded coming out of my mouth right now, but like how. A lot of potential for interesting things, right? And so the limitations on the models call it like pre 2022 relegated a lot of the natural language use cases to really interesting like proofs of concept, but that weren't necessarily scaled in the same way that these computer vision use cases scaled out of Maven and across DoD.
Maggie 10:24
So then, what are some of the different ways you're seeing people in DoD actually adopt AI today, and what are some of the big use cases that are popping up that are delivering real value to end users?
Brian 10:38
We’re about two and a half years into the LLM generation of machine learning, right? Or this, this most recent epic and ChatGPT was released in November of 2022 and then in this first half and throughout 2023 industry as a whole, as well as, like data scientists and government, started to try and understand, like, “Okay, how could we use this to engage with our data in new and interesting ways?” And so a new term has now proliferated, RAG – retrieval, augmented generation. I believe, like the first RAG deployment was done in January 2023, like two months after, after ChatGPT was released on a on a classified network, and since then, most of the use cases have been knowledge related. So really accelerated search, right as well as some light levels of automation.
Some of the things that like hindered adoption, at least from you know, 2022 through call at the end of last year were one the political climate. There was huge amounts of anxiety with among the previous administration at the highest level of the government, with adopting kind of a forward looking Gen AI strategy. Instead, there was a huge amount of effort to study it, to measure risk, to mitigate risk, rather than operationalize the usage of it. What that resulted in was a proliferation of bottoms up initiatives. And if you go on LinkedIn any day, you see lots of discussion on this, but you see NIPRGPT, CamoGPT, SOF Chat, as well as a range of other applications across the defense space that are all aimed at, like bringing your data to these models and being able to not only query it, but hopefully help summarize it, automate it, maybe draft new, you know knowledge products in the process that that really defined. I call it like 2023 and 2024 the limiting factors there was model availability, GPU availability, as well as budget and support among service chiefs, COCOM chiefs and others, which really wasn't there. So you saw just lots of 1000 sprouts kind of spring up from inner enterprising units across the landscape using open source tools as well as what limited kind of commercial tools were available.
Maggie 13:34
On that topic of where people are really seeing success adopting this technology in DoD, what would you say are some of the unique challenges of adopting this technology within DoD compared to just a traditional enterprise, and where are there similarities in those challenges?
Brian 13:55
I'd say the use cases are almost indistinguishable, from what we were talking about around like, chat your data type applications to trying to, we'll talk more about this later, deploy agents across your data in order to automate. Go from kind of Copilot to autopilot in a lot of use cases. That's similar. What's different is that DoD is great at buying, let’s call it, solutions, right? They're not great at buying infrastructure. And so if you go work with an investment bank or a large CPG company, Fortune 500 of any case.
The way that Gen AI has unfolded has been the CEO told the CIO or the CTO or the Chief Data Officer, “We got to go figure out Gen AI.” And somebody got made the Gen AI czar back in 2023 and then the Gen AI czar cast about for use cases across the organization and like big companies. Companies typically have between like 50 and 100 use cases that they identified. And then what do they do? They put resources across those and so they'll resource them with people, they'll resource them with compute with models, with data, and then they'll graduate those that have the most potential into production, and then scale them right. And so you have a CIO who's got a relationship with one of the hyperscalers from a from a cloud standpoint, they have a relationship with Anthropic, OpenAI, Google Gemini, whatever it might be, right? And they're able to make those infrastructure investments that are required to vector resources across lots of these use cases.
In contrast to that in the federal space, especially the defense space, you do not have that. You do have cloud contracts, right, with models that are riding on those, but you don't have CIOs who have directives from service chiefs to go find those use cases, provide them with compute, with models, with data, with support in terms of human capital, in order to identify which of those is going to hold the most promise against mission sets. Rather, what's happening is that they're using SBIRs. They're using $5 or $10 million contract vehicles. They're using individual like seat licenses or having individuals by tokens to utilize open source or more universal kind of like DIY do it all, kind of like platforms, in order to try and, like, work their way up. And so I'm optimistic that that might change in the future, but I think that there's, it's going to take a lot of concerted effort, because you what you don't necessarily have in the commercial space that you have here, just don't have ATOs, you might have one ATO that you have to do, like, if you don't want to work with JP Morgan, you got to get your models approved. You got to get through all the security scans. But there's just one CISO that you got to work with in the government.
You would have thought that CDAO would have solved this, or would have been able, like, have the agency to solve this through Alpha One and various initiatives that never happened. Instead, what they did was more experimentation, which is what the government's great at. There's almost no infrastructure, so to speak, that's being allocated from that, from actual DoD like the Pentagon, down, in order to enable Gen AI use cases. And so that creates a very different environment for a commercial company trying to operate with the customers compared to the private sector.
Akhil 17:47
Brian, this is an awesome topic. I love that we can dive into a little bit of this. Certainly, many of our companies, at Shield Capital, really, at the application layer right, are looking to do this, and they want democratized easy access to models compute infrastructure so they can focus on the application built with their customer. So two questions for you based upon what you just said. One, what's the ballpark: How many months or years is the government / DoD behind even the most highly regulated commercial space when it comes to some of this, AI, infrastructure, from a maturation standpoint?
Brian 18:24
I would say two to three. I would say they’re still in 2023 kind of land right now, Akhil. I'm not casting stones, but from the ability to procure and the ability to like take action – you know, there are lots of folks with a vision, a lot of folks with, like, deep technical expertise, but their ability to actually get things done, I would say that we're there at least a couple of years behind, if not further. I think that, you know it's going to take Congress, and it's going to take the current administration really focusing on this intensely if they're going to solve it, and it'll probably require legal reforms as well as policy reforms than DoD. We're already seeing some great work being done with the CIO shops around this, but it's the tip of the iceberg. It's going to require a whole lot more work in it to really move at a pace that we haven't seen previously.
Akhil 19:13
Yeah, thanks, Brian. Let's say if you had to take one call or spend $1 to best accelerate the AI infrastructure in the government. What would that go to?
Brian 19:23
It would go somewhere around like something that works really, really well Akhil in the commercial space: application or infrastructure companies like Unstructured working in tandem with hyperscalers in order to rapidly access customers. And so that can that means a few different things. That means one, network access. So for example, with Azure, AWS GCP, we're able to, like execute private offers through their marketplaces and then just deploy our containers easily into customers VPCs. Related to that, we're also able to transact using those marketplace mechanisms very effectively, flipping it around on the customer side, they can access us and other another emerging technologies, really easily. But then they can spin up large clusters and CPUs, GPUs. They have other cloud infrastructure services that are readily available. And so I think, like right now, there's a huge amount of friction and actually capitalizing on the potential of these hyperscalers for transact-ability, for network-ability, data availability, as well as all those other ancillary services that are required around this. And so you've got this, like sports car that you can't really take out of a garage in some ways. And I not really want to go around like heaping praise on hyperscalers, but in this case, they're really have reduced friction in the in the private sector, but it's had almost no impact in the public sector, despite them having marketplaces and these things, they're just not operating in any way, any in remotely the similar manner that they are in the commercial space.
Akhil 21:06
Yep, that totally makes sense. Brian, I know we've been a little critical here of the government space over the last couple of minutes, but obviously all of us here, you, Maggie, me, are in this business too, because the mission matters too, and we're excited by the applications. Can you share maybe a more positive story? What was one deployment on the public sector side, or maybe even a mission critical commercial, commercial side that was really exciting for you that either from a technical aspect or from an impact aspect?
Brian 21:34
Early on? So we were like a three month old company of three months post slide deck, and I was wandering around AUSA trying to make friends, and I bumped in to Nick Frazier from USASOC, and I was chatting about what we were doing, and he's like, I actually have an Army of contractors hard coding pipelines exactly like what your team had to do at Primer as part of your work program. And so can we do some sort of like design partnership and just start working?
And so our open source project, actually, we don't really advertise this. We don't want to have it on the website, but the fingerprints all over that are actually from us being in Slack together with USASOC AI divisions developers on real world use cases, and that open source project now has been downloaded almost 35 million times. It's used in almost every single Gen AI application in government. It's used by 85% of the Fortune 1000 and it was actually built in tandem, sitting side saddle with the folks down at Fort Bragg. And so that was, that was incredibly exciting to see, and to see, like all of the different use cases that folks have built around our open source in government, on the commercial side, we're in some really interesting spaces right now, working with a range of companies from like airlines to finance to biotech pharma.
A really interesting customer that we have is actually Nestle, who's headquartered just there in Northern Virginia, and they are making some huge investments in generative AI and in agentifying their AI stack. And so we are in a really unique position to deliver data to like their agents that are not only helping with an internal knowledge management work, but also creating new products, helping identify market opportunities and new product strategies for them. And so it's really kind of incredible for me to wake up every day and see all the different ways that, like our kind of foundational infrastructure is being used to populate a range of different use cases across these industries.
Akhil 23:59
That’s awesome. Totally awesome, Brian, and awesome to see where you've come, both from those early days and to now, I want to ask one question a little bit on the Nestle point before turning over to Maggie to maybe dive a little bit more into the technical aspects. And that is maybe a broader question, Nestle, how are they thinking about the return on investment with something like this?
Brian 24:17
I would say a common theme what's almost universal among the use cases on the on the private sector side, is that it's almost all productivity focused, right? Because you could focus on driving revenue, or you can focus on being more efficient with cost centers. And so they're doing some really interesting work around procurement and legal. They're doing some really interesting work around research and development and utilizing generative AI to generate to deliver near term returns in those area. That doesn't mean that there aren’t like revenue opportunities. We've seen companies like Writer and Typeface and others leap out to huge, huge valuations by helping accelerate marketing. We’ve got companies like Apollo and others that are using generative AI to help make sellers more more efficient. But I think that the cost centers in these companies are so large that it's not necessarily about laying people off and reducing workforce. It's more about, how can you 2x or 3x your current headcount by allowing them to be a lot more efficient? And maybe hold that OPEX constant as the as the business continues to grow.
And so, that's something that's very similar across there isn't, like, a, really a revenue side, I guess maybe tip of the sphere type stuff and in defense. But we're working with some of the testing centers, for example, that have to that as part of their work, they're responsible for putting together, like, incredibly complex deliverables, packages, to higher ups as they're testing munitions or different aircraft, whatever it might be. And so and those R&D type centers, right? Those are, they're so manual, they're so paper focused, that they're really ripe for disruption with generative AI. And you see a lot of demand coming from that as well.
Maggie 26:21
So Brian, I wanted to turn to talk through like, what does it actually look like to build a generative AI app for a customer like the Department of Defense? You know, let's say I have an amazing RAG app. I've done all my user interviews. People love it. They desperately want it. I want to be able to chat with my documents, and maybe some of that data that I want to be able to chat with is classified. It's going to have to be deployed on classified networks. I guess, my first question on that, what does it actually take to get that data I want to chat with in a state that it can be used by this model? And how do I actually keep that data up to date as the new data is constantly being developed?
Brian 27:07
Yeah, that's, it's really kind of the central question right now, when there's a bunch of different approaches or strategies that you're seeing kind of play out in the space. So, you have companies like us, right where we're selling a piece of infrastructure alongside us. We're doing the normalization, the Extract, Transform, Load of that raw data, and then we're doing a baton pass to a database, typically like a graph or a vector database. And then there's a framework typically utilized to orchestrate between an LLM and that that that data that's in that external memory and that's being continuously updated in order to deliver that that capability. And so in that case, this is invest, like large investments by data teams, usually like data engineering teams as well as cloud infrastructure teams, who want to kind of own each layer of the cake they want to have. They want to own the data they want to own the data structures. They want to own the data storage right and manage that and that's incredibly important to them.
Now another delivery model is one like Ask Sage, like what Nick's done. There is, he's used our open source with some other tooling, taking Weaviate as a vector data database. And then he has done a lot of work around the frameworks for how those LLMs orchestrate with that data in that external knowledge base to deliver value like ATO in a box chat your data, whatever that might be. And so it's almost pre-packaged, and he's done a lot of that layering for you, but you have a lot of adjustability. You can choose which models you can you choose how your data is indexed. There's, there's a lot there, and it's really almost a B2C motion, right, where we're a pure business to business motion. Nick's delivering a really interesting product that's like a B2C motion.
Then you have companies like Legion, formerly known as Yurts, Primer, and a range of others that are going kind of far beyond what Nick's doing in terms of vertically integrating that stack. They may have proprietary chunking strategies, proprietary embedding models. They might be using things like graph RAG. So might be using graph databases alongside alongside vector databases. They may have optimized their retrieval models with their and re rankers with their embedding models, in order to optimize it. A corollary in the commercial space or companies like Contextual which is doing a very similar thing where they're doing a lot of the extreme kind of fine tuning of all the different parameter settings on all of this in order to deliver, like, an incredible, like a very exquisite sort of, like, user experience. But they own the whole box, right?
And so a few different models here on how to engage, and you're seeing the same thing right in the in the commercial space, where you can invest in a vertically integrated stack, and you just buy the box to teams that want to own each layer in the same way that they have, say, with the modern data stack, where they may have bought the ETL/ELT provider, like Fivetran, DBT indexed into like, say, Databricks, snowflake, BigQuery, and then use say Microsoft Power BI on top, and then they have sovereignty over each layer of the cake. And so there's a lot of parallels between kind of how this is playing out today, and then how it played out with structured data over the last five or 10 years.
Maggie 31:01
So once I have all my data set up, all ready to go in whatever database of choice, what models do you have access to on the high side to actually then plug into that data?
Brian 31:17
This is a good question. This is maybe one that you all would have some thoughts on as well. I mean, I think increasingly you're starting to see, like the two big labs have their models somewhat available on secret, like on tipper or on JWICS, right, with Anthropic and Open AI models. Now, how available those are, how much volume and quota, right? It's even a challenge for us and the commercial space to get quota to do what we're doing. And so that's though they may be available. They might just be available for experimentation. And you know, these things are incredibly GPU hungry, and so the size of the clusters that they're running on in each respective network is the hard constraint on throughput right. And then you know how many applications are trying to hit that model right? Creates real scarcity for the ability to not only experiment, but also like put these applications into production.
Maggie 32:17
And then what is the state of compute hardware that's available on the high side for, I guess, both storing this data and these databases as well as actually running these models.
Brian 32:32
It really depends. So there's been there continues to be investments in GPUs, but the there's a big but, and the but is, you have federated networks, right? And so some that may be available to the Navy or might be available to Air Force or to NSA, doesn't necessarily mean it's available for your application. That's for an Army PEO, right? And so the total number is growing, but it's incredibly federated.
But then also comes back to an earlier point they were talking about, which is, like, where's the budget for that coming from? Right? And so is there a CIO shop that's creating budget and allocating it? You know, there, I hear lots of stories about different units or wings that want to go and use it, but then they need to have a minimum commit to a hyper scaler of, say, a million dollars over 12 months for some fixed kind of set of nodes. And so that that's a real friction point, right? And the ability to adopt that doesn't necessarily exist in the commercial sector, just because GPUs are more plentiful, right? And the demands a lot more elastic, or supply is more elastic.
Maggie 33:52
To what extent have you started to see the adoption of actual agents doing some amount of tool use – both within Enterprise and then within the Department of Defense?
Brian 34:06
Enterprise: the last like six months, it's exploded. And so particularly the last 90 days it's absolutely exploded. And so we've seen barriers fall with agent experimentation. We've seen frameworks mature, and then really interesting paradigms, like MCP right begin to proliferate that will make it easier for agent to agent communications and these multi agent systems are mature.
In the defense space, we're seeing desire. We're seeing some experimentation, but it is incredibly constrained compared to what we're seeing in the in the first space, there's a lot of advertising going on, and there's a lot a lot of fostering and positioning going on, but the ability to you. To train and deploy models is mainly being done in sandboxes right now, but they're really interesting use cases, and they hold a lot of promise.
Maggie 35:08
What do you think will be the first couple use cases of agents, in particular, for the Department of Defense?
Brian 35:15
I think a lot of them are going to be extremely boring things right now, right? And so I don't think that's a bad thing at all, because there's a lot of boring bureaucracy that's required in order to enable that giant war fighting machine that is the US military and so things like what Advana was that's been focused on right, passing the audit around, workforce management, things like that. They're just perfect right.
For agents, right now, the sexier things that will probably get the wired articles are probably going to be around rapid mission planning, right. Closing the OODA loop, faster agents. There was a really interesting blog that was written back in, back in February by Nikita, the CEO of Neon that just got acquired by Databricks, and it showed the number of neon databases that were being created each day by agents versus by humans. And you saw this enormous J curve where you saw like 10s of 1000s of these databases being created by agents every day, and that that just accelerating. You see numbers of like lines of code written by with cursor every day. You see that same sort of J curve that's going to enable like if, that's if they do make these tools available and they resource them, you're going to see the pace of innovation around kind of the point into the spear, really, really accelerate. It's going to take data fusion. It's going to take the infrastructure. But it's going to be really, really exciting to see what the folks that are close to these missions are able to come up with these tools in their hands.
Maggie 36:53
So it's going to start with just filling out paperwork and whatnot, and then slowly move closer and closer to the battlefield.
Brian 37:00
Absolutely.
Maggie 37:02
Brian, I'm curious. You've been doing this for a long time. What does it take to build trust with the Department of Defense to get them to trust this technology? You know, particularly if we want to start deploying AI agents that have the ability to use tools that might actually change something in the world?
Brian 37:21
I think that, like a lot of the senior leaders, it's not that they're distrustful necessarily today, it's that what they need to be able to buy is a capability, and that the Department's not really well positioned, and the capital markets aren't really well positioned either to deliver capabilities for them. And so like, let me talk about that for a second. So with contrasting it with, like, the commercial sector we talked about earlier, they'll have some $50, $100, $500 million a year commit with a hyper scaler or multi cloud strategy. They'll allocate resources across dozens of teams. Those teams will effectively build capabilities, right? And they can experiment with them rapidly, and then they can scale those that are successful. In contrast, in the in the defense space, you have that sort of with SBIRs a little bit, and something with, like with these other various kind of funding lines, but none of them are large enough, right, in terms of, like, a critical mass of funding in order to actually demonstrate the capability. And it takes so long over the period of performance of these things to deliver it, that what it does is it incentivizes capital markets to fund like defense tech companies that will package everything up into a bespoke capability, right, that they could then go sell. I think Vannevar did a great job about this. Vannevar’s done this. Companies like Legion, others are doing this as well. And so I think that it's really around packaging up a really tight capability for them to go and fund. That's something that that fits within, like the funding paradigm in DoD, but that isn't really a natural motion in the commercial sector.
Maggie 39:06
Where have you seen glimmers of success within the US government in terms of AI adoption? Are there any groups or initiatives within DoD that are particularly forward leaning?
Brian 39:16
I think most of it these days, at least from where we sit, which is like primarily around natural language data: Army, Air Force, and Special Operations. We're not seeing as much yet from Navy, from Space Force, from the Marine Corps. Within Army, we see a lot of experimentation going on around Linchpin on C5ISR, around various initiatives there in order to data fusion, that's kind of filter on Titan.
On the Air Force side, you have so much technical talent spread across ACC, AMC, AFTC, Air Force Research Labs that what you've ended up with is like a huge amount of experimentation. And from that, they've sort of begun to develop really tight theses around where they think they're going to be able to accrue value and go out and start getting tools that they want.
And then with SOCOM, really around the teams at Fort Bragg around Army Special Operations Center and their AI division. They've been a leader from the beginning on this, at least, to partner with us. And they've, I think, been a leader across the Department of Defense. And Frazier in particular demonstrated incredible technical depth, but also that initiative that you really see and expect with from SOCOM to move quickly and to and to mature quickly and so those are really the kind of three areas that we're seeing the most pull from these days.
Maggie 41:00
What do you think has made those organizations successful, and what can the rest of the DoD and US government learn or take from those initiatives?
Brian 41:10
I think on the SOCOM side, it's you resource, you empower really hard chargers, and you set aggressive goals, right? What they had there, though, was they had compute, they had network, they had data, and they had human capital. And those are, like the four things right that we were talking about earlier that are needed to be successful in the commercial space, and General Braga and and then others, later on, really made it a priority to resource that. And so that created the set the conditions for success.
You've had that to a lesser extent with Air Force, but they've been able, like the enterprising Airmen, have been able to piece that together in really interesting ways across different parts of it.
And then on Army, you see a really a strong mission set there around AI, and really aggressive goals around that, though they have slightly different mission sets and they're buying it and working in different ways. And so I would say the combination of resources at SOCOM deep talent and Air Force, and then you know, mission on the Army side have, at least from where we're sitting, accelerated those three organizations.
Akhil 42:33
Hey, Brian, we all know that at some bottom up level, folks are just going to figure it out, right? Airmen, sailors, soldiers, Marines, Coast Guardsmen, obviously, Space Force too. Your critiques with regards to the scalability and the importance of a top down strategy is still sticking with me. Folks are just figuring out how to get data, compute talent right at the lower levels. And so I wanted to return to this whole idea you were talking about. And specifically, could you talk about one or two key policy changes the DoD needs to make that would better support the scalability of some of these bottom up initiatives?
Brian 43:10
Look, I think that there's been way too much discussion (that's my hot take) on contract vehicles over the last couple of years, and not nearly enough discussion around those factors that you just talked about. So you have these huge companies that are you know, have been intensely lobbying for reforms on acquisition. That's fine. Let's do it. Let's keep let's keep making progress there. That's not going to that's not going to be sufficient though, to take you to where you need to go.
I firmly believe that an organization like CDAO or the CDAOs, for each of these respective organizations, my preference would be centralized, because I'm selfish and I'm in the commercial sector, and I could have one person to sell to who could be responsible for the infrastructure, for the network access, right, for the compute availability, for the model availability, and getting the data into those architectures, as well that mission needs to be solved. And so if it's done at the command level, at the service level, fine, but I think that mission set for CDAO is a valid one despite execution flaws that we've seen over the last few years.
Akhil 44:30
Yep, that totally makes sense, Brian, and to be fair, correct me if I'm wrong, that's not necessarily selfish of you. I think what we're looking also is what is in the interest of the users and the customers trying to build applications, and what I believe is should be the case is they should be spending less time trying to connect things and more time trying to figure out what is the right application and what is the right type of usage with the either compute constraints, whether they're operating on edge or cloud or wherever that they should be focused on, unless I have that wrong.
Brian 45:00
Yeah, absolutely right, absolutely right. I think that, like any day that's spent trying to get a hold of data or trying to marshal compute resources is a day wasted that could have been spent on mission, right? And so that's just friction that needs to be eliminated
Maggie 45:16
Brian, let's say tomorrow Congress decides to appropriate $10 billion or $50 billion or something to CDAO, they call you in to give them advice on how to spend that money. How would you advise them to spend a new large budget?
Brian 45:36
Have more than one or two authorizing officials for ATOs – so that's number one. My advice always is to go hunt for bottlenecks, right? God bless them. But they have like, two people there that can actually like, sign ATOs. Go solve that. Okay, one, two, go make sure that there's GPU availability across all the regions and all the federated networks. That's two. Three, provide an easy way to transact. Tradewinds is not the easy way to transact. Allow marketplaces, consumption based contract vehicles make it as similar as possible to the private sector, and three go make this stuff available for almost nothing to the folks at the ground level that are closest to the mission problems, right, who are in the best possible position to drive that innovation and then scale the hell out of that as quickly as you can right?
Akhil 46:39
I had one follow up question on the agent piece, you mentioned some of the bottlenecks, whether it's MCP or something else. You know, there's some commanders out there that are really excited about that agent to agent future. What to you is the one or two bottlenecks you talked about wanting to drive down bottlenecks. What's the one bottleneck, maybe specific to agentic maturation in government that you think needs to happen and needs to happen and needs to happen fast, or something that the commercial space is already pretty far ahead on?
Brian 47:06
Look, I think MCP client servers are proliferated over the last over the last several months, and that's been great. I think that those by themselves are not enough for these multi agent systems to be successful, I think that it's really well defined and capable APIs coupled with MCPs or other emerging frameworks that enable multi agent systems. I think that that defining those protocols for how these systems are going to talk to each other, this next generation technology is really going to be the linchpin to unlocking, unlocking the efficacy of multi agent systems, and especially multi agent systems coupled with these multi step deep reasoning models.
Maggie 47:51
I was going to ask you know, one word you keep saying, Brian is experimentation. DoD is really good at experimentation. What does it take to actually move beyond experimentation and actually bring this technology to the end user?
Brian 48:08
Yeah, I think this comes back to familiar challenges, right? But if you like, let's take all the GIDE challenges, right? The Global Information Dominant Exercises that were done, you're able to find sandbox network environments, you're able to get sandbox data. You're able to have it for a finite amount of time, and you didn't have to worry about changing people's job titles or re upskilling folks that had been doing manual processes. And so if you want to actually scale these things, right? You need it to be on real networks. So you solve ATO problems. You need to have real data. You need to be able to have someone that's sufficiently invested in this that they're willing to upskill their workforce.
And so in the in the private sector, we've seen really a few phases. We saw a proliferation of proofs of concepts with Gen AI, then a consolidation of those that they were going to go scale. And then everyone thought, Okay, the next phase is actually production wrong? There's actually like this third step here in a four step act that folks didn't really anticipate, which was they needed to go and rethink all their people ops. So what are the job criteria is, what are you actually doing? What are the, what are the processes inside these organizations so that they're actually well positioned to capitalize on these tooling, otherwise you just have a cool app that sits idle and isn't effectively utilized, that requires top down cover, top down pressure, right? And top down resources, if you're going to actually make that happen. And so that's actually a really critical step here is you got to have the data, you got the network, you got to have the models. But you also like these. These are actually humans that are using this. They're using it to accomplish a task, right? And you need to think, do? Deeply at an organizational level with you know how you're actually shaping those job functions and those processes so that you can actually capitalize on that, on these tools.
Maggie 50:11
What are some of the most common mistakes that you've seen startups make when trying to build and deploy AI for the Department of Defense?
Brian 50:19
The most common mistake is to get sucked into abandoning dual use technology. And so it's really, really easy to land some initial contracts and to end up, as you know, a software vendor that has loose, if any, product market fit in the commercial space, but great product market fit within one particular part of one service. Now, there are companies that have done that and done it incredibly well, but they're, I could probably count them on one hand. Like Vannevar is the most recent, right? They've crushed. And so, like a tip of the hat to Vannevar for doing that. That is very, very difficult to replicate for most others. And so I would like the rule of thumb that I tell a lot of the other CEOs that I talked to about doing business in this is like, if you can maintain a single, unified code base for your commercial clients and your and your public sector clients, then you're then you're on the right track. Now, if you're forking it and having to Frankenstein things, then that you should really take pause and think about the scalability, like when the speed and breadth of customers in the in the public sector that you can actually go sell to, versus the good personal space, because public sector is a great place to land. It's a great place to steer a mature business. My opinion, it's an absolutely horrible space to try and scale a venture backed startup.
Maggie 52:02
What would you say is your most controversial hot take about AI and or AI infrastructure?
Brian 52:09
I'm not sure if I have hot takes here, other than I would put myself in the extreme optimist camp around agents and about where, how quickly agents are going to be useful, and so, like, while I'm optimistic about the tech, I'm also worried about what that portends in terms of competition. And I think we should utilize it to drive innovation and drive productivity increases as fast as we can. You could bet that the Chinese are going to be using that as well for offensive purposes. And so I think that this warming that we've had between DC and the Pentagon in particular, and then Silicon Valley like I hope that continues. I hope that deepens, and I hope that that accelerates, because we're going to need to figure this out faster than our adversaries on how to utilize LLMs, coupled with agents right in order to advance mission as fast as we possibly can.
Akhil 53:17
I love that optimism Brian and where the things are headed from an agent to agent collaboration and the AI front, I know you have some pointed critiques, but I know they come from a place where we want to make the ecosystem better and we want to advance against the global competitors. Speaking of global competitors, Brian, where are we vis a vis the PRC and others? Obviously, plenty of news around there about R1, DeepSeek, et cetera, but you're seeing it also, not just from the model stack, but from the infrastructure stack.
Brian 53:47
Look, I think that there was a broad consensus in 2023 and 2024 that we had a fair lead here in the US, that sanctions on GPU availability were effective, and that we were in a pretty good spot. I think that we have a very different view of that here in Q2 2025, now, and that, from a hardware standpoint, the Chinese have effectively circumvented aloneness of the sanctions and are actually building domestically or cultivating domestically, incredibly advanced GPU manufacturing capability.
Second from a software standpoint and model training standpoint, the only thing that we have over the Chinese at this point is momentum, right? It's the speed of innovation. And so I think that it's absolutely critical that we support our labs, or that we support Open AI, that we support Anthropic, Google, Meta and others, and continue to innovate as rapidly as they have. I think that that's the only real advantage that we're going to have. That's going to be potentially durable over time is speed of innovation. And I think that we're focused on the right things right now, which is going to be electricity, really, it's going to be power generation, right as well as making sure that we don't stifle innovation through excessive regulation of these of this industry.
Akhil 55:20
Awesome. Thanks, Brian. Brian, we're coming here to the end, and I want to focus on you and your company a little bit as we close out. Maybe the first ask, what was your biggest surprise, going back to when this was on the drawing board, napkin table, in terms of building unstructured since then and to now?
Brian 55:38
Yeah, I'd say one I'd like to think that I anticipated, but I didn't anticipate ChatGPT launching six months in. So here we were fall of 2022, us just groveling for GitHub stars, and then ChatGPT drops that November, and demand explodes, right? And so from a timing standpoint, it was we were really fortuitous. I'd say on the flip side of that, it was just it was moving so fast during that period of time, like every week it seemed like the technology was changing, and what hasn't really changed, the foundational sort of pieces that you need to normalize all the data, get it in a consistent format. You need to index it in a way that's discoverable by the models. And that there's a lot of art to getting these, these architectures, to work even now in 2025.
Akhil 56:37
Yeah, Thanks, Brian. It's been awesome to see the journey you've been on where is unstructured going from here? What are you looking to do in the next couple months? Who you're trying to bring on in terms of awesome folks? Would love to hear a little bit about where that's headed.
Brian 56:50
I'd say like, especially within the last quarter, quarter and a half, the market has really matured, and so we're receiving a lot of inbound now from Fortune 100, Fortune 500 companies that now are looking at their data, Gen AI data strategy, and looking to normalize. And so we have call it like a couple dozen really challenging and ambitious deployments that we're working on as we speak, that I'm really excited about. And then we're doing a similar thing within the DoD, across the DoD. And so from you know, a talent standpoint, a lot of this comes down to really phenomenal engineers, right, that understand the customers, that are incredibly capable in their respective kind of fields of expertise, and that can move at the pace that innovation is happening today, and so we're looking, you know we're gonna, we expect it to grow significantly over the back half of the this year, and are really excited for what the future pretends. Awesome.
Akhil 57:52
Brian, thanks so much for being with us today. Always great to get your insights and have some conversations and see you be able to do some awesome stuff again, both in the commercial space and in the government space, and we look forward to continuing to see that grow.