Agencies are laying the groundwork for artificial intelligence and automation tools by first mastering their vast stores of data.
The core task of data inventory and improving data maturity in agencies is a foundational step for the commissioning of AI and automation tools across the federal government.
But in the process of completing that work, agency data managers say they’re having an easier time getting the right data to the right people at…
READ MORE
Agencies are laying the groundwork for artificial intelligence and automation tools by first mastering their vast stores of data.
The core task of data inventory and improving data maturity in agencies is a foundational step for the commissioning of AI and automation tools across the federal government.
But in the process of getting that work done, agency data managers say they have an easier time getting the right data to the right people at the right time.
Damian Kosktiuk, deputy director of data for U.S. Citizenship and Immigration Services, told Federal News Network during a recent roundtable that his office is looking at ways to use data to improve customer experience. of the agency.
“If we’re going to be able to do one of the AI/ML projects and actually solve a lot of the core issues around the agency, before we even get to ML and AI, you have to have the data to solve the problem,” Kosktiuk said Nov. 17 at the ATARC AI and Data Summit.
USCIS is specifically looking at how data can improve the digital experience on its website and how to make processing its caseload easier and faster.
But to help get that work started, Kosktiuk said his office has cataloged the agencies’ memorandums of understanding that set policy for sharing and managing data.
“Sometimes it’s actually been a bit disorganized in the past, and it’s been all over the place. But we’ve done a lot of due diligence to try to shore that up, to make sure there are good controls over it,” he said. “When memorandums of understanding have been signed, and we are negotiating them, they have always included huge elements on data standards and data quality requirements…In particular, we are really emphasizing the implementation in place of data standards in all DHS and the federal space. ”
Data management at USCIS also requires having clear safeguards in place for how certain datasets may be used.
“We have a lot of data that can only be used for non-law enforcement purposes, and we have data that can only be used for law enforcement purposes. They must be very bifurcated, they cannot see each other. So there are natural limits, per se, to what you can do, but we wholeheartedly adhere to them. And therefore you have to have really good data management to make sure that you don’t spill and have those two bits get mixed up,” Kosktiuk said.
While few agencies have deployed AI tools beyond limited pilots, Kosktiuk said USCIS, by improving data maturity, has improved “time to market” to provide the right delivered to the right people at the right time.
“In the past, if you had to spend nine months trying to clean data before you could even get your project, you’re not really going to help the American people very quickly,” he said.
Kosktiuk said the current state of the agency’s data allows tasks to be completed in weeks that previously would have taken months or years.
“Because the quality of the data was so much better, because the data sharing agreements were in place, and we had connections in all the other types of immigration space, we were able to complete projects in two weeks, which I swear would have been at least nine months back then… Critically the data was good to come, and so we can make good decisions quickly and get the insights we need, that it’s about statistics for management to make policy decisions, or for operators to actually get the job done,” he said.
US Copyright Office migrates data from legacy systems
Suman Shukla, chief of the data management section of the US Copyright Office’s product management division, said the agency had used optical character recognition to scan some of its records, but was considering AI tools. to speed up this workload.
The US Copyright Office has 41 million records in its card catalog. These documents are a mixture of typed and handwritten documents.
“The biggest challenge for us is capturing these images, extracting the metadata, and performing real-time keyword research to determine what information we’re looking for,” Shukla said. “The people who did the copyright work, they don’t have to physically come into the building to pull the drawers and look at the map to find out what work was done.”
Shukla said the US Copyright Office needs access to decades of data each time it challenges a copyright claim in court. Copyright protections in the United States cover the lifetime of the author, plus an additional 70 years.
“We have an immediate urgency to provide all kinds of information related to this court work,” Shukla said. “Our data cannot simply be archived and left behind, it must be archived in such a way that it can be retrieved at any time when needed in such situations.”
Shukla said the US Copyright Office conducted a data management initiative to understand the current state of its data. The initial analysis shed light on what data the agency has and where it is stored.
Shukla said the analysis also helped the agency terminate some of its legacy systems and move data to newer systems.
“There is data that you can simply publish online [under] free open data policy. Some information may contain FOIA-compliant information, but is not readily available to the public. Some information is reserved for agency sharing or team sharing. Or there is classified information, so you can’t share everything,” Shukla said.
Alexis Banks, a computer specialist with the EPA’s Office of Chemical Safety and Pollution Prevention, said her agency is considering AI to flag names and signatures on archived documents.
“It’s very important that people get this data in time, so they can make impactful decisions. Now there’s this whole cycle of how we do this. We collect the data, we have to clean the data, we have to organize the data. But for some agencies, we have this data placed in all these different places that we need to be able to compile it all in one place, so we have a faster system. It’s just how you do it. And so moving forward, we minimized the time, and that’s the whole point, trying to make things work faster,” Banks said.
USDA Data Maturity Accelerates Payments to Farmers
Xuan Pham, senior actuary at the Department of Agriculture’s Risk Management Agency, said the agency invested early in data management and used the data to ensure crop insurance payments reach farmers faster.
“If a farmer went to a county office and spent an entire day filling out a paper application, it could take hours and hours of work. What we’ve done is because we have that data, we pre-populate the request for them. All they have to do is look at it, check it, make sure it’s correct, and that’s it. So instead of taking months and months, we were able to reduce that time to weeks, and that’s huge,” Pham said.
The agency also provides weekly county-level machine-readable data on the causes of crop loss.
“As an agency, we’ve really built a structure to collect that data from the field and to be able to know exactly what’s going on. And so, because of that, we benefit a lot,” Pham said. “We benefit a lot from this foresight, this leadership that has happened over several decades.
#USCIS #Deputy #CDO #Sees #Faster #Time #Market #Data #People #Federal #Information #Network