Automation Doesn't Fix Bad Data. It Industrializes It.

Name: Convergent LLC
Address: 800 Battery Ave., STE 300, Atlanta, GA, 30339, US
Telephone: 866-567-7221

ArticleAugust 20, 2024By Robert Ferris McKnight

I've spent the better part of three decades inside the data layer of retirement systems — as an Oracle DBA, as the lead on database architecture for trading and recordkeeping operations, and currently as the practice lead for Convergent's DTCC operations and Relius work. From that vantage point, the most common failure I see in retirement automation is a simple one: teams automate workflows that sit on top of data that's quietly broken.

The automation runs. It runs fast. It runs at scale. It also propagates every defect in the source data with mechanical efficiency, and now the firm has a problem that's harder to find than the manual one it replaced.

What "broken data" looks like in a recordkeeping environment

In every engagement where automation has gone sideways, I've found some combination of the following before anyone touches the workflow:

Participant identifiers that don't reconcile across systems. The recordkeeping platform has one identifier. Payroll has another. The custodian has a third. Operations has been bridging these by hand for years using a mix of SSN partial matches, name fuzzy logic, and institutional memory. Automation that assumes the identifiers reconcile doesn't reconcile them — it just produces silent mismatches at production scale.

Plan rule configurations that disagree across systems of record. The plan document says one thing. The recordkeeping configuration says another. The compliance testing tool says a third. In normal operations this is invisible because the same person interprets all three and makes the right call. Automate any single piece, and the disagreement becomes a defect.

Reference data that drifts. Deduction codes added by one team, contribution types added by another, plan types added by a third. None of it formally governed. The data dictionary that should be authoritative is a Wiki page nobody has updated in 18 months.

Historical data with gaps. The participant database has clean records back to 2015 and a mix of imputed, partial, and missing values before that. The automation built on top either ignores history (bad for compliance) or trusts it indiscriminately (bad for accuracy).

Audit trail discontinuities. Records have updated_at timestamps but no who-changed-it metadata. Or there's a who-changed-it field, but it points to a generic service account because the upstream system never passed the actual user identity through. Compliance auditors find this every time.

None of these are exotic. All of them are routine. And every one of them turns automation from an operational win into an audit risk.

The work before the work

The clients that automate successfully do something specific before they write a single workflow rule: they audit the data layer. The audit isn't glamorous. It's also non-optional. The questions it answers:

Is the data model fit for the use case? Most recordkeeping databases were designed for batch processing, not real-time analytics. Asking them to support real-time participant intelligence without rethinking the schema is a recipe for read amplification and locking.
Are the identifiers actually unique? Run the reconciliation. Look at what doesn't match. Decide whether you have a data-cleansing problem, a process problem, or both.
Is reference data governed? If new deduction codes can appear in production without a documented review, every automation downstream is fragile by construction. Fix the governance before you build the automation.
Is the audit trail complete enough to satisfy a compliance review? If the answer is "if we cross-reference three systems and one Outlook archive," the answer is no. The automation will inherit this gap and make it bigger.
What's the actual data quality rate? Pick a sample of 10,000 records. Score them against the rules the automation is going to enforce. The clients who win at automation know this number before they start. The ones who don't, find out later — usually during the first audit cycle after launch.

A working sequence

When the data work is done in the right order, the automation goes in cleanly. The sequence I run on these engagements:

Profile the data. Quantitative pass — completeness, uniqueness, referential integrity, range/domain checks. Most teams skip this and start at workflow design. It's almost always a mistake.
Define the canonical model. Decide which system is the source of truth for each entity. Document it. Get the operations leads to sign off, because they're the ones who'll be navigating exceptions.
Build reconciliation, not assumption. Whatever the canonical model says, build a reconciliation step that proves the assumption holds. Run it on a schedule. Alert on drift.
Then automate. Now the workflow can run against data it can trust. The exception rate stays manageable, the audit trail stays clean, and the speed gains are real.

The reverse order — automate now, fix data later — is the path that produces the war stories.

What recordkeepers should look for in a partner

If you're scoping an automation engagement right now, the diagnostic question to ask the prospective partner is this: how will they assess the data before they touch the workflow? If the answer involves a database profiling pass, a reconciliation step, and a governance review, that partner has done this before. If data assessment is deferred to implementation, the automation will industrialize whatever is already broken.

Clean the data. Then automate. That's the order.

Get the next one in your inbox.

One email when new research lands. No drip campaign. Unsubscribe anytime.

About the author

Robert Ferris McKnight

Principal Architect — Databases & Trading Operations

Every retirement transaction — every contribution posted, every trade executed, every participant statement generated — ultimately lives or dies in the database. Robert Ferris McKnight has been the person retirement organizations trust with that responsibility since 1997. With nearly three decades of hands-on experience spanning retirement trading systems, operations, and database architecture, Robert brings an encyclopedic command of the data layer that underpins the entire retirement recordkeeping ecosystem.

View full profile

Your platform won't modernize itself. Let's talk.

Book a 30-minute platform assessment with a principal-level consultant. No pitch deck. No junior associate. Just a direct conversation about your systems, your challenges, and what it would actually take to solve them.

Book Your AssessmentOr call us directly: 866-567-7221

Automation Doesn't Fix Bad Data. It Industrializes It.

What "broken data" looks like in a recordkeeping environment

The work before the work

A working sequence

What recordkeepers should look for in a partner

More from the library

Retirement Is Complex. The Answer Isn't More Complexity.

FRP Has a Launch Date. You Still Don't Have a Migration Date. Build One Before the Queue Builds It For You.

AI Won't Replace Your Retirement Operations Team. But It Will Replace the Parts They Hate.

Your platform won't modernize itself. Let's talk.