He emails me (but not for attribution)
The key to the whole thing is the back-end. The big problem here is likely having to query various legacy DBs in real time (e.g., have the website fire a query to see if your name and SSN match). This was a home-run approach by the original technical team, and in my view a huge blunder (though hindsight is 20/20, and I’ve made some very dumb technical decisions in my life). They already have an Oracle (basically, a modern relational database) instance up and running. The crucial change would be to grab a batch transfer from the legacy systems as of say October 25th and move ALL of it into the Oracle instance. You could then have every query fire against the Oracle database. This would radically simplify every technical problem, as you would now have end-to-end control over the DB – logic – website. You could structure highly simplified queries, and build the whole data model around exactly one job: make this one website work. You could then periodically batch update from those source legacy systems, say daily or weekly once you were in production.
This occurred to me also. You create extracts of the data on legacy systems, and then you run the web site against the extracts. Some issues.
1. You have to hope that the existing system has a clean separation of business logic from physical location of the data. That way, you just change the statements that access data to find it in the extract rather than in the legacy system. If there is business logic mixed in with the data-access statements, then there is more to re-code and test.
2. The job of building the data model for the extract database is not trivial.
3. The job of tuning the database to obtain good response times is not trivial.
4. In some sense, you have two databases. You have a “read-only” database of extracts from other systems. You also have a database that is written to as the user goes through the process. Tuning the performance of that latter database is a nontrivial problem–although they should already be working on it.
5. We start with a deficit in terms of testing, and that deficit gets much larger if we redo the data interchange process.
6. This creates new security problems, and it may create new logistical problems for the people managing the legacy databases. Not that those issues will be given priority at this point.
As an expert on both web development and relational DB systems, I agree with this assessment.
I think that there are two additional issues that this will lead to, flowing from this approach.
7. The organizational difficulties in getting data extracts from one entity, and transferring them to another entity, are frequently non-trivial (although perfectly soluble). There are a number of technical issues to overcome, including the data model of the extract itself (fully normalized or not, etc.), its format (XML? Delimited text, etc.), the details of the transfer (nightly FTPs, webservices, etc.; push or pull?). Every organization has their patterns for how to deal with this stuff, so there’s going to be some learning curve for those who have to leave their comfort zone to accommodate the final decisions.
8. Most significantly: security. A mad rush into this design is going to lead to all sorts of security concerns, mostly around how the data dumps are handled (e.g., will they be encyrpted?), and how that data is staged at the central system, and how access to it is controlled. And when in a rush like this, it’s very easy to give security short shrift. I predict serious security vulnerabilities as a consequence, although if we’re very lucky those will never come to light.
If it were one system one was integrating, yes this an approach. Even 10. There are 50+ medicaid systems, many federal systems, insurer system, etc. and they are probably all different vintages. How many of them will just hand over their data? Probably the feds, some states – insurance companies? They could probably be compelled but it would take time. After all this is in your mega database, you still have all the issues of the 50+ medicaid systems having different data or dirty data or incomplete data, same with insurance companies, so what do you gain except an elimination of network lattency? If you going to clean the data before, well you are not transferring the data on 10/25 and have it be ready in any reasonable amount of time.
I would also add that if the problems were fixed by this solution, the problems could have also been fixed by adding more bandwidth. That seems unlikely.
No, I don’t think that’s the case. The problem isn’t bandwidth but latency (and not network latency, but application latency).
Sending a hypothetical request to Social Security asking “is John Doe in Dallas TX really 123-45-6789?” amounts to trivial bandwidth. The actual problem is the amount of time it takes for that system to catch the request, query its own databases, and make the reply.
The difficulties might be due to ancient technology platforms that are hard to adapt to web systems (AJAX queries sending queries to an old IBM 360 and expecting JSON in response, maybe?). They may also be due to the agencies “owning” the data having organized it in a manner different than the access path that’s needed by healthcare.gov; simply lacking appropriate indexes could cause problems in responsiveness, and a separate copy could have an entirely different indexing scheme optimized to healthcare.gov’s requirements.
I do agree completely with your insight about data cleansing / ETL.
ok, so application latency, throw in more RAM add more processors. Either network or application latency can be fixed with more hardware/infrastructure
I second @mb – ETL solutions *seem* straightforward, but in my experience they almost never are. And just the administrative tools alone, like internal webapps, could easily take another year or longer to implement.
One example that comes to mind is handling multiple matching entries for a given name and SSN. What does the consumer-facing website show the user? What does the backend process do? Is there an internal site, or any other tools, where discrepancies can be resolved (or even seen)? And besides the significant technical challenges, there are related administrative and ‘business’ challenges that need to be solved as well.
Given the example you give, I would respectfully disagree with this approach, which I would view as far more complex and risky. This is basic data that changes frequently and in all likelihood will be changed specifically just to facilitate signing up on these exchanges. Syncing large databases always sucks. It is done often enough, because of the reasons given, but it is always hard.
The great thing about using the web is that solid interchange technologies have emerged, and it really doesn’t matter what the legacy systems are. If a legacy system can provide a valid AJAX request service, or some equivalent, they can work together.
The SS number example is a query the government will need over and over across many systems, not just healthcare.gov. If they can’t develop secure, rapid, well understood AJAX responses with other Fed servers to such common questions, then they have no business trying to build systems like this.
Exactly. Creating an uber database is a desperate move that can work in theory but will in practice only make the system even more brittle than it already is.
I don’t think any government is capable of making software in the modern way, even most big businesses fail. This is going to make it hard for governments to govern in a networked software world.
Nothing is going to make this system work. They are going to walk back the expectations in a big way and then announce victory.
I’ve been working on SOA projects for healthcare insurance providers for the past 5 years. They can’t figure out how to securely integrate the dozens of systems within their own 4 walls that they have acquired via merger and acquisition.
I expect a steady stream of NPR stories about how ACA caused premiums to double, deductibles to double, usage of mandated free services to double and swamp doctors capacity, claims will be denied because enrollment and entitlement can’t be verified, claims will be delayed because the claim was never received.
Some Presidents invade Iraq/Vietnam. Some Presidents try changing every healthcare regulation and system for every citizen, everywhere, all at once. Hubris comes in many flavors. And political insiders always profit by it.
If the small government crowd can’t capitalize on constant over-reach it’s because the median voter has porridge for brains (see B. Caplan). But then, insiders.always win and treat outsiders like the Maasai treat their cattle.
There’s a world of difference between integrating complex systems that were developed by separate organizations, and developing the means for one system to ask a handful of questions of another system via secure web service requests. Its like the difference between a book and a few paragraphs.
Ultimately, you may very well be right and this whole thing will collapse under its own weight. But some of these software problems are solvable.
As a follow-up, if their legacy system simply can’t handle queries like this, have the SSA follow this advice once internally- sync the data internally to a modern DB architecture tuned to handle this. Then they can provide this service for the dozens of other systems that might be approved to access such data. Don’t force every system in the government that needs to validate SS numbers in real time to build out their own solution. The government shouldn’t allow dozens of anachronistic standards for such things to evolve anyway. There should be one API for this across the entire government.
Clearly this is a core competency the SSA (and pretty much any major agency) needs going forward.
If the Federal Obamacare site’s main problem is extracting data from other legacy systems, why do states’ Obamacare sites work?
The states’ Obamacare sites will have to extract data from the same legacy systems as the federal Obamacare site, e.g., SSA, IRS and insurance company systems
Speaking as someone who knows next to nothing about about databases or web development but who works with Medicaid Agencies in several states, the data they possess is pretty standardized in my experience, at least, so that should cut down on the complexity of the querying issues. Not identical, obviously, but states have to maintain certain records and file quarterly reports with CMS if they want to get their federal matching funds, and those reports are entirely standardized, so the states’ databases all have to be able to generate the same kinds of information.
If there were difficult problems here, I would guess that they’d be due to the fact that these systems all seem to have been set up in the eighties and likely have not been updated much since then, so legacy hardware issues could arise quite often, I’d guess, as others mentioned above. The data itself seems pretty uniform, though, in the seven or eight states my company does work for.
The irony is the general form of healthcare.gov is basically a referral web store, which means all it has to do is allow browsing of the product catalog, which is heavily geography-based (due to silliness in ACA law, not anything “real”). It doesn’t have to do inventory or billing, which is usually the hardest part of a web store. It’s analogous – at least at the user end – to realtor.com, not amazon.com.
Referral-style web stores that have offline order fulfillment by partners were among the earliest type of e-commerce sites, and have been well-understood in terms of system architecture for a very long time.
Sourcing the data from numerous input sources is obviously a challenge, but far from an unsolved one; pretty much any integrated website with partners does some sort of regular load/push to the catalog cache or direct querying of external systems, or often a combination of both. And yes, the subsidy stuff adds a complication (and is a political wheeze to hide the real cost of plans), but it should be the easiest bit as the feds control all those parts of the architecture.
But would you hire 40+ beltway bandits to build this sort of thing, and have a bunch of nontechies handle the coordination and integration? Nope. That’s where the real fail happened.