Database Design: Why 73% of IT Projects Fail Before They Even Begin

Do you know why that new business app that was supposed to 'change everything' has become slow and error-prone? The problem lies where no one looks: in the database. 73% of IT projects fail for this very reason. Poorly designed databases cost billions, slow everything down and turn data scientists into dirty data cleaners. Yet many companies continue to use monstrous Excel spreadsheets or improvised Access databases. In this article, you'll discover the three design phases that make all the difference, when to use relational or NoSQL, how to prepare data for AI, and the mistakes I've seen cost millions.

Tempo di lettura: 9 minuti

Have you ever wondered why that new business application that was supposed to ‘revolutionise everything’ turned into a nightmare of slowness, errors and out-of-control costs?

The answer, in most cases, lies in a place that no one ever looks: the database. More precisely, in how it was not designed.

According to recent industry analysis, 73% of digital transformation projects fail or severely underperform. And the silent culprit? A database that was poorly designed from the outset. We are talking about billions of pounds wasted every year in the UK, companies competing with their hands tied, and data scientists spending 80% of their time ‘cleaning’ data instead of extracting value.

Yet many businesses – especially small and medium-sized ones – continue to use outdated or completely unstructured databases. Excel spreadsheets with thousands of rows. Access with a single gigantic table. Or worse: ‘home-made’ systems that no one has ever really designed.

In this article, you will discover why database design is not a technical luxury, but the foundation on which every modern application, every AI system, and every business process that really works is built.

What Does “Designing” a Database Really Mean?

When we talk about database design, we are not talking about choosing between MySQL and MongoDB. We are talking about information architecture: how you organise, structure and connect the data that represents your business.

Database design goes through three fundamental phases, each with specific objectives.

The conceptual phase is where it all begins. Here, you define what needs to be tracked, without worrying about how yet. This is the time for crucial questions: what are the key entities in your business? An e-commerce business needs to track products, customers and orders. But what about categories? Reviews? Suppliers? And how do they relate to each other?

The Entity-Relationship (ER) model is the main tool in this phase. A rectangle represents an entity, a diamond a relationship. Simple on paper. Devastating when wrong. Because a mistake here – forgetting a critical relationship, misdefining a cardinality – spreads like cancer throughout the entire system.

The logical phase translates that conceptual model into structures that a database can actually manage. This is where normalisation, primary keys, and referential integrity constraints come into play. This is where you decide whether your application will be fast or slow. Whether the data will be consistent or chaotic.

Let’s consider a real-life case. An order management system initially modelled the ‘customer’ with multiple attributes for shipping addresses. Sounds reasonable. Then the business grew. Customers with 50 different addresses. The table became a monster of NULL columns. Queries turned into nightmares. All because the logical phase had not anticipated scalability.

The physical phase is where theory meets reality. SQL, indexes, constraints, triggers. It’s time to optimise for the real world: data volumes, access patterns, performance requirements. A poorly placed index can slow down writes by 300%. A well-designed one can speed up queries by 10,000%.

But here’s the uncomfortable truth: in the rush to digitise, many companies skip the design phase altogether. They start directly with the code. “We’ll create the tables as we need them.” The result? A database that is technically functional but structurally compromised. It’s like building a skyscraper starting with the roof.

The Invisible (But Devastating) Impact on Business

Let’s talk about money. About time. About missed opportunities.

A poorly designed database is like having a blocked artery. The heart pumps, the blood flows, but not enough. Not fast enough. The symptoms manifest themselves everywhere:

Performance gradually declines. At first, everything works. You have 1,000 customers and 5,000 orders. Then you grow. 100,000 customers. 500,000 orders. And suddenly, that query that used to respond in 200 milliseconds now takes 30 seconds. Users abandon their shopping carts. The call centre is inundated with complaints. Everyone looks to the development team. But the problem originated months earlier, in the design decisions.

Technical documents confirm it: optimisation is possible through indexes and restructuring, but it only works if the foundation is solid. You can put a Ferrari engine on a rusty chassis, but you’ll never win the race.

Maintenance costs skyrocket. Every new feature becomes an odyssey. Add a field? You have to modify 15 tables. Create a report? You need a query that JOINS 8 different tables, taking hours just to write. Developers spend more time fighting with the database than creating value.

I have personally seen projects where 70% of the IT budget is absorbed by the ‘routine maintenance’ of poorly designed systems. Rewriting queries. Optimising tables. Correcting data inconsistencies. It’s like constantly patching up a sinking boat instead of building a new one that floats.

Data quality inevitably deteriorates. Without well-designed integrity constraints, dirty data creeps in. Duplicates. NULL values where they shouldn’t exist. References to non-existent records. Data scientists know this all too well: they spend 80% of their time ‘cleaning’ data instead of analysing it. And do you know what the paradox is? That cleaning has to be done over and over again. Because the source – the database – keeps producing dirt.

As highlighted in advanced design principles, data scientists are much more productive when working with data that is structured and clean from the outset. It is not a question of tools or algorithms. It is a question of foundations.

Relational vs NoSQL: Choosing the Right Foundation

Here, confusion reigns supreme. “NoSQL is the future!” some shout. “Relational databases are dead!” others proclaim. The reality? Both are wrong.

Relational databases still dominate – and rightly so – contexts where data structure is defined and consistency is critical. Banking systems. Order management. Enterprise resource planning. Anywhere where ACID (Atomicity, Consistency, Isolation, Durability) transactions are non-negotiable.

Relational design follows rules that have been established for decades. Normal forms. Foreign keys. Referential constraints. It is an art that many are familiar with. MySQL, PostgreSQL, Oracle – mature, documented technologies that are supported everywhere.

But there is a cost: rigidity. Changing the schema is painful. Scaling horizontally is complex. And when data becomes heterogeneous – think IoT data, JSON documents, semi-structured data of the modern world – the relational model shows its limitations.

NoSQL databases were created to solve precisely these problems. MongoDB, Neo4j, Cassandra – each with a different model for different needs.

Document databases such as MongoDB offer unprecedented flexibility. The schema is ‘schemaless’ – you can insert documents with different structures into the same collection. Perfect for fast-evolving start-ups. For web applications where requirements change every sprint. To integrate heterogeneous data from multiple sources.

But be careful: schemaless does not mean “without design”. Quite the contrary. As demonstrated in advanced modelling patterns, MongoDB requires a different design, not no design. Patterns such as “extended reference”, “bucket” and “subset” are strategies for optimising access, reducing duplication and managing complex relationships.

I have seen teams switch to MongoDB thinking “finally free from design!” only to end up with gigantic documents, extremely slow queries, and a data structure that no one understands. Flexibility without strategy is chaos.

The uncomfortable truth? It’s not about choosing. It’s about designing for context. An e-commerce platform might use PostgreSQL for orders and transactions (critical consistency), MongoDB for product catalogues (schema flexibility), and Redis for caching and sessions (speed).

And with the rise of AI and machine learning, this complexity increases. Data-hungry AI models are not satisfied with a single database. They devour data lakes, real-time streams, distributed storage. And everything has to be orchestrated. Designed. Not improvised.

Designing for Artificial Intelligence and Modern Applications

Let’s be clear: AI has changed everything.

I’m not just talking about ChatGPT or generative models. I’m talking about every modern application that uses machine learning for recommendations, predictions, automation. And they all have one thing in common: they crave structured, clean, accessible data.

But there is a problem. Traditional databases were not designed for AI. They were designed for transactions, for CRUD operations (Create, Read, Update, Delete), for human users filling out forms. Not for algorithms that devour millions of records looking for patterns.

The bottleneck of data lakes. Many companies have responded by creating ‘data lakes’ – gigantic repositories where all data is dumped. The idea: ‘Let’s keep everything, AI will find the value’. The result? Data swamps. Swamps of data where no one can find anything. Why? Lack of design. Data without structure, without metadata, without governance.

Designing for AI requires thinking differently from the outset:

Feature engineering embedded in the database. Instead of extracting raw data and then transforming it, why not pre-calculate features directly in the database? Aggregations, metrics, statistics. The “pre-aggregated values” pattern in MongoDB does exactly that: it calculates common metrics (sums, averages, counts) and keeps them up to date. When AI requests that data, it’s ready. No processing, no latency.

Time-series optimisation. IoT, sensors, events, logs – modern applications generate tsunami-like amounts of time-series data. A poorly designed database simply dies under these loads. The “bucket” pattern groups measurements into optimised documents, drastically reducing disk accesses and query times. I have seen implementations go from 20 seconds to 200ms for time-series queries simply by applying this pattern.

Hybrid relationships for knowledge graphs. AI does not only work with isolated tables. It works with networks of information. Products linked to categories, categories to trends, trends to users, users to purchases. A graph. Graph databases such as Neo4j were created for this purpose. However, well-designed relational or document-based systems can also support these structures – if provided for in the initial design.

And then there is scalability. An AI model in production can receive thousands of requests per second. Each one requires reading data from the database, processing it, and writing results. If your database is not designed for this throughput – indexes, partitioning, caching, replication – the system will collapse at the first real load.

Best Practices and Mistakes that Cost Millions

After years of database design, I have seen the same mistakes repeated. And they cost a lot.

Mistake #1: Normalising everything (or nothing). Normalisation is a pillar of relational design. It eliminates redundancy and ensures consistency. But taken to extremes, it becomes a nightmare. Tables fragmented into 20 different pieces. Queries requiring 15 JOINs. Disastrous performance.

The principle? Normalise for consistency, denormalise for performance. Strategically. If you read a piece of data 1,000 times for every time you write it, duplicating it may be the best choice. The important thing is to know this, design for it, and manage it.

Mistake #2: Ignoring access patterns. I have seen databases designed “perfectly” from a theoretical point of view. Normal forms respected. Clean schema. Then you go into production and discover that 80% of queries always access the same data in the same way. But there are no indexes. There is no optimisation. The database is “correct” but unusable.

Design must start from real use cases. How will you read the data? How many writes vs. reads? What are the critical queries? What latency is acceptable? The answers determine the design.

Mistake #3: Immutable schema. The world changes. Business evolves. New requirements emerge. If your database schema is set in stone, you’re dead. NoSQL databases have an advantage here – flexibility is built in. But relational databases can also be designed for evolution.

The ‘schema versioning’ pattern is elegant: add a version field to each record. V1, V2, V3. The application can handle all versions. You can migrate gradually. No downtime. No risky big bangs. But you have to design it that way from the start.

Mistake #4: Underestimating data governance. Who can modify what? Who approves changes to the schema? Where are decisions documented? Without governance, the database becomes a jungle. Duplicate tables. Fields with cryptic names. No one knows what contains what anymore.

Best Practice #1: Start with the conceptual model. Always. Even if you think you’re “wasting time”. An hour spent drawing an ER diagram saves you weeks of refactoring later.

Best Practice #2: Document obsessively. SQL comments. Up-to-date ER diagrams. Data dictionary. Internal wiki. Sounds boring? Imagine two years from now when you need to modify that database and you can’t remember why that table exists.

Best Practice #3: Test with real data (and real volumes). That database works fine with 100 test records. But what about 10 million? Load realistic data. Run real queries. Measure. Optimise. Repeat.

Best Practice #4: Plan for disaster recovery. Corrupted databases. Data centres on fire. Ransomware. It’s not a question of “if” but “when”. Automatic backups. Geographic replication. Tested restore procedures. Designing a database is not just about schema – it’s about resilience.

The Database as the Foundation of Corporate Value

Here’s the truth that few admit: the database is the business.

Not the interfaces. Not the colourful dashboards. Not the ‘microservices’ or any other architectural buzzword. The database is where the data representing customers, products, transactions and relationships lives. It is the institutional memory. It is the most valuable asset.

Yet we continue to treat it as a technical detail. Something to delegate to the ‘DBA’ or ‘backend developer’ without giving it much thought. A monumental mistake.

The companies that excel today – Amazon, Netflix, Airbnb – have one thing in common: they designed their databases as a strategic competitive advantage. They didn’t take MySQL out-of-the-box and start writing queries. They designed data models that enable the functionality they want, at the speed they need, with the scalability they require.

Amazon isn’t fast because it has more powerful servers. It’s fast because every product, every review, every recommendation lives in a database designed precisely for that type of access. Netflix doesn’t personalise content because it has magic algorithms. It does so because viewing data, preferences and context are structured in a way that makes it possible in real time.

And when they get it wrong? They redesign. They migrate. They invest months and millions. Because they know that the cost of not having the right database is infinitely higher.

Start Today, Not Tomorrow

If you are reading this article, you probably have a database. Or you are about to create one. Or you know that your existing database is a problem.

The good news? It is never too late to improve. Even a legacy database can be progressively restructured, normalised and optimised. Even a new project can get off to a good start.

Start with a simple question: If you had to rebuild this database from scratch today, what would you do differently?

That answer contains the roadmap. Then start with the requirements analysis. Draw an ER model even if you use NoSQL. Identify critical access patterns. Choose the right technology for the right problem. Design for scalability, not for the present. If you are curious about the subject and want to learn more, I recommend these books:

And remember: a well-designed database is invisible. Users don’t see it. They don’t complain about speed. They don’t notice errors. It’s just there, silent, solid, reliable. Like the foundations of a building that no one looks at but without which everything would collapse.

73% of IT projects fail not because of a lack of budget or technology. They fail because of a lack of foundations. It’s time to start with the basics. Literally.

More To Explore

DBMS

Apache Kafka Part 1: What Stream Processing Is and Why It Changes Everything

Kafka is not a typical message broker — it’s the distributed nervous system powering Netflix, LinkedIn, and Uber. It handles millions of events per second without losing a single one, with guaranteed ordering per partition. This first installment explains the core concepts (topics, partitions, offsets, consumer groups) using a real use case: the 50 ARPA Piedmont stations from the Smart City project at Politecnico di Torino.

Alessandro Fiori 6 July 2026

Development

Supabase: the Open-Source Backend for Your Vibe-Coded Apps

Lovable and Bolt build the frontend in minutes. But where does user data live? How does login work? Who can see what? Supabase answers all of these questions: managed PostgreSQL, ready-to-use authentication, file storage, and Row Level Security — all free up to a generous limit, all integrable in a single click from the main vibe coding tools.

Alessandro Fiori 29 June 2026