Back to blog
    Dados

    Data Architecture Fundamentals for Beginners

    When we talk about modern systems, we usually think of APIs, interfaces, cloud, and Artificial Intelligence.

    Data Architecture Fundamentals for Beginners
    June 12, 20266 min read

    But there is a component that supports absolutely everything.

    The database.

    If we remove the frontend, the system continues to exist.

    If we remove the AI layer, the system continues to function.

    But if we remove the database:

    There are no products.
    
    There are no orders.
    
    There are no payments.
    
    There are no conversations.
    
    There is no history.
    
    There is no business.
    

    That's why understanding data architecture is an essential skill for any software engineer.

    In this article, you will learn:

    • The role of the database;
    • How to model business entities;
    • How to create relationships;
    • How to ensure consistency;
    • How to use PostgreSQL in modern systems;
    • How to integrate AI with pgvector and RAG;
    • How to think like a data architect.

    The Role of the Database

    The database is the permanent memory of a system.

    Everything that needs to continue existing after a request needs to be stored.

    Examples:

    Products
    
    Orders
    
    Payments
    
    Customers
    
    Conversations
    
    Documents
    

    Imagine that a customer closes their browser right after making a purchase.

    The order continues to exist.

    Why?

    Because it was persisted.

    The database is responsible for:

    • storing;
    • querying;
    • updating;
    • ensuring consistency.

    The database is the permanent memory of the business.


    Modeling the Domain

    Before creating tables, we need to understand the business.

    The first question is:

    What needs to exist in the system?
    

    In a modern e-commerce system, we can identify:

    Product
    
    Cart
    
    Order
    
    Payment
    
    Conversation
    
    Message
    
    Document
    

    These elements are called entities.

    What is an Entity?

    An entity represents something important to the business.

    Examples:

    • Product
    • Order
    • Payment
    • Conversation

    An important rule:

    We model business concepts. We don't model screens.

    Wrong:

    CheckoutScreen
    

    Correct:

    Order
    

    Product: The Central Entity

    In an e-commerce system, almost everything revolves around the product.

    A simple structure could contain:

    id
    name
    slug
    description
    price
    stock_quantity
    is_active
    created_at
    updated_at
    

    The slug field is usually used in friendly URLs:

    industrial-lamp-black
    

    The product is directly related to:

    Cart
    
    Order
    

    That's why it's usually one of the most important entities in the domain.


    Shopping Cart

    The cart represents an intention to purchase.

    Good modeling separates:

    Cart
    
    Cart Item
    

    The cart represents the session.

    The items represent the products added.

    Example:

    Cart
    ↓
    Cart Items
    ↓
    Product
    

    This separation allows:

    • multiple products;
    • quantity control;
    • price history.

    An important practice is to store the product price when it's added to the cart.

    This way, future changes don't affect the customer experience.


    Orders and Payments Are Not the Same Thing

    A common mistake in beginner systems is to mix orders and payments.

    But they are different concepts.

    Example:

    Order created
    ↓
    Payment failed
    

    The order continues to exist.

    Therefore, we need separate entities.

    Order
    ↓
    Payment
    

    The order represents the intention to purchase.

    The payment represents a financial event.

    This separation improves:

    • auditing;
    • traceability;
    • consistency.

    Modeling Conversations

    Modern systems often have integrated support.

    For this, we usually use two entities:

    Conversation
    
    Message
    

    The conversation represents a session.

    The message represents an interaction.

    Flow:

    Conversation
    ↓
    Messages
    

    This structure allows:

    • complete history;
    • AI integration;
    • transfer to human support.

    Knowledge Base for AI

    Modern AI applications need to store knowledge.

    Examples:

    Return Policy
    
    FAQ
    
    Warranties
    
    Documentation
    
    Procedures
    

    A good practice is to separate:

    Document
    
    Chunk
    

    Why?

    Because large documents don't work well in RAG systems.

    By dividing them into small chunks, we can retrieve only the relevant content.

    Flow:

    Document
    ↓
    Chunks
    ↓
    Embeddings
    ↓
    Vector Search
    

    What is pgvector?

    Traditionally, PostgreSQL was used only for relational data.

    Today, we can also turn it into a semantic search platform.

    This is possible using:

    pgvector
    

    With it, we store:

    Text
    
    +
    
    Embedding
    

    Example:

    Return Policy
    
    ↓
    
    [0.21, -0.77, 0.44, ...]
    

    When the user asks a question:

    Can I return a product?
    

    The system searches for the most similar vectors.

    Result:

    Relevant chunks found
    

    This approach is the basis of modern RAG systems.


    Indexes and Performance

    As the system grows, queries become slower.

    The solution is to use indexes.

    Example:

    CREATE INDEX idx_products_slug
    ON products(slug);
    

    Other frequently indexed fields:

    status
    
    product_id
    
    order_id
    
    conversation_id
    

    Without indexes, even good databases can become slow.


    Consistency and Transactions

    One of the biggest advantages of PostgreSQL is its ability to maintain consistency.

    Imagine:

    Payment approved
    

    We need to update:

    Payment
    
    and
    
    Order
    

    These operations must happen together.

    To do this, we use transactions.

    BEGIN;
    
    UPDATE payments ...
    
    UPDATE orders ...
    
    COMMIT;
    

    If something fails:

    ROLLBACK;
    

    Everything returns to its previous state.

    This prevents critical inconsistencies.


    Observability of the Data Layer

    Databases also need to be monitored.

    Some important metrics:

    • slow queries;
    • open connections;
    • CPU usage;
    • memory usage;
    • database growth;
    • disk space.

    Good observability allows us to identify problems before they impact users.


    The Lumina Store Model

    In the end, our domain is composed of:

    Products
    
    Carts
    
    Cart Items
    
    Orders
    
    Payments
    
    Conversations
    
    Messages
    
    Documents
    
    Chunks
    

    Each entity has a specific responsibility.

    Together, they support the entire system operation.


    Conclusion

    Data architecture goes far beyond creating tables.

    It involves understanding the business, modeling concepts correctly, and ensuring that data remains consistent over time.

    Throughout this article, we've seen:

    • the role of the database;
    • how to model entities;
    • how to create relationships;
    • how to use PostgreSQL;
    • how to support AI with pgvector;
    • how to ensure consistency with transactions;
    • how to monitor the data layer.

    The main lesson is simple:

    Databases don't just store information. They store the state of the business.

    And understanding this is one of the first steps to thinking like a software architect.

    Related tags