Skip to content

Data Objects

A Data Object is a business concept — Customer, Order, Invoice, Product, Contract — modeled once and shared across the applications and integrations that touch it. The Data Object layer answers three questions the portfolio keeps coming back to: what information does the business care about, which applications operate on it, and which flows move it around. It does not describe where that information physically lives. Storage is modeled separately, as an IT Component.

A Data Object represents a meaningful business concept — something the business names and discusses, regardless of how any particular application stores it. “Customer” is a Data Object whether it lives as a row in a relational database, a document in a JSON store, or a Salesforce account record. Three tests, applied together, settle most cases:

  1. The business names it. A non-technical stakeholder recognizes the word and can describe what one instance of it means. “Customer”, “Invoice”, “Shipment” pass. “customer_master_v2”, “InvoiceDTO”, “ShipmentDocumentEntity” do not.
  2. It is stable. The concept outlives the systems that store it. Applications come and go; a Customer is still a Customer across generations of CRM.
  3. Multiple applications care about it. If exactly one application ever touches it, it is probably an internal implementation detail of that application, not a landscape-level Data Object.

Typical examples: Customer, Order, Invoice, Product, Contract, Employee, Shipment, Claim, Policy, Transaction.

What does not belong as a Data Object:

  • A database table or a document collection. These are physical storage — model the database or document store as an IT Component.
  • A DTO, message payload, or API response shape. These are transport formats internal to an application or an integration.
  • A UI form or screen model. These describe presentation, not the underlying business concept.
  • A report or view. These are derived outputs; the Data Object is the raw business concept the report draws from.

Data Objects and IT Components sit on different axes of the same picture. Data Objects describe what information the business cares about; IT Components describe the technology that stores and moves it.

If it is…Model it asWhy
A business concept stakeholders nameData ObjectPortfolio-level; independent of where it is stored.
A database, warehouse, lake, or bucketIT ComponentTechnology the business does not discuss by name.
A specific database instance holding multiple conceptsIT ComponentOne PostgreSQL cluster can store many Data Objects.

The relationship between the two layers is implicit: applications use IT Components (the databases and stores) and operate on Data Objects (the business concepts). Albumi does not link a Data Object to a specific IT Component directly. If you need to know “where does Customer live”, follow the applications that operate on Customer and look at the IT Components those applications use.

Getting granularity right is the hardest part of modeling Data Objects. A workable default: one Data Object per business concept the organization names independently and multiple applications share.

  • Too coarse-grained. Collapsing several distinct concepts into one — for example, a single “Party” Data Object that is supposed to cover Customers, Suppliers, and Employees at once, when the business actually treats them as three different things with different owners, processes, and systems.
  • Too fine-grained. Splitting a single business concept into many small ones. The common trap is Address: unless addresses live on their own and are referenced independently by multiple applications (a reusable Address book), Address is an attribute of Customer, not a separate Data Object. The same applies to Contact Info, Line Item, Attachment, and similar sub-structures.

The signal you are modeling too fine-grained: Data Objects that are only ever carried together in the same integration, only ever operated on by the same applications, and never discussed independently by the business. Collapse them into the parent concept.

If you do need to express a containment relationship — Order contains Order Line, or Account contains Sub-Account — Albumi supports a parent/child hierarchy on Data Objects. A child is still a first-class Data Object with its own classification, owner, and set of operating applications; the parent link just records the business containment, it does not move data between them.

The Classification attribute records the sensitivity of the information the Data Object represents. It is the primary signal the portfolio uses to spot where sensitive data travels and which applications and integrations deserve extra scrutiny. Four levels:

  • Public — information intended for or appropriate for public disclosure. Product catalog, published marketing content, regulatory filings already on the public record.
  • Internal — information meant for employees and internal systems but not for external release. Organizational charts, internal policies, routine business data without personal or financial sensitivity.
  • Confidential — information whose disclosure would harm the business or the people it describes. Customer records, pricing, contracts, most employee data, non-public financial figures.
  • Restricted — information whose disclosure would cause serious harm or violate specific obligations. Credit card numbers, health records, authentication credentials, data under legal hold.

Pick the highest level any instance of this Data Object would carry. A Customer Data Object that sometimes contains payment details should be Restricted, not averaged down to Confidential because most customers don’t have a card on file. Classification is optional, but leaving it unset means the Data Object is invisible to every view that filters or colors by sensitivity.

Classification describes sensitivity in general terms. Compliance flags call out specific regulatory regimes that attach to the data regardless of its classification level. They are independent booleans — a Data Object can carry any combination of them.

  • PII — the Data Object contains personally identifiable information about natural persons. Names, email addresses, phone numbers, government identifiers. Drives privacy reviews, data-subject-access workflows, and retention discussions.
  • PCI — the Data Object is in scope for payment card handling. Card numbers, cardholder names tied to card data, authentication values. Drives network segmentation and audit-scope conversations.

Flags are set independently of Classification. A PII Data Object is usually at least Confidential, but the flag is what audit and privacy teams filter on, not the classification letter. Use both: Classification for the portfolio-wide picture of sensitivity, flags for the concrete regulatory conversations.

A Retention Period attribute records how long instances of this Data Object should be kept — free-text so it can match the actual policy language (“7 years after contract end”, “delete on request”, “indefinite”). Use it when the retention rule is policy-driven rather than technical.

Every link between an Application and a Data Object is annotated with the operations the application performs on that data — Create, Read, Update, Delete. The link is declared on the Application side, per Data Object. An application that only displays Customer records declares Read; an application that edits Customer records declares Read and Update; the application where new Customer records are originated declares Create.

What the operation set captures:

  • Who produces the data. Applications with Create are the origination points; without at least one, a Data Object has no source in the landscape.
  • Who consumes the data. Applications with Read but no write operations are consumers — they will want the data to flow in from somewhere.
  • Who modifies the data after creation. Applications with Update or Delete have write privileges on existing records; multiple writers for the same Data Object is a data-quality conversation worth having explicitly.

Albumi does not currently designate one of the applications as the authoritative master, System of Record, or single source of truth for a Data Object. If that is how your organization governs data, record the decision in a supporting note or tag on the application for now — see the open questions above.

An Integration can carry one or more Data Objects between its source and target applications. Each Data Object on an integration is also annotated with operations, but the operation set is narrower at this layer: Create, Update, Delete only. Read is not an integration-layer operation — reading is a local concern of an application, and a data flow that moves records from one application to another is always a write-shaped operation on the receiving side (a new record is created there, an existing one is updated, or one is deleted).

Optionally, the Data Object link on an integration carries transformation notes — a free-text field for documenting mappings, enrichments, or format conversions that the integration applies. Use it when the transformation is business-relevant; leave it unset for pass-through flows.

For a single Data Object, the set of integrations carrying it — source, target, operations, transformation — is its lineage: where the data is produced, which flows move it, and which applications receive it. Albumi exposes the lineage as a view on the Data Object’s detail page.

A Data Object has one owner and one organization, and the two answer different questions.

  • Owner — the person accountable for the Data Object as a business concept. The single human to go to for decisions about its definition, classification, and retention. Ownership is also the governance signal that decides who can edit the Data Object directly versus propose a change; a non-owner must propose the change rather than edit directly. See Permissions & Roles.
  • Organization — the organizational unit the Data Object belongs to for classification and reporting (the data-owning department, the business domain that stewards it). This is not a permission dimension. Membership in the owning organization grants nobody any access.

Every Data Object has an organization — if nothing specific applies, it sits under the workspace root organization. The owner may be unset; the organization is always set.

A Data Object is a reference entity. Its status is either Active or Archived — there is no five-stage lifecycle with planned, phase-in, and retirement dates. The same Customer concept lives for as long as the business uses the word; applications that store or process Customer records do go through a lifecycle, but the concept they operate on does not.

Archive a Data Object when the business has stopped using the concept and no application or integration still references it. Archiving is reversible; it is a classification signal, not a deletion.

  • Operated on by → Application. For each Application that touches this Data Object, the CRUD operations it performs. Edited from the Application.
  • Carried by → Integration. The flows that move this Data Object between applications. Each link records CUD operations and optional transformation notes. Edited from the Integration.
  • Associated with → Business Capability. The capabilities this Data Object supports. Edited from the Data Object.
  • Parent → Data Object. Optional containment hierarchy. A child Data Object is still a first-class concept; the link records business containment, not data movement. Edited from the Data Object.
  • Modeling database tables as Data Objects. One row of customer_master in PostgreSQL is not a Data Object — it is a physical manifestation of one. The Data Object is Customer; the database is an IT Component used by the applications that operate on Customer.
  • Modeling DTOs, message payloads, or API response shapes as Data Objects. These are transport formats. They change with every contract revision; the business concept underneath does not.
  • Splitting one concept into many fine-grained ones. Address, Contact Info, Line Item, Attachment — if they are always embedded in a larger concept and never referenced independently, they are attributes, not Data Objects. Model them as structure inside the parent concept.
  • Collapsing different concepts into one. The opposite failure: one “Party” Data Object covering Customers, Suppliers, Partners, and Employees because they all have names and addresses. If the business treats them as different things — different owners, different applications, different processes — they are different Data Objects.
  • Claiming an authoritative source by convention. It is tempting to verbally agree that “the CRM is the System of Record for Customer” and leave it at that. Albumi does not currently record this designation, so whatever is not in the tool is not in the governance record — flag the assumption explicitly until the product supports it.
  • Forgetting Classification and compliance flags. An unclassified Data Object with no PII or PCI flag is invisible to every report that filters on sensitivity. Set them at the time you create the Data Object; revisiting the entire catalog later is a much bigger job.
  • Using the organization as access control. The Data Object’s organization is classification, not permission. A user in the same organization as a Data Object has no special rights on it.