Resources /

Blog

How to Create a Data Archiving Policy in Salesforce

Min Read

Resources /

Blog

How to Create a Data Archiving Policy in Salesforce

Min Read

Salesforce orgs accumulate data quickly. Case comments, emails, logs, and integration feeds expand storage usage, raise licensing costs, and gradually degrade performance. At the same time, compliance frameworks like GDPR and HIPAA require organizations to retain certain records for defined periods and delete them when that period ends.

Without a clear archiving policy, teams face mounting storage costs, slower systems, and regulatory risk. A documented retention plan addresses all three by defining what to keep, what to offload, and when to delete.

This guide outlines a practical approach to building a Salesforce data archiving policy. You'll learn how to identify archivable records, map policies to regulatory requirements, use native Salesforce tools to automate retention, and maintain audit-ready documentation—all without introducing unnecessary complexity.

Step 1: Map Regulatory and Business Requirements

Before archiving data, you need to define how long each type of record should be kept and why. The foundation of any archiving policy is a clear set of retention rules based on both legal obligations and internal needs.

Start with Regulatory Requirements

Define minimum retention periods based on the regulations that apply to your data. Failing to align with these requirements can lead to noncompliance, unnecessary data exposure, or missed deletion deadlines during audits.

HIPAA: Retain protected health information for at least 6 years.
SOX: Keep financial records and related communication for 7 years.
GDPR: Retain personal data only as long as necessary. Systems must support deletion on request (Article 17).
PCI-DSS: Cardholder data should only be kept while there’s a legal or business need, typically 2–3 years.

GDPR also requires support for targeted deletions, meaning archives must allow for secure removal of individual records without affecting related data.

Include Business Requirements

A strong retention policy also supports operational goals such as cost control, system performance, and data accessibility. These considerations are just as important as compliance and should be factored into every retention rule.

Reducing storage costs through archiving
Improving performance by keeping production tables lean
Retaining access to historical data for audits or analysis

The goal is to maintain what’s necessary, remove what’s not, and keep systems responsive.

Use Salesforce Tools to Enforce Policy

Once retention rules are defined, the next step is enforcing them with tools that provide visibility, traceability, and audit support. Salesforce offers native capabilities that help teams apply retention policies without relying on third-party systems or manual workarounds. Salesforce tools that support defensible retention include:

Shield event monitoring: Captures user activity across the org, including data exports, record views, and configuration changes. This helps verify who accessed which data and when.
Field audit trail: Preserves a complete, tamper-evident history of changes to specified fields, even beyond Salesforce’s standard retention limits.

These tools allow you to track data usage and changes over time, supporting both regulatory audits and internal policy reviews. Export logs regularly to confirm that records are retained or deleted according to defined timelines, and integrate them into your documentation to demonstrate compliance when requested.

Align with Legal and Compliance Early

Before automating retention, review timelines and exceptions with legal and compliance teams. Confirm how overlapping regulations apply and ensure documentation is audit-ready. Once approved, these rules become the basis for all downstream automation, ensuring your archiving policy is both operationally efficient and regulator-ready.

Step 2: Inventory and Classify Salesforce Data

Before applying retention rules, you need a complete and accurate picture of your data. A structured inventory helps identify what exists, how fast it’s growing, and what requires special handling due to volume or sensitivity. Follow these four steps to establish that baseline:

Extract your metadata schema: Start by exporting metadata to understand which objects and fields are active in your org. Use the Salesforce CLI command:
sfdx force:mdapi:describemetadata -u prod > metadata.json
For a more detailed view, retrieve full metadata files containing object and field definitions. These files can be compared using metadata diff tools to detect changes between orgs and streamline documentation and policy reviews.
Quantify data growth: Use the Storage Usage screen in Salesforce Setup and run basic queries (e.g. SELECT COUNT() on Task, Case, EmailMessage) to identify high-volume objects. Tables with hundreds of thousands of records are strong candidates for archiving, since offloading even a small percentage can improve performance and reduce storage costs.
Classify by data sensitivity: Review field names, descriptions, and help text to identify sensitive data, such as emails, ID numbers, card data, financial values, or health information. Tag fields according to relevant regulations like HIPAA, SOX, or GDPR. This step ensures retention and deletion policies align with legal obligations and internal controls.
Identify AppExchange-managed objects: Use the following query to list managed package objects:
SELECT QualifiedApiName FROM EntityDefinition WHERE NamespacePrefix != null
Objects prefixed with namespaces like npsp__ or sbqq__ are often part of large transactional datasets maintained by external vendors. Because their update cycles are outside your control, it’s important to include them in your retention matrix and monitor their growth over time.

A complete inventory ensures that every automation and policy decision rests on an accurate understanding of your org’s data structure and compliance footprint.

Step 3: Create Your Retention and Deletion Matrix

A retention matrix turns policy into object-level rules your team can automate. Start by listing high-volume or regulated objects. For each one, define how long to retain records, what happens after that period, and whether legal holds apply.

Example retention rules:

Case
Criteria: Status = Closed
Retention: 5 years (operational)
Action: Archive, then delete
Legal Hold: No
EmailMessage
Criteria: Related Case closed > 2 years
Retention: 2 years (retail privacy)
Action: Delete
Legal Hold: Yes, if under litigation
Payment__c
Criteria: Status = Settled
Retention: 7 years (SOX)
Action: Archive, then delete
Legal Hold: No
Patient_Record__c
Criteria: Discharged
Retention: 6 years (HIPAA)
Action: Archive, then delete
Legal Hold: Yes, if under regulatory review

Salesforce logs like Field History and Setup Audit Trail have short default retention windows—18 and 24 months. If your policy requires longer timelines, account for exporting or archiving those logs separately.

When defining the Action, choose:

Delete when records can be permanently removed
Anonymize when identifiers must be stripped but the record must be retained for analysis

Apply similar logic to other objects and test in a sandbox before production. Track exceptions, such as legal holds or audit freezes, within the matrix. Record who approved each exception and when. Store the matrix alongside your Flow or Batch Apex jobs in a version-controlled storage to document change history. Review your matrix with legal and compliance teams twice a year to keep retention periods aligned with evolving requirements.

Step 4: Choose an Archiving Architecture

Your archiving strategy must balance performance, compliance, and cost based on how your teams access and use historical data. Where you store archived data affects retrieval speed, compliance posture, and overall system complexity. Most Salesforce teams adopt one of three approaches. Each comes with tradeoffs:

Salesforce big objects
- No additional license fees, but capacity limits apply
- Supports near-instant queries on indexed fields
- Uses Salesforce role hierarchy for access, but lacks platform encryption and Shield support
- Requires precise indexing; non-indexed queries are slow and resource-intensive
- Works well for moderate data volumes and frequent on-platform reporting needs

External data lakes
- Lowest storage cost at scale, often pay-as-you-go
- Requires ETL or Salesforce Connect for access, adding latency and complexity
- Access controls, encryption, and audit logs are managed outside Salesforce
- Suitable for high-growth orgs with multi-cloud strategies, long-term storage needs, or external analytics use cases

Integrated platforms (e.g. Flosum)
- Look for tools that include the following features:
  - Archive, CI/CD, and backup are bundled into one system
  - Archived data can be restored directly within release pipelines
  - Supports encryption (e.g. BYOK), tamper-evident logs, and audit-ready access
  - Enables rollback at both the code and data layer without switching tools
  - Ideal for teams automating deployments who need integrated recovery and compliance controls

When choosing an architecture, consider:

Data volume: Big Objects handle billions; lakes handle petabytes
Access patterns: Choose based on whether queries are frequent, ad hoc, or audit-driven
Security model: Know whether encryption, access control, and logs must stay within Salesforce
Automation needs: Integrated platforms support full-lifecycle workflows for release, rollback, and recovery

Regardless of architecture, apply least-privilege access to archived data, restoration jobs, and scheduling logic. Build your archive to meet today's needs, but structure it to scale as your data, compliance scope, and team complexity grow.

Step 5: Automate Archiving, Deletion, and Governance

Manual cleanup jobs can’t keep up with growing record volumes. A sustainable archiving policy depends on automation and clear governance. This includes how records are processed, who owns the policy, and how audit readiness is maintained over time.

Automation Options

Automation reduces manual effort, minimizes risk, and ensures consistency across high-volume data operations. The right approach depends on your org’s scale, data complexity, and how tightly archiving needs to align with your deployment process.

No-Code Automation: Use Salesforce Flow with time-based triggers to archive and delete records. Scheduled Flows can filter records using criteria like LastModifiedDate < LAST_N_YEARS:5, copy them to a Big Object, and remove them from the source table. Assign a dedicated integration user with least-privilege access to run the Flow. This limits risk and satisfies audit expectations.
Code-Based Automation: For large volumes or more complex requirements, use Batch Apex. A batch class can archive records in chunks and log results to a custom object such as Archive_Run__c. Trigger the batch from your CI/CD pipeline to align archiving with deployments.

Regardless of the method, always:

Preserve parent-child relationships using stored record IDs
Count records before and after execution
Abort jobs if counts deviate from expected volumes
Log every job run, including filters used, record counts, and execution metadata
Trigger real-time alerts via Platform Events if jobs fail or process unexpected volumes

Automate rollback procedures as well. Restore records from Big Objects or off-platform storage, reapply relationship keys, and run validation checks to ensure ownership and sharing rules are rebuilt correctly. Document restore rehearsals and keep records alongside your policy for audit review.

Governance and Oversight

Strong governance ensures that archiving policies are consistently enforced and that retention decisions are traceable across teams. It also creates accountability, so exceptions, legal holds, and policy changes are reviewed and approved through a documented process. Assign roles to enforce policy and maintain audit readiness:

Data Owner: Defines retention rules for specific objects and approves exceptions
Compliance Officer: Maps policy to regulations like HIPAA, SOX, and GDPR
Release Manager: Embeds archive and deletion jobs into release pipelines and maintains runbooks

Support these roles with structured documentation:

A version-controlled policy document with approved retention rules
A changelog linking policy updates to Jira tickets or pull requests
An exception register noting datasets retained beyond standard windows, with rationale and expiration dates

Use Salesforce native tools like Field Audit Trail and Event Monitoring to log changes and access history. Link these logs to dashboards to monitor compliance reviews and surface anomalous activity. For stronger guarantees, configure these tools properly and add supplemental controls for log retention and immutability.

Review your policy every 6 to 12 months. Reconfirm storage growth against your retention matrix, validate that automated jobs are running on schedule, and reassess legal holds with your compliance team. Route any updates through the same approval and testing process you use for production code. With clear roles, automated execution, and documented oversight, your archiving policy becomes a sustainable system—ready for regulators and built to scale with your org.

Step 6: Test, Deploy, and Monitor

Before deploying any archiving or deletion logic, run a full test in a sandbox that mirrors production. Use a representative dataset spanning several years, and confirm that record counts before and after execution match expectations. Include restore tests to verify that relationships and field values can be fully rebuilt.

Store test scripts alongside your automation code in version control, and re-run them after schema changes to catch fields that may need to be included in the archive or masking logic. Once validated, deploy during a low-traffic window. Tag the release and log expected record counts to create a baseline for audits.

Set up ongoing monitoring to track job success, storage reclaimed, and API usage. Use native tools like Storage Usage, Event Monitoring, and custom objects to capture batch run stats and surface results in a dashboard. Trigger alerts for job failures or anomalies, and log failed chunks for targeted reprocessing. A reliable, well-tested process gives operations teams the transparency and control needed to maintain data hygiene without disruption.

Flosum: Archive and Enforce Retention Natively

Flosum automates data archiving and deletion directly inside Salesforce, using the same pipelines teams already rely on for deployments. Retention rules are enforced during each release, with archived records copied, deleted, and logged under a least-privilege integration user. All actions are version-controlled and audit-ready. Encryption, role-based restore permissions, and compliance-aligned hosting support HIPAA, SOX, and GDPR requirements.

Dashboards track storage savings and job performance. Sandbox seeding with masked data lets teams validate retention logic before going live. Flosum reduces storage costs, simplifies compliance, and keeps archiving fully integrated with DevOps.

‍

Table Of Contents

■

Author

Stay Up-to-Date

Get flosum.com news in your inbox.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.