Feature Flags on AWS with AppConfig

Feature flags are a technique that lets you change an application’s behavior without deploying a new version of the code. This means you can have new code in production, hidden behind a flag until you decide to turn it on. If something fails when you enable it, you simply turn the flag off, with no need to roll back the entire deployment.

This ability to control changes safely is recommended within the Reliability pillar of the AWS Well-Architected Framework.

Practical example: creating a feature flag

1. Create the application in AppConfig

Go to AWS AppConfig (within AWS Systems Manager) → Applications → Create application. Enter a name and click Create application.

2. Create the environment

Inside the application, go to Environments → Create environment. Enter a name (for example, prod) and click Create environment.

3. Configuration profile

Go to Configuration profiles → Create configuration profile:

Name: for example FraudFlags.
Configuration type: Feature flags.
Click Next.

4. Define the flag

Enter a Flag key (for example enable-new-fraud-model).
Leave State = Disabled (we’ll enable it later).
Click Next and then Save and continue to deploy.

5. First deployment

Choose the Environment (for example prod).
Deployment strategy: AllAtOnce (sufficient for testing).
Start deployment.

6. Create the Lambda function

Create a Lambda function in Python 3.12 with:

Environment variables:

APPCFG_APPLICATION=BankingPlatform
APPCFG_ENVIRONMENT=prod
APPCFG_PROFILE=FraudFlags

Layers → Add layer → AWS layers: add AWS AppConfig Lambda Extension.

IAM role with permissions:

appconfig:StartConfigurationSession
appconfig:GetLatestConfiguration
appconfig:GetConfiguration

Lambda code:

import json
import logging
import os
import sys
import urllib.request

# Logger básico
logger = logging.getLogger()
logger.setLevel(logging.INFO)
if not logger.handlers:
    h = logging.StreamHandler(sys.stdout)
    h.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
    logger.addHandler(h)

# Vars de entorno
APP     = os.environ["APPCFG_APPLICATION"]
ENV     = os.environ["APPCFG_ENVIRONMENT"]
PROFILE = os.environ["APPCFG_PROFILE"]

# Endpoint del AppConfig Agent (Lambda Extension)
APPCFG_URL = f"http://localhost:2772/applications/{APP}/environments/{ENV}/configurations/{PROFILE}"

def lambda_handler(event, context):
    # 1) Leer feature flags desde el Agent (cache local)
    with urllib.request.urlopen(APPCFG_URL, timeout=1.5) as resp:
        cfg = json.loads(resp.read().decode("utf-8"))

    # 2) Tomar el estado del release flag
    enabled = cfg["enable-new-fraud-model"]["enabled"]

    # 3) Lógica de aplicación según el flag
    if enabled:
        logger.info("Nuevo modelo de fraude: ACTIVADO (release flag encendido)")
    else:
        logger.info("Nuevo modelo de fraude: DESACTIVADO (release flag apagado)")

    # 4) Respuesta
    return {
        "statusCode": 200,
        "body": json.dumps({
            "flag": "enable-new-fraud-model",
            "enabled": enabled
        })
    }

7. Test with the flag disabled

Run the Lambda function. In CloudWatch Logs you’ll see a message indicating that the old version is in use (flag disabled).

8. Enable the flag

In AppConfig, open the enable-new-fraud-model flag, change its state to Enabled (On), and click Save version.

9. Deploy the change

In the Configuration profile, Start deployment → choose the environment → Deployment strategy (AllAtOnce for testing) → Start.

10. Validate the active flag

Invoke the Lambda again. The logs will now indicate that the new version is in use (flag enabled).

Best practices

When implementing feature flags, it’s worth following practices that maximize the benefit and prevent antipatterns. Here are the five most important ones and how AWS AppConfig supports or automates each:

Roll out features gradually

Turning a feature flag on for 100% of your users at once is risky. It’s safer to do a progressive rollout, watching metrics and feedback at each increment.

AWS AppConfig makes this easy through configurable Deployment Strategies: linear strategies (for example, increasing 10% every 5 minutes), exponential, or others, including a bake time period to monitor before continuing. You can choose a predefined strategy (AppConfig.Linear) or customize your own. This way you limit the blast radius of any potential problem, in line with AWS Well-Architected.

Enable automatic rollback (auto-rollback)

Set up monitoring and alarms that trigger a flag to turn off if something goes wrong. AWS AppConfig integrates with Amazon CloudWatch Alarms: you can associate an alarm (by error rate, latency, etc.) with a flag deployment. If the alarm fires during the rollout, AppConfig reverts the configuration to its previous state automatically.

It’s invaluable: automatic reaction speed is usually faster than a human operator. To take this principle to the extreme, it’s worth complementing it with Chaos Engineering on AWS to validate that the alarms and rollback work when needed.

Validate the configuration input

Implement syntactic and semantic validations for each flag or parameter. AWS AppConfig lets you attach Validators to your configuration profiles:

JSON Schema to validate structure and valid ranges.
Custom Lambda functions for more complex checks.

For example, validating that a discount percentage is between 0 and 100, or that a log level only accepts DEBUG, INFO, WARN, ERROR. If someone enters a value outside those rules, AppConfig blocks the deployment before it affects customers.

Use the AppConfig Agent as a caching layer

Querying the service on every call is inefficient. Keep a local cache of the flags in the application using the AppConfig Agent (extensions for Lambda, ECS, EC2). The agent fetches the configurations efficiently, keeps them in memory, and refreshes them in the background, reducing latency and load.

In production, using the AppConfig Agent is best practice because it also handles reconnections and fallback if AppConfig or the network has problems. It adds a buffer between your system and the source of truth.

Clean up obsolete feature flags

Feature flags should have a lifecycle. Temporary ones (release flags) should be marked for future removal once they have served their purpose.

AWS AppConfig helps with this governance: you can tag a flag as Short-term and assign it a target deprecation date. The console shows which flags have passed their date by marking them as “Overdue”, making it easier to identify pending references.

Documenting which flags exist and who is responsible for removing them prevents the dreaded flag rot — dozens of old toggles that clutter the code.

Conclusion

Feature flags, used with discipline, let you separate code deployment from feature release, reducing risk and improving reliability. With AWS AppConfig we can manage them centrally, validating configurations, enabling automatic rollbacks, and making gradual deployments easier. These practices turn change into a controlled, predictable process — key to the resilience of any system in production.

How we apply this at Caleidos

For clients with mission-critical platforms, AppConfig becomes the backbone of change control: feature flags + dynamic configuration + gradual rollouts with CloudWatch alarms as a kill switch. Combined with full-stack observability and Chaos Engineering, it lets you make changes in production with confidence.

Caleidos designs and implements this layer as part of our DevOps & Automation service and operates it 24×7 with Caleidos Lens©.

Want to enable safer deployments on your platform? Let’s talk →

Feature flags with AWS AppConfig: safe, reliable changes in production