Optimize GCS Costs with Python: Intelligent Storage Class Recommendations

Google Cloud Storage (GCS) offers multiple storage classes with varying pricing models depending on data access frequency. While "Standard" is great for active data, long-retained infrequently accessed data is better suited for "Nearline", "Coldline", or "Archive".
In this post, weβll explore a Python-based tool that analyzes your GCS buckets, recommends optimal storage classes based on object age, and estimates cost savings automatically.
Overview of What This Script Does
This script:
Lists all buckets in a specified GCP project.
Inspects each object to determine how old it is.
Suggests a better storage class if applicable.
Calculates monthly and annual savings from switching classes.
Fetches the current lifecycle policy and recommends a better one.
Prints a detailed report and saves it as a JSON file.
π Workflow Diagram
Hereβs a high-level visual representation of how the tool operates:

π§ͺ Use Cases
1. Cost Optimization
You might be paying for Standard storage for data that hasn't been accessed in months. This tool suggests when to move data to Nearline, Coldline, or Archive, potentially saving thousands annually.
2. Storage Auditing
Helps with understanding what's inside your buckets and how they're aged. Critical for compliance and cleanup tasks.
3. Policy Recommendations
Lifecycle rules help automate tier transitions. This script suggests sensible rules:
30 days β
Nearline90 days β
Coldline365 days β
Archive
π§° Requirements
Python 3.x
Google Cloud SDK is configured and authenticated
Python package:
google-cloud-storage
Install the required package:
pip install google-cloud-storage
Make sure you authenticate using:
gcloud auth application-default login
π Authenticate with Google Cloud (Service Account Method)
To enable your script to access GCS securely, follow these steps:
β Step 1: Create a Service Account
Go to IAM & Admin β Service Accou****nts
Click β+ CREATE SERVICE ACCOUNTβ
Enter a name like
gcs-cost-optimizerClick Create and Continue
β Step 2: Assign IAM Roles
Attach the following role:
Storage Viewer(roles/sto``rage.viewe``r) β minimum for read accessStor``age Admin(roles/storage.admin) β required if you want to manage lifecycle policies
Click Done.
β Step 3: Create and Download a JSON Key
In the service account page, go to the Keys tab.
Click βAdd Keyβ β βCreate Ne****w Keyβ
Select JSON, download, and store it safely:
vernal-zone-452806-a4-2fd65f98b08a.json
π Keep this file secure β it contains credentials.
β Step 4: Set the Environment Variable
Export the credential path to your shell:
export GOOGLE_APPLICATION_CREDENTIALS="vernal-zone-452806-a4-2fd65f98b08a.json"
Or inline:
GOOGLE_APPLICATION_CREDENTIALS="vernal-zone-452806-a4-2fd65f98b08a.json" python gcs_cost_opti
π» Key Code Snippet
Hereβs how to kickstart the analysis:
from google.cloud import storage
from datetime import datetime, timezone
import json
# Mumbai region pricing per GB per month in USD
STORAGE_PRICING = {
"Standard": 0.023,
"Nearline": 0.016,
"Coldline": 0.006,
"Archive": 0.0025
}
def recommend_storage_class(last_modified):
now = datetime.now(timezone.utc)
age_days = (now - last_modified).days
if age_days <= 30:
return "Standard"
elif 30 < age_days <= 90:
return "Nearline"
elif 90 < age_days <= 365:
return "Coldline"
else:
return "Archive"
def calculate_cost_and_savings(size_bytes, current_class, recommended_class):
size_gb = size_bytes / (1024 ** 3)
current_price = STORAGE_PRICING.get(current_class, 0)
recommended_price = STORAGE_PRICING.get(recommended_class, 0)
current_monthly_cost = current_price * size_gb
new_monthly_cost = recommended_price * size_gb
if current_price <= recommended_price:
monthly_savings = 0
annual_savings = 0
else:
monthly_savings = current_monthly_cost - new_monthly_cost
annual_savings = monthly_savings * 12
return round(current_monthly_cost, 4), round(new_monthly_cost, 4), round(monthly_savings, 4), round(annual_savings, 4)
def suggest_lifecycle_policy():
return {
"rule": [
{"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"}, "condition": {"age": 30}},
{"action": {"type": "SetStorageClass", "storageClass": "COLDLINE"}, "condition": {"age": 90}},
{"action": {"type": "SetStorageClass", "storageClass": "ARCHIVE"}, "condition": {"age": 365}}
]
}
def print_report(report):
print(f"\nProject ID: {report['project_id']}")
for bucket in report["buckets"]:
print(f"\nπ¦ Bucket: {bucket['bucket_name']}")
if "error" in bucket:
print(f" β Error: {bucket['error']}")
continue
if bucket.get("empty_bucket"):
print(" β οΈ Empty bucket.")
else:
print(" π Objects:")
for obj in bucket["objects"]:
print(f" β’ {obj['object_name']}")
print(f" Last Modified: {obj['last_modified']}")
print(f" Recommended Class: {obj['recommended_storage_class']}")
print(f" Monthly Cost (Current): ${obj['estimated_monthly_cost_current']}")
print(f" Monthly Cost (After): ${obj['estimated_monthly_cost_after_policy']}")
print(f" Monthly Savings: ${obj['estimated_monthly_savings']}")
print(f" Annual Savings: ${obj['estimated_annual_savings']}")
# Print current lifecycle policy
print(" π Current Lifecycle Policy:")
current_policy = bucket.get("current_lifecycle_policy", [])
if current_policy:
for i, rule in enumerate(current_policy, 1):
action = rule.get("action", {})
condition = rule.get("condition", {})
print(f" Rule {i}:")
print(f" Action: {action}")
print(f" Condition: {condition}")
else:
print(" No lifecycle policy set.")
# Print suggested lifecycle policy
print(" π‘ Suggested Lifecycle Policy:")
for rule in bucket["lifecycle_policy_suggestion"]["rule"]:
age = rule["condition"]["age"]
storage_class = rule["action"]["storageClass"]
print(f" - After {age} days β {storage_class}")
def list_buckets_and_generate_json_report(project_id, output_file="gcs_report.json"):
storage_client = storage.Client(project=project_id)
buckets = storage_client.list_buckets()
report = {"project_id": project_id, "buckets": []}
for bucket in buckets:
bucket_info = {
"bucket_name": bucket.name,
"lifecycle_policy_suggestion": suggest_lifecycle_policy(),
"objects": []
}
try:
# Fetch current lifecycle policy
bucket.reload()
bucket_info["current_lifecycle_policy"] = list(bucket.lifecycle_rules)
# List objects
blobs = list(storage_client.list_blobs(bucket.name))
if not blobs:
bucket_info["empty_bucket"] = True
else:
for blob in blobs:
recommended_class = recommend_storage_class(blob.updated)
current_monthly, new_monthly, monthly_savings, annual_savings = calculate_cost_and_savings(
blob.size, "Standard", recommended_class
)
object_info = {
"object_name": blob.name,
"last_modified": str(blob.updated),
"current_storage_class": "Standard",
"recommended_storage_class": recommended_class,
"estimated_monthly_cost_current": current_monthly,
"estimated_monthly_cost_after_policy": new_monthly,
"estimated_monthly_savings": monthly_savings,
"estimated_annual_savings": annual_savings
}
bucket_info["objects"].append(object_info)
except Exception as e:
bucket_info["error"] = str(e)
report["buckets"].append(bucket_info)
# Print and save
print_report(report)
with open(output_file, "w") as f:
json.dump(report, f, indent=4)
print(f"\nβ
Report saved to: {output_file}")
# Example usage
if __name__ == "__main__":
project_id = "vernal-zone-452806-a4"
list_buckets_and_generate_json_report(project_id)
π Code Walkthrough
1. Storage Class Cost Mapping
Defines GCP Mumbai pricing:
STORAGE_PRICING = {
"Standard": 0.023,
"Nearline": 0.016,
"Coldline": 0.006,
"Archive": 0.0025
}
2. Determine Best Storage Class
This function checks how old an object is:
def recommend_storage_class(last_modified):
age_days = (now - last_modified).days
...
3. Cost Comparison
Calculates what you'd save by transitioning storage class:
def calculate_cost_and_savings(size_bytes, current_class, recommended_class):
...
4. Lifecycle Policy Suggestion
Hardcoded lifecycle suggestion for automation:
def suggest_lifecycle_policy():
return {
"rule": [
{"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"}, "condition": {"age": 30}},
...
]
}
5. Bucket Analysis and Report Generation
The main function:
pythonCopyEditdef list_buckets_and_generate_json_report(project_id):
...
π Output Example
Console Printout:
π¦ Bucket: my-backup-data
π Objects:
β’ database_backup_2022.sql
Last Modified: 2022-04-10
Recommended Class: Archive
Monthly Cost (Current): $0.0344
Monthly Cost (After): $0.0037
Monthly Savings: $0.0307
Annual Savings: $0.3684
JSON Report (partial):
{
"project_id": "my-gcp-project",
"buckets": [
{
"bucket_name": "my-backup-data",
"objects": [
{
"object_name": "database_backup_2022.sql",
"recommended_storage_class": "Archive"
}
]
}
]
}
π How to Run It
- Update the project ID in
if __name__ == "__main__"block:
project_id = "your-gcp-project-id"
- Run the script:
python gcs_cost_optimizer.py
- Review output:
Console for a human-readable report
gcs_report.jsonfor structured data
β Final Thoughts
This script empowers cloud cost optimization with minimal effort. Automating such audits monthly can lead to:
Improved data hygiene
Significant cost reductions
Strategic policy automation






