Production Deployment
Pointing Alloy at the live Railway app and verifying production monitoring.
The Workflow
1. Merge monitoring branch → main via PR
2. Railway auto-deploys updated code
3. Verify /metrics endpoint is live on Railway
4. Update Alloy config to scrape Railway URL
5. Confirm data flows into Grafana dashboard
Step 1 — Merge Via PR
We used a feature branch monitoring/grafana-setup and merged via Pull Request. This:
- Keeps main clean until monitoring is ready
- Documents what changed and why
- Shows the professional git workflow
Step 2 — Verify Production /metrics
curl https://finpay-api-production.up.railway.app/metrics | head -10
Expected output:
# HELP process_cpu_user_seconds_total Total user CPU time spent in seconds.
# TYPE process_cpu_user_seconds_total counter
process_cpu_user_seconds_total 0.812398
...
Step 3 — Update Alloy Config for Production
prometheus.scrape "finpay_api" {
targets = [
{ __address__ = "finpay-api-production.up.railway.app" },
]
metrics_path = "/metrics"
scheme = "https"
scrape_interval = "15s"
forward_to = [prometheus.remote_write.grafana_cloud.receiver]
}
Two changes from local config:
__address__→ Railway production URLscheme = "https"→ required for HTTPS endpoints
Troubleshooting: Redis Connection Failure
After deployment we saw this in Railway logs:
Redis error read ECONNRESET
Redis error Reached the max retries per request limit (which is 3)
Root Cause
The code checked for UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN to use the Upstash REST client. In Railway, the variable was named UPSTASH_REDIS_REST_URL but contained a rediss:// TCP URL instead of the https:// REST URL.
// Code expected this branch to activate:
if (process.env.UPSTASH_REDIS_REST_URL && process.env.UPSTASH_REDIS_REST_TOKEN) {
const { Redis } = require('@upstash/redis');
// ...
}
// But fell into this branch instead:
} else {
const IORedis = require('ioredis');
// Tried to connect via TCP to an HTTPS URL — failed
}
Fix
| Variable | Correct Value |
|---|---|
UPSTASH_REDIS_REST_URL | https://correct-cougar-XXXXX.upstash.io |
UPSTASH_REDIS_REST_TOKEN | The token from Upstash REST tab |
The Railway logs showed Redis connection errors but didn't tell us why the code chose the wrong branch. Reading the actual src/config/redis.js file was the only way to understand the conditional logic. Logs tell you what happened — code tells you why. Always read both.
Verify Alloy is Scraping Production
curl http://localhost:12345/metrics | grep "conn_established"
Expected:
net_conntrack_dialer_conn_established_total{dialer_name="prometheus.scrape.finpay_api"} 2
net_conntrack_dialer_conn_established_total{dialer_name="remote_storage_write_client"} 71
prometheus.scrape.finpay_api→ Alloy connected to Railway ✅remote_storage_write_client→ Alloy connected to Grafana Cloud ✅
Production Dashboard Results
| Panel | Value | Status |
|---|---|---|
| CPU Usage | 0.3% at idle | ✅ Healthy |
| Memory | 96–104 MiB | ✅ Normal warm-up |
| HTTP Request Rate | 0.07 req/s | ✅ Railway health checks |
| HTTP Error Rate | 0 req/s | ✅ Zero errors |
| Response Time P95 | 8ms | ✅ Excellent |
| Event Loop Lag | 2–3ms | ✅ Healthy |
Keep Alloy Running
Alloy stops when the terminal closes. Run it as a system service:
sudo systemctl enable alloy
sudo systemctl start alloy
sudo systemctl status alloy
1.What is the difference between the Upstash REST URL and the TCP URL?
2.Why did the app fall into the ioredis branch instead of the Upstash REST branch?
3.When scraping a production HTTPS endpoint with Alloy, what additional config is required?
4.What does a P95 response time of 8ms on production indicate?