Scrutiny
Self-hosted server monitoring tool that provides hard drive S.M.A.R.T. monitoring application and interface.
Hard drive health monitoring, honestly reviewed. No marketing fluff, just what you get when you self-host it.
TL;DR
- What it is: Open-source (MIT) web dashboard for S.M.A.R.T hard drive monitoring — takes the raw output of
smartdand turns it into a readable health UI with historical trends and real-world failure thresholds [README]. - Who it’s for: Homelab operators, small business owners, and server administrators running more than two or three drives who want early warning before a disk fails silently [README][1].
- Cost savings: Enterprise drive monitoring (Nagios plugins, commercial NAS software, hosted observability stacks) starts at $50–$300/mo. Scrutiny is MIT-licensed and runs free on any server already hosting your drives.
- Key strength: It doesn’t just surface raw S.M.A.R.T attributes — it applies real-world failure rate data to decide which attributes actually matter, filtering out the hundred-plus vendor metrics down to the ones that predict imminent death [README].
- Key weakness: The README itself warns it’s a “Work-in-Progress” with “rough edges,” the project’s commit velocity has slowed, and third-party reviews are sparse — it occupies a useful niche but hasn’t attracted the community documentation depth you’d want before betting production data on it [README].
What is Scrutiny
If you’ve ever run smartctl -a /dev/sda on a Linux server, you’ve seen the problem Scrutiny solves. The output is a wall of text — 100-plus attributes with names like Raw_Read_Error_Rate, Reallocated_Sector_Ct, and Spin_Retry_Count — and zero guidance on which of those actually matter. The manufacturer-set thresholds are often useless: some are never set, some are so conservative they only confirm a drive is already dead, and none of them track whether an attribute is getting worse over time [README].
Scrutiny is a web dashboard that wraps smartd (the Linux S.M.A.R.T monitoring daemon) and fixes those four problems directly. It auto-detects connected drives, polls their S.M.A.R.T data on a schedule, stores historical readings in InfluxDB, and applies real-world failure thresholds — derived from large-scale drive failure datasets — to surface only the metrics that actually predict failure [README].
The project is maintained by Jason Kulatunga (AnalogJ) and sits at 7,581 GitHub stars with 261 forks. It’s MIT-licensed, written in Go, and ships as a Docker image that bundles the web UI, the collector, and InfluxDB in a single container [README][GitHub]. As of this review the README still carries a “NOTE: Scrutiny is a Work-in-Progress and still has some rough edges” warning — which is worth taking seriously before deploying it anywhere critical.
The architecture has two pieces: a web server (dashboard, API, InfluxDB connection) and a collector (runs on each machine with drives, polls smartctl, ships results to the web server). For single-machine setups, the omnibus image bundles everything. For multi-server setups, you run the collector image on each remote machine and point them all at a single web server [README][1].
Why people choose it
The honest answer is: because nothing else in the self-hosted space hits this exact combination of web UI + historical trending + real-world thresholds at zero cost.
smartd alone is powerful but headless. It runs as a daemon, emails you when something crosses a threshold, and offers no historical view. If you’re managing a headless server — or several — you want a dashboard you can glance at, not a log file to grep [README].
One self-hosters’ roundup [1] describes deploying Scrutiny across a multi-node cluster: the omnibus image on one machine, the collector image on all the others, everything feeding into a single dashboard. The author specifically calls out the temperature graph and the clean S.M.A.R.T data display as working well in practice over a month of runtime. Notably, they hadn’t had any drive failures during that period, so the alerting path went untested — a fair caveat to pass along.
The real-world failure threshold angle is the genuine differentiator. Standard S.M.A.R.T thresholds are set by drive manufacturers, who have commercial reasons to set them conservatively (a threshold that never triggers means fewer warranty returns). Scrutiny’s thresholds are calibrated against actual population-scale failure data, which means it distinguishes between an attribute that’s technically above zero but statistically irrelevant and one that’s correlated with near-term failure [README].
For anyone running a NAS, a Proxmox cluster, a TrueNAS box, or a general-purpose server with more than a handful of drives, Scrutiny solves a real operational problem: knowing which drive to replace before it takes a RAID array down mid-rebuild.
Features
Based on the README and first-hand deployment descriptions:
Core monitoring:
- Web UI dashboard focused on critical S.M.A.R.T metrics [README]
smartdintegration — wraps existing daemon, no re-implementation [README]- Auto-detection of connected hard drives via
smartctl --scan[README] - Historical S.M.A.R.T attribute tracking stored in InfluxDB [README]
- Customized failure thresholds using real-world failure rate data [README]
- Temperature tracking and history graphs [README][1]
Multi-host support:
- Omnibus image (web + collector + InfluxDB) for single-machine deployments [README]
- Separate collector image for remote machines — all report to one dashboard [README][1]
- Configurable via YAML (
example.collector.yaml,example.scrutiny.yaml) [README]
RAID and hardware compatibility:
- All RAID controllers supported by
smartctlare supported automatically [README] - Supports device type overrides in config for cases where
--scanmisidentifies the device [README] - Requires explicit
--devicepassthrough in Docker for each physical disk [README]
Alerting:
- Configurable webhook notifications [README]
- Supported notification providers (via webhook endpoints) — data not specified in README
Deployment:
- Provided as Docker image; manual installation also supported [README]
- Pinnable semver tags (
v0.8.2-omnibus,v0.8-web,v0-collector) [README] - Helm chart not mentioned in README; Docker Compose example provided [README]
Not yet implemented:
- Hard drive performance testing and tracking — listed as “(Future)” in the README [README]
Pricing: SaaS vs self-hosted math
Scrutiny has no SaaS tier. There is no hosted version, no commercial license, no paid plan. It’s MIT-licensed software you run on your own hardware.
The cost comparison here isn’t “Scrutiny vs. Scrutiny Cloud” — it’s “Scrutiny vs. the alternatives”:
Commercial NAS software (Synology, QNAP): Bundled S.M.A.R.T monitoring exists in both platforms but is locked to their proprietary hardware. If you’re running drives on a Linux server or a custom NAS build, this isn’t an option.
Hosted observability stacks: Datadog, New Relic, and similar platforms can ingest S.M.A.R.T metrics via custom agents or node exporters. Realistically, you’re looking at $15–$30/mo per host at minimum, plus the engineering time to wire up custom dashboards. None of them ship with pre-built real-world failure thresholds for hard drives.
Prometheus + Grafana + smartctl exporter: This is the DIY alternative. The smartctl_exporter Prometheus exporter exists and works. The tradeoff is build time — you’re assembling the stack from parts (Prometheus scrape config, InfluxDB or another TSDB, Grafana dashboard JSON, alert rules) rather than deploying one container. For someone who already runs Prometheus, this is arguably the better path. For someone who doesn’t, Scrutiny is significantly faster to get running.
Self-hosted Scrutiny:
- Software: $0 (MIT)
- Server cost: $0 if running on existing hardware (the drives you’re monitoring are presumably already on a machine)
- Time to deploy: 15–30 minutes for a single machine; 45–90 minutes for a multi-machine setup
The cost story for Scrutiny is less about dollar savings and more about data insurance. A single undetected drive failure in a non-redundant setup can cost you everything on it. A RAID rebuild on a drive array with one healthy drive and one quietly degrading drive is a well-documented failure scenario. Scrutiny’s value is knowing about the degrading drive before the rebuild starts.
Deployment reality check
The Docker path is straightforward — the README’s quickstart is a single docker run command — but there are a few non-obvious requirements worth flagging before you start.
What you need:
- Docker (or Docker Compose for the multi-container path)
--cap-add SYS_RAWIOon the container — required for direct drive access [README]- Explicit
--device=/dev/sdXfor every drive you want monitored [README] /run/udev:/run/udev:romount for drive detection [README]- Ports 8080 (web UI) and 8086 (InfluxDB) exposed
What can go wrong:
First, RAID controllers are a variable. Some pass S.M.A.R.T data through cleanly; others don’t. If smartctl --scan on the host doesn’t correctly identify your device types, you’ll need to override them manually in example.collector.yaml. The README links to a troubleshooting document for this, which is a signal that it’s a real enough issue to document [README].
Second, the “latest” tag problem. The README explicitly warns against using latest- tags because they can update silently. For production deployments, pin a specific version (v0.8.2-omnibus, etc.) [README]. This is standard Docker hygiene but worth emphasizing since monitoring infrastructure that updates itself without warning is ironic.
Third, the InfluxDB bundling. The omnibus image ships InfluxDB internally. For production environments where you might want to back up or query the time-series data independently, you’ll want to understand that the data lives inside the container volume at /opt/scrutiny/influxdb. Plan your backup strategy accordingly.
Fourth, multi-server setup requires collector configuration. The collector image on remote machines needs to know where the web server API is. This is config-file driven and not difficult, but it’s not zero-config — you’re looking at a collector YAML per remote host pointing at your central Scrutiny instance [README][1].
Realistic time estimate for a technical user: 15–30 minutes for a single-machine omnibus deployment. 45–90 minutes for a three-machine setup with separate collectors. The [1] review confirms the multi-server path works — the author ran it across multiple nodes in a cluster setup.
For non-technical users: this is meaningfully harder than a Synology DSM plugin. You need comfort with Docker, drive device paths, and basic Linux CLI. If you’ve never opened a terminal on a Linux box, this isn’t the right starting point.
Pros and cons
Pros
- Real-world failure thresholds. The core differentiator: instead of surfacing 100+ raw attributes equally, Scrutiny applies population-scale failure data to highlight only metrics that actually predict drive death [README]. This is what
smartdalone cannot do. - MIT licensed. No commercial tier, no “community edition” gating, no fair-code restrictions. The whole thing is yours [README].
- Multi-server, single dashboard. The collector architecture means you can monitor drives across an entire homelab cluster from one UI — confirmed working in practice [1].
- Historical trending. InfluxDB backend means you can track an attribute getting worse over time, not just whether it crossed a threshold today [README].
- Omnibus image simplicity. For single-machine deployments, one Docker run command gets you a working dashboard [README].
- Temperature history. Visible in the dashboard UI, useful for identifying drives running hotter than others [README][1].
- Webhook alerting. Notification integration via webhooks [README].
Cons
- “Work in progress” warning is real. The README itself says this with “rough edges” — take that seriously when evaluating for production use [README].
- Sparse third-party coverage. As of this review, deep independent reviews of Scrutiny are nearly non-existent. The one found [1] is brief and positive but doesn’t cover failure modes or alerting behavior in practice.
- Drive passthrough is manual. You explicitly list each
--devicein the Docker run command. If you add a drive to the server, you need to update the container configuration and restart [README]. - Requires explicit Docker capabilities.
--cap-add SYS_RAWIOis a non-trivial privilege. In locked-down environments this may not be approvable [README]. - InfluxDB bundling adds weight. The omnibus image is heavier than a pure web-only container. On very resource-constrained machines (Raspberry Pi class hardware with limited RAM), this matters.
- RAID pass-through is hit-or-miss. Hardware RAID controllers often don’t expose individual drive S.M.A.R.T data. Scrutiny can’t fix this if
smartctlcan’t see through your RAID card [README]. - No native Prometheus metrics export. If you already run Prometheus, you can’t scrape Scrutiny directly — the data lives in InfluxDB.
- Performance testing not implemented. Listed as a future feature; not available now [README].
- Slower commit velocity. The project has ~899 commits across its lifetime but recent activity should be verified before adopting for long-term production use.
Who should use this / who shouldn’t
Use Scrutiny if:
- You run a homelab, NAS, or small-business server with four or more drives and you want to know which one is quietly dying.
- You’re already running Docker and comfortable with device passthrough configuration.
- You want multi-server drive monitoring aggregated in one dashboard.
- You want historical trend data that raw
smartctloutput can’t provide. - You’re replacing a commercial NAS platform that bundled S.M.A.R.T monitoring and losing that visibility on migration.
Skip it (use smartd + email alerts instead) if:
- You have two or three drives on a single machine and email alerts from
smartdare sufficient. - You want zero maintenance overhead —
smartdis a mature daemon that ships with most Linux distributions and requires no containers.
Skip it (use Prometheus + smartctl_exporter instead) if:
- You already run a Prometheus/Grafana observability stack and want S.M.A.R.T data in the same system.
- You want native alerting via Alertmanager.
- You prefer infrastructure-as-code over Docker Compose files.
Skip it entirely if:
- Your drives are behind a hardware RAID controller that doesn’t pass S.M.A.R.T data through to
smartctl— Scrutiny can’t help if the raw data isn’t accessible. - You’re on Windows or macOS — this is Linux-native infrastructure software.
Alternatives worth considering
- smartd (standalone) — Ships with
smartmontools, installed in most Linux distros. CLI-only, no historical trending, no real-world thresholds. Free, zero dependencies. The baseline that Scrutiny improves on [README]. - Netdata — Broader system monitoring including drive health via S.M.A.R.T. More complex to configure for drive-specific alerting. Has a hosted cloud option. Better fit if you want unified system + drive monitoring.
- Prometheus + smartctl_exporter + Grafana — The DIY path. More setup time, more control, integrates with existing observability stacks. No real-world failure thresholds out of the box — you’d build those alert rules manually.
- TrueNAS / Openmediavault — Bundled S.M.A.R.T monitoring included. Only relevant if you’re running their full NAS operating system, not individual Linux servers.
- Checkmk / Nagios — Enterprise monitoring platforms with S.M.A.R.T plugins. Overkill for most self-hosters; commercial licensing for full features.
Bottom line
Scrutiny fills a narrow but genuine gap: it turns the raw, undifferentiated output of smartd into a dashboard you can actually use to make decisions — which drive is degrading, how fast, how its temperature compares to last month. The real-world failure threshold logic is the feature that separates it from just plotting S.M.A.R.T numbers in Grafana. For anyone running more than a few drives across multiple servers, the multi-collector architecture is practical and confirmed to work in deployment.
The honest caveats are real, though. The “work in progress” label isn’t false modesty, the external review community is thin, and RAID pass-through remains controller-dependent. If your setup is simple (one machine, software RAID or no RAID), Scrutiny is a low-risk addition to your monitoring stack. If you’re building something you can’t afford to be blind to, test it thoroughly before relying on its alerting.
If the Docker configuration and device passthrough are the blockers, upready.dev handles these deployments one-time for clients. You keep the infrastructure; we do the setup.
Sources
- Enchanted Code — “Now Self Hosted #4” (covers Scrutiny deployment across multi-node cluster, temperature graphs, S.M.A.R.T data display). https://enchantedcode.co.uk/blog/now-self-hosted-4/
Primary sources:
- GitHub repository and README — https://github.com/AnalogJ/scrutiny (7,581 stars, MIT license, 261 forks)
- Docker Hub / GitHub Container Registry — scrutiny package versions: https://github.com/AnalogJ/scrutiny/pkgs/container/scrutiny/versions
Features
Integrations & APIs
- Webhooks
Analytics & Reporting
- Dashboard
- Metrics & KPIs
Related Monitoring & Observability Tools
View all 92 →Firecrawl
94KTurn websites into LLM-ready data — scrape, crawl, and extract structured content from any website as clean markdown, JSON, or screenshots.
Uptime Kuma
84KFancy self-hosted uptime monitoring with 90+ notification services, status pages, and 20-second check intervals — the open-source UptimeRobot alternative.
Netdata
78KReal-time infrastructure monitoring with per-second metrics, 800+ integrations, built-in ML anomaly detection, and AI troubleshooting — using just 5% CPU and 150MB RAM.
Elasticsearch
76KThe distributed search and analytics engine that powers search at Netflix, eBay, and Uber — sub-millisecond queries across billions of documents, with vector search built in for AI/RAG applications.
Grafana
73KThe open-source observability platform for visualizing metrics, logs, and traces from Prometheus, Loki, Elasticsearch, and dozens more data sources.
Sentry
43KSentry is the leading error tracking and application performance monitoring platform, helping developers diagnose, fix, and optimize code across every stack.