Best Smart Ring: SLEEP 💤
Which smart ring measures sleep best? I spent a night in a sleep lab to compare the sleep tracking accuracy of the top models against the gold standard—polysomnography (PSG). How did Oura, Samsung, Ultrahuman, and others perform? The results might surprise you!

I spent a night in a sleep lab to test how accurately smart rings measure sleep compared to polysomnography (PSG). I brought along all my precious—Oura 3, Oura 4, Ultrahuman Air, Samsung Galaxy Ring, Circular Slim, Circul, RingConn Gen2, EQ R3, and as a bonus, the new Garmin Fenix 8 watch.
The prequel to this article was a look at the summary data of sleep stage durations.

If you haven’t read it, here’s the key takeaway:
- Oura Ring 4 and Samsung Galaxy Ring performed very well.
- Total time in each sleep stage isn’t enough. Even if a ring correctly estimates the duration of deep sleep, it might not pinpoint the exact time it actually occurred.
In this article, you’ll learn:
- 🥼 What happens during a night in a sleep lab.
- 🔭 How sleep stages are actually determined.
- 📱 How to export data from smart ring apps.
- 🌗 How smart rings perform using a 2-stage model (sleep vs. wake).
- 📉 How they compare in the 4-stage model (hypnogram, confusion matrix).
- 🏅 Which device had the most accurate measurements?
- 🤔 What’s the final verdict?
To navigate quickly, click on the chapter that interests you ⤴️. However, for full context, I recommend reading the entire article.
Let’s dive in! 💥
🥼 Sleep Lab: How Is Sleep Measured?
If you suspect a sleep disorder, your primary care doctor can refer you for an overnight study at a sleep lab. After scheduling an appointment, you’ll arrive at the facility in the evening, where a team of specialists will attach various sensors to your body. These sensors monitor your brain waves, eye movements, muscle activity, heart rate, and breathing throughout the night.
The process of attaching all the electrodes and straps isn’t exactly pleasant—it takes some time, and you might feel a bit like a cyborg. It’s not the most comfortable way to sleep, either, since you’re covered in wires, bands, and electrodes, which is far from a typical bedtime setup. If you tend to toss and turn at night, it can be quite bothersome.
The "fun" continues in the morning when it’s time to remove all the sensors. Some come off easily, while others provide an unexpected (but free) partial depilation. Once everything is off, you can take a shower to wash away the conductive gel. Then, you head home and wait for the results and a consultation with a specialist.
Overall, it’s not the most comfortable night, but if you suspect sleep issues, it’s definitely worth doing.





Evening Gallery: Beer (an important detail, as you’ll soon find out), rings, feet, the ring-covered cyborg, and a close-up of a medical oximeter.
🥼 How Are Sleep Stages Determined?
Most smart devices categorize sleep into four main stages: wake, light sleep, deep sleep, and REM sleep. But how are these stages identified in a sleep lab?
Polysomnography: The Gold Standard for Sleep Measurement
A sleep lab uses polysomnography (PSG) to monitor various physiological signals:
🔹 EEG (Electroencephalography) – records brain activity
🔹 EOG (Electrooculography) – tracks eye movements
🔹 EMG (Electromyography) – measures muscle tension
🔹 PPG or ECG – monitors heart activity and breathing
💍 How Do Smart Rings and Watches Track Sleep?
Wearable devices like smart rings and watches don’t have EEG capabilities, so they rely on different data sources:
🔸 Heart rate variability (HRV)
🔸 Changes in heart rate throughout the night
🔸 Body movements (accelerometer)
These devices use machine learning algorithms to estimate sleep stages based on these metrics. Since it's an indirect measurement, the goal is to get as close as possible to the accuracy of polysomnography.
⛏️ Mining Sleep Data: How to Export Sleep Data from Smart Rings
A huge thanks to the sleep lab (which preferred to remain anonymous) for providing reference data from my experimental night. This gave me a solid benchmark for comparison.
But getting data from smart rings? That was a whole different challenge! And it made me wonder—how much do I really own my own wearable data? If an app only shows you a hypnogram image without exact timestamps, that’s not enough.
So, how did different platforms handle data exports? Let’s find out!
Device | API / Data Export | Rating |
---|---|---|
Oura Ring (Oura 3 & 4) | Great API, data can be downloaded via GET in Postman | 👍 |
Ultrahuman Air | Excellent API, support helped with setup | 👍 |
RingConn Gen2 | Allows export, but not detailed enough. Had to manually transcribe. | 👍👎 |
Circular Slim | No API, manual data extraction was even more painful than RingConn | 👎 |
EQ ECTRI | No export option, manual transcription only | 👎 |
Samsung Galaxy Ring | API is internal, data not visible in the app. Workaround: transfer data to Sleep as Android. | 👎 🤬 |
Circul | Impossible to extract data. | ❌ |
Aizo | Data mysteriously disappeared. | ❌ |
Garmin Fenix 8 | Manual reading from Garmin Connect | 👎 |
📊 How Do Smart Rings Perform in the Two-Phase Sleep-Wake Model?
We have both reference data and smart ring outputs—now it’s time for analysis. We’ll start with a broad overview and gradually dive deeper into the details.
🔍 The Simplest Sleep Classification: Two-Phase Model
The most basic method of sleep analysis divides the night into just two states: sleep vs. wakefulness (awake/wake phase).
📌 How to Interpret the Results?
Each device detects sleep onset and wake-up differently—some recognize the initial wake phase, while others skip the transition entirely. These variations can also affect wake detection in the morning. To ensure fair comparisons, I manually adjusted the final wake phase across all devices to match uniform conditions.
📌 Why Is the Two-Phase Model Important?
Many sleep tracking platforms use Total Sleep Time (sleep duration minus wake phases) as a key input for further calculations. It influences sleep scores, readiness scores, and other key metrics. Even small differences in detection can significantly impact the final results.
📉 How Accurately Did Smart Rings Track Sleep?
Let’s examine how much the results vary across different devices. Below, you’ll find a graphical comparison of total sleep time and wakefulness measured by each platform.
The graph below compares total sleep duration (blue) and wake phases (orange) across reference PSG data and the tested smart rings.

The following graph illustrates the deviation of each device from the PSG reference value in percentage terms.
• Negative values indicate that the device underestimated the wake phase (detected fewer awakenings than actually occurred).
• Positive values mean the device overestimated the wake phase (detected more awakenings than there actually were).

📊 How Do the Results Look in the Four-Phase Model? (Hypnogram, Confusion Matrix)
After analyzing sleep vs. wakefulness, we now shift to the four-phase sleep model, which distinguishes between:
🔹 Deep Sleep – The deepest stage, essential for body regeneration and muscle growth.
🔹 Light Sleep – A transitional phase, stabilizing breathing and heart rate.
🔹 REM Sleep – The dreaming phase, characterized by high brain activity, crucial for memory and learning.
🔹 Wakefulness – Brief awakenings during the night, often unnoticed.
Just like most smart ring applications, we will now analyze the night through the lens of a hypnogram.
By analyzing the hypnogram, we can visually compare how accurately different devices detected sleep stages in comparison to the PSG reference data.
Below is an interactive hypnogram where you can select the device comparison that interests you. (For the best experience, I recommend using landscape mode.)
At that time, I had a beer in the evening, which made me need to use the bathroom, and then I couldn’t fall back asleep for a long time. This moment was well captured by Oura and Samsung, which correctly identified the extended wake phase. And the fact that I took a few steps (accelerometer) was at least recorded by Ultrahuman and Garmin.
I think it's now visually clear who is "pulling sleep phases out of thin air." But what about hard data, actual numbers?
One of the best ways to objectively evaluate measurement accuracy is the confusion matrix.
📌 How does the confusion matrix work?
- Each row represents the actual sleep phase according to PSG (polysomnography).
- Each column shows how the tested device classified that phase.
- If a device were perfectly aligned with the reference (PSG), we would see 100% along the diagonal – meaning it always correctly assigned the sleep phase.
If this isn’t clear, let’s break it down using an example with the RingConn Gen2.

The agreement between Deep PSG and RingConn is 66.3%.
When RingConn classified a phase as Deep sleep, it was actually Light sleep in 25.6% of cases, according to PSG.
We can also verify this on the hypnogram:

Below are all the attached confusion matrices.
🏅 Which device had the most accurate measurements?
After a detailed comparison of sleep phases, the hypnogram, and confusion matrices, it's time for the final verdict. Which smart device best replicates PSG, and which one is just guessing sleep phases?
If we translate the findings from the confusion matrix into a single comprehensive graph, the results look like this:
- The horizontal axis represents the average accuracy of sleep phase detection.
- The vertical axis shows the accuracy of the worst-detected phase.
The further up and to the right ↗️ a device is, the better it performed. An ideal device would be as close as possible to the top right corner. 📊
Taking all the information into account, the final verdict is as follows:
Final Verdict
Device | Agreement with PSG | Strengths | Weaknesses |
---|---|---|---|
🏆 Oura 4 | ⭐⭐⭐⭐⭐ | Best accuracy for REM and deep sleep | Occasional minor differences in wake detection |
🥈 Samsung Galaxy Ring | ⭐⭐⭐⭐☆ | Excellent REM sleep detection | Slightly overestimates light sleep |
🥉 Garmin Fenix 8 | ⭐⭐⭐☆☆ | Good deep sleep detection | Doesn't always distinguish REM and light sleep correctly |
Ultrahuman Air | ⭐⭐⭐☆☆ | Decent light sleep detection | Sometimes slightly overestimates deep sleep |
RingConn Gen2 | ⭐⭐☆☆☆ | Fairly good REM detection | Often confuses light and deep sleep |
Circular Slim | ⭐☆☆☆☆ | Occasionally detects deep sleep correctly | Major errors in recognizing REM and wake phases |
EQ ECTRI | ❌☆☆☆☆ | - | Almost useless for sleep tracking |
🤔 Conclusion
I am fully aware of the limitations of this test – primarily the small sample size (n=1) and the fact that it was based on a single night of measurements. However, my goal was to provide the most accurate comparison possible of available smart rings against the gold standard – PSG.
At the same time, I’d like to add that sleep phases aren’t the most essential metric for me. I find broader data points like resting heart rate, HRV, and similar metrics to be much more valuable indicators.
For third place, I would rank the Ultrahuman Air, as it was also one of the few devices that successfully detected the obvious awake phase (bathroom visit).
In the future, I’d like to focus on long-term testing and the use of continuous measurement with a reference EEG headband, which will allow for more repeated and detailed sleep phase tracking. I also plan to choose a different representative of the "clones", as the EQ R3, combined with the ECTRI app, completely failed in this test.
📌 And this is just the beginning! This test is the first part of a major smart ring comparison series. Sleep tracking is just the start – next up is a step-counting accuracy test, compared against the scientific reference device StepWatch 5.
P.S. Results from Oura 3 can be found in the appendix here:
