FAQ

How do you detect sleep stages?

In this version of SomnoBot, sleep stages are automatically detected using a well validated neural network called RobustSleepNet (RSN) [1], which achieves state-of-the-art detection accuracies. RSN was developed, trained and validated by Antoine Guillot and Valentin Thorey (Dreem, Paris) and is publicly available here. SomnoBot has no affiliation with Dreem or the authors of RSN. We appreciate the creativity, perseverance and work of hundreds of researchers who have developed and validated algorithms to identify sleep stages in human polysomnographic data in the past. To date, only a few algorithms have been trained and validated on a large number of polysomnographic recordings from subjects with different medical conditions, from different studies, and from different clinics. Among these few algorithms, we consider RSN to be one of the best publicly available and validated models to date.

Your polysomnographic recordings (PSGs) are scored by SomnoBot on your computer in your browser. This means that no PSGs are transmitted to us or any third party. When you visit our website, our implementation of RSN is downloaded to your web browser which runs the neural network locally.

SomnoBot will detect sleep stages regardless whether your recordings are strongly contaminated with artifacts or not. This is often very helpful, particularly for contaminated recordings. However, SomnoBot will yield sleep stages even for segments such as flatlines (e.g., when electrodes were not plugged in). This is expected behavior. We plan to detect artifacts in a future version of SomnoBot.

Which web browser should I use?

We use progressive web app (PWA) technology to run SomnoBot locally on your computer. Therefore, we recommend that you use recent versions of web browsers such as Chrome, Chromium, and Firefox. Chromium based browsers such as Edge will also work. We recommend that you use new versions of these browsers (v. 114 or later). We have received feedback from users that they can use SomnoBot with the Safari web browser.

Do you have access to my data to detect sleep stages?

No, your polysomnographic data remain on your computer. We don't have access to your data. It is not transferred or copied to anybody. When you visit our sleep stage detection website, a neural network is downloaded to your web browser. Sleep stages are detected in your web browser, which runs the neural network. All the calculations are done by your computer. So there is no need for any transfer of your data to us or any other party.

Still not convinced? Try it for yourself: Once you have opened the sleep stage detection website in your web browser and the page has finished loading (this can take up to a minute depending on your internet connection), you can disconnect from the internet. You will be able to select a PSG file and let SomnoBot detect sleep stages. Note that for technical reasons beyond our control, you must disable private browsing (also called "private mode" or "incognito mode") in your browser before using SomnoBot offline.

Which channels should I select to detect sleep stages?

We recommend selecting at least two EEG channels and one EOG channel. Indeed, the neural network was designed to handle arbitrary EEG montages. The original developers of the neural network demonstrated that sleep scoring performance varied only slightly for different EEG montages (see Ref. [1], Table 4). However, their results also suggest that selecting EEG and EOG channels will yield more accurate scores than selecting only a single channel such as EOG. These results are supported by our own experiments, in which we compared SomnoBot’s scores to scores of the Cleveland Family Study [2] (data not shown).

Are the sleep stages correctly detected?

We have taken every possible measure to ensure that SomnoBot's sleep scores are accurate. However, as a good scientist, you should not blindly trust the sleep scores. It is always a good idea to check the scores on a sample of epochs from your recordings.

Below we explain

what it means to be "accurate" in sleep stage scoring,
what we suggest as a simple protocol that you can use to evaluate whether SomnoBot is indeed accurately detecting sleep stages in your data, and
how we ensured that the implementation of the underlying neural network (RobustSleepNet) is correct.

What does it mean to be "accurate" in sleep stage scoring?

Even expert scorers make errors. When different human experts manually score the same PSG recording, they will usually disagree on some epochs (inter-rater variability), despite their best efforts and training [3]. When asked to score a recording twice, even a single human expert will usually not identify exactly the same sleep stages for all epochs (intra-rater variability) [4]. One approach to increasing the accuracy of sleep scoring is to have different experts score the same recordings and derive an expert consensus score from these scores. Agreement between experts depends on the sleep stage, with substantial agreement reported for W and R stages, moderate agreement for N2 and N3 stages, and fair agreement for N1 stage [3]. In fact, a comprehensive study suggests that the percentage of epochs where all expert scorers are in agreement decreases with the number of expert scorers involved, reaching only 20-30% of epochs in the limit of an infinite number of expert scorers [5].

In light of these results, we suggest that the scoring performance of automated scoring systems be compared to that of expert scorers. Indeed, the underlying neural network (RobustSleepNet) used in SomnoBot has been shown to produce more accurate scores when compared to an expert consensus than when compared to scores from individual experts (Ref. [1], Table 6). This is encouraging and suggests that the network mimics an expert consensus more closely than individual experts.

A simple protocol to evaluate detection accuracy in your data

We always recommend visualizing the hypnogram to check for anomalies. Keep in mind that N1 is particularly difficult to score for both expert scorers and automated systems.

If you want to assess whether SomnoBot scores sleep as you do, we recommend the following.

Select one of your PSG recordings and score sleep manually.
Let SomnoBot detect sleep stages for the selected recording.
Compare your scores with SomnoBot's scores by calculating Cohen's Kappa.

Below, we link to resources to help you do this. The higher the Cohen's Kappa, the better the agreement between your scores and SomnoBot's scores. Keep in mind that you cannot expect perfect agreement between your and SomnoBot's scores. Even human experts do not achieve perfect agreement (see previous section).
Compare your Cohen’s Kappa value with those obtained between pairs of expert scorers of the DODO/DODH datasets [6].

If your Kappa value is within the distribution of Kappa values, this is a good indication that SomnoBot is detecting sleep stages similar to you, with an agreement comparable to that between two expert scorers. If your Kappa value is below the distribution of Kappa values, then SomnoBot’s scores are in less agreement with your scores than two expert scorers would normally be in agreement with each other. In such a case, you may want to reconsider whether or not you want to use SomnoBot for your data.

Comparison of Cohen Kappa — **Figure:** Probability distributions of the agreement (measured by Cohen’s Kappa) between pairs of expert scorers for scoring sleep in recordings of healthy subjects (DODH) and subjects with obstructive sleep apnea (DODO). The distributions indicate a ‘substantial’ agreement (Kappa values between 0.61-0.80) for the majority of recordings. Expert scorers tend to agree less on sleep scores for subjects with sleep apnea compared to healthy subjects, where we can observe recordings with only ‘fair’ agreement (Kappa values between 0.21-0.40). Datasets DODO and DODH are publicly available [6]; this figure was created by the authors of SomnoBot.

Kappa values between 0.21-0.40 indicate a ‘fair, between 0.41-0.60 a ‘moderate’, between 0.61-0.80 a ‘substantial’ and above 0.81 a ‘near-perfect’ agreement between scorers [3]. Compare the Cohen’s Kappa value you determined between your scores and SomnoBot’s scores for a sample recording with the distributions shown in the figure to determine whether you want to trust SomnoBot’s scores or not.

There are several scripts available online that can help you calculate Cohen’s Kappa for your script language of choice, for instance for Matlab, Python, or R. Likewise, you can use an Excel sheet we created to help you calculate Cohen's Kappa and is available online here.

How we ensured that the neural network was correctly implemented

SomnoBot uses a well validated neural network called RobustSleepNet (RSN) [1], which achieves state-of-the-art detection accuracy. RSN has been developed, trained and validated using data from various clinics and sleep studies and is publicly available here. To run RSN in your browser, we have ported the neural network to SomnoBot using the same model weights as in the original publication [1]. However, computers operate with finite numerical precision, which varies between programming languages, libraries and computing hardware. This will inevitably lead, in rare cases, to SomnoBot predicting a different sleep stage for a given epoch as compared to the original implementation. We tested how often such a discrepancy can be observed on 70 PSG recordings from the IS-RC dataset [7]. We observed that 1 out of 84,347 epochs were scored differently compared to the original implementation, which corresponds to a misclassification rate of 0.001%. We consider the observed misclassification rate to be negligible, indicating that our implementation closely follows the original one.

Can I use SomnoBot to score animal data or human intracranial data?

No. The neural network underlying SomnoBot that detects sleep stages has only been trained on scalp EEG recordings. We strongly advise against using SomnoBot for animal or intracranial data.

References

[1]

Guillot, A. & Thorey, V. RobustSleepNet: Transfer Learning for Automated Sleep Staging at Scale. IEEE Transactions on Neural Systems and Rehabilitation Engineering 29, 1441–1451 (2021). doi:10.1109/tnsre.2021.3098968

[2]

Zhang, G.-Q., Cui, L., Mueller, R., Tao, S., Kim, M., Rueschman, M., Mariani, S., Mobley, D. & Redline, S. The National Sleep Research Resource: Towards a Sleep Data Commons. Journal of the American Medical Informatics Association 25, 1351–1358 (2018). doi:10.1093/jamia/ocy064

[3]

Lee, Y. J., Lee, J. Y., Cho, J. H. & Choi, J. H. Interrater Reliability of Sleep Stage Scoring: A Meta-Analysis. Journal of Clinical Sleep Medicine 18, 193–202 (2022). doi:10.5664/jcsm.9538

[4]

Younes, M., Raneri, J. & Hanly, P. Staging Sleep in Polysomnograms: Analysis of Inter-Scorer Variability. Journal of Clinical Sleep Medicine 12, 885–894 (2016). doi:10.5664/jcsm.5894

[5]

Bakker, J. P., Ross, M., Cerny, A., Vasko, R., Shaw, E., Kuna, S., Magalang, U. J., Punjabi, N. M. & Anderer, P. Scoring Sleep with Artificial Intelligence Enables Quantification of Sleep Stage Ambiguity: Hypnodensity Based on Multiple Expert Scorers and Auto-Scoring. Sleep 46, (2022). doi:10.1093/sleep/zsac154

[6]

Guillot, A., Sauvet, F., During, E. H. & Thorey, V. Dreem Open Datasets: Multi-Scored Sleep Datasets to Compare Human and Automated Sleep Staging. IEEE Transactions on Neural Systems and Rehabilitation Engineering 28, 1955–1965 (2020). doi:10.1109/tnsre.2020.3011181

[7]

Kuna, S. T., Benca, R., Kushida, C. A., Walsh, J., Younes, M., Staley, B., Hanlon, A., Pack, A. I., Pien, G. W. & Malhotra, A. Agreement in Computer-Assisted Manual Scoring of Polysomnograms Across Sleep Centers. Sleep 36, 583–589 (2013). doi:10.5665/sleep.2550