a The variables between the groups were compared using the 2-sided Student t test for continuous variables and the χ 2 test for categorical variables.
b Use of steroids was defined as the use of prednisolone 5 mg daily or equivalent over 3 months.
c Possible secondary osteoporosis includes patients with osteoporosis and concurrent diagnosis with type 1 diabetes, osteogenesis imperfecta in adulthood, hyperthyroidism, hypogonadism, premature menopause (<45 years), chronic malnutrition, malabsorption, and chronic liver disease.
Incident fracture (-) (n=606) | Incident fracture (+) (n=608) | value | |
Age (years), mean (SD) | 72.0 (7.6) | 72.9 (8.2) | <.001 |
Females, n (%) | 422 (69.6) | 417 (68.6) | .96 |
Height (cm), mean (SD) | 156.9 (8.5) | 157.1 (8.3) | .32 |
Weight (kg), mean (SD) | 58.29 (9.5) | 58.11 (10.4) | .23 |
BMI (kg/m ), mean (SD) | 23.7 (3.4) | 23.5 (3.5) | .39 |
Current smoker, n (%) | 115 (18.9) | 169 (27.8) | .001 |
Current drinker, n (%) | 113 (18.7) | 123 (20.2) | .92 |
Use of steroids, n (%) | 47 (7.8) | 114 (18.8) | <.001 |
Possible secondary osteoporosis, n (%) | 45 (7.4) | 77 (12.7) | .001 |
a The variables between the groups were compared using the 2-sided Student t test for continuous variables and the χ 2 test for categorical variables. Fracture (-) and (+) groups represent participants who did not and did experience fractures at 5 years of follow-up, respectively.
c Possible secondary osteoporosis includes patients with osteoporosis and concurrent diagnosis with type 1 diabetes, osteogenesis imperfecta in adulthood, hyperthyroidism, hypogonadism, or premature menopause (<45 years), chronic malnutrition, malabsorption, and chronic liver disease.
As demonstrated in Table 3 , for the development set, the models using images that included both vertebral bone and paravertebral muscle showed significantly better AUROC, accuracy, and precision values compared to those using bone-only images. Specifically, the bone-only images had an AUROC of 0.677 (95% CI 0.674-0.680) and accuracy of 0.669 (95% CI 0.665-0.673). In contrast, the images including both bone and muscle exhibited an AUROC of 0.739 (95% CI 0.737-0.741) and accuracy of 0.719 (95% CI 0.715-0.722; all P <.001). The fracture risk assessment tool (FRAX) model for major osteoporotic fracture and hip fracture showed lower AUROCs of 0.557 and 0.563, respectively, indicating a significantly better performance of our image model (all P <.001).
Similar trends were observed in the external validation set, where bone-only images resulted in an AUROC of 0.815 (95% CI 0.806-0.824) and accuracy of 0.754 (95% CI 0.752-0.756), while the combined bone and muscle images demonstrated an AUROC of 0.827 (95% CI 0.821-0.833) and accuracy of 0.812 (95% CI 0.798-0.826; all P <.001), though the specificity value was similar between the 2 groups. The FRAX model for major osteoporotic fracture and hip fracture had AUROCs of 0.810 and 0.780, respectively. Again, these results confirmed the superior predictive capability of our image-based model (all P <.001).
Development set | External validation set | ||||||
Bone only | Bone + muscle | value | Bone only | Bone + muscle | value | ||
AUROC (95% CI) | 0.677 (0.674-0.680) | 0.739 (0.737-0.741) | <.001 | 0.815 (0.806-0.824) | 0.827 (0.821-0.833) | .04 | |
Accuracy (95% CI) | 0.669 (0.665-0.673) | 0.719 (0.715-0.722) | <.001 | 0.754 (0.752-0.756) | 0.812 (0.798-0.826) | <.001 | |
Sensitivity (95% CI) | 0.746 (0.739-0.753) | 0.761 (0.746-0.776) | .23 | 0.645 (0.613-0.677) | 0.704 (0.675-0.733) | .054 | |
Specificity (95% CI) | 0.601 (0.586-0.616) | 0.634 (0.625-0.643) | .002 | 0.844 (0.810-0.877) | 0.855 (0.835-0.875) | .43 |
a AUROC: area under the receiver operating characteristic curve.
Compared to the clinical models, the image model using vertebral bone and muscle showed significantly higher performance than the clinical models in predicting the vertebral fractures during the 5-year follow-up period in the development and external validation sets ( Figure 4 , Table 4 ). In the development set, the images that included vertebral bone and muscle had significantly better AUROC and accuracy than the clinical model D, which included age, sex, BMI, history of alcohol consumption, smoking, possible secondary osteoporosis, type 2 diabetes mellitus, HIV, hepatitis C infection status, and renal failure (AUROC 0.667, 95% CI 0.661-0.672 and accuracy 0.640, 95% CI 0.661-0.649; all P <.001, Table 4 ). In addition, the performance did not show a significant change when the clinical variables were added to the image-only model (Table S1 in Multimedia Appendix 1 ).
AUROC (95% CI) | value | Accuracy (95% CI) | value | Sensitivity (95% CI) | value | Specificity (95% CI) | value | |||||||||
Image-only | 0.739 (0.737-0.741) | Reference | 0.719 (0.716-0.722) | Reference | 0.761 ± 0.024 (0.746-0.776) | Reference | 0.634 (0.625-0.643) | Reference | ||||||||
Clinical model A | 0.647 (0.643-0.651) | <.001 | 0.620 (0.614-0.626) | <.001 | 0.681 (0.643-0.719) | .03 | 0.575 0.549-0.601) | <.001 | ||||||||
Clinical model B | 0.631 (0.626-0.636) | <.001 | 0.612 (0.610-0.614) | <.001 | 0.675 (0.639-0.711) | .02 | 0.558 (0.517-0.598) | .003 | ||||||||
Clinical model C | 0.663 (0.659-0.667) | <.001 | 0.637 (0.631-0.643) | <.001 | 0.723 (0.694-0.752) | .11 | 0.553 (0.521-0.585) | <.001 | ||||||||
Clinical model D | 0.667 (0.661-0.672) | <.001 | 0.640 (0.661-0.649) | <.001 | 0.729 (0.690-0.768) | .13 | 0.560 (0.527-0.593) | .005 | ||||||||
FRAX (MOF) | 0.557 | <.001 | 0.557 | <.001 | 0.442 | <.001 | 0.672 | <.001 | ||||||||
FRAX (hip) | 0.563 | <.001 | 0.556 | <.001 | 0.449 | <.001 | 0.663 | <.001 | ||||||||
Image-only | 0.827 (0.821-0.833) | Reference | 0.812 (0.798-0.826) | Reference | 0.704 (0.675-0.733) | Reference | 0.855 (0.834-0.875) | Reference | ||||||||
Clinical model A | 0.731 (0.725-0.737) | <.001 | 0.651 (0.629-0.673) | <.001 | 0.715 (0.683-0.747) | <.001 | 0.656 (0.629-0.683) | <.001 | ||||||||
Clinical model B | 0.733 (0.725-0.737) | <.001 | 0.654 (0.625-0.683) | <.001 | 0.728 (0.673-0.783) | <.001 | 0.662 (0.621-0.703) | <.001 | ||||||||
Clinical model C | 0.745 (0.733-0.757) | <.001 | 0.669 (0.646-0.692) | <.001 | 0.713 (0.678-0.748) | <.001 | 0.720 (0.689-0.751) | <.001 | ||||||||
Clinical model D | 0.749 (0.736-0.762) | <.001 | 0.675 (0.643-0.707) | <.001 | 0.729 (0.690-0.768) | <.001 | 0.686 (0.650-0.722) | <.001 | ||||||||
FRAX (MOF) | 0.810 | <.001 | 0.810 | <.001 | 0.262 | <.001 | 0.887 | <.001 | ||||||||
FRAX (hip) | 0.780 | <.001 | 0.685 | <.001 | 0.705 | <.001 | 0.682 | <.001 |
b Image model represents the model using bone and muscle.
c Model A includes age and sex.
d Model B additionally includes BMI.
e Model C additionally includes history of drinking, smoking, and possible secondary osteoporosis.
f Model D includes age, sex, BMI, history of alcohol consumption, smoking, possible secondary osteoporosis, type 2 diabetes mellitus, HIV, hepatitis C infection status, and renal failure.
g FRAX: fracture risk assessment tool.
h MOF: major osteoporotic fracture.
i Since this was calculated for a single data set, there are no 95% CI values.
As depicted in Figure 4 , in the external validation set, the images including vertebral bone and muscle showed a significantly better AUROC and accuracy than the clinical model D (AUROC 0.749, 95% CI 0.736-0.762 and accuracy 0.675, 95% CI 0.643-0.707; all P <.001). The results were similar for clinical models A, B, C, and D, which showed poorer performance than the image model.
In this study, we developed and externally validated a vertebral fracture prediction model by using abdominal CT images. In the development cohort, the performance of predicting vertebral fractures represented by AUROC was 0.688 (SD 0.001) by using images of vertebral bone-only and 0.736 (SD 0.003) by using images of vertebral bone and paravertebral muscle. In the validation cohort, the performances (AUROC) were 0.698 (SD 0.001) and 0.729 (SD 0.002) for images of vertebral bone-only and images of vertebral bone and paravertebral muscle, respectively. In addition, the performance of the model using images of vertebral bone and muscle was significantly better than that of the clinical models using age, sex, BMI, use of steroids, smoking status, and possible secondary osteoporosis, which showed performances of 0.635 (SD 0.002) and 0.698 (SD 0.021), respectively, for the development and validation cohorts.
Our model shows that the image models using vertebral bone and muscle had a better performance than those using images of vertebral bone-only. Osteosarcopenia, defined by combined occurrence of bone loss and sarcopenia, is one of the critical risk factors for osteoporotic fractures [ 25 , 26 ]. The paravertebral muscles are essential components of the vertebral column and are associated with osteoporotic vertebral fractures [ 27 , 28 ]. In previous studies, information retrieved from muscle images, such as cross-sectional area, volume, and degree of fat infiltration in the paravertebral muscle, was correlated with vertebral stability and the risk of fractures [ 28 , 29 ]. Specifically, Kim et al [ 30 ] reported lower cross-sectional areas and greater fat infiltration of the paravertebral muscles in patients with vertebral fractures than in those without fractures. This implies that not only the density and quality of the bones are correlated with the risk of fractures but also the quality of the muscles supporting and communicating with the bones [ 17 ]. Fat infiltration in the muscles, called myosteatosis, has been reported to be associated with an increased risk of fractures [ 17 , 31 ]. Thus, in line with previous studies, our study results imply that information from the images of the paravertebral muscles in addition to the information from the images of vertebral bones can help predict vertebral fractures more accurately.
Further, the image-based learning model with images of both vertebral bone and muscle showed better performance than the clinical variable–based models. This finding is consistent with a previous report that showed that information from the images of vertebral bones and muscles from CT scans can be used to predict major osteoporotic fractures and is comparable with FRAX [ 32 ]. Another group reported different algorithms by using opportunistic CT-based bone assessments for osteoporotic fracture prediction [ 33 ]. They showed that CT-based predictors (vertebral compression fractures, simulated DXA T-scores, and lumbar trabecular density) with metadata of age and sex showed better performance in AUROC than FRAX [ 33 ]. However, in that model, muscle information was not considered [ 33 ], which may further improve the performance. In addition to the attenuation information, we used information from the image itself on the quality of the bone and muscle structure, similar to the trabecular bone score [ 13 ]. The trabecular bone score is an algorithm used to calculate the microstructure of the bone based on DXA images [ 34 ]. More than 50% of the osteoporotic fractures occur in patients with a normal or osteopenic range of BMD [ 35 ], which implies that the microarchitecture of the bone is also a key determinant of bone strength [ 36 ]. Similarly, in our study, the model used the information on the qualities of bones and muscles from CT images, demonstrating the potential value of CT images that may include rich and various informative data for the metabolic diseases of bones and muscles.
We also observed that the performance did not significantly change when clinical variables were added to the image-only model. There is a possibility that information such as age and gender could already be reflected to some extent in the image itself [ 37 ]. Therefore, there could be an insignificant improvement in the performance because the information poses a redundant input to the model. It is widely accepted that there is a noticeable sex difference in the size of the vertebral body and paravertebral muscles [ 37 ], and BMI could be positively correlated with the size of the vertebrae and muscles. In addition, although the model was based on high-resolution peripheral quantitative CT, each bone has different characteristics according to age and sex, such as calcification and size, which could have influenced our analysis [ 38 ]. In addition, the vertebral endplate calcification increases with age, implying that age information can be reflected in the image [ 39 ]. In addition to that reported in previous studies, smoking and alcohol consumption status can be associated with low muscle mass [ 40 ], which may explain why adding simple clinical variables to the image may not significantly improve the model, as the image already contains some clinical information. The results are clinically promising, and they can be utilized in the future, as only opportunistic CT scans without detailed clinical variables may automatically provide the risk of osteoporotic fractures.
To extract pertinent information from each CT scan, we designed an image-only model to prevent overfitting and to focus on the essential regions. Since 3D CNN models, which have a large number of parameters to be optimized, tend to overfit the training data [ 41 ], the CNN encoder of our model took consecutive 2D images as its input data while keeping their sequential information with the RNN decoder [ 42 ]. The input processing strategy served as a robust data augmentation method because our model could exploit different 2D image sets from a single 3D CT scan at each training iteration. In addition, an attention module was applied to the CNN encoder to further enhance its robustness. The attention module automatically guided the image model to concentrate on essential regions [ 43 ] for the prediction of vertebral fractures. Thus, the attention CNN-RNN model avoids making predictions based on background regions, except for the vertebral body and paravertebral muscles. Unlike previous CNN model–based deep learning algorithms, which were limited to 2D X-ray analysis or bone texture analysis, our CNN-RNN model showed robust performance in fracture prediction. Owing to its design to mitigate the overfitting problem of conventional 3D CNN models [ 42 ], the CNN-RNN model could extract effective information from 3D CT images, which were intractable in previous approaches. In addition, the attention module forced our model to focus on important regions in the CT images by removing the effects of the background regions [ 43 ].
Our study has several limitations. The data set did not contain BMD due to the retrospective study design, which is an essential predictor for osteoporotic fracture. It was difficult to compare the clinical model containing BMD with the image model. The model showed a 5-year fracture prediction model instead of a 10-year model owing to the follow-up duration of the data set, which is relatively short to be utilized in real-world practice. Thus, due to the short time frame, we could not show the results for nonvertebral fractures because the number of cases was too small. In addition, the paravertebral muscles were included without distinction among the psoas, intervertebral, multifidus, longissimus, iliocostalis, and quadratus lumborum muscles. Therefore, it is difficult to interpret the contribution of each muscle. In addition, the number of images in the development set may not be sufficient for model optimization. Moreover, the utilization could be low in various contrast settings because it was based on contrast CT scans. There was also the disparity in vertebral fracture incidence between the development and the external validation set, which may affect the external validity and generalizability of our fracture prediction model. The retrospective nature inherently carries the potential for selection bias, including confounding by indication. Although we have employed propensity score adjustment to mitigate this bias, it is important to acknowledge that residual bias may still be present. Another limitation is the exclusion of radiographic imaging data with poor quality from our models. This decision might have introduced detection bias such that it may have impacted the diagnostic accuracy of our models in correctly identifying positive versus negative fracture cases. Further, we could not assess the reproducibility of these measurements through interexaminer and intraexaminer κ value assessments, which could be considered a limitation of our study. Future prospective studies could benefit from including such reproducibility assessments.
Our study has several strengths. Our study was longitudinally designed to observe future fracture events in patients who did not have baseline fractures. Furthermore, in the development cohort, we used controls with matched clinical variables, which made it possible to attenuate the effects of the major clinical variables in the model. It was also externally validated, which helped prove the generalizability of the model. In addition, the model used the image itself as an input, which made it possible to utilize the information on vertebral bone and muscle quality and quantity. This inclusion of the muscle image reflected the interplay between muscle health and fracture risk. For instance, factors such as muscle mass and muscle steatosis, which are visible in CT images as darker and more heterogeneous areas compared to normal muscle, could be crucial inputs. These muscle attributes, automatically analyzed by the CNN, contribute significantly to the model’s ability to discern patients at higher risk of fractures, offering a more comprehensive view than bone analysis alone. In addition, by sequentially applying bones and muscles to the model, it was possible to check the degree of contribution of muscles and bones to the model performance, thereby increasing the interpretability of the model. In addition, the differences in the clinical characteristics between development and external validation sets were purposefully leveraged to assess the generalizability of our model across populations with varying clinical profiles.
In this study, we showed that a deep learning model of the CNN-RNN structure based on CT images of the muscle and vertebral bone could help predict the risk of vertebral fractures. The model using images of the vertebral bone and muscle showed better performance than the model using images of the vertebral bone-only. This implies that the information from the muscle images provides additional key information for predicting fractures. In addition, the model using images showed better performance than the model using clinical variables, suggesting that images can provide useful information in addition to having known clinical variables. This study has clinical significance in suggesting that opportunistic CT screening with deep learning algorithms utilizing bone and muscle images may contribute to identifying patients with a high fracture risk in the future. Further prospective studies are needed to broaden the applicability of our model.
The study was funded by the National Research Foundation of Korea (grants 2020R1A2C2011587 and 2021R1A2C2003410).
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
None declared.
Supplementary table and figure.
area under the receiver operating characteristic curve |
bone mineral density |
convolutional neural network |
computed tomography |
dual-energy X-ray absorptiometry |
fracture risk assessment |
recurrent neural network |
Edited by A Mavragani; submitted 27.04.23; peer-reviewed by M Liu, SH Lee, R Mpofu; comments to author 07.12.23; revised version received 27.01.24; accepted 30.05.24; published 12.07.24.
©Sung Hye Kong, Wonwoo Cho, Sung Bae Park, Jaegul Choo, Jung Hee Kim, Sang Wan Kim, Chan Soo Shin. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 12.07.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Employees are quitting. The talent gap is widening. And leaders are grappling with the hybrid dilemma—what an imminent return to the office might look like and why.
In this episode of McKinsey Talks Talent , HR expert David Green, coauthor (with Jonathan Ferrar) of Excellence in People Analytics (Kogan Page, July 2021), speaks with McKinsey’s Bryan Hancock and Bill Schaninger about a talent market in the throes of changes—and how HR leaders can use people analytics to navigate the current inflection point successfully.
The McKinsey Talks Talent podcast is hosted by Lucia Rahilly.
Lucia Rahilly: So, David, we are closing in on two years of the COVID-19 pandemic. It’s obviously been a massively challenging crisis that has put both lives and livelihoods at risk for employees across the globe. What has this crisis meant for the role of HR?
David Green: Well, I suppose it’s been HR’s chance to shine, and in many companies, it has. There’s an elevated role for the CHRO, 1 Chief human resources officer. which means more expectations for the function, a thirst for data to drive decisions around people, more interest from the C-level, and more demands from the C-level as well. Good people analytics teams have been more focused around employees and understanding how employees are feeling at the various stages of the pandemic, and then are building that into their approach to hybrid work.
Bryan Hancock: I think that the role of HR and the role of the CHRO is going to continue to be elevated for the next few years. The pandemic was a unique human event that affected individuals, that affected people. Now, as we come back and we adjust to the new normal, HR has an opportunity to continue to step up, to continue to innovate and continue to use data, facts, and insights in how they guide, not just intuition.
Bill Schaninger: There’s a little bit of “watch what you wish for.” Because now—while HR is unbelievably front and center, with critical roles and critical pools and “how are we going to respond to the return to the office,” et cetera—that same light shines on deficiencies in the function.
Where maybe in the past HR might’ve carried some folks who were pleasant or good order takers or good caretakers, what HR is demanding now is being numerate, understanding the value tree, really knowing how to use analytics—all that kind of stuff. It’s all laid bare now. So, it is a wonderful time for the function, but the bar for individuals has been raised dramatically.
Lucia Rahilly: In the US we hear so much about what some are calling the Great Resignation and we are calling the Great Attrition : employees reassessing their priorities and quitting their jobs at record rates.
David Green: There’s a lot of column inches devoted to it here, and in Europe as well, although maybe not quite as much as in the US. We 2 David Green is managing partner of Insight222. work with around 90 large global organizations, about half of them headquartered in the US. The person I’m speaking to is usually the head of people analytics. They’ve had a lot of panicked executives saying, “Oh my God, everyone’s leaving.” When they actually look at their data, in most cases it’s not more than they would normally expect it to be. They’re seeing the numbers—maybe a little bit higher, in some cases, than in 2019—but maybe what we would expect as almost a correction from 2020. In most cases, certainly, the companies that I’ve spoken to, they’re not seeing numbers that justify some of the panic at the moment.
Bryan Hancock: I think what we are seeing is that some people are choosing to leave the workforce and not necessarily go to another job. When you look in the US at workforce participation rates, they’re down. If you look at who is leaving, it’s disproportionately women , and it is disproportionately people toward retirement age. When we start looking at other populations, that’s where you need a real focus on the facts and the insights that let you solve the problems around flexibility, comfort of coming to work—whatever it may be. Which is to say, let’s not necessarily paint with a broad brush. Although we do see dissatisfaction broadly, let’s really dive into who’s leaving and why.
Employee expectations have gone up. It’s been happening for a while, but maybe the pandemic acts as a bit of a catalyst to this. David Green
Bill Schaninger: One of the things I’ve been toying with, and I don’t know that we have a great answer, is do we have to do a fundamental reset, almost, on the offer? We’re facing this moment where there’s been a fixation on wages, but even as the wages have gone up, in many cases to $25, $30 an hour, you’re missing the point, which is “who I work for, the conditions I work under, the nature of the interactions—it has to be better.” I’m curious as to your experience of that part. It’s beyond the data, you know what I mean? This idea of a higher calling.
David Green: I definitely think there’s a purpose that people want to have at work, and that’s now coming out. Employee expectations have gone up. It’s been happening for a while, but maybe the pandemic acts as a bit of a catalyst to this. What’s fascinating is some of the research that you’ve been doing at McKinsey. There’s a growing disconnect between executives around the return to work and employees who aren’t ready yet. Generally, there are large numbers that want more hybrid work moving forward.
I wonder if one of the consequences of the Great Resignation and all the press around it is that maybe some of these executives will start to be a bit more flexible and come closer to what employees are looking for in the hybrid workplace, which actually will benefit executives in the long run. Maybe there’ll be a good consequence of all the column inches that have been written about the Great Resignation.
I like the way that you guys have kind of reframed it as the Great Attraction , depending on the way a company approaches it.
Lucia Rahilly: Bill or Bryan, are you seeing that shift in mindset among employers—toward embracing, or at least being more accepting of, a hybrid culture?
Bill Schaninger: I think two-thirds are still in the stage of “it’s either transitory or slowly they’ll come to their senses, and we’re going to bring them back.” Maybe a third are wrapping their heads around the hybrid model and saying, “Well, this could be pretty interesting.”
Lucia Rahilly: David, tell us a little bit about what we’re talking about when we discuss people analytics and how it helps HR leaders improve retention during this interval of churn.
David Green: Excellence in People Analytics has got 30 case studies of real-life people analytics in companies. There’s a couple that touch on attrition. What can people analytics do? I think that the key thing is to separate the signal from the noise. It can help organizations understand if they actually have a problem with attrition and, if so, where, what job families, what locations? Is it people that have been tenured for a certain time? Is it certain groups? As Bryan said, it is women who are disproportionately leaving the workplace.
If attrition is a problem, what can you do about it? If it’s in parts of the business that you’re either looking to divest or invest in less, attrition can arguably be your friend. If it’s in areas of the business that you’re really trying to grow, and people are leaving and going to your competitors, then clearly it’s a problem that you want to try and address. But you need to understand why people are leaving—if they are leaving—before you can even think about what you can do to solve it.
What employers get wrong about the great attrition.
Lucia Rahilly: Bryan, walk us through some of the Great Attrition/Great Attraction research that we did.
Bryan Hancock: There’s a disconnect between what an employer and an employee think the main issue is. The employer is saying, “Hey, people must be leaving for another job, a better job, and better pay.” Employees are saying, “No, I’m leaving because I don’t feel valued at work.” Even asking the right questions and getting the right frame can—before you get more advanced forms of analytics on it—bring a fact-based and broader lens to make sure we’re having the right conversation.
A really good people analytics function combines the broad view—the broad understanding of organizational research, the broad understanding that this is a field that’s been around for a while, and we know what motivates people—and then brings that to bear to highlight individual facts.
Bill Schaninger: We started getting the data back, and I said, “Isn’t it an interesting pattern here, all the things that the managers are saying are exogenous: ‘the employee is maximizing for the money, my competitor is being foolish about raising the floor.’” It’s everything that was outside them, that allowed them to point the finger at someone else. Managers should just hold up the mirror to themselves and realize that they’ve caused this environment where employees don’t feel valued. They don’t feel well looked after. They feel like they’re a piece of machinery.
Asking the right questions and getting the right frame can—before you get more advanced forms of analytics on it—bring a fact-based and broader lens to make sure we’re having the right conversation. Bryan Hancock
I’m hopeful we can help managers without, maybe, poking them in the eye so much, but maybe it takes a little poke in the eye.
David Green: Microsoft has published some research that they’ve been doing during the pandemic. They found that managers are even more important in a remote or hybrid work environment. They need to be checking in, to be doing one-on-ones regularly. If they’re not, don’t be surprised if people get demotivated and decide to leave. Understanding that is the job of people analytics.
Then we can start doing something about attrition, which is a problem in organizations, and start to nudge managers and leaders around behaviors that will actually encourage people to stay because they feel valued, they feel looked after, they’re given a great employee experience. If you do these sorts of things, then people are going to be much less inclined to look elsewhere.
Yes, people sometimes will get a 40 percent pay raise on a new job. That’s just going to happen. There’s not much you can do about that. You can obviously make sure that you’re paying market rates or above-market rates, if that’s what you want to do. But I think that by creating the right culture in the organization and making people feel more valued, you can keep people more than you lose.
Bryan Hancock: The point of the research on the middle manager is exactly what we’re seeing at our clients. In the course of the pandemic, what we saw is that some people are naturally very good managers—they knew how to check in, how to use the one-on-ones. Then, on the other end of the spectrum, there are some people that never checked in.
At one point during the pandemic, there was a survey, and 40 percent of the employees surveyed said that no one had called to check in on them—no manager, no individual. And those people were 40 percent more likely to be exhibiting some sign of mental distress. I think companies are now recognizing that and saying, “Hey, if the role of the manager got elevated during the pandemic, what does it mean in a hybrid world?” And a number of organizations are now saying, “Gosh, if it mattered when everybody was remote, doesn’t it matter at least as much, if not more, when we’ve got a mixed model, with some people in the office, some remote? Don’t we need to have those one-on-one coaching skills, as well as intentionality about when we’re all coming together as a team and when we’re separated?”
David Green: That, again, is where people analytics teams come in—listening to employees, conducting regular pulse surveys, looking at some of the passive data as well. By looking at some of the metadata, people analytics teams can see the managers who are checking in regularly with their employees and understand the behaviors that drive engagement, that drive performance from teams.
Lucia Rahilly: David, in our Women in the Workplace research, we saw that women managers were much likelier than men managers to call to see how their reports were doing. We also know from other research that women and people of color have been among the most affected during the pandemic and that people of color, in particular, are more likely than White employees to attribute quitting to a lack of a sense of belonging in the organization. Do you see analytics as playing a role in promoting diversity, equity, and inclusion in the workplace?
David Green: We conducted annual research among over a hundred organizations this year. And one of the questions we asked was, “What are the top three areas in your organization where people analytics is adding the most value?” Diversity, equity, and inclusion came out on top—54 percent of respondents included that in their top three. And that’s gone up significantly since we did that research last year.
Now we’re seeing that people analytics is really helping organizations move beyond counting diversity, to measuring inclusion. We’re still at the early stages of that, in many respects. Companies are starting to understand the importance of inclusion and belonging. They’re measuring it in surveys, and they’ve got people analytics teams that can be on top of that as well.
Second, by looking at some of the passive network analysis as well, you can start to understand the links and the strength of relationships within teams and between teams. I think that is helping. Leaders want to be better at diversity, equity, and inclusion and to meet the expectations of the employees. They also want their organizations to be better at diversity, equity, and inclusion.
Bill Schaninger: That pivot toward moving upstream and asking, “What’s the felt experience”—that really encouraged us to go back and look at how we were measuring inclusion, not just to ask a few “engagement-y” questions but instead to ask, “What’s your sense of the organization overall; what are you personally experiencing in your company and team and with your manager?”
That pivot toward moving upstream and asking, “What’s the felt experience”—that really encouraged us to go back and look at how we were measuring inclusion. Bill Schaninger
Bill Schaninger: When you think about the advanced math, how do you get some of these insights without losing people in the math? At some of our clients, you get some really cool quant jocks, and they lose everyone on the third word.
David Green: It’s turning that complex math into a compelling story that’s going to resonate with whichever audience you’re delivering it to and the impact that it has on the objective. I wonder if, in addition to the kind of active-based network analysis that’s been going on for years, we now have the technology to do this at scale, looking at some of the metadata. Of course, you need to be careful about the ethics and the privacy and make sure there’s benefit for employees, of course, as well. But you’re right. You’ve got to take quite complicated insights and turn them into a compelling story that drives action you can then measure.
Bryan Hancock: We took our new inclusion assessment and put it in our inaugural Race in the Workplace survey. Our focus was on Black leaders in corporate America. What became clear is that more Black workers in corporate America were leaving before they ever got promoted. But the numbers were so small, in terms of the absolute—maybe out of every hundred workers, you might’ve had one or two more Black workers than expected leave and one or two fewer White workers.
So from an awareness standpoint, an individual manager wouldn’t pick it up. But when you look at the data you say, “This is like an invisible revolving door. What’s going on in there?” That’s something that makes an executive say, “OK, I now know what I need to do with our new entry-level diverse talent. I know I need to focus on that. Now, let me go back and figure out the next level of detail. What are the levels of initiatives? How do I check up on this? How do I follow up?”
David Green: Another thing from network analytics is that I’ve seen a few examples where high-performing women who don’t have strong networks at the senior level don’t get promoted and leave the organization. Men, who are quite good, generally, at changing their networks as they move up an organization, were getting promoted. I think the academic research backs this up. When you make people aware of that, then they might change their behaviors and consciously build those networks.
Bill Schaninger: David, you said something earlier that was interesting about the challenges with privacy. The US has some challenges on the data front, often around security and what you’re doing with hashing and things like that. Europe, I’ve always found to be way more sensitive to the idea of a “Big Brother-ish” tracking of my movements. In your experience, what’s the balance there, because the insight you can get from this is pretty awesome.
David Green: I think you’re right. There is a balance there. At any organization that wants to use people analytics, it’s OK to start small, be transparent right from the start, think about the benefit for the employees, and ask, “What are the benefits that we’re trying to drive out of this? What is the business problem we’re trying to solve?”
You’ve got to speak to your privacy team. You’ve got to engage with works councils in Europe. You have to clearly articulate the benefit for employees and how you’re going to protect that data. It can be frustrating in terms of time because it slows up the process at the start. But as you say, you can get some really, really rich insights out of some of these technologies that actually have a really clear and direct benefit for employees and the business.
Bryan Hancock: Are you seeing a link now between the people analytics team and the real-estate team? We’re hearing a lot of organizations start to ask, “What should the workspace look like? What does it do?” Are you seeing linkages across the teams or are they existing in silos?
David Green: They’re definitely starting to see linkages, particularly in the companies that are maybe more advanced in people analytics. They are bringing exactly that sort of data together. And they’re thinking, “OK, in some parts of the world, our people are back in the office, but we’ve got these hybrid work models in place now. People are using the office differently. We need to measure how people are using the office and then redesign the workplaces with intention.” And I think we’ll see more of that in the next 18 months, two years.
Bill Schaninger: Maybe six months ago, Bryan, you and I had a run of webinars. There was an architect from Atlanta talking about “repurposing the space.” So much of this was around flexibility. I think the consensus was that we’d been on a two- or three-decade run about increasing the density and lowering the square footage per person, and we were perfectly happy when we had teleworkers and remote workers. Now we may need to go in another direction and pay a little bit more for configurability if we’re talking about a combination of individual work, team-based work, or even lecture hall kind of communication.
The people agenda now is almost stemming the tide of dramatically increasing the span of bosses, increasing the density of office space, hoteling—that whole thing. So much of this had almost gone unchecked. Now we’re saying, “Hey, if we want to bring them back, we’ve got to use the workspace differently.”
I’ve found in Europe that you often have a bit more intervention on things like sunlight. Are you seeing that or is that a US thing and we’re just late to the party?
David Green: We’re definitely seeing that. It makes sense, doesn’t it? Part of understanding people is understanding how they use workspaces. If we can make workspaces more productive, then that’s good. People become more productive; hopefully, more engaged; and maybe less likely to leave, as well.
Lucia Rahilly: What’s the role of analytics in helping HR leaders to fill the surging volume of open roles as folks quit and the talent gap widens?
David Green: The two big use cases of people analytics, going back years, have been attrition and recruiting. It’s almost like coming back full circle now in many respects. The people analytics teams have access to technology that can really help companies. We’re using analytics to automate parts of the recruitment process. In many respects, that actually widens the funnel. By automating, you can potentially open up the process and get a more diverse set of people applying in the first place, which is obviously good.
One big investment bank that I spoke to recently is using analytics to help hiring managers understand that if they use education and experience in their role profiles, how many applicants they’re likely to get and, if they tweak one or two things, how that might make a change to the applicants. If they maybe change the language that they’re using in the effort, they might get more female applicants, for example, if they’re looking for a software engineer. I think analytics is playing a big, big role in that. You can look at analytics across the recruitment process. You can start to see where you might be suffering significant candidate drop-off. You can start to understand if you have a problem around offer to accept.
I would argue that recruiting doesn’t stop once the person starts. You also need to think about onboarding. You need to think about understanding where managers are having one-to-ones with new starters in the first week, two weeks. Does that have an impact on people’s time to productivity? Does that have an impact on first-year attrition? There’s so much that analytics can do.
And then the other bit that I haven’t mentioned is bringing some of that external data in to understand things like supply of talent, demand of talent, locations where we might want to hire talent—particularly now that hybrid’s potentially opening the game around that as well.
Bill Schaninger: You mentioned framing the description of the job as a way of making it more appealing to candidates. I’m assuming that the lexicon you’re using triggers different behavior. That’s great.
David Green: Using natural-language processing helps to understand words that may put off female applicants or other groups. There’s academic research which says if you put bullet points on a job description, men will apply if they meet half of them; women tend not to apply unless they feel they meet at least 90 percent of them. So the more bullet points, the more you can have a very biased male slate, perhaps.
Bryan Hancock: Have you seen organizations navigate and manage through all of the new offerings and make sure that they pick the types of data and types of insights that will matter most to them, not just the ones that seem cool to a person who heard about them on a podcast?
David Green: You probably need someone in your team spending half the time scanning the market and understanding the market, trying to get proof of concepts. A lot of the smaller vendors will do that, but you’re right; it’s not there everywhere. So now the regulators are coming in. There’s regulation in New York recently around using AI in the hiring process. The US Equal Employment Opportunity Commission is looking into it as well—the use of algorithms in hiring and people management generally. But, of course, the most important thing is you’ve got to make sure that what talent acquisition professionals are telling you is actually valid. You’ve got to be careful around bias. Particularly if you’ve got a problem with diversity in your organization, you don’t want to perpetuate that through hiring as well.
Lucia Rahilly: Last question: Where do HR leaders stand in terms of their own skills in data-driven decision making? Do you see that there’s work to be done?
David Green: I think there’s work to be done. We did a survey with a focus on data-driven culture. Over a hundred companies participated. Ninety percent said that their CHROs have now communicated that people analytics was a core component of HR strategy, but only 42 percent said that their companies have a data-driven culture for HR at the moment. You could argue that the first sign is that the CHRO says it’s important. They use these data in their conversations with executives. Maybe they celebrate people in the HR team who are using data, setting that as an example to others, making it very clear that it’s expected.
And there are technologies coming in that are enabling organizations to democratize the data, both for HR’s business partners, who are particularly important in this, but also for managers in the business. This is a big change for HR, so you’ve got to bring in change management and support people through that process. Data literacy is a core skill that they need to have.
Bryan Hancock: I think HR is well along the journey. We now have an understanding that HR is no longer just in the business of feeling good about people. It is in the business of bringing data, facts, and insight into the people side of work. I think there is a real understanding and appreciation of that across the board.
What we’re doing is shifting the skills of folks who used to deal with transactional issues and may have dealt with investigations—a number of things that required a different skill set. Now we’re shifting them not just to have data literacy but also to ask the right questions, to synthesize in the right way, and to compellingly advocate for solutions. The next push on analytics isn’t just the analytics but how to equip the team to use it.
David Green: It’s absolutely key. There is this mistaken idea that suddenly everyone in HR needs to become a data scientist or a statistician. But as you said, the important thing is the ability to ask the right questions and maybe to work with the business to develop hypotheses you can test with analytics. Then it’s communicating the insights and driving the change in order to implement them.
Lucia Rahilly: Let’s close there. David, thanks so much for being with us today.
David Green: Well, it’s been a pleasure. I’ve really enjoyed the conversation.
David Green is managing partner of Insight222. Lucia Rahilly is global editorial director of McKinsey Global Publishing and is based in McKinsey's New York office.
Comments and opinions expressed by interviewees are their own and do not represent or reflect the opinions, policies, or positions of McKinsey & Company or have its endorsement.
Related articles.
COMMENTS
Medical providers often rely on evidence-based medicine to guide decision-making in practice. Often a research hypothesis is tested with results provided, typically with p values, confidence intervals, or both. Additionally, statistical or research significance is estimated or determined by the investigators. Unfortunately, healthcare providers may have different comfort levels in interpreting ...
The p value is a number, calculated from a statistical test, that describes how likely you are to have found a particular set of observations if the null hypothesis were true. P values are used in hypothesis testing to help decide whether to reject the null hypothesis. The smaller the p value, the more likely you are to reject the null hypothesis.
Statistical probability or p values reveal whether the findings in a research study are statistically significant, meaning that the findings are unlikely to have occurred by chance. To understand the p value concept, it is important to understand its relationship with the α level. Before conducting a study, researchers specify the α level ...
A p-value, or probability value, is a number describing how likely it is that your data would have occurred by random chance (i.e., that the null hypothesis is true). The level of statistical significance is often expressed as a p-value between 0 and 1. The smaller the p -value, the less likely the results occurred by random chance, and the ...
The calculation of a P value in research and especially the use of a threshold to declare the statistical significance of the P value have both been challenged in recent years. There are at least two important reasons for this challenge: research data ...
Studies use a predefined threshold to determine when a p -value is sufficiently small enough to support the study hypothesis.
To support the significance of the study's conclusion, the concept of "statistical significance", typically assessed with an index referred as P value is commonly used. The prevalent use of P values to summarize the results of research articles could result from the increased quantity and complexity of data in recent scientific research.
P-Value: The p-value is the level of marginal significance within a statistical hypothesis test representing the probability of the occurrence of a given event. The p-value is used as an ...
The p-value is an important concept in quantitative research that can be confusing and easily misused. In this comprehensive article, we take a deeper look at what is a p-value, how to calculate it, and its statistical significance in research. Read more!
The p-value is the most commonly used statistic in scientific papers and applied statistical analyses. Learn what its definition is, how to interpret it and how to calculate statistical significance if you are performing statistical tests of hypotheses. The utility, interpretation, and common misinterpretations of observed p-values and significance levels are illustrated with examples.
A p value is used in hypothesis testing to help you support or reject the null hypothesis. The p value is the evidence against a null hypothesis. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis. P values are expressed as decimals although it may be easier to understand what they are if you convert ...
The p value, or probability value, tells you the statistical significance of a finding. In most studies, a p value of 0.05 or less is considered statistically significant, but this threshold can also be set higher or lower.
Learn how to find the p value starting with the general process and then a step-by-step example showing the calculations.
P-value in statistics is the probability of getting outcomes as extreme as the outcomes of a statistical hypothesis test.
The interpretation of the p-value depends in large measure on the design of the study whose results are being reported. When the study is a randomized clinical trial, this interpretation is straightforward.
The p -value is used in the context of null hypothesis testing in order to quantify the statistical significance of a result, the result being the observed value of the chosen statistic . [note 2] The lower the p -value is, the lower the probability of getting that result if the null hypothesis were true. A result is said to be statistically ...
About. Transcript. We compare a P-value to a significance level to make a conclusion in a significance test. Given the null hypothesis is true, a p-value is the probability of getting a result as or more extreme than the sample result by random chance alone. If a p-value is lower than our significance level, we reject the null hypothesis.
The P value is the probability that the results of a study are caused by chance alone. To better understand this definition, consider the role of chance.
Led by expert researcher, Dr. Deanna Kelly, the trial is a landmark inpatient study of a ketogenic diet for psychotic illness and is a collaborative initiative with the Maryland Psychiatric Research Center and Spring Grove Hospital Center.
First and foremost, a p value is simply a probability. However, it is a conditional probability, in that its calculation is based on an assumption (condition) that H 0 is true. This is the most critical concept to keep in mind as it means that one cannot infer from the p value whether H 0 is true or false. More specifically, after we assume H 0 ...
P value is the risk that the relation between 2 variables exists by chance due to the sample under study and may not necessarily exist in the population. Lets say we have fixed an alpha risk ...
We will discuss the differences between quantitative (numerical and statistics-focused) and qualitative (non-numerical and human-focused) research design methods so that you can determine which approach is most strategic given your specific area of graduate-level study.. Understanding Social Phenomena: Qualitative Research Design. Qualitative research focuses on understanding a phenomenon ...
Zillow Research aims to be the most open, authoritative source for timely and accurate housing data and unbiased insight.
New diabetes drugs are very likely to improve heart and kidney health for sufferers, research reveals. But Australia's supply shortage will not be resolved this year.
Value-based care solution, Level2, helps members with type 2 diabetes improve control over their condition, while also reducing costs.
The study defined the dependent variable as follows: overall patient waiting time, which was captured using a stopwatch, was categorized as a binary dummy variable. A value of 1 represented OPD waiting time less than 3 h, while a value of 0 indicated OPD waiting time exceeding 3 h.
Cosmic rays could offer scientists another way to track and study violent tornadoes and other severe weather phenomena, a new study suggests. By combining local weather data with complex astrophysics simulations, researchers explored whether a device that typically detects high-energy particles called muons could be used to remotely measure torn...
The p -value is one of the most widely used statistical terms in decision making in biomedical research, which assists the investigators to conclude about the significance of a research consideration. Up today, most researchers base their decision on the value of the probability p. However, the term p -value is often miss- or over- interpreted ...
Background: With the progressive increase in aging populations, the use of opportunistic computed tomography (CT) scanning is increasing, which could be a valuable method for acquiring information on both muscles and bones of aging populations. Objective: The aim of this study was to develop and externally validate opportunistic CT-based fracture prediction models by using images of vertebral ...
David Green: We conducted annual research among over a hundred organizations this year. And one of the questions we asked was, "What are the top three areas in your organization where people analytics is adding the most value?" Diversity, equity, and inclusion came out on top—54 percent of respondents included that in their top three.